• 2GB limitation

    From alexandru@21:1/5 to All on Mon Jul 22 06:17:09 2024
    Hi,

    Will there be a fix for the 2GB size limit that a string representation
    have in Tcl?
    Maybe already fixed in Tcl 9.0?

    Thanks
    Alexnadru

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andreas Leitgeb@21:1/5 to alexandru on Mon Jul 22 08:31:35 2024
    alexandru <alexandru.dadalau@meshparts.de> wrote:
    Will there be a fix for the 2GB size limit that a string representation
    have in Tcl?
    Maybe already fixed in Tcl 9.0?

    Yes, that's one of the reasons for switching to tcl9 as soon as
    possible.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From alexandru@21:1/5 to All on Mon Jul 22 17:06:07 2024
    Wow, that unexpected and cool!
    Thanks

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rich@21:1/5 to Andreas Leitgeb on Mon Jul 22 17:17:21 2024
    Andreas Leitgeb <avl@logic.at> wrote:
    alexandru <alexandru.dadalau@meshparts.de> wrote:
    Will there be a fix for the 2GB size limit that a string representation
    have in Tcl?
    Maybe already fixed in Tcl 9.0?

    Yes, that's one of the reasons for switching to tcl9 as soon as
    possible.

    What is the new larger "limit" in Tcl9?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Emiliano@21:1/5 to Rich on Mon Jul 22 21:58:03 2024
    On Mon, 22 Jul 2024 17:17:21 -0000 (UTC)
    Rich <rich@example.invalid> wrote:

    Andreas Leitgeb <avl@logic.at> wrote:
    alexandru <alexandru.dadalau@meshparts.de> wrote:
    Will there be a fix for the 2GB size limit that a string representation
    have in Tcl?
    Maybe already fixed in Tcl 9.0?

    Yes, that's one of the reasons for switching to tcl9 as soon as
    possible.

    What is the new larger "limit" in Tcl9?

    In 9.0 the type of the 'length' member of the Tcl_Obj struct (the number of bytes at '*bytes' member, not including the terminating null) has changed
    from int to ptrdiff_t, so it will remain (1<<31)-1 => 2147483647 bytes on
    32 bit platforms (unsurprisingly) and (1<<63)-1 => 9223372036854775807
    (9,22 exabyte) on 64 bit platforms.

    IIUC that's also the (new) number of elements for a Tcl list. In practice
    the number will be less, since the length of the string representation of
    such list will hit the '*bytes' max length first.

    --
    Emiliano

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harald Oehlmann@21:1/5 to All on Wed Jul 24 09:07:15 2024
    Am 22.07.2024 um 19:17 schrieb Rich:
    Andreas Leitgeb <avl@logic.at> wrote:
    alexandru <alexandru.dadalau@meshparts.de> wrote:
    Will there be a fix for the 2GB size limit that a string representation
    have in Tcl?
    Maybe already fixed in Tcl 9.0?

    Yes, that's one of the reasons for switching to tcl9 as soon as
    possible.

    What is the new larger "limit" in Tcl9?

    expr {2**63}

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andreas Leitgeb@21:1/5 to Emiliano on Wed Jul 24 16:22:53 2024
    Emiliano <emiliano@example.invalid> wrote:
    In 9.0 the type of the 'length' member of the Tcl_Obj struct (the number of bytes at '*bytes' member, not including the terminating null) has changed from int to ptrdiff_t, so it will remain (1<<31)-1 => 2147483647 bytes on
    32 bit platforms (unsurprisingly) and (1<<63)-1 => 9223372036854775807
    (9,22 exabyte) on 64 bit platforms.

    My hearsay was "generally 64 bit (minus the sign-bit)".
    Are you sure that length-type is *always* ptrdiff_t, and
    that this may be 32bit?

    The "64bit'ness" of a platform is also a bit more complicated...
    There are platforms, where pointers are 64bit, but ints are
    still 32 (despite machine words being all 64bit) - in those
    cases, I'd expect ptrdiff_t to be 64 bit, but on a real old
    32bit machine, I don't really know for sure...

    IIUC that's also the (new) number of elements for a Tcl list.
    In practice the number will be less, since the length of the
    string representation of such list will hit the '*bytes' max
    length first.

    Not all lists are ever turned to string-rep. While they are
    semantically "just strings", well written programs can avoid
    the actual obtainment of the string rep, at least for those
    really long lists that may be relevant here.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Emiliano@21:1/5 to Andreas Leitgeb on Wed Jul 24 17:05:19 2024
    On Wed, 24 Jul 2024 16:22:53 -0000 (UTC)
    Andreas Leitgeb <avl@logic.at> wrote:

    Emiliano <emiliano@example.invalid> wrote:
    In 9.0 the type of the 'length' member of the Tcl_Obj struct (the number of bytes at '*bytes' member, not including the terminating null) has changed from int to ptrdiff_t, so it will remain (1<<31)-1 => 2147483647 bytes on 32 bit platforms (unsurprisingly) and (1<<63)-1 => 9223372036854775807 (9,22 exabyte) on 64 bit platforms.

    My hearsay was "generally 64 bit (minus the sign-bit)".
    Are you sure that length-type is *always* ptrdiff_t, and
    that this may be 32bit?

    In 9.X, it is ptrdiff_t. In 8.Y is still int.

    See https://core.tcl-lang.org/tcl/file?ci=trunk&name=generic/tcl.h&ln=325-333 and
    https://core.tcl-lang.org/tcl/file?ci=trunk&name=generic/tcl.h&ln=740-752

    ptrdiff_t can still be a 32 bits wide value. See below.

    The "64bit'ness" of a platform is also a bit more complicated...
    There are platforms, where pointers are 64bit, but ints are
    still 32 (despite machine words being all 64bit) - in those
    cases, I'd expect ptrdiff_t to be 64 bit, but on a real old
    32bit machine, I don't really know for sure...

    This is what I mean when say "on 32-bit platforms is still 2GB",
    since i386-i686 platform has a 32 bit ptrdiff_t.

    On my ancient i686 machine:

    $ uname -m
    i686
    $ tclsh9.0
    % expr {(1 << (8 * $tcl_platform(pointerSize))-1) - 1}
    2147483647
    % package provide Tcl
    9.0b3
    % set tcl_platform(pointerSize)
    4

    IIUC that's also the (new) number of elements for a Tcl list.
    In practice the number will be less, since the length of the
    string representation of such list will hit the '*bytes' max
    length first.

    Not all lists are ever turned to string-rep. While they are
    semantically "just strings", well written programs can avoid
    the actual obtainment of the string rep, at least for those
    really long lists that may be relevant here.

    Yes, but that's an optimization. Tcl semantics are still defined
    in terms of strings operations. I prefer not to depend on internals.

    --
    Emiliano

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)