• sorting a list of dicts by a nested key...

    From Andreas Leitgeb@21:1/5 to All on Thu Jan 27 17:48:19 2022
    This is one problem I had, but completely worked around it,
    by changing it to nested lists (and lsort -index ...).

    Nevertheless, I'm curious, what would be the best approach to
    get from outset:

    set list {{key 42 a 1} {key 54 b 5} {c 6 key 30}}

    to the list sorted by the value for "key" from the nested dicts:

    {c 6 key 30} {key 42 a 1} {key 54 b 5}

    So far I can only think of lsort -command ...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From heinrichmartin@21:1/5 to Andreas Leitgeb on Thu Jan 27 14:34:50 2022
    On Thursday, January 27, 2022 at 6:48:23 PM UTC+1, Andreas Leitgeb wrote:
    This is one problem I had, but completely worked around it,
    by changing it to nested lists (and lsort -index ...).

    Nevertheless, I'm curious, what would be the best approach to
    get from outset:

    set list {{key 42 a 1} {key 54 b 5} {c 6 key 30}}

    to the list sorted by the value for "key" from the nested dicts:

    {c 6 key 30} {key 42 a 1} {key 54 b 5}

    So far I can only think of lsort -command ...

    Not an answer, but sharing my experience. I mostly avoid lists of dicts and prefer dicts of dicts by using the key in the outer-most structure, i.e.

    set data {42 {key 42 a 1} 54 {key 54 b 5} 30 {c 6 key 30}}

    I assume that [dict keys] is quite cheap as long as $data does not shimmer ... forearch key [lsort [dict keys $data]] {set record [dict get $data $key]; ...}

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andreas Leitgeb@21:1/5 to heinrichmartin on Thu Jan 27 23:31:41 2022
    heinrichmartin <martin.heinrich@frequentis.com> wrote:
    On Thursday, January 27, 2022 at 6:48:23 PM UTC+1, Andreas Leitgeb wrote:
    This is one problem I had, but completely worked around it,
    by changing it to nested lists (and lsort -index ...).

    Nevertheless, I'm curious, what would be the best approach to
    get from outset:

    set list {{key 42 a 1} {key 54 b 5} {c 6 key 30}}

    to the list sorted by the value for "key" from the nested dicts:

    {c 6 key 30} {key 42 a 1} {key 54 b 5}

    So far I can only think of lsort -command ...

    Not an answer, but sharing my experience. I mostly avoid lists of
    dicts and prefer dicts of dicts by using the key in the outer-most
    structure, i.e.

    set data {42 {key 42 a 1} 54 {key 54 b 5} 30 {c 6 key 30}}

    I assume that [dict keys] is quite cheap as long as $data does not
    shimmer ... forearch key [lsort [dict keys $data]] {set record [dict
    get $data $key]; ...}

    Thanks!
    Very good point!

    Compared to [lsort -command ...], even an extra iteration to turn the
    outer list to a dict would have been a breeze...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harald Oehlmann@21:1/5 to All on Fri Jan 28 08:13:14 2022
    Am 28.01.2022 um 00:31 schrieb Andreas Leitgeb:
    heinrichmartin <martin.heinrich@frequentis.com> wrote:
    On Thursday, January 27, 2022 at 6:48:23 PM UTC+1, Andreas Leitgeb wrote: >>> This is one problem I had, but completely worked around it,
    by changing it to nested lists (and lsort -index ...).

    Nevertheless, I'm curious, what would be the best approach to
    get from outset:

    set list {{key 42 a 1} {key 54 b 5} {c 6 key 30}}

    to the list sorted by the value for "key" from the nested dicts:

    {c 6 key 30} {key 42 a 1} {key 54 b 5}

    So far I can only think of lsort -command ...

    Not an answer, but sharing my experience. I mostly avoid lists of
    dicts and prefer dicts of dicts by using the key in the outer-most
    structure, i.e.

    set data {42 {key 42 a 1} 54 {key 54 b 5} 30 {c 6 key 30}}

    I assume that [dict keys] is quite cheap as long as $data does not
    shimmer ... forearch key [lsort [dict keys $data]] {set record [dict
    get $data $key]; ...}

    Thanks!
    Very good point!

    Compared to [lsort -command ...], even an extra iteration to turn the
    outer list to a dict would have been a breeze...


    Not an answer neither. If you arange tk get key at the first position
    always, you can use lsort -index 1.

    My big data structures are normally array of dicts, to:
    - restrict shimmering on the toplevel key level (array elements does not shimmer the whole array)
    - then use highly nested dicts to have "dict index a(b) $i1 $i2 $i3".

    Harald
    -

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andreas Leitgeb@21:1/5 to Harald Oehlmann on Fri Jan 28 08:34:41 2022
    Harald Oehlmann <wortkarg2@yahoo.de> wrote:
    Not an answer neither. If you arange to get key at the first position
    always, you can use lsort -index 1.

    That's a bit dirty, isn't it? ;-) shimmering all nested dicts...

    I named it "key" but that was meant in the sense of sort-key,
    not in the sense of fixed object key.

    It may change between one sort task and the next one.

    If the value for key changes, the key will move to the right
    within the dict, so "keeping the key at front" may not be as
    easy in practise.

    There is also a possible extra challenge, if keys aren't necessarily
    unique. While it wouldn't matter in your approach, Martin's approach
    then would have to be adapted to an outer dict holding lists of
    inner dicts of same key. Will likely still beat the lsort -command ...

    My big data structures are normally array of dicts, to:
    - restrict shimmering on the toplevel key level (array elements does not shimmer the whole array)

    To some degree, arrays may do more shimmering harm than dicts...
    in particular the keys for arrays are always shimmered to string,
    unlike for dicts... - at least that's what I've gathered here in
    the past. Not entirely sure, if still true.

    - then use highly nested dicts to have "dict index a(b) $i1 $i2 $i3".

    "dict get", or did I miss any latest features?

    some of the dict-subcommands unfortunately are not able to reach out
    into nested dicts...

    Just recently I used an array of single-level-dicts, to be able to "incr" leaf-elements indexed by two keys.

    Thanks for joining into the discussion.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harald Oehlmann@21:1/5 to All on Fri Jan 28 09:51:56 2022
    Am 28.01.2022 um 09:34 schrieb Andreas Leitgeb:
    To some degree, arrays may do more shimmering harm than dicts...
    in particular the keys for arrays are always shimmered to string,
    unlike for dicts... - at least that's what I've gathered here in
    the past. Not entirely sure, if still true.

    That is an important point, but I suppose, this is also true for dicts.
    They both use a hash-table for internal organisation and the hash-table requires always a string representation.

    Thank you,
    Harald

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andreas Leitgeb@21:1/5 to Harald Oehlmann on Fri Jan 28 10:17:10 2022
    Harald Oehlmann <wortkarg2@yahoo.de> wrote:
    Am 28.01.2022 um 09:34 schrieb Andreas Leitgeb:
    To some degree, arrays may do more shimmering harm than dicts...
    in particular the keys for arrays are always shimmered to string,
    unlike for dicts... - at least that's what I've gathered here in
    the past. Not entirely sure, if still true.

    That is an important point, but I suppose, this is also true for dicts.
    They both use a hash-table for internal organisation and the hash-table requires always a string representation.

    There is a subtle difference between "shimmering to string" and merely obtaining a string for the hash and then not storing the string back
    into the object.

    Adding a string-rep to an object is not always an improvement.
    Existing string-reps sometimes inhibit alternative "direct routes"
    from some object to another.

    e.g.: Given a dict object with a string-rep, it is possible, that the string-rep was originally a list's string-rep that still contains
    duplicate keys in the dict-sense, but these are perfectly legal for
    a list. Therefore, the list would be re-parsed from the string rep
    rather than from the dict-rep.

    Without a string-rep, the list can (or at least could) be directly
    obtained from the dict.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harald Oehlmann@21:1/5 to All on Fri Jan 28 11:29:39 2022
    Am 28.01.2022 um 11:17 schrieb Andreas Leitgeb:
    Harald Oehlmann <wortkarg2@yahoo.de> wrote:
    Am 28.01.2022 um 09:34 schrieb Andreas Leitgeb:
    To some degree, arrays may do more shimmering harm than dicts...
    in particular the keys for arrays are always shimmered to string,
    unlike for dicts... - at least that's what I've gathered here in
    the past. Not entirely sure, if still true.

    That is an important point, but I suppose, this is also true for dicts.
    They both use a hash-table for internal organisation and the hash-table
    requires always a string representation.

    There is a subtle difference between "shimmering to string" and merely obtaining a string for the hash and then not storing the string back
    into the object.

    Adding a string-rep to an object is not always an improvement.
    Existing string-reps sometimes inhibit alternative "direct routes"
    from some object to another.

    e.g.: Given a dict object with a string-rep, it is possible, that the string-rep was originally a list's string-rep that still contains
    duplicate keys in the dict-sense, but these are perfectly legal for
    a list. Therefore, the list would be re-parsed from the string rep
    rather than from the dict-rep.

    Without a string-rep, the list can (or at least could) be directly
    obtained from the dict.


    Andreas,

    thank you, good point,
    Harald

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From heinrichmartin@21:1/5 to Andreas Leitgeb on Fri Jan 28 02:52:48 2022
    On Friday, January 28, 2022 at 11:17:14 AM UTC+1, Andreas Leitgeb wrote:
    Adding a string-rep to an object is not always an improvement.
    Existing string-reps sometimes inhibit alternative "direct routes"
    from some object to another.

    e.g.: Given a dict object with a string-rep, it is possible, that the string-rep was originally a list's string-rep that still contains
    duplicate keys in the dict-sense, but these are perfectly legal for
    a list. Therefore, the list would be re-parsed from the string rep
    rather than from the dict-rep.

    Without a string-rep, the list can (or at least could) be directly
    obtained from the dict.

    One never stops learning. This means it actually makes a difference whether I start a dict with [dict create] or not (namely no string rep vs string rep).
    On the other hand, I can still stick to initialising short constant dicts with braces (rather than [dict create]), because it will not be measurable in e.g.,

    set defaults {foo 1 bar 2} ;# that is imho more readable than [dict create foo 1 bar 2]

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harald Oehlmann@21:1/5 to All on Fri Jan 28 12:58:54 2022
    Am 28.01.2022 um 12:32 schrieb Harald Oehlmann:
    set defaults [dict create foo 1 bar 2];list
    % tcl::unsupported::representation $defaults

    Sorry, my trick with the ";list" did not work, there is still a string representation. Aim was to avoid this.
    I may check out later, sorry,
    Harald

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harald Oehlmann@21:1/5 to All on Fri Jan 28 12:32:02 2022
    Am 28.01.2022 um 11:52 schrieb heinrichmartin:
    On Friday, January 28, 2022 at 11:17:14 AM UTC+1, Andreas Leitgeb wrote:
    Adding a string-rep to an object is not always an improvement.
    Existing string-reps sometimes inhibit alternative "direct routes"
    from some object to another.

    e.g.: Given a dict object with a string-rep, it is possible, that the
    string-rep was originally a list's string-rep that still contains
    duplicate keys in the dict-sense, but these are perfectly legal for
    a list. Therefore, the list would be re-parsed from the string rep
    rather than from the dict-rep.

    Without a string-rep, the list can (or at least could) be directly
    obtained from the dict.

    One never stops learning. This means it actually makes a difference whether I start a dict with [dict create] or not (namely no string rep vs string rep).
    On the other hand, I can still stick to initialising short constant dicts with braces (rather than [dict create]), because it will not be measurable in e.g.,

    set defaults {foo 1 bar 2} ;# that is imho more readable than [dict create foo 1 bar 2]

    Yes, it makes a difference:

    % set defaults {foo 1 bar 2}
    % tcl::unsupported::representation $defaults
    value is a pure string with a refcount of 4, object pointer at 007D4430,
    string representation "foo 1 bar 2"
    % dict get $defaults foo
    1
    % tcl::unsupported::representation $defaults
    value is a dict with a refcount of 4, object pointer at 007D4430,
    internal representation 040879E0:00000000, string representation "foo 1
    bar 2"

    Now without string interpretation:
    Remark the funny ";list" at the end of the command to avoid to put the
    result as command result, which then would be converted by the sehll to
    a string for printing, which would add a string representation

    % set defaults [dict create foo 1 bar 2];list
    % tcl::unsupported::representation $defaults
    value is a dict with a refcount of 7, object pointer at 007D4430,
    internal representation 040879E0:00000000, string representation "foo 1
    bar 2"

    Harald

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andreas Leitgeb@21:1/5 to heinrichmartin on Fri Jan 28 12:05:32 2022
    heinrichmartin <martin.heinrich@frequentis.com> wrote:
    On Friday, January 28, 2022 at 11:17:14 AM UTC+1, Andreas Leitgeb wrote:
    Adding a string-rep to an object is not always an improvement.

    One never stops learning. This means it actually makes a difference
    whether I start a dict with [dict create] or not (namely no string rep
    vs string rep).

    Indeed it does, but only for that specific initial dict.

    you can see it, if you:
    set d {a 1 a 2 a 3}
    dict size $d ;# -> 1
    llength $d ;# -> 6, because it used orig string-rep.

    As soon as you add further keys, or just change the value
    for a key, you get a new dict and that new dict will then
    be free of string-rep, even if the orig one was unshared
    and thus modified in-place. Any mod on the object kills the
    string rep.

    This goes a step further, in that, when you create an empty
    dict [dict create] or empty list [list], tcl will just return
    a shared ref to the empty string object, because it doesn't
    matter for those small objects.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andreas Leitgeb@21:1/5 to Harald Oehlmann on Fri Jan 28 13:06:01 2022
    Harald Oehlmann <wortkarg2@yahoo.de> wrote:
    Now without string interpretation:
    Remark the funny ";list" at the end of the command to avoid to put the
    result as command result, which then would be converted by the sehll to
    a string for printing, which would add a string representation

    % set defaults [dict create foo 1 bar 2];list
    % tcl::unsupported::representation $defaults
    value is a dict with a refcount of 7, object pointer at 007D4430,
    internal representation 040879E0:00000000, string representation "foo 1
    bar 2"

    I was surprised, that there even is a string rep in that case, so I
    checked that I could reproduce it (yes, I could), and then tried
    another one:

    % set defaults [dict create foo [expr {1+1}] bar 4];list
    % tcl::unsupported::representation $defaults
    value is a dict with a refcount of 2, object pointer at 0x55d874107190, internal representation 0x55d874172b60:(nil), no string representation

    Apparently, there exists a special case, where all key and value
    elements of a new dict are all strings themselves, and then a string-rep
    of the dict is (unnecessarily?) pre-generated.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From heinrichmartin@21:1/5 to Andreas Leitgeb on Fri Jan 28 05:21:13 2022
    On Friday, January 28, 2022 at 2:06:05 PM UTC+1, Andreas Leitgeb wrote:
    Harald Oehlmann wrote:
    Now without string interpretation:
    Remark the funny ";list" at the end of the command to avoid to put the result as command result, which then would be converted by the sehll to
    a string for printing, which would add a string representation

    % set defaults [dict create foo 1 bar 2];list
    % tcl::unsupported::representation $defaults
    value is a dict with a refcount of 7, object pointer at 007D4430,
    internal representation 040879E0:00000000, string representation "foo 1
    bar 2"
    I was surprised, that there even is a string rep in that case, so I
    checked that I could reproduce it (yes, I could), and then tried
    another one:

    % set defaults [dict create foo [expr {1+1}] bar 4];list
    % tcl::unsupported::representation $defaults
    value is a dict with a refcount of 2, object pointer at 0x55d874107190, internal representation 0x55d874172b60:(nil), no string representation

    Apparently, there exists a special case, where all key and value
    elements of a new dict are all strings themselves, and then a string-rep
    of the dict is (unnecessarily?) pre-generated.

    Nope, note the refcount 7. Always start such tests in a fresh tclsh - or even better in a file.
    But even in a fresh tclsh, I can reproduce the string rep ... and demonstrate how Tcl re-uses objects.

    $ tclsh
    % tcl::unsupported::representation [dict create foo bar]
    value is a dict with a refcount of 3, object pointer at 0x10306b0, internal representation 0x10afad0:(nil), string representation "foo bar"
    % tcl::unsupported::representation [dict create foo bar]
    value is a dict with a refcount of 4, object pointer at 0x10306b0, internal representation 0x10afad0:(nil), string representation "foo bar"
    % tcl::unsupported::representation [dict create foo bar]
    value is a dict with a refcount of 5, object pointer at 0x10306b0, internal representation 0x10afad0:(nil), string representation "foo bar"
    % tcl::unsupported::representation [dict {*}{create foo bar}]
    value is a dict with a refcount of 6, object pointer at 0x10306b0, internal representation 0x10afad0:(nil), string representation "foo bar"
    % tcl::unsupported::representation [dict {*}{create foo bar}]
    value is a dict with a refcount of 7, object pointer at 0x10306b0, internal representation 0x10afad0:(nil), string representation "foo bar"
    % tcl::unsupported::representation [dict {*}{create foo baz}]
    value is a dict with a refcount of 3, object pointer at 0x10cc9e0, internal representation 0x10b18d0:(nil), string representation "foo baz"

    Your special case seems to be crafted by the interactive mode - not even non-normalized string reps in the command could break that.
    But the interpreter on its own behaves as expected:

    $ cat test.tcl
    set a [dict create foo 1 bar 2]
    puts [tcl::unsupported::representation $a]
    puts $a
    puts [tcl::unsupported::representation $a]
    puts [tcl::unsupported::representation {foo 1 bar 2}]

    $ tclsh test.tcl
    value is a dict with a refcount of 2, object pointer at 0x10d0610, internal representation 0x10e9ec0:(nil), no string representation
    foo 1 bar 2
    value is a dict with a refcount of 2, object pointer at 0x10d0610, internal representation 0x10e9ec0:(nil), string representation "foo 1 bar 2"
    value is a pure string with a refcount of 1, object pointer at 0x10d0eb0, string representation "foo 1 bar 2"

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Eric@21:1/5 to All on Fri Jan 28 06:00:02 2022
    Le vendredi 28 janvier 2022 à 14:06:05 UTC+1, Andreas Leitgeb a écrit :
    Harald Oehlmann <wort...@yahoo.de> wrote:
    Now without string interpretation:
    Remark the funny ";list" at the end of the command to avoid to put the result as command result, which then would be converted by the sehll to
    a string for printing, which would add a string representation

    % set defaults [dict create foo 1 bar 2];list
    % tcl::unsupported::representation $defaults
    value is a dict with a refcount of 7, object pointer at 007D4430,
    internal representation 040879E0:00000000, string representation "foo 1 bar 2"
    I was surprised, that there even is a string rep in that case, so I
    checked that I could reproduce it (yes, I could), and then tried
    another one:

    % set defaults [dict create foo [expr {1+1}] bar 4];list
    % tcl::unsupported::representation $defaults
    value is a dict with a refcount of 2, object pointer at 0x55d874107190, internal representation 0x55d874172b60:(nil), no string representation

    Apparently, there exists a special case, where all key and value
    elements of a new dict are all strings themselves, and then a string-rep
    of the dict is (unnecessarily?) pre-generated.

    Hi,

    This is due to management of literals table.
    I had the same "issue", trying to reduce size of generated code using [list] instead of {}.
    See https://core.tcl-lang.org/tcl/tktview?name=a64ef79518

    Eric

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andreas Leitgeb@21:1/5 to heinrichmartin on Fri Jan 28 13:57:58 2022
    heinrichmartin <martin.heinrich@frequentis.com> wrote:
    Your special case seems to be crafted by the interactive mode -
    not even non-normalized string reps in the command could break that.

    I did try to alter values first, using different numbers for value of
    bar, and always got a string rep, despite the ";list"-trick to avoid stringification of result in the shell. Only when I computed one of
    the inner values, it worked as expected.

    I fail to explain this with interactive mode alone. If it were that,
    then 2 versus [expr {1+1}] wouldn't make a reproducible difference,
    either.

    Interactive mode, however, may of course play a part, namely by not
    compiling the commands... - The interpreted path may well take a
    different route for dict create with fixed arguments, than for dict
    create with computed ones.

    Nevertheless, this is no bug, as long as the difference can only be seen
    by tcl::unsupported::-means.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)