• ANNOUNCE: Excel file format reader/writer package OOXML 1.9 released

    From Harald Oehlmann@21:1/5 to All on Fri Nov 29 15:27:38 2024
    Dear TCL team,

    OOXML may read and write Excel files.

    New features are:
    - Set header/footer
    - TCL 8.6: optionally read using tcllib::zip::read module, so binary
    package vfs::zip is not required any more
    - If vfs::zip is used, version 1.0.4 is required
    - More checks on file read on invalid files

    So, the requirements are:
    - TDOM 0.9
    - TCL 8.6.7

    And for reading one of:
    - TCLLIB::ZIP:READ
    - VFS::ZIP version 1.0.4 or better

    The download page is here:
    https://fossil.sowaswie.de/ooxml/uv/download.html

    Thanks to all contributors !
    Harald (on behalf of the very busy group)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From TorstenBerg@21:1/5 to All on Thu Dec 5 12:06:34 2024
    Hi,

    thanks for the new version. This is much appreciated and the new options
    on paper size and orientation work nicely.

    One issue that I found:

    When I have a Tcl script encoded in utf-8 and that script writes the
    xlsx file, then umlauts come out weird. Is there an option that can
    handle this or does ooxml assume or expect text input to be in a
    specific encoding?


    And an idea:

    When formatting cells using the '-style' option of the 'cell' method,
    it would be cool to be able to specify more than one style (e.g. as a
    list of styleIDs). Then you could have one style for font styling and
    another for borders and then combine those two to get cells with a
    specific font and border. Conflicting elements of two different styles
    of the list could be handled so that a style later in the list would
    overwrite settings for the identical option in a previous style in the
    list.

    Regards, Torsten

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harald Oehlmann@21:1/5 to All on Thu Dec 5 13:53:04 2024
    Hi Torsten,
    thanks for the message. Please use the tickets in the tracker: https://fossil.sowaswie.de/ooxml/ticket
    You may author two tickets.

    About the "Umlauts". This should be an internal issue. The outputted
    data is utf-8 afaik. But this is critical.
    I have tested Umlauts when reading and that works.
    I had to add an "encoding convertfrom utf-8 $data" to make it work.

    Take care,
    Harald

    Am 05.12.2024 um 13:06 schrieb TorstenBerg:
    Hi,

    thanks for the new version. This is much appreciated and the new options
    on paper size and orientation work nicely.

    One issue that I found:

    When I have a Tcl script encoded in utf-8 and that script writes the
    xlsx file, then umlauts come out weird. Is there an option that can
    handle this or does ooxml assume or expect text input to be in a
    specific encoding?


    And an idea:

    When formatting cells using theĀ  '-style' option of the 'cell' method,
    it would be cool to be able to specify more than one style (e.g. as a
    list of styleIDs). Then you could have one style for font styling and
    another for borders and then combine those two to get cells with a
    specific font and border. Conflicting elements of two different styles
    of the list could be handled so that a style later in the list would overwrite settings for the identical option in a previous style in the
    list.

    Regards, Torsten

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ralf Fassel@21:1/5 to All on Thu Dec 5 15:46:57 2024
    * Harald Oehlmann <wortkarg3@yahoo.com>
    | About the "Umlauts". This should be an internal issue. The outputted
    | data is utf-8 afaik. But this is critical.
    | I have tested Umlauts when reading and that works.
    | I had to add an "encoding convertfrom utf-8 $data" to make it work.

    Wouldn't that not also depend on how exactly the TCL script is sourced?
    I.e. an tcl script containing literal utf-8 data (not the \uxxxx form)
    on Windows with the default system encoding (eg cp1252) would require an explicit -encoding utf8 for the 'source' command to read it properly.
    The OP did not specify what OS he was on, and how the tcl script
    containing utf-8 data was sourced...

    R'

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harald Oehlmann@21:1/5 to All on Thu Dec 5 16:16:41 2024
    Am 05.12.2024 um 15:46 schrieb Ralf Fassel:
    * Harald Oehlmann <wortkarg3@yahoo.com>
    | About the "Umlauts". This should be an internal issue. The outputted
    | data is utf-8 afaik. But this is critical.
    | I have tested Umlauts when reading and that works.
    | I had to add an "encoding convertfrom utf-8 $data" to make it work.

    Wouldn't that not also depend on how exactly the TCL script is sourced?
    I.e. an tcl script containing literal utf-8 data (not the \uxxxx form)
    on Windows with the default system encoding (eg cp1252) would require an explicit -encoding utf8 for the 'source' command to read it properly.
    The OP did not specify what OS he was on, and how the tcl script
    containing utf-8 data was sourced...

    R'

    Ralf,
    my message was mis-leading: I have introduced the converfrom into the
    source code for reading Excel (not writing). Eventually, this is missing
    or there is another error, I don't know.

    Looking a bit in the source code:
    proc ooxml::Dom2zip {zf node path cd count} {
    upvar $cd mycd
    upvar $count mycount
    append mycd [::ooxml::add_str_to_archive $zf $path [$node asXML
    -indent none -xmlDeclaration 1 -encString "UTF-8"]]
    incr mycount
    }

    Ok, always UTF-8

    Later:
    proc ::ooxml::add_str_to_archive {zipchan path data {comment {}}} {
    ...
    set utfdata [encoding convertto utf-8 $data]

    So, I see no issue in the code. I have no idea, what happens here.

    I would write the relevant data to an utf-8 flat file for debug

    set h [open debug.txt w]
    fconfigure $h -encoding utf-8
    puts $h $data
    close $h

    Thanks,
    Harald

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From TorstenBerg@21:1/5 to All on Thu Dec 5 21:52:35 2024
    Hi,

    thanks for your ideas wrt. the encoding. I will investigate further and
    see whether I can find the culprit. The phenomenon is found on a Windows machine running a script being utf-8 with Tcl 8.6. So, maybe this
    combination is already bad (it probably is) since Windows will expect
    the Tcl file tobe in cp1252 or so ...

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harald Oehlmann@21:1/5 to All on Fri Dec 6 08:31:01 2024
    Am 05.12.2024 um 22:52 schrieb TorstenBerg:
    Hi,

    thanks for your ideas wrt. the encoding. I will investigate further and
    see whether I can find the culprit. The phenomenon is found on a Windows machine running a script being utf-8 with Tcl 8.6. So, maybe this
    combination is already bad (it probably is) since Windows will expect
    the Tcl file tobe in cp1252 or so ...

    All my pckingdex files have this:

    source -encoding utf-8 $file

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ralf Fassel@21:1/5 to All on Fri Dec 6 11:01:01 2024
    * berg@typoscriptics.de (TorstenBerg)
    | thanks for your ideas wrt. the encoding. I will investigate further and
    | see whether I can find the culprit. The phenomenon is found on a Windows
    | machine running a script being utf-8 with Tcl 8.6. So, maybe this
    | combination is already bad (it probably is) since Windows will expect
    | the Tcl file tobe in cp1252 or so ...

    Definitely:

    https://www.tcl.tk/man/tcl/TclCmd/source.htm

    SYNOPSIS
    source fileName
    source -encoding encodingName fileName

    [...]
    The -encoding option is used to specify the encoding of the data stored
    in fileName. When the -encoding option is omitted, the system encoding
    is assumed.

    See also Harald's response (always specify the encoding with 'source'
    when the file is not ASCII). You could use the \u-Notation if there are
    only a few Unicode characters in the file (with many, the file becomes unreadable IMHO).

    R'

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)