• Bug#1052371: game-data-packager: a crash while rebuilding quake2-full-d

    From Simon McVittie@21:1/5 to undef on Fri May 2 20:00:01 2025
    On Thu, 21 Sep 2023 at 08:29:24 +0300, undef wrote:
    I tried to build quake2-full-data from the existing deb package

    I've fixed the crash that you reported, but it's perhaps worth noting
    that if you have the package installed, you don't need to keep the old
    .deb around or pass it to g-d-p: if you run

    game-data-packager quake2

    then g-d-p will gather up the files from /usr/share/games and build a
    new .deb out of them, like a specialized version of dpkg-repack.

    smcv

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon McVittie@21:1/5 to All on Fri May 2 19:50:01 2025
    On Wed, 07 Feb 2024 at 12:42:01 +0100, Sébastien Noel wrote:
    A fix could be as simple as

    --- a/game_data_packager/build.py
    +++ b/game_data_packager/build.py
    @@ -903,7 +903,7 @@ class PackagingTask:
    if provider is None:
    try_to_unpack: Collection[str] = self.game.files
    should_provide = set()
    - distinctive_dirs = False
    + distinctive_dirs = True
    else:
    try_to_unpack = set(f.name for f in
    provider.provides_files)
    should_provide = set(try_to_unpack)


    but i'm not familiar enough with that part of the code to have a full
    view of the implications of that change...

    I don't think that's the correct fix, but I've been able to identify a different change that I find easier to justify, so I pushed that
    instead.

    Context:
    - the file provided on the CLI is "unknown/untrusted" by g-d-p and so
    there isn't any "provider" for the stream and for that case, >'distinctive_dirs' was set to False (what's the reasoning for that ?)

    The origin of this seems to be commit ef4dc6c7 "zipfile: match on
    basename for unknown & quircky archives" [sic]. Originally, it was only
    for zip files, where the equivalent of this bug couldn't actually cause
    a crash, because zip files are seekable (although it might have led to
    some inefficiency, unpacking the same file more than once). Later, I
    refactored the same function into consider_stream(), in commit f7915ca5
    "Unify code to stream members from a TarFile or ZipFile". After that
    point, it *can* cause a crash when unpacking a non-seekable archive,
    like a tar file or (as in this case) the tar file embedded in a .deb.

    What distinctive_dirs means is: if it's true (which is the default), we
    will assume that a file named "foo/bar/abc.bin" can only be found in a directory named "foo/bar", and if we see an "abc.bin" somewhere else,
    it's a different file. This is usually what we want: we know the layout
    that the directory tree will have, and there's no point in unpacking a
    file named something like directx/setup.exe in the hope that it might
    secretly be mygame/setup.exe.

    However, if the user of g-d-p has passed an unknown archive file on the command-line, after ef4dc6c7 we will consider any "abc.bin" in any
    directory to be potentially the one we are looking for.

    I don't know why this was done: the commit message doesn't say, and the
    change isn't part of a merge request. But I can guess: probably
    Alexandre had (or was thinking about) an unofficial archive containing
    files in a layout that was not the one we expect, for example:

    quake2-pak0.zip
    \- pak0.pak (same content we expect to find in baseq2/pak0.pak)

    and in that scenario, we will not successfully match pak0.pak to
    anything unless we are willing to disregard the path.

    - quake2 have multiple (different) gamefiles that have the same
    basename()

    Not just that, but they are the same size, for example:

    31329 14d330fe2a9af05a6254b2c2bda9f384 baseq2/players/crakhor/a_grenades.md2
    ...
    31329 8b26b6a4863b7e2c30b4dcd7867a6d10 baseq2/players/cyborg/a_grenades.md2

    This means that g-d-p cannot tell which one it's looking at by just
    looking at the metadata available in the tar file: to disambiguate, it
    needs to know either the path (which we have told it to ignore), or the
    content (which requires actually unpacking).

    When I run

    /path/to/run-gdp-uninstalled \
    -d. --no-search --no-compress --debug \
    quake2 input/quake2-full-data_85_all.deb

    with this temporary change

    --- a/game_data_packager/build.py
    +++ b/game_data_packager/build.py
    @@ -945,6 +945,12 @@ class PackagingTask:
    # proceed to next entry
    continue

    + logger.debug(
    + 'Unpacking %s from %s because it might be %s',
    + entry.name,
    + name,
    + wanted.name,
    + )
    should_provide.discard(filename)

    if filename in self.found:

    it becomes obvious what the problem is:

    DEBUG:game_data_packager.build:Unpacking ./usr/share/games/quake2/baseq2/players/crakhor/a_grenades.md2 from ref/quake2-full-data_85_all.deb because it might be baseq2/players/crakhor/a_grenades.md2
    extracting ./usr/share/games/quake2/baseq2/players/crakhor/a_grenades.md2 from ref/quake2-full-data_85_all.deb
    DEBUG:game_data_packager.build:found baseq2/players/crakhor/a_grenades.md2 at /tmp/gdptmp.sun12x3u/tmp/baseq2/players/crakhor/a_grenades.md2
    DEBUG:game_da