• Upstream tarball hashes: debian/upstream/*SUMS

    From Simon Josefsson@21:1/5 to All on Thu Nov 28 09:10:01 2024
    All,

    There is discussion in the 'Simpler git workflow for packaging with upstreamless repositories' thread about the merrits of pristine-tar.

    One important value people appear to see is to be able to assert that orig.tar.gz's integrity can be chained back into some data in the git repository.

    I agree with the value of being able to assert and verify
    bit-by-bit-identical upstream source tarballs.

    I'd like to explore if we can achieve the same goal without
    pristine-tar.

    How about putting SHA256 checksums of the upstream *.orig.tar.* in, say, debian/upstream/?

    What do you think about the following DEP/RFC-style specification?

    /Simon

    Upstream source tarball checksums: debian/upstream/*SUMS ========================================================

    Checksum files are organized on a per-hash filename basis.

    SHA256 checksums are put in a file debian/upstream/SHA256SUMS.

    Generally files MUST be parseable by the 2024-era interface of Coreutils checksum tools such as 'sha256sum -c'.

    New checksum values are added for each new upstream release tarball.

    Multiple tarballs can be supported, if the Debian package is making use
    of that feature.

    The filenames in the *SUMS file should be the *.orig.tar.* filename used
    within the Debian archive.

    A checksum of upstream's tarball name MUST be included, as it is
    retrieved by debian/watch. This normally results in the same checksum
    value as for the *.orig.tar.* file. Having both checksum lines helps to establish a cryptographic connection from Debian's tarball name to
    upstream's tarball name. The checksums will be different when Debian
    re-pack upstream's source tarball, but there is still value in recording
    the upstream tarball used as a basis for creating the Debian source
    tarball.

    Native Debian packages are not supported, as they don't have a
    reasonable external upstream that can be checksum'ed.

    Adding support for new algorithms is simple, just add a new file.

    For backwards compatibility with old tools used in the future, and to
    establish a known least-supported base-line, the
    debian/upstream/SHA266SUMS file MUST exist if any debian/upstream/*SUMS
    files are present, and MUST contain all relevant checksums.

    There MAY be checksums of auxilliary files -- such as PGP *.asc or *.gpg signatures, Sigsum *.proof files, CMS/PKCS7 signatures, Sigstore cosign artifacts, etc.

    Comments are supported by beginning each line with a # character,
    optionally preceed by whitespace.

    -----BEGIN PGP SIGNATURE-----

    iIoEARYIADIWIQSjzJyHC50xCrrUzy9RcisI/kdFogUCZ0gjUBQcc2ltb25Aam9z ZWZzc29uLm9yZwAKCRBRcisI/kdFogejAP9HBnlvRC2TD69Q/qVc6+IITH7RbRkK FxY+A7cySjdUTwD/e05pusf1lmftuZSP26xBXDm3YjFDl1U3e7uJYQLKkQI=
    =R+6K
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jonas Smedegaard@21:1/5 to All on Thu Nov 28 11:50:01 2024
    Quoting Alec Leamas (2024-11-28 11:28:02)
    Hi,

    On 28/11/2024 09:01, Simon Josefsson wrote:
    The checksums will be different when Debian
    re-pack upstream's source tarball, but there is still value in recording the upstream tarball used as a basis for creating the Debian source
    tarball

    Personally, the few packages I maintain are mostly repacked. Isn't there
    also value in storing the hash of the repacked tarball, the thing
    actually used?

    I can only understand the proposal as the *.orig.tar.* file being the
    thing actually used - i.e. for a source package repackaging the upstream
    source tarball, the *repackaged* tarball is mandatory to include the
    checksum for, whereas the upstream tarball may optionally be recorded
    as well.

    - Jonas

    --
    * Jonas Smedegaard - idealist & Internet-arkitekt
    * Tlf.: +45 40843136 Website: http://dr.jones.dk/
    * Sponsorship: https://ko-fi.com/drjones

    [x] quote me freely [ ] ask before reusing [ ] keep private --==============A24654860703594943=MIME-Version: 1.0
    Content-Transfer-Encoding: 7bit
    Content-Description: signature
    Content-Type: application/pgp-signature; name="signature.asc"; charset="us-ascii"

    -----BEGIN PGP SIGNATURE-----

    wsG7BAABCgBvBYJnSElcCRAsfDFGwaABIUcUAAAAAAAeACBzYWx0QG5vdGF0aW9u cy5zZXF1b2lhLXBncC5vcmcdAAwuHez6BG2RFHSUn2rOjdc3rbb28UMx/vhsdilG gBYhBJ/j6cNmkaaf9TzGhCx8MUbBoAEhAAAKnA/+OxwtuA+u0sjOcLacbufHqCs0 LST9qzIvuZC5PTUs5w9E8ZNTJxDsAlB5XtKrFobMG2KmjngOEzm9L5NAtdC5FWmZ 5v7QZILzsgn/Un8/CxmPieMx8nXoi7P+54gmJG/wImCLGNe8PR1AJz7+KQeoGues Q2ebvbgo9Vucuy1jdOrkYibdpQFIIbl83SWJ7Skq/Wxof1QQbGupLEKjUtg0Ll7N DYlqKxEvXfogNeyF+CtSYRT4rUSikaUJ0kupHLG4zGj/9Oy99nnNmN8pWMG3l7wQ /rJR/zlOVL6FoxI8BDSz4Ur4ntUFNCP3+RU9rABXazVAQhOKYnnZSUYZI7M06TTu 9YYwnhe6SD5CWyTzgjjUa9OB6c/W/VnOlejM8HOH
  • From sre4ever@free.fr@21:1/5 to All on Thu Nov 28 12:50:02 2024
    Hi,

    Le 2024-11-28 11:28, Alec Leamas a écrit :

    Personally, the few packages I maintain are mostly repacked. Isn't
    there also value in storing the hash of the repacked tarball, the thing actually used?

    Not much value, as both the hash and the data would be stored at the
    same place (git repo) if something is able to compromise the data not
    much more is needed to make the hash match the compromised data again.
    And anyway the (eventually signed) hash is in a .changes, and git itself already use (weaker, but good enough against data corruption) hashes to
    store things.

    Signed commits/tags would help in that case but it looks like gbp
    import-orig doesn't support them on the pristine-tar branch. You have
    them on the upstream branch though, which is what actually matters.

    Anyway in both cases it would be wise to compute the hash against the uncompressed (and normalized, timestamps ownerships order etc) .tar
    data, as only the delta (against uncompressed data) is stored and it is
    not guaranteed to get exactly the same compressed bit stream when
    rebuilding the compressed tarball on another system or an updated one.

    Also FI and reference Gradle implemented something similar [1] for
    checking downloaded dependencies. Hashes are used only as last resort,
    when upstream provides no signatures, and signatures are much more
    convenient as they do not require to record a new hash with every
    release. Gradle also allows to store multiple hashes for the same
    dependency (group + name + version) which is useful with locally
    (re)built dependencies, and finding a matching hash allows an unsigned dependency where a signature would otherwise be required.


    [1]:
    https://docs.gradle.org/current/userguide/dependency_verification.html

    --
    Julien Plissonneau Duquène

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon Josefsson@21:1/5 to Alec Leamas on Thu Nov 28 13:00:01 2024
    Alec Leamas <leamas.alec@gmail.com> writes:

    Hi,

    On 28/11/2024 09:01, Simon Josefsson wrote:
    The checksums will be different when Debian
    re-pack upstream's source tarball, but there is still value in recording
    the upstream tarball used as a basis for creating the Debian source
    tarball

    Personally, the few packages I maintain are mostly repacked. Isn't
    there also value in storing the hash of the repacked tarball, the
    thing actually used?

    Absolutely, and that was my intention but I can see how it can be read otherwise -- how about the version below?

    /Simon

    Source tarball checksums: debian/upstream/*SUMS ===============================================

    Checksum files are organized on a per-hash filename basis.

    SHA256 checksums are put in a file debian/upstream/SHA256SUMS.

    The file MUST contain checksums of the intended *.orig.tar.* archives.
    The filenames within the *SUMS file should be the same *.orig.tar.*
    filename that will be uploaded into the Debian archive.

    Files MUST be parseable by the 2024-era interface of Coreutils checksum
    tools such as 'sha256sum -c'.

    New checksum values are added for each new upstream release.

    Multiple source tarballs is supported, if the Debian package is making
    use of that feature.

    A checksum of upstream's tarball name MUST also be included, as it is
    retrieved by debian/watch. This normally results in the same checksum
    value as for the *.orig.tar.* file. Having both checksum lines helps to establish a cryptographic connection from Debian's tarball name to
    upstream's tarball name. The checksums will be different when Debian
    re-pack upstream's source tarball, but there is still value in recording
    the upstream tarball used as a basis for creating the Debian source
    tarball.

    Native Debian packages are not supported, as they don't have a
    reasonable external upstream that can be checksum'ed.

    Adding support for new algorithms is simple, just add a new file.

    For backwards compatibility with old tools used in the future, and to
    establish a known least-supported base-line, the
    debian/upstream/SHA266SUMS file MUST exist if any debian/upstream/*SUMS
    files are present, and MUST contain all relevant checksums.

    There MAY be checksums of auxilliary files -- such as PGP *.asc or *.gpg signatures, Sigsum *.proof files, CMS/PKCS7 signatures, Sigstore cosign artifacts, etc.

    Comments are supported by beginning each line with a # character,
    optionally preceed by whitespace.

    -----BEGIN PGP SIGNATURE-----

    iIoEARYIADIWIQSjzJyHC50xCrrUzy9RcisI/kdFogUCZ0hZpxQcc2ltb25Aam9z ZWZzc29uLm9yZwAKCRBRcisI/kdFotAGAQC8nlZyrN7oTvUqIwJoNG19sOdF6YIw ufl7iHpzv72FNAEA+1vhlDGa3QdyP5eUU1Vc9yhMpU2mAqP5TTq9dvUGJgY=
    =MsOZ
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)