• debian/gbp.cnf analytics? (Re: Re: Accepting DEP14?)

    From =?UTF-8?B?T3R0byBLZWvDpGzDpGluZW4=?@21:1/5 to All on Mon Aug 19 03:50:01 2024
    Hi!

    I am happily using debian/gbp.conf and debian-branch=debian/latest in
    all of my packages but based on the DEP14 discussion seems some people
    prefer debian/sid or debian/unstable (and some of them upload to
    experimental from the branch despite the name, and some maintain a
    separate debian/experimental branch for experimental uploads).

    However this the responses are just a sample based on who happens to
    have time to read debian-devel@ discussions.

    I tried to use codesearch.debian.net to find out how many packages
    have a debian/gbp.conf but it seems it can't be used to simply list
    packages that have a specific file, it always also needs a search
    terms to look up inside the file.

    With https://codesearch.debian.net/search?q=path%3Adebian%2Fgbp.conf+debian-branch+%3F%3D+%3Fdebian%2Flatest&literal=0
    I was able to find that 1655 packages have either "debian-branch = debian/latest" or "debian-branch=debian/latest".

    Is there some easy way to iterate every single Debian package and
    extract just one single file from them without having to download all
    packages?

    I'd like to see how many % of all Debian packages have a gbp.conf
    file, and then download all of them to do stats on what they contain.

    - Otto

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Johannes Schauer Marin Rodrigues@21:1/5 to All on Mon Aug 19 06:50:01 2024
    Quoting Otto Kekäläinen (2024-08-19 03:45:37)
    I tried to use codesearch.debian.net to find out how many packages have a debian/gbp.conf but it seems it can't be used to simply list packages that have a specific file, it always also needs a search terms to look up inside the file.

    With https://codesearch.debian.net/search?q=path%3Adebian%2Fgbp.conf+debian-branch+%3F%3D+%3Fdebian%2Flatest&literal=0
    I was able to find that 1655 packages have either "debian-branch = debian/latest" or "debian-branch=debian/latest".

    Is there some easy way to iterate every single Debian package and
    extract just one single file from them without having to download all packages?

    I'd like to see how many % of all Debian packages have a gbp.conf file, and then download all of them to do stats on what they contain.

    finding out which package contains a given file is better done via the Contents files from our mirrors. The apt-file tool provides an easy interface to search these contents files and answer the question "which package contains a file or path that looks like this".

    By default, apt-file will only download (and search) Contents files for binary packages and not source packages. To change that, edit /etc/apt/apt.conf.d/50apt-file.conf and change DefaultEnabled from "false" to "true" in the section deb-src::Contents-dsc. Once that is done you run "apt update" to download the newly enabled Contents files and then you can run a search like this:

    $ apt-file --index-names dsc search debian/gbp.conf

    You need to add --index-names because the default value is "deb" and that would only search through binary packages.

    Thanks!

    cheers, josch
    --==============@58517594338283246=MIME-Version: 1.0
    Content-Transfer-Encoding: 7bit
    Content-Description: signature
    Content-Type: application/pgp-signature; name="signature.asc"; charset="us-ascii"

    -----BEGIN PGP SIGNATURE-----

    iQIzBAABCgAdFiEElFhU6KL81LF4wVq58sulx4+9g+EFAmbCzp4ACgkQ8sulx4+9 g+EpgQ/+M0I47fCJ2KfEBHDn2YeRJUrXp65MqWZmsCz8XDJkhVciXqD/pWcSbH2J k8+yGadBkh3yO2h8mL6/g4NxFBKTuSqH+d8fxCBjgie0hypnlM9Pyu17m9Nc4Kc0 6Evzr97PLLM7BSn1kRZ70TeP/xa4IDynBaYmzzcOW3PVgH0/cq/QID3tnXrS7B8z wGDYq2gt1hQ2zAMBmZNbH+b3SsdhZJrZqy/2toPpRwHKEmImcWWNZ1oiLaqSonli wq+tHv7QjcMKOUqN3xCOU3o/IDKKajVRt2u2ioJaBecl/JY9HrDMZpuG/rlIO4ew 6IjRA+brwUuRl1iuwNOLLf51nR5T8/Ieckohf2vC6DIwt5Tsa+V+Iy2O8B/6FTRe AKRkDyuexj0eA+4iNwU/gGoaLE9DJFg5atN6Pdog2j24pzur38nEcFBKIvcKPmbj RRjwBqVrEprxNcMH7rwfwnbjQfV/3aKwopT5xEZwWsPK2qH119ZTqXsatSscF+X1 7endAv7S1w4yOPzB+BMpfQEhaACc2bNhK+1hMYXSeHgJFJNIpC1YF9CUS1dHZz+i 1sF99jTNz2ApIDFzS9eon7/nXvN03rp04u77wLn3x9hYZb4YzttuFyjbD99W6tmE 5MNV9JZwIyhvthhjD6Wf/SZMsNtzPanvbnjZGJrFpMUUPtT41L8=
    =uNe/
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?B?T3R0byBLZWvDpGzDpGluZW4=?@21:1/5 to All on Tue Aug 20 07:50:01 2024
    Hi!

    ## How many source packages are in Debian unstable as of today?

    ± zgrep "^Package: " Sources.gz | wc -l
    38199

    ## How many source packages have a gbp.conf?

    ± ls -1 *_gbp.conf | wc -l
    13570
    (24629 do not have it)

    ## What is the most popular 'debian-branch'?

    Note! The Sources.gz used to analyze this was from Debian unstable so
    one would not expect to see any debian/bookworm or debian/12-bookworm
    kind of lines here.

    ± grep --no-filename --only-matching --max-count=1 --perl-regex "^debian-branch( )?=( )?\K([^ ]*)" *gbp.conf | sort | uniq -c | sort
    -nr
    2284 debian/master
    1898 debian/sid
    1655 debian/latest
    990 master
    520 debian
    304 debian/unstable
    179 debian/main
    156 main
    133 debian-sid
    28 debian/experimental
    24 unstable
    20 debian-unstable
    12 experimental
    5 debian-experimental
    4 latest
    3 sid
    2 llvm18/main
    2 llvm17/main
    2 llvm16/main
    2 llvm15/main
    2 llvm14/main
    2 debian-pkg
    2 deb


    ## What is the most popular 'upstream-branch'?

    ± grep --no-filename --only-matching --max-count=1 --perl-regex "^upstream-branch( )?=( )?\K([^ ]*)" *gbp.conf | sort | uniq -c | sort
    -nr
    1846 upstream/latest
    1488 upstream
    220 master
    130 upstream-sid
    47 upstream/master
    30 main
    15 upstream-unstable
    14 upstream/sid
    10 master-dfsg
    9 there-is-no-upstream-branch
    6 dfsg
    5 release
    4 upstream-experimental
    4 dfsg-orig
    3 upstream-tarball
    3 upstream-release
    3 upstream-dfsg
    3 invalid
    3 dfsg_clean
    3 debian-upstream


    ## What is the most popular 'upstream-tag' format?

    ± grep --no-filename --only-matching --max-count=1 --perl-regex "^upstream-tag( )?=( )?\K([^ ]*)" *gbp.conf | sort | uniq -c | sort
    -nr
    943 upstream/%(version)s
    350 v%(version)s
    267 %(version)s
    23 'v%(version)s'
    11 '%(version)s'
    9 upstream/v%(version)s
    4 version/%(version)s
    4 'upstream/%(version)s'
    3 upstream-tarball/v%(version)s
    3 snapshot-%(version)s
    3 release-%(version)s
    2 v%(version%~%-)s
    2 version_%(version)s
    2 upstream/%(version)s+dfsg
    2 release/v%(version)s


    ## How many packages have a 'upstream-vcs-tag' and what is it typically?

    ± grep --no-filename --only-matching --max-count=1 --perl-regex "^upstream-vcs-tag( )?=( )?\K([^ ]*)" *gbp.conf | sort | uniq -c |
    sort -nr
    214 %(version)s
    187 v%(version)s
    156 %(version%~%.)s
    126 %(version%~%-)s
    52 v%(version%~%-)s
    19 v%(version%~%.)s
    8 release/%(version)s
    5 release/%(version)s/final
    3 release-%(version)s
    2 v-%(version)s
    2 rel-%(version)s
    2 gnupg-%(version)s


    ## How many packages have 'pristine-tar'?

    ± grep --no-filename --only-matching --max-count=1 --perl-regex "^pristine-tar( )?=( )?\K([^ ]*)" *gbp.conf | sort | uniq -c | sort
    -nr
    9098 True
    508 False
    169 true
    46 false
    3 1


    ## How many packages have 'upstream-signatures'?

    ± grep --no-filename --only-matching --max-count=1 --perl-regex "^upstream-signatures( )?=( )?\K([^ ]*)" *gbp.conf | sort | uniq -c |
    sort -nr
    7 on
    6 True
    2 auto

    ## How many packages have 'sign-tags'?

    ± grep --no-filename --only-matching --max-count=1 --perl-regex
    "^sign-tags( )?=( )?\K([^ ]*)" *gbp.conf | sort | uniq -c | sort -nr
    2587 True
    55 true
    9 False


    ## Which lines in gbp.conf in general are most common?

    ± cat *_gbp.conf | sort | uniq -c | sort -nr
    13032 [DEFAULT]
    10116
    8450 pristine-tar = True
    2746 [import-orig]
    2553 sign-tags = True
    1771 debian-branch = debian/sid
    1734 upstream-branch = upstream/latest
    1731 [buildpackage]
    1547 dist = DEP14
    1527 debian-branch = debian/latest
    1446 debian-branch = debian/master
    1307 upstream-branch = upstream
    1117 [dch]
    1059 patch-numbers = False
    987 filter = [ '.gitignore', '.travis.yml', '.git*' ]
    967 [pq]
    873 debian-branch = master
    870 upstream-tag = upstream/%(version)s
    771 debian-branch=debian/master
    729 # Configuration file for git-buildpackage and friends
    691 pristine-tar=True
    678 multimaint-merge = True
    604 debian-tag = debian/%(version)s
    526 filter = */.git*
    501 pristine-tar = False
    489 filter=[ '.gitignore', '.travis.yml', '.git*' ]
    466 debian-branch = debian
    436 filter-pristine-tar = True
    369 compression = xz
    360 # Always use pristine-tar.


    In the light of these stats I am fine with current version of the
    DEP-14 text, and I am happy with what I settled on in my packages, in particular the most complex one (https://salsa.debian.org/mariadb-team/mariadb-server/-/blob/debian/latest/debian/gbp.conf)
    that uses basically all features of git-buildpackage and DEP-14).

    Also, it would be cool if trends.debian.net included some kindof
    gbp.conf stats to track how things evolve over time. For that wishlist
    request I filed
    https://salsa.debian.org/lucas/debian-trends/-/issues/3.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Simon McVittie@21:1/5 to All on Tue Aug 20 10:10:01 2024
    On Mon, 19 Aug 2024 at 22:42:53 -0700, Otto Kekäläinen wrote:
    ## How many packages have a 'upstream-vcs-tag' and what is it typically?

    Unlike most of the other questions you asked and answered (thanks!) we
    should never expect this to be consistent, because it isn't Debian's
    decision: it's upstream's decision what they will name their tags.
    The only decision we can make here as Debian packagers is whether to use:

    1. a workflow where upstream/latest contains the same commits as
    upstream git (like src:mesa);
    2. a workflow where upstream/latest contains imported tarball snapshots,
    *without* upstream git history merged in (like src:libsdl2);
    3. a workflow where upstream/latest contains imported tarball snapshots
    *with* upstream git history merged in, most likely via upstream-vcs-tag
    (like src:glib2.0)

    and the total number of upstream-vcs-tag is effectively counting (3.) (and possibly some of (1.)).

    I'm surprised the number your statistics give for (3.) is such a small proportion: I find this workflow really useful as a way to reconcile
    devref 6.8.8.1's assertion that pristine upstream tarballs are important
    with the desire to have upstream git history readily available to make maintenance easier.

    The main reason I don't use upstream-vcs-tag in all the packages I
    maintain[1] is that some upstreams have non-DFSG or not-obviously-DFSG
    content in their VCS, and as a project we can be very uncompromising about
    the application of the DFSG, so using a non-ideal workflow is less of a
    concern to me than the prospect that the project might decide that the
    upstream VCS is insufficiently Free and demand that the packaging history
    is destroyed and re-created.

    smcv

    [1] other than the usual exceptional cases like packages not maintained
    in git upstream, or very large data packages

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gregor herrmann@21:1/5 to Simon McVittie on Sun Aug 25 01:40:01 2024
    On Tue, 20 Aug 2024 09:04:59 +0100, Simon McVittie wrote:

    3. a workflow where upstream/latest contains imported tarball snapshots
    *with* upstream git history merged in, most likely via upstream-vcs-tag
    (like src:glib2.0)
    …
    I'm surprised the number your statistics give for (3.) is such a small proportion: I find this workflow really useful as a way to reconcile
    devref 6.8.8.1's assertion that pristine upstream tarballs are important
    with the desire to have upstream git history readily available to make maintenance easier.

    In the Debian Perl Group, our dpt-import-orig wrapper around
    gbp-import-orig uses --upstream-vcs-tag, but it guesses the name of
    the upstream tag, so we're typically not using upstream-vcs-tag in
    gbp.conf. As a result, the numbers from checking all gbp.conf files
    are lower than the real use.

    Cheers,
    gregor

    --
    .''`. https://info.comodo.priv.at -- Debian Developer https://www.debian.org
    : :' : OpenPGP fingerprint D1E1 316E 93A7 60A8 104D 85FA BB3A 6801 8649 AA06
    `. `' Member VIBE!AT & SPI Inc. -- Supporter Free Software Foundation Europe
    `-

    -----BEGIN PGP SIGNATURE-----

    iQKTBAEBCgB9FiEE0eExbpOnYKgQTYX6uzpoAYZJqgYFAmbKbj5fFIAAAAAALgAo aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldEQx RTEzMTZFOTNBNzYwQTgxMDREODVGQUJCM0E2ODAxODY0OUFBMDYACgkQuzpoAYZJ qgbJ6g/+NcYItqLcpMvYaz88O2uq2oj71L3NgnkbHynrcnh931k5UNypU0/ML6Kv gZAKGjAbNZfaW5me356nhGWQ5Vta6r+FJ2lp0fgV7YJr1lmo9KlfF8j2TvpQUQlR EOMP/sbRTtSoLLnHeAnMQAn5NsoGfP/TNFIZJIzWZY1HYHhh6CXJBfzJlB7eSbtS zrv846a51nxk3oZxuIyNSljbTL8FmLkC3cy5K7TR3lv6LgsXr44Wq5zM1l3XsQg7 pvEHiZQoGCngbTHWt9z51ADKEOSuJkitR/ULfXUiFeTyqGXHr0D4eaOMk3etBxfb XisrDnZ/2DMiDH3G6mzD9JwRTEk5p5N1G1+0luUcVhlo705BWSADkdHiGoHkbXWu Qfr2XLtbwIXU/BtNwP3vDx5ZeSkjiQ44qOmvmMWIwCKXIhdYvCyD/e6b/KuRIMSG NvEQKNXBjudMhpfRb6MM5kB/7ftFtLMe/hRvAbw8K/hAJv5/EaTI67xPy9jxnG+C
    FdMbJYEe