• Speech mode of Debian installer

    From Roland Clobus@21:1/5 to Debian Install System Team on Tue Nov 19 12:40:01 2024
    This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --------------YkNiHaUMoJVb0XdWT6NMFqQ3
    Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: base64

    SGVsbG8gbGlzdCwNCg0KUmVjZW50bHkgSSd2ZSBlbmFibGVkIHRoZSByZWNvcmRpbmcgb2Yg dGhlIGF1ZGlvIHRoYXQgaXMgZ2VuZXJhdGVkIGJ5IA0KZXNwZWFrdXAgaW4gdGhlIHNwZWVj aCB2ZXJzaW9uIG9mIHRoZSBpbnN0YWxsZXIgKG5ldGluc3QgaW1hZ2UpIGluIA0Kb3BlblFB LiBUaGUgZmlyc3Qgc3RlcCBvZiB0aGUgaW5zdGFsbGVyIGlzIHJlY29yZGVkLg0KDQpZb3Ug Y2FuIHNlZSB0aGUgcmVzdWx0IGhlcmU6DQpodHRwczovL29wZW5xYS5kZWJpYW4ubmV0L3Rl c3RzLzMyNTc3NQ0KDQpUaGUgbW9zdCBzdHJpa2luZyByZWNvcmRpbmcgaXMgYXQgc3RlcCAy OjY6MQ0KaHR0cHM6Ly9vcGVucWEuZGViaWFuLm5ldC90ZXN0cy8zMjU3NzUvZmlsZS9ib290 d2Fsa18yOjY6MS1jYXB0dXJlZC53YXYNCndoaWNoIGlzIGFib3V0IDUgbWludXRlcyBsb25n IGFuZCBsaXN0cyA3OCBsYW5ndWFnZSBvcHRpb25zLg0KDQpCZWZvcmUgYXNraW5nIHF1ZXN0 aW9ucyBhdCB0aGUgZGViaWFuLWFjY2Vzc2liaWxpdHkgbWFpbGluZyBsaXN0LCBJJ2xsIA0K YXNrIHNvbWUgdGVjaG5pY2FsIHF1ZXN0aW9ucyBoZXJlOg0KKiBBcmUgYWxsIHRoZXNlIGxh bmd1YWdlcyBzdXBwb3J0ZWQgYnkgdGhlIHNwZWVjaCBnZW5lcmF0b3JzPyAoSSd2ZSANCm5v dGljZWQgZS5nLiAnQ2hpbmVzZSBsZXR0ZXIgLSBDaGluZXNlIGxldHRlcicgYmVpbmcgc3Bv a2VuIGF0IDE6MDApIC0+IA0KaS5lLiBzaG91bGQgdGhlIGxpc3Qgb2YgbGFuZ3VhZ2VzIGJl IHJlZHVjZWQgZm9yIHRoaXMgc3BlY2lmaWMgdmFyaWFudCANCm9mIHRoZSBpbnN0YWxsZXIs IGJlY2F1c2UgdGhlIHNwZWVjaCBtb2R1bGUgY2Fubm90IHJlYWQgaXQ/DQoqIENvdWxkIGEg ZGlmZmVyZW50IGZvbnQgYmUgdXNlZCB0byBzaG93IHRoZSBVVEYtOCBjaGFyYWN0ZXJzLCBz aW1pbGFyIA0KdG8gdGhlIHRleHQgaW5zdGFsbGVyPyAoSSd2ZSBub3RpY2VkIHRoZSBzcXVh cmUgc3ltYm9scyBmb3IgbWlzc2luZyBnbHlwaHMpDQoqIFRoZSBzcG9rZW4gdGV4dCAnUHJv bXB0LiBGb3IgaGVscCcgZG9lcyBub3Qgc3BlYWsgdGhlIG1vc3QgaW1wb3J0YW50IA0KYml0 LCBpLmUuIHRoYXQgdGhlIHF1ZXN0aW9uIG1hcmsgd2lsbCBzaG93IHRoZSBoZWxwIHRleHQN CiogTm93YWRheXMgbmV3ZXIgVFRTIHZvaWNlcyBleGlzdCB0aGF0IHNwZWFrIGEgbW9yZSBu YXR1cmFsIGxhbmd1YWdlIA0KKGUuZy4gcGlwZXIgaHR0cHM6Ly9yaGFzc3B5LmdpdGh1Yi5p by9waXBlci1zYW1wbGVzLyksIGNvdWxkIHRoaXMgYmUgDQp1c2VkIGluc3RlYWQ/DQoNCkkn bSB2ZXJ5IHdlbGwgYXdhcmUgb2YgdGhlIGh1Z2UgYW1vdW50IG9mIHdvcmsgbmVlZGVkIHRv IGltcGxlbWVudCB0aGlzLCANCmZvciBhIHRlYW0gdGhhdCBpcyBhbHJlYWR5IHVuZGVyIGxv YWQuDQpBdCBsZWFzdCBJJ2xsIGJlIGFibGUgdG8gaGVscCB3aXRoIHRoZSBhdXRvbWF0ZWQg dGVzdGluZyBzaWRlLCBvbiBvcGVuUUEuDQoNCldpdGgga2luZCByZWdhcmRzLA0KUm9sYW5k IENsb2J1cw0KDQo=

    --------------YkNiHaUMoJVb0XdWT6NMFqQ3--

    -----BEGIN PGP SIGNATURE-----

    iQIzBAEBCgAdFiEEUFVLM5Bdj7GSJEb+YsV8aqYUlb0FAmc8dwMACgkQYsV8aqYU lb1+hBAAjQUkGtkb1od8RId+ZfxIsNijkcNmwy0egJ3Fit4G7rj8FFrPxbuDJN5V QAQXW32D5swa5YK7K/lAQyTmFmKuSapPyKTG6MESJTrTSB2c9AJV0H12iogy/Ovv NdZpWq/HbosHitZ3hv+mh+WL2iVf3TQZpQGZTrHyfuNDlpGYfuNmK6TGzxUhG5Jb 1T/IL0lpJovttpsgdApf+wQrAaPAdrg6r+MD4S5TfEf46RK49JxCbyGIFp4+h/Ri MILaJf8yJV6NYw3XOyKJgu97l3MhFwo0Pa0QxrPEuoQvmIkSI9olTWfWSiFo1Oyl x7qWE2PHowbpTdoOtAPT/6W4lCFqOR3YnAKj481l/toZKDCw41zfxzk5PiYJ22DD dZZnMNNGL63Iq7xmnothPmevIW9eoC9hzCCtGlWJPzW5gAwNorq0ugvkjaYsYVSQ vN0XKELjm1GrDcW6U2zsxCcsopOJnIufbiSNk0u+Vyirdk37iZehqmXv8KMfHbSZ xT0CnjZK4AadVXyGt0VpaJdDSjxamrOEQQH13IFTBao3LexJ6ow94knlpDJDF3yV YW9KIZafuFvyBjih7jzFDqV88Yf9WTBC8m6TBlU8ZzyWzHpVOA+/MsWS+tEDjBhP xLcDhIRC/B4fLP+BhsUDoYFCG0r6WpQhuY59OhtMgSoz7nXW2Ik=
    =/w3r
    -----END PGP SIGNATURE-----

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Samuel Thibault@21:1/5 to All on Tue Nov 19 14:30:01 2024
    Hello,

    Cc-ing debian-accessibility at least for information.

    Roland Clobus, le mar. 19 nov. 2024 12:31:15 +0100, a ecrit:
    The most striking recording is at step 2:6:1 https://openqa.debian.net/tests/325775/file/bootwalk_2:6:1-captured.wav
    which is about 5 minutes long and lists 78 language options.

    Yes. One can use arrow keys instead to quickly go over the list.

    * Are all these languages supported by the speech generators?

    I don't remember if that's 100% the case, in my memory it is at least
    very largely covered.

    (I've noticed e.g. 'Chinese letter - Chinese letter' being spoken
    at 1:00) -> i.e. should the list of languages be reduced for this
    specific variant of the installer, because the speech module cannot
    read it?

    Ideally that could be implemented in localechooser, by looking at the
    list of voices in espeak to filter the list.

    * Could a different font be used to show the UTF-8 characters, similar to
    the text installer?

    For the screen reader to work, we have to use the linux console, not
    an fbterm. So we are limited to the linux console capability, and thus
    cannot display everything at the same time. In practice this is not a
    problem because the speech is correct, and the font is switched once a
    language is selected.

    * The spoken text 'Prompt. For help' does not speak the most important bit, i.e. that the question mark will show the help text

    This is

    https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=690343

    forwarded upstream

    https://github.com/espeak-ng/espeak-ng/issues/150

    * Nowadays newer TTS voices exist that speak a more natural language (e.g. piper https://rhasspy.github.io/piper-samples/), could this be used instead?

    The problem is the size. espeak-ng supports a very wide range of
    languages with a quite small disk footprint. Piper etc. (we had mbrola
    already for a long time) take a *lot* of space. We have packages ready
    for including e.g. mbrola voices, but we cannot really include them on
    the default images, it's rather for specialized images.

    Samuel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Samuel Thibault@21:1/5 to All on Sat Nov 23 18:20:01 2024
    Roland Clobus, le sam. 23 nov. 2024 15:34:49 +0100, a ecrit:
    On 19/11/2024 14:17, Samuel Thibault wrote:
    Cc-ing debian-accessibility at least for information.
    Roland Clobus, le mar. 19 nov. 2024 12:31:15 +0100, a ecrit:
    (I've noticed e.g. 'Chinese letter - Chinese letter' being spoken
    at 1:00) -> i.e. should the list of languages be reduced for this specific variant of the installer, because the speech module cannot
    read it?

    Ideally that could be implemented in localechooser, by looking at the
    list of voices in espeak to filter the list.

    The next screen typically works fine, so there appears no need for removing some languages from this list.

    However, could localechooser use a pre-rendered pronunciation for the language-selection question (activated only when espeak is active)?
    E.g.:
    # Prerendering of the language file:
    echo "Ελληνικά" | espeak-ng -x -v el
    # Then the line would become "28: Greek - [[,elinik'a]]"

    That'd be complex: espeakup just gets the whole text, it doesn't know
    what piece is in which language.

    * The spoken text 'Prompt. For help' does not speak the most important bit,
    i.e. that the question mark will show the help text

    This is

    https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=690343

    forwarded upstream

    https://github.com/espeak-ng/espeak-ng/issues/150

    With no upstream activity since 2016, should the text in the installer be changed?

    That's a possibility, but the bug really wants to be fixed by somebody
    at some point, because such prompts do happen here and there.

    Piper has a MIT license and the voices appear to be less restrictive, but I didn't look at it too deeply.

    Regarding size: perhaps the netinst image cannot handle the growth, but e.g. a GNOME live image (already at 4GB) can have a bit more.

    But it's not just "a bit". Look at the size per language of piper. We
    cannot just add a GB of voices for the various languages that d-i
    supports.

    Or there could even be an a11y live image (or are there already Debian derivatives that handle this?)

    Some derivatives do this, yes.

    Samuel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)