• The Pleasantry Of Paragraphs

    From Stefan Ram@21:1/5 to All on Sun Jan 26 20:14:09 2025
    TeX has a special demerit for a hyphen at the end of the
    second-to-last line. I don't mind such a hyphen!

    But usually, I dislike a very short word at the end of a
    line, as the "a" in the previous line. Sure, you can always
    write "a~line" to avoid such things, but I think as a general rule,
    one could also have a demerit for a word of one, two, or, maybe,
    three letters at the end of the line. TeX does not have this.

    Then, I also deem it to be elegant when a line ends with
    punctuation like a comma or a period. So, I would add a merit
    for this, which, however, TeX does not do.

    If I would have formatted the second paragraph starting with

    But usually, I dislike a very short word at the end of a line,

    then, I would have avoided both the short "a" at the end and
    would have gained a final comma "," at the end, which to me
    seems more pleasant!

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Didier Verna@21:1/5 to Stefan Ram on Mon Jan 27 13:20:24 2025
    ram@zedat.fu-berlin.de (Stefan Ram) wrote:

    but I think as a general rule, one could also have a demerit for a
    word of one, two, or, maybe, three letters at the end of the line.
    TeX does not have this.

    Then, I also deem it to be elegant when a line ends with punctuation
    like a comma or a period. So, I would add a merit for this, which,
    however, TeX does not do.

    Hi Stefan,

    your aesthetic concerns are legit, but I disagree with your conclusion.

    First of all, TeX uses demerits for "contextual" information which you
    cannot know in advance because it depends on how the paragraph will be
    broken into lines (comparing two lines, figuring out the second-to-last
    line, etc.). So demerits would be the wrong place to do what you want to
    do.

    In fact, the penalty system is a better place, since it's a context-free weighting system. Which brings me to the second point: the penalty model
    is already powerful enough to express what you need. You mention a~line,
    but you can be less drastic by inserting a specific penalty between
    those words, and you can even vary the penalty in question, for example, depending on the actual size of the short word. In a similar fashion you
    can express a "merit" after a punctuation sign by inserting a negative
    penalty after it.

    Of course, as you mention, it would be cumbersome to do all this by
    hand. But since the capability is already in here, it would be a bad
    idea to complicate the Knuth-Plass eve more than it already is. A better solution would be to have a pre-processing stage on the text. Some
    people already do that in a semi-automatic way with regular expressions.


    In fact, I would even argue that there's already too much in the core of
    TeX. It has two penalties related to hyphenation, but they are
    hard-wired together with the discretionary mechanism, which is not as
    general as it could be: it's not possible to manually adjust the
    breaking penalty for individual discretionaries.

    --
    Resistance is futile. You will be jazzimilated.

    Lisp, Jazz, Aïkido: http://www.didierverna.info

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Peter Flynn@21:1/5 to Didier Verna on Thu Jan 30 22:50:44 2025
    On 27/01/2025 12:20, Didier Verna wrote:
    ram@zedat.fu-berlin.de (Stefan Ram) wrote:

    but I think as a general rule, one could also have a demerit for a
    word of one, two, or, maybe, three letters at the end of the line.
    TeX does not have this.

    Then, I also deem it to be elegant when a line ends with punctuation
    like a comma or a period. So, I would add a merit for this, which,
    however, TeX does not do.

    Hi Stefan,

    your aesthetic concerns are legit, but I disagree with your conclusion.

    I agree with both concerns; there are also requirements in some styles
    to automate hanging punctuation, which could be combined with the second
    point. The first point is probably going to lead to some uneven
    justification unless \raggedright is in force (or perhaps \RaggedRight).

    Some people already do that in a semi-automatic way with regular
    expressions.

    A considerable amount can be done in (eg) XSLT when transforming a
    master XML document into LaTeX. For example, detecting a URI at the end
    of a sentence (followed by period, comma, exclamation, question, etc)
    can then insert \thinspace before the punctuation to avoid copy-paste or retyping errors including the punctuation (when hyperref is not used).

    it's not possible to manually adjust the
    breaking penalty for individual discretionaries.

    My home style uses \hyphenation{he-li-co-pter} :-)

    Peter

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)