Suggestion for Italian Regola "eufoniche"

This is a good rule because if properly implemented it is simple yet 100% reliable, and helps polish and normalize Italian texts to modern standards in this respect.

From what I saw, it is currently limited to “E” (as in “ed”). I propose to extend it to “A” and “O” (as in “ad” and “od”), allowing for these two commonly accepted exceptions:

  • “ad es” (including “ad es.”)
  • “ad esempio”

If the same rule cannot be extended to cover the three cases, I propose to duplicate it, renaming the resulting three rules as follows:

  • Regola D eufonica ("ad)
  • Regola D eufonica (“ed”)
  • Regola D eufonica (“od”)

(only the “ad” rule needs the exceptions)

Reference:

I am no LanguageTool expert, but it seems like there currently are several “eufoniche” rules, of which I had a look at the variants for “e”, “a”, “o”:

  • The “e” case looks good
  • The “a” case includes the exception for “esempio”, but I would also add the abbreviation “es.” as an allowed exception
  • The “o” case looks wrong, even suggesting a correction to “ed ogni” (in violation of the “e” rule) for the example “Puoi vederlo ogni mattina o ogni sera.”

In all cases, the tool flags as a warning perfectly good Italian, like “e entrando”, “a esempio”, etc. These do not need to be marked in writing. I would propose that the tool only flag cases where the “d” needs to be removed (there is never a need to add, only to remove). Let the writer decide the rest. A “d” can be safely flagged or even auto-removed by a rule, but not added (this is much more complicated, and best done manually).

Hi Mike,

thanks for spotting those problems out. I have a patch ready to fix them.

@Daniel: do we need a bug, shall i forward it to you or just commit?

As far as the second part of your message is concerned (the one related to the suggestion for adding the “d”) we could think about splitting that rule in two: one for removing unwanted eufoniche and one for adding legal eufoniche. The user could then decide which rule to enable based in his own writing style.

Ciao

Paolo

here’s the patch

— grammar.xml.orig 2014-12-30 13:12:54.000000000 +0100
+++ grammar.xml 2015-02-05 22:15:58.000000000 +0100
@@ -1333,7 +1333,7 @@

<rule>    
                 <pattern>
                     <token>ad</token>
-                    <token regexp="yes">[eiou].*<exception>esempio</exception></token>
+                    <token regexp="yes">[eiou].*<exception regexp="yes">esempio|es</exception></token>
                 </pattern>
                 <message>L'uso della d eufonica dovrebbe essere limitato ai casi di incontro della stessa vocale: <suggestion>a <match no="2"></match></suggestion>.</message>
                 <example type="incorrect">Provo <marker>ad interpretare</marker> il tuo pensiero.</example>
@@ -1365,7 +1365,7 @@
             <rule>
                 <pattern>
                     <token>a</token>
-                    <token regexp="yes">[a].*|esempio</token>
+                    <token regexp="yes">[a].*|esempio|es</token>
                 </pattern>
                 <message>In questo caso l'uso della 'd' eufonica è ammesso: <suggestion>ad <match no="2"></match></suggestion>.</message>
                 <example type="incorrect">Provo <marker>a arrivare</marker> prima di te.</example>
@@ -1382,13 +1382,13 @@
                 <example type="incorrect">Si concentra <marker>e effettua</marker> una bella manovra.</example>
                 <example type="correct">Si concentra <marker>ed effettua</marker> una bella manovra.</example>
             </rule>
  •        <!-- togli od -->
    
  •        <!-- metti od -->
    
    o [o].* - In questo caso l'uso della 'd' eufonica è ammesso: ed . + In questo caso l'uso della 'd' eufonica è ammesso: od . Puoi vederlo ogni mattina o ogni sera. Puoi vederlo ogni mattina od ogni sera.

Paolo, feel free to commit directly.

As far as the second part of your message is concerned
(the one related to the suggestion for adding the “d”) we
could think about splitting that rule in two: one for removing
unwanted eufoniche and one for adding legal eufoniche

IMHO, the two should be considered very different in terms of what can be done automatically.

Removing the D with such rules is so reliable, safe and recognized even by institutions like Accademia della Crusca, that you could IMHO safely run the rule on the entire Italian Wikipedia, and it would only get better. It might be an exciting test.

Adding a D automatically is usually neither good (it is generally not the problem that needs fixing), nor safe (human writers may intentionally have decided to avoid cacophonies, or to use a modern writing style, etc.) For example, “ad esempio” is allowed, but a modern writer or speaker could choose to write “a esempio”. I will admit that it sounds strange to me (I too might use “ad esempio” or prefer “per esempio”), but I know some people who use “a esempio”. Under a more modern style, an euphonic D might never be used with the “O” case. By “encouraging” additions, we risk confusing people or going backwards in time. In my experience, if a writer chose not to put a D, it is an annoyance to even mark that.

BTW, the rules we are discussing here are like these default settings used by Accenti, which were tested over the years against large corpora:

accenti.it/schermate/d-eufonica

I’ve committed this, it will be part of tonight’s snapshots (Index of /snapshots/) and also be live on languagetool.org later tonight.