Greek rules: "Final n in articles"-rules do not work

Hi Daniel,
Hi all,

I am using LanguageTool almost from the beginning but decided to enter the forum just today. Sorry for that.
I appreciate LanguageTool very much, it is a great tool.

The reason I decided to sign in today is the following:

Some years ago I noticed that some Greek rules do not work properly. At least the “Final n in articles”-rules, as far as I noticed until now.
As far as I could tell, they work correctly only with the integrated example sentences. As soon as you use another sentence, they do not work at all.

Two years ago I tried to solve this problem.
At the time it was too much for me to learn using the rule-editor and the language used in it, so I decided to experiment and play around with the code of the rules involved.
Somehow I managed to make the rules work.
I honestly have no idea anymore how that happened :wink:

After that, my greatest problem was that I had to keep my own grammar.xml file somewhere and after every update of LanguageTool (extension as well as stand alone) I had to put it every time in the respective directory.
I had to do this also for OmegaT, which I use as a translator.
(I had to do this also for the German version, where I added some rules, after copying and modifying similar existing rules.)

You can understand that after a while I found this very annoying and troublesome.

So, today I decided to give the rule-editor one more try with.
After experimenting with Token #2 a bit, I got this:

<!-- Greek rule, 2018-12-08 -->
<rule id=“N” name=“τελικό ν θηλυκού άρθρου (τn → την)”>
<pattern>
<marker>
<token>τη</token>
</marker>
<token regexp=‘yes’>(α|ε|ι|η|υ|ο|ω|ά|έ|ί|ή|ύ|ό|ώ|ϊ|ϋ|ΐ|ΰ|κ|π|τ|ξ|ψ|γκ|μπ|ντ|τσ|τζ|a|e|i|o|u|b|d|ch|z|g|k|l|p|q|t|x).*</token>
</pattern>
<message>Το τελικό ν διατηρείται στον γραπτό λόγο, όταν η επόμενη λέξη αρχίζει από φωνήεν ή από ένα από τα: κ, π, τ, γκ, μπ, ντ, τσ, τζ, ξ, ψ. Χρησιμοποιήστε <suggestion>την</suggestion></message>
<short>Πρόβλημα ορθογραφίας</short>
<example correction=’’>Το Σιν Φέιν ψηφίζει υπέρ της συνεργασίας του με <marker>τη</marker> αστυνομία για πρώτη φορά στην ιστορία του κόμματος.</example>
<example>Το Σιν Φέιν ψηφίζει υπέρ της συνεργασίας του με την αστυνομία για πρώτη φορά στην ιστορία του κόμματος.</example>
</rule>

I pressed “evaluate error pattern” and the rule seems to work now.
After posting this text, I will continue with the rest of the rules which are not working.

What I want to ask you is:

  1. Is Panagiotis Minos still the maintainer of the Greek language? If so, could you please contact me in order to discuss the improvement of the Greek rules?
  2. If that is not the case, how can I submit the corrected or new rules to your development team, so that they can be automatically included in all future versions of LanguageTool?

Thank you all for this great tool and for the help I can find here.

Best Regards

Konstantin

Hi Konstantin, thanks for the rule update. Panagiotis Minos is still the maintainer for Greek. I suggest you post updated rules as an issue at Issues · languagetool-org/languagetool · GitHub

@pminos Could you have a look at the updated rule?

Hi Daniel,

thank you for responding so fast.
Here is the posting I just made:
https://github.com/languagetool-org/languagetool/issues/1288

Hi Daniel,
is Panagiotis Minos still the maintainer for Greek? Is he doing well?
I sent him a PM in January, but still didn’t hear from him.
I also posted this [EL] greek rules: not all work · Issue #1288 · languagetool-org/languagetool · GitHub yesterday and still no sign of him.
I really hope he is doing well.
In the meantime, as I mention in my posting, I would like to submit a few rules.
Unfortunately, I do not know what the procedure is.
Yesterday I went to Check a LanguageTool XML rule and pasted a rule I have in the window, but didn’t know what to do afterwards. I pressed Ctrl+Return, as the hint suggest, but nothing happened.
Please help me out here, in words and steps that even a dummy like me will understand because I really do not have the time to start reading the instructions again.
Thank you in advance for you reply (if it is easier for you, you can reply in German as well).

Best Regards
Konstantin

Hi Konstantin, I have last heard from Panagiotis about 4 months ago. If you have rules and the online rule editor doesn’t complain about them, you can just paste the XML here and we will add them to LT. It would be good to know which category they belong to. Currently, these categories are in use for Greek:

Agreement
Orthography
Redundant Phrases
Homonymy
Syntax
Στίξη
Υφολογικά λάθη
Φράσεις με δοτική πτώση

Hi Daniel, thank you for the reply.
Here are the 2 first rules I would like to submit. They are actually the old rules, made from Minos as far as I know, but I modified them a bit so that they’ll work with all words.
Minos wanted to have a look at them, to check if you could improve them even further or make them even more generalized but since he didn’t come back with something better, I think these would do.

rule 1 Orthography:

<rule id="GREEK_ART_FEM_MISSING_N" name="τελικό ν θηλυκού άρθρου (τn → την)" type="grammar">
    <pattern>
        <marker>
            <token>τη</token>
        </marker>
        <token regexp='yes'>(α|ε|ι|η|υ|ο|ω|ά|έ|ί|ή|ύ|ό|ώ|ϊ|ϋ|ΐ|ΰ|κ|π|τ|ξ|ψ|γκ|μπ|ντ|τσ|τζ|a|e|i|o|u|b|d|ch|z|g|k|l|p|q|t|x).*</token>
    </pattern>
    <message>Το τελικό ν διατηρείται στον γραπτό λόγο, όταν η επόμενη λέξη αρχίζει από φωνήεν ή από ένα από τα: κ, π, τ, γκ, μπ, ντ, τσ, τζ, ξ, ψ. Χρησιμοποιήστε <suggestion>την</suggestion></message>
    <short>Πρόβλημα ορθογραφίας</short>
    <example correction="την">Το Σιν Φέιν ψηφίζει υπέρ της συνεργασίας του με <marker>τη</marker> αστυνομία για πρώτη φορά στην ιστορία του κόμματος.</example>
    <example>Το Σιν Φέιν ψηφίζει υπέρ της συνεργασίας του με την αστυνομία για πρώτη φορά στην ιστορία του κόμματος.</example>
</rule>

rule 2 Orthography:

<rule id="GREEK_ART_FEM_EXTRA_N" name="τελικό ν θηλυκού άρθρου (τnν → τη)" type="grammar">
    <pattern>
        <marker>
            <token>την</token>
        </marker>
        <token>
            <exception regexp="yes">(α|ε|ι|η|υ|ο|ω|ά|έ|ί|ή|ύ|ό|ώ|ϊ|ϋ|ΐ|ΰ|κ|π|τ|ξ|ψ|γκ|μπ|ντ|τσ|τζ|a|e|i|o|u|b|d|ch|z|g|k|l|p|q|t|x).*</exception>
        </token>
    </pattern>
    <message>Το τελικό ν διατηρείται στον γραπτό λόγο, μόνο όταν η επόμενη λέξη αρχίζει από φωνήεν ή από ένα από τα : κ, π, τ, γκ, μπ, ντ, τσ, τζ, ξ, ψ. Χρησιμοποιήστε <suggestion>τη</suggestion></message>
    <short>Πρόβλημα ορθογραφίας</short>
    <example correction="τη">Για <marker>την</marker> λειτουργία της αναπνοής υπάρχει σε κάθε οργανισμό ξεχωριστό σύστημα οργάνων</example>
    <example>Για τη λειτουργία της αναπνοής υπάρχει σε κάθε οργανισμό ξεχωριστό σύστημα οργάνων</example>
</rule>

The online rule editor doesn’t complain about them.
It would be nice to add them as soon as possible, because as I mentioned, it is quite annoying having to copy these in the grammar-file every time an update comes out.
Thank you.

I get some matches with the rule editor, did you see them, too? If not, which browser do you use? If so, have you checked these are valid matches and not false alarms?

Yes Daniel, I have seen them, but as far as I can say, the rules work quite correctly.
I use them like this for years already and I am quite happy with the results.

I just corrected the post with the rules.
I had both rules in every quote. Sorry about that.

The rules state:
Το τελικό ν διατηρείται στον γραπτό λόγο, όταν η επόμενη λέξη αρχίζει από φωνήεν ή από ένα από τα: κ, π, τ, γκ, μπ, ντ, τσ, τζ, ξ, ψ
which means:
The final n is maintained in written texts when the next word begins with a vowel or one of the following: κ, π, τ, γκ, μπ, ντ, τσ, τζ, ξ, ψ

So, if you look at the results again, they should be corrected in the wiki-pages, if I am not mistaken :woozy_face:.

Thanks, I’ve added both rules. I’ve only changed the syntax from (x|y) to [xy] as that’s more compact.

Hi Daniel, thank you for adding the rules. BUT… there is a slight problem.
By changing the rule you obviously neutralized it.
If you check the following, you will see it for your self:
Τη Μαρία, τη νύχτα, τη κυρία, τη πέτρα, τη έγκλιση, τη ντουλάπα, τη γκαζιέρα, τη μπουγάδα, τη ψυχή, τη τσάντα.

Checking online, everything is considered wrong.
My LibreOffice recognizes the mistakes correctly.
It should be:
Τη Μαρία, τη νύχτα, την κυρία, την πέτρα, την έγκλιση, την ντουλάπα, την γκαζιέρα, την μπουγάδα, την ψυχή, την τσάντα.

It seems that now LanguageTool checks “μπ”, “ντ”, “τσ” separately.
Therefore, even “τη νύχτα” is considered wrong.
Please be so kind and use the rules as I posted them.
Thank you.

Sorry, fixed now. Will be online later tonight.

Thank you Daniel.

Hello Daniel,
I think it is about time to ask you to integrate the following rules too.
It took so long, before I put this request, because I was hoping that Panos or I would come up with a better solution, but it seems that the 2 rules we already applied are the best we can have currently.
I even double-checked the Greek grammar as well as the manual of the DGT.
So, the table below shows the rules already applied as well as the rules to be applied.
I must admit, I didn’t look for example-sentences better matching the rules, but I believe, that it doesn’t matter.
If you have any questions, I am at your disposal at anytime.
Thank you.

already applied: τη/την
rule 1 -token>τη – -example correction=“την”>
rule 2 -token>την – -example correction=“τη”>

to be applied: στη/στην
rule 1 -token>στη – -example correction=“στην”>
rule 2 -token>στην -example correction=“στη”>

to be applied:αυτή/αυτήν
rule 1 -token>αυτή — -example correction=“αυτήν”>
rule 2 -token>αυτήν — -example correction=“αυτή”>

Could you please send the complete new XML of the rules? That makes it easier for me to change them.

Hello Daniel,
sorry for not replying yesterday.
Today I decided to look for some example sentences for each rule and here are the results:

Rule 1 στη → στην

<rule id="GREEK_ART_FEM_MISSING_N" name="τελικό ν θηλυκού άρθρου (στn → στην)" type="grammar">
    <pattern>
        <marker>
            <token>στη</token>
        </marker>
        <token regexp='yes'>(α|ε|ι|η|υ|ο|ω|ά|έ|ί|ή|ύ|ό|ώ|ϊ|ϋ|ΐ|ΰ|κ|π|τ|ξ|ψ|γκ|μπ|ντ|τσ|τζ|a|e|i|o|u|b|d|ch|z|g|k|l|p|q|t|x).*</token>
    </pattern>
    <message>Το τελικό ν διατηρείται στον γραπτό λόγο, όταν η επόμενη λέξη αρχίζει από φωνήεν ή από ένα από τα: κ, π, τ, γκ, μπ, ντ, τσ, τζ, ξ, ψ. Χρησιμοποιήστε <suggestion>στην</suggestion></message>
    <short>Πρόβλημα ορθογραφίας</short>
    <example correction="στην">Ο Πέτρος πήγε <marker>στη</marker> αντιπροσωπεία και αγόρασε ένα καινούργιο αυτοκίνητο.</example>
    <example>Ο Πέτρος πήγε στην αντιπροσωπεία και αγόρασε ένα καινούργιο αυτοκίνητο.</example>
</rule>

Rule 2 στην → στη

<rule id="GREEK_ART_FEM_EXTRA_N" name="τελικό ν θηλυκού άρθρου (στnν → στη)" type="grammar">
    <pattern>
        <marker>
            <token>στην</token>
        </marker>
        <token>
            <exception regexp="yes">(α|ε|ι|η|υ|ο|ω|ά|έ|ί|ή|ύ|ό|ώ|ϊ|ϋ|ΐ|ΰ|κ|π|τ|ξ|ψ|γκ|μπ|ντ|τσ|τζ|a|e|i|o|u|b|d|ch|z|g|k|l|p|q|t|x).*</exception>
        </token>
    </pattern>
    <message>Το τελικό ν διατηρείται στον γραπτό λόγο, μόνο όταν η επόμενη λέξη αρχίζει από φωνήεν ή από ένα από τα : κ, π, τ, γκ, μπ, ντ, τσ, τζ, ξ, ψ. Χρησιμοποιήστε <suggestion>στη</suggestion></message>
    <short>Πρόβλημα ορθογραφίας</short>
    <example correction="στη">Το κάπνισμα έχει ζημιογόνα επίδραση <marker>στην</marker> λειτουργία των πνευμόνων.</example>
    <example>Το κάπνισμα έχει ζημιογόνα επίδραση στη λειτουργία των πνευμόνων.</example>
</rule>

Rule 3 αυτή → αυτήν

<rule id="GREEK_ART_FEM_MISSING_N" name="τελικό ν θηλυκού άρθρου (αυτή → αυτήν)" type="grammar">
    <pattern>
        <marker>
            <token>αυτή</token>
        </marker>
        <token regexp='yes'>(α|ε|ι|η|υ|ο|ω|ά|έ|ί|ή|ύ|ό|ώ|ϊ|ϋ|ΐ|ΰ|κ|π|τ|ξ|ψ|γκ|μπ|ντ|τσ|τζ|a|e|i|o|u|b|d|ch|z|g|k|l|p|q|t|x).*</token>
    </pattern>
    <message>Το τελικό ν διατηρείται στον γραπτό λόγο, όταν η επόμενη λέξη αρχίζει από φωνήεν ή από ένα από τα: κ, π, τ, γκ, μπ, ντ, τσ, τζ, ξ, ψ. Χρησιμοποιήστε <suggestion>αυτήν</suggestion></message>
    <short>Πρόβλημα ορθογραφίας</short>
    <example correction="αυτήν">Γι' <marker>αυτή</marker> ξενιτεύτηκα.</example>
    <example>Γι' αυτήν ξενιτεύτηκα.</example>
</rule>

Rule 4 αυτήν → αυτή

<rule id="GREEK_ART_FEM_EXTRA_N" name="τελικό ν θηλυκού άρθρου (αυτήν → αυτή)" type="grammar">
    <pattern>
        <marker>
            <token>αυτήν</token>
        </marker>
        <token>
            <exception regexp="yes">(α|ε|ι|η|υ|ο|ω|ά|έ|ί|ή|ύ|ό|ώ|ϊ|ϋ|ΐ|ΰ|κ|π|τ|ξ|ψ|γκ|μπ|ντ|τσ|τζ|a|e|i|o|u|b|d|ch|z|g|k|l|p|q|t|x).*</exception>
        </token>
    </pattern>
    <message>Το τελικό ν διατηρείται στον γραπτό λόγο, μόνο όταν η επόμενη λέξη αρχίζει από φωνήεν ή από ένα από τα : κ, π, τ, γκ, μπ, ντ, τσ, τζ, ξ, ψ. Χρησιμοποιήστε <suggestion>αυτή</suggestion></message>
    <short>Πρόβλημα ορθογραφίας</short>
    <example correction="αυτή">Μ' <marker>αυτήν</marker> γειτονεύω.</example>
    <example>Μ' αυτή γειτονεύω.</example>
</rule>

Rule 5 σαυτή → σαυτήν

<rule id="GREEK_ART_FEM_MISSING_N" name="τελικό ν θηλυκού άρθρου (σαυτή → σαυτήν)" type="grammar">
    <pattern>
        <marker>
            <token>σαυτή</token>
        </marker>
        <token regexp='yes'>(α|ε|ι|η|υ|ο|ω|ά|έ|ί|ή|ύ|ό|ώ|ϊ|ϋ|ΐ|ΰ|κ|π|τ|ξ|ψ|γκ|μπ|ντ|τσ|τζ|a|e|i|o|u|b|d|ch|z|g|k|l|p|q|t|x).*</token>
    </pattern>
    <message>Το τελικό ν διατηρείται στον γραπτό λόγο, όταν η επόμενη λέξη αρχίζει από φωνήεν ή από ένα από τα: κ, π, τ, γκ, μπ, ντ, τσ, τζ, ξ, ψ. Χρησιμοποιήστε <suggestion>σαυτήν</suggestion></message>
    <short>Πρόβλημα ορθογραφίας</short>
    <example correction="αυτήν"><marker>σαυτή</marker> υποκλίθηκα</example>
    <example>σαυτήν υποκλίθηκα</example>
</rule>

Rule 6 σαυτήν → σαυτή

<rule id="GREEK_ART_FEM_EXTRA_N" name="τελικό ν θηλυκού άρθρου (σαυτήν → σαυτή)" type="grammar">
    <pattern>
        <marker>
            <token>σαυτήν</token>
        </marker>
        <token>
            <exception regexp="yes">(α|ε|ι|η|υ|ο|ω|ά|έ|ί|ή|ύ|ό|ώ|ϊ|ϋ|ΐ|ΰ|κ|π|τ|ξ|ψ|γκ|μπ|ντ|τσ|τζ|a|e|i|o|u|b|d|ch|z|g|k|l|p|q|t|x).*</exception>
        </token>
    </pattern>
    <message>Το τελικό ν διατηρείται στον γραπτό λόγο, μόνο όταν η επόμενη λέξη αρχίζει από φωνήεν ή από ένα από τα : κ, π, τ, γκ, μπ, ντ, τσ, τζ, ξ, ψ. Χρησιμοποιήστε <suggestion>σαυτή</suggestion></message>
    <short>Πρόβλημα ορθογραφίας</short>
    <example correction="σαυτή"><marker>σαυτήν</marker> μεγάλωσα</example>
    <example>σαυτή μεγάλωσα</example>
</rule>

Sorry, I’m still not sure what to do now - rule ids GREEK_ART_FEM_MISSING_N appear more than once in your list. Are these the same rules? Rule ids need to be unique.

Sorry Daniel, I forgot to mention this, although I thought about it. My bad.
I have them like this in my grammar.xml and I believe they do not cause any conflicts.

Yes, it is the same rule but it applies to the various pronouns which you see after the rule-number.

What do you suggest we do about the id?

Could you run testrules.sh (when on Linux) or testrules.bat (when on Windows)? It will run all kinds of checks, including ids being unique.