Back to LanguageTool Homepage - Privacy - Imprint

Greek rules: "Final n in articles"-rules do not work


#21

I just put the above 6 rules in the greek grammar.xml and ran “testrules,sh el-GR”.
The following is what I get:

Known languages: [English, English (US), English (GB), English (Australian), English (Canadian), English (New Zealand), English (South African), Persian, French, German, German (Germany), German (Austria), German (Swiss), Simple German, Polish, Catalan, Catalan (Valencian), Italian, Breton, Dutch, Portuguese, Portuguese (Portugal), Portuguese (Brazil), Portuguese (Angola preAO), Portuguese (Moçambique preAO), Russian, Asturian, Belarusian, Chinese, Danish, Esperanto, Galician, Greek, Japanese, Khmer, Romanian, Slovak, Slovenian, Spanish, Swedish, Tamil, Tagalog, Ukrainian, Testlanguage]
Running XML validation for el/el-GR/grammar.xml...
No rule file found at /org/languagetool/rules/el/el-GR/grammar.xml in classpath
Running pattern rule tests for Greek... Exception in thread "main" java.lang.AssertionError: Greek rule GREEK_ART_FEM_MISSING_N[1]:
"Ο Πέτρος πήγε στη αντιπροσωπεία και αγόρασε ένα καινούργιο αυτοκίνητο."
Errors expected: 1
Errors found   : 2
Message: Το τελικό ν διατηρείται στον γραπτό λόγο, όταν η επόμενη λέξη αρχίζει από φωνήεν ή από ένα από τα: κ, π, τ, γκ, μπ, ντ, τσ, τζ, ξ, ψ. Χρησιμοποιήστε <suggestion>στην</suggestion>
Analyzed token readings: [/SENT_START*] Ο[Ο/null*]  [ /null*] Πέτρος[Πέτρος/null]  [ /null*] πήγε[πήγε/null]  [ /null*] στη[στη/null]  [ /null*] αντιπροσωπεία[αντιπροσωπεία/null]  [ /null*] και[και/null]  [ /null*] αγόρασε[αγόρασε/null]  [ /null*] ένα[ένα/null]  [ /null*] καινούργιο[καινούργιο/null]  [ /null*] αυτοκίνητο[αυτοκίνητο/null] .[./SENT_END*]
Matches: [GREEK_ART_FEM_MISSING_N[1]:14-17:Το τελικό ν διατηρείται στον γραπτό λόγο, όταν η επόμενη λέξη αρχίζει από φωνήεν ή από ένα από τα: κ, π, τ, γκ, μπ, ντ, τσ, τζ, ξ, ψ. Χρησιμοποιήστε <suggestion>στην</suggestion>, GREEK_ART_FEM_MISSING_N[1]:14-17:Το τελικό ν διατηρείται στον γραπτό λόγο, όταν η επόμενη λέξη αρχίζει από φωνήεν ή από ένα από τα: κ, π, τ, γκ, μπ, ντ, τσ, τζ, ξ, ψ. Χρησιμοποιήστε <suggestion>στην</suggestion>]
	at org.junit.Assert.fail(Assert.java:88)
	at org.languagetool.rules.patterns.PatternRuleTest.testBadSentences(PatternRuleTest.java:324)
	at org.languagetool.rules.patterns.PatternRuleTest.testGrammarRulesFromXML(PatternRuleTest.java:267)
	at org.languagetool.rules.patterns.PatternRuleTest.runTestForLanguage(PatternRuleTest.java:192)
	at org.languagetool.rules.patterns.PatternRuleTest.runGrammarRulesFromXmlTestIgnoringLanguages(PatternRuleTest.java:142)
	at org.languagetool.rules.patterns.PatternRuleTest.main(PatternRuleTest.java:580)
Running disambiguator rule tests...
Running disambiguation tests for Greek...
1 rules tested (37ms)
Tests successful.
Running XML bitext pattern tests...
Tests successful.
Validating false-friends.xml...
Validation successfully finished.

This makes me believe that there is no conflict with the id, but this one rules contains some other problem which I do not recognize at the moment.

What do you think, Daniel?


#22

This is the tested grammar.xml.
Just to let you check it out.

grammar.xml.zip (8.2 KB)


(Daniel Naber) #23
Errors expected: 1
Errors found   : 2

The example sentence “Ο Πέτρος πήγε στη αντιπροσωπεία και αγόρασε ένα καινούργιο αυτ…” is matched by 2 rules. So it seems the rules don’t just have the same id but are also identical or almost identical? In that case, they could maybe be merged or at least put into a common <rulegroup>.


#24

I am not sure, but I suspect it has to do with the very strong similarity between “την” and “στην”.
Whereas “την” was the rule you integrated a few weeks ago.
Would you agree with me?

If that is the case, I do not know how to merge these two rules.
Something like “(σ)την” I suppose?


(Daniel Naber) #25

With the ZIP you attached I have a different failure: java.lang.AssertionError: Greek: Incorrect suggestions: [αυτήν] != [σαυτήν] for rule GREEK_ART_FEM_MISSING_N4[1] on input: σαυτή υποκλίθηκα expected:<[αυτήν]> but was:<[σαυτήν]> - (I had to add “4” to the rule id to make it unique)


#26

I suppose this confirms my suspicion.
“αυτήν” and “σαυτήν” differ also only to the “σ”.


(Daniel Naber) #27

I have committed the rules now, and the tests don’t fail.


#28

Fantastic, I am most grateful for your time and efforts to get this right.
Thank you Daniel.