Hi.
I’m new to LanguageTool, but I do appreciate the page LanguageTool - Online Grammar, Style & Spell Checker for the Esperanto language. However, there seems to be mistakes that are not caught by the checker. I have tried to create the following rule using Create a new LanguageTool rule, then manually updating it using the expert mode. The checker said that the unit tests pass, and that it did not trigger false alarms on Wikipedia and Tatoeba. Great! Is this the right place to share it to add it to the tool?
<!-- Esperanto rule, 2017-08-14, by Martin Bodin -->
<rule id="AKUZATIVA_TIKORELATIVO_RILATA_AL_NEAKUZATIVA_SUBSTANTIVO" name="Akuzativan “Ti”-korelativo rilata al neakuzativa substantivo.">
<pattern>
<token postag='V.*tr.*' postag_regexp='yes'></token>
<token postag='T.*akz.*' postag_regexp='yes'></token>
<marker>
<token postag='O.*nak.*' postag_regexp='yes'></token>
</marker>
</pattern>
<message>Aspektas ke ‘<match no="2"/>’ rilatas al ‘<match no="3"/>’, sed ‘<match no="3"/>’ ne akuzativas: probable devintus esti ‘<suggestion><match no="3"/>n</suggestion>’.</message>
<short>Mankas akuzativo</short>
<example correction='frazon'>Mi ne ŝatas tiun <marker>frazo</marker>.</example>
<example>Mi ne ŝatas tiun frazon.</example>
<example correction='ideojn'>Kiu apogas tiajn <marker>ideoj</marker>?</example>
<example>Kiu apogas tiajn ideojn?</example>
</rule>
I also wonder whether this is the right way to write a rule. For instance, I have writen the “postag=‘V.tr.’ postag_regexp=‘yes’” part manually, as the generator generated “postag=‘V’ posttag='tr”, which did not pass the XML checks (by the way, it then displays an error in German: is this wanted?). I would like to have your feedback about this rule before writing other similar rules.
Interestingly, the sanity checks do not pass in https://community.languagetool.org/ruleEditor/expert… by showing an erronous sentence! I think that this proves my rule to be useful, as it as helped detect an mistake in Wikipedia or Tatoeba. The sentence is “Meti tiu libron en la poŝon ne eblas”, and should be “Meti tiun libron en la poŝon ne eblas”
I thus have to look for this sentence in Wikipedia and Tatobea now
Martin.
The good news is that the offending sentence (Tatoeba’s sentence number 1450387 — I am not authorised to send the link here for some reasons) has already been corrected. But this means that the page https community languagetool org ruleEditor expert (sorry, I am not allowed to send the link for some reasons) is not up to date with Tatoeba This is frustrating…
@Jan_Schreiber: Sorry for the multiposting. Here is the error: « Error: XML validation failed: org.xml.sax.SAXParseException; lineNumber: 4; columnNumber: 34; Attribut “postag” wurde bereits für Element “token” angegeben. ». I indeed have German in my HTTP_HEADER field, but it is in a very low priority (Spanish, Esperanto, English, French, and Portuguese have higher priority than German in it…), so it is unlikely that the sentence has not been translated in any of the other languages with higher priority.
Here is how to reproduce. Go to https community languagetool org ruleEditor2 index (still not able to post a link… this is frustrating) and create a simple rule. This rule should have at least one token of the form “Part-of-speech” with more than one item in it. For instance “O akz” (a noun in the accusative form). This is translated in XML by “<token postag='O' postag='akz'></token>”, which does not validate.
A valid XML would be “<token postag='O akz'></token>”.
So there are two errors there: first in the XML generator (which should generate valid XML files when possible), second, in the XML checker, which should probably not display an error in German when not asked.
Hoping that it can help.
Martin.
P.S.: The build from Travis just finished and was accepted. So I guess that it really just is the database of the web interface which is not up to date.
Well, it’s not really supposed to be up-to-date, as we only use Tatoeba as a test corpus. In other words, we’re testing LT, not Tatoeba. The same is true for the Wikipedia data.