I converted all postags to the : format for Dutch. Postag dictionary seems to work fine.
But reverse lookup for tags is a problem. None of those seem to be working anymore.
One of the errors is:
Dutch: Incorrect suggestions: [ergert zich aan] != [(ergeren) zich aan] for rule IRRITEERT_ZICH[1] on input: Hij irriteert zich aan deze fout. expected:< [ergert zich aan] > but was:< [(ergeren) zich aan] >
This is the complete rule:
< rulegroup id=“IRRITEERT_ZICH” name=“irriteert zich etc”>
< rule>
< pattern>
< token inflected=“yes”>irriteren< /token>
< token regexp=“yes”>zich|me< /token>
< token>aan< /token>
< /pattern>
< message>U bedoelt vast: < suggestion>< match no=“1” postag=“WKW.*” postag_regexp=“yes”>ergeren< /match> < match no=“2”/> < match no=“3”/>< /suggestion>?< /message>
< url>https://onzetaal.nl/taaladvies/advies/ik-irriteer-erger-me-aan-haar< /url>
< example correction=“ergert zich aan”>Hij < marker >irriteert zich aan< /marker> deze fout.< /example>
< /rule>
< /rulegroup>
This means the reverse lookup of the postag of ‘irriteert’ does not function.
In the dictionary input file are present:
irriteert< tab>irriteren< tab>WKW:TGW:3EP
ergert< tab>ergeren< tab>WKW:TGW:3EP
I understand you have made all changes only locally so far? Could you upload the new dictionaries somewhere (i.e. not to git, but to some dropbox or whatever)?
dutch_tags.txt also needs to be updated. It’s a list with all tags. I’m attaching an updated file, created with awk '{print $3}' dictionary.dump | sort | uniq.
Okay, I will put it in its place. But it generates the question why it is not put there when the -o is in the command… Never known it is any other than documentation.
But thanks a lot. It helps. Now I can start editing rules to pass the tests again.