Could someone please describe the process LT is doing to reverse lookup a matching tag? I keep having problems getting some rules to work.
1 a word is matched
2 the postag is stored
3 the replacement word is stored
4 the root word is fetched from the postag dictionary
5 the alternative word is searched in the postag dictionary
6 the child words of this root word are filtered by the regexp and the items matching the stored postag are offered
<rulegroup id="GESCHIEDEN" name="geschieden">
<rule>
<antipattern><token>uw</token><token>wil</token><token>geschiede</token></antipattern>
<pattern>
<token inflected="yes">geschieden</token>
</pattern>
<message>De tekst wordt vlotter als je deze herschrijft met <suggestion><match no="1" postag="WKW:.*" postag_regexp="yes">gebeuren</match></suggestion>.</message>
<url>https://onzetaal.nl/taaladvies/ouderwets-taalgebruik</url>
<example correction="gebeurde">En zo <marker>geschiedde</marker>, want voor Jan is een woord een woord.</example>
<example correction="gebeuren">Ook dit helpen dient beheerst te <marker>geschieden</marker>.</example>
<example correction="gebeurt">Openen <marker>geschiedt</marker> door het duwen tegen de horizontale balk.</example>
<example correction="gebeurde">De controle <marker>geschiedde</marker> door de pastoor.</example>
</rule>
</rulegroup>
In postag="(WKW:.*)" you select the token you want from the original sentence (there can be more than one postag), and in postag_replace="$1" you put what you want to synthesize. In this case you keep the whole original postag, but you can change it totally or partially.
I tried this, but it does not help. There must be more. Feel free to edit and test…
Strangely enough, it works for the other forms of ‘geschieden’, like ‘geschiedt’
@dnaber: I think this could be a bug; geschieden is in the dictionary as a WKW.* more than once. This is not very common, but also not rare. It could be LT is not deciding for the correct root?
I guess something is needed to make sure a root is selected; the tag of the matched word is the one to be uses (after regexp replace) to get the right derivative.