Back to LanguageTool Homepage - Privacy - Imprint

Negated <exception> with regex not working in disambiguator

I’m working in adding a new language to LT.
My devised POS tags:
FL.* - Verb
EM.* - Noun
ND.* - Determiner

I’m trying to create some disambiguation rules, almost same as in docs
Determiner + VERB/NOUN → NOUN

<rule name="determiner + verb/EM -> EM" id="ND_FL_EM">
    <pattern>
        <token postag="ND"></token>
        <marker>
            <and>
                <token postag="FL.*" postag_regexp="yes"/>
                <token postag="EM.*" postag_regexp="yes"><exception negate_pos="yes" postag_regexp="yes" postag="(FL|EM):.*"/></token>
            </and>
        </marker>
    </pattern>
    <disambig postag="EM" />
</rule>

As I understand, the rule will match a token with FL and EM tag(excluding any EM token with readings other than EM.* or FL.*)
<exception negate_pos="yes" postag_regexp="yes" postag="(FL|EM).*"/> does not seem to work.

If I remove <exception>, the rule works. Am I missing something?

Thanks

Isn’t negate_pos in the exceptoin the issue?

Shouldn’t that work by excluding all EM tokens with readings other than FL/EM tags?

I am having a hard time understanding what you are exactly trying to achieve. You could try to remove the negate from the exception and specify all other tags to test. Double negations are always hard to understand and tricky.

The SENT_END postag could be a problem. Perhaps you need to add it:

<exception negate_pos="yes" postag_regexp="yes" postag="(FL|EM):.*|SENT_END"/>
1 Like

That seems to be the case. Without SENT_END, disambiguator does not match the word if it’s the last one. Thanks a lot.

I agree about complexity, it took me time to wrap my head around the negated exception as explained in the docs.