Back to LanguageTool Homepage - Privacy - Imprint

Testing postag ambiguity

I thought there was a hint on the wiki once, that showed how to check if a tag is the only one. I cannot find that anymore. But anyway, I want to check if a word has more than one postag, no matter which those are, as long as those are not equal.

Does anyone know how to do that?

I think it is not possible to detect just “more than one POS tag”.
You can use <and> to detect that a token is a “noun” (or adjective, verb…) and something else:

   <pattern>
       <and>
           <token postag="N.*" postag_regexp="yes"/></token>
           <token postag="N.*" postag_regexp="yes" negate_pos="yes"/>
       </and>
    </pattern>

I know that, but that is not what I need.

I will have to make a large collection (>600) of separate rules to make it work then…

To improve the disambiguator, I need to know what it did so far, to be able to make additional rules.
Is there a different method to enhance the disambiguator systematically?

Perhaps you mean this: the token has the POS tag “NN” and only “NN”:

<token postag="NN"><exception postag="NN" negate_pos="yes"/></token>

And the wiki entry is here: Tips and Tricks | dev.languagetool.org

I just want to detect ambiguity. The idea was to detect all sentences with ambiguity left. Now I will try to do so per postag pair.