Back to LanguageTool Homepage - Privacy - Imprint

Disambiguation testing


(Ruud Baars) #1

I produced a list of common words that are ambiguous as far as postags go. I added the list as a comment in the disambiguation.xml for now, planning to tackle them one by one (or by type if possible) now antipatterns are there.

I need tests however, to be sure not to introduce more confusion. How can I add examples etc for testing to disambiguation rules?


(Daniel Naber) #2

Please see en/disambiguation.xml for examples. You can check type="ambiguous" for changes and type="untouched" for cases where the disambiguation pattern doesn’t match or doesn’t change anything.


(Ruud Baars) #3

Thanks. Does ‘untouched’ mean ‘by all disambiguation’ or ‘by this rule’? Latter would be most helpful.


(Daniel Naber) #4

I think “by this rule”, but please give it a try, I’m not 100% sure.


(Ruud Baars) #5

It is a lot of work to get examples from the corpus, change it into valid testing examples for the disambig rule. The format is very different from a tagged sentence.

Secondly, since rules are cascading, there is an error for the ‘incoming’ pattern when an earlier rule did something to the tags. That makes the testing of rules even harder.

Is there a way to test a disambiguation rule on live data a bit easier?