I am trying to create a rule (for Icelandic) that looks for the word “Breskur” in a sentence, excluding it if it’s at the start of a sentence. But my rule is detecting the word Breskur both in “Breskur maður…” and “Maður er Breskur…”
<rule...> <token postag="SENT_START" negate_pos="yes">Breskur</token> </rule>
The Icelandic language has not been tagged for the Language Tool, but should postag=“SENT_START” not work?
I know it’s not necessarily logical, but the sentence start tag is its own token (unlike the sentence end tag). So
<token postag="SENT_START" negate_pos="yes"/> <token>Breskur</token>
should work (not tested). You can use http://community.languagetool.org/analysis/index?lang=is to see LT’s internal analysis.
Thanks a lot. This works!