Back to LanguageTool Homepage - Privacy - Imprint


Currently setting SENT_END on the last token makes some rules a bit flaky.
Consider these two sentences in

He pointed to it’s reddest area.


He pointed to it’s reddest area

First generates an error, while second does not.
The reason is that negate_pos=“yes” on the last token in the rule need to take to account SENT_END and many rules (if not most) do not. To allow the rule to work for sentence that ends on the last word you have to add “|SENT_END” to the postag attribute. That’s a bit tricky to remember.

See the patch below to illustrate this in grammar.xml

What’s worse your last token may also get PARA_END (I suspect you can’t trigger that in grammar.xml but it happens on real texts via command line or REST API).
So technically on any rule that has negate_pos in the last token you need to add “|SENT_END|PARA_END”.

This technically may also apply to some Java rules (I know I noticed this moment with SENT_END while writing some of Ukrainian Java rules, but I don’t think even accounted for PARA_END).

diff --git a/languagetool-language-modules/en/src/main/resources/org/languagetool/rules/en/grammar.xml b/languagetool-language-modules/en/src/main/resources/org/languagetool/rules/en/grammar.xml
index d4e8d21..68603d2 100644
--- a/languagetool-language-modules/en/src/main/resources/org/languagetool/rules/en/grammar.xml
+++ b/languagetool-language-modules/en/src/main/resources/org/languagetool/rules/en/grammar.xml
@@ -2990,6 +2990,8 @@
                 <message>Did you mean <suggestion>its <match no="4"/> <match no="5"/></suggestion>?</message>
                 <example correction="its reddest area">For the painting, <marker>it's reddest area</marker> was in the upper left.</example>
+                <example correction="its reddest area">He pointed to <marker>it's reddest area</marker>.</example>
+                <example correction="its reddest area">He pointed to <marker>it's reddest area</marker></example>
             <!-- for it's .*/JJ|NN|NNS::word=for its::pivots=\1,its -->
             <rule id="FOR_ITS_NN" name="for its NN (possessive)">

issue #1205

I think that SENT_START and SENT_END should be handled similarly.

Here’s another interesting moment. Sometimes the sentence gets \n and PARA_END after SEND_END, here’s the AnalyzedSentence (tagged as part of bigger text):

[<S> Псевдосервіс[Псевдосервіс/null],[,/null] будь[бути/verb:imperf:impr:s:2] ласка[ласка/noun:anim:f:v_naz:xp1,ласка/noun:inanim:f:v_naz:xp2,</S>] <P/> ]

Here “ласка” gets SENT_END, but then the sentence has one more token “\n”, marked as PARA_END. Interestingly even though this last token is \n and isWhitespace() returns true, it’s returned as part of sentence.getTokensWithoutWhitespace(). So rules that will get \n as a regular token.