We use LanguageTool against text that will be presented on a web page, so it naturally includes HTML elements. We use the AnnotatedTextBuilder to separate the text from the HTML. We noticed a bit of odd behavior recently though. If the following HTML is used:
<p>Magma doesn't taste good</p><p>I don't recommend it.</p>
With the <p>
and </p>
tags being placed in the markup field and the rest in the text, it appears that the logic is concatenating the text elements so the word “goodI” is being considered and obviously then marked as misspelled.
Clearly in this case there should be a . at the end of the first sentence, but we have observed other instances like this where there was no punctuation necessary. This is just one example where I was able to recreate the issue.