Back to LanguageTool Homepage - Privacy - Imprint

Requested enhancement for RuleMatch object

(Rick Meyer) #1

The text that we need to use LanguageTool for includes html. Since LT doesn't have a filter to skip errors in the html code we were attempting to resolve this by passing both the text with html and performing a second check on the text with the html sanitized out. Then we would compare both of the List objects and build a list of the common errors, but with the positioning information from the "withHtml" version. The issue is that there is no good way to compare the lists to determine which RuleMatch objects are for the same error. I was attempting to compare the suggestions, but there are too many cases where no suggestions are returned, so the comparator was returning false positives.

I think the solution that would work for us is if the RuleMatch object included the word(s) that the error is reporting against. Is there any possibility of adding this in a near future release of LanguageTool?

(Daniel Naber) #2

I think the proper solution to this us to use AnnotatedText, built by AnnotatedTextBuilder. This way, you can define that your HTML is just markup and it will be ignored.