Java LanguageTool Extension

Hello Everyone,

I am extending LanguageTool in Java by making new rules for our own company texts. I am facing an issue where in around 0.09% percent of my whole text throws a RunTimeException while parsing through the text.

I want to find a solution for this, as the new developed rules work for most of the cases I want to ignore these error for few cases that it occurs but I am unable to catch this exception in a try and catch block.

I tried putting the code part of a rule that i create under this method
“public RuleMatch[] match(List sentences) throws IOException”
in a try and catch block but it didnt catch the error.

I want language tool to not check or ignore the rule which is creating runtime exception for a particular text. Does anyone know here how can i do it ??

It sounds like it makes more sense to avoid the error in the first place. What’s the full stack trace of the error?

Exception in thread “main” java.lang.RuntimeException: java.lang.StringIndexOutOfBoundsException: begin 0, end 4, length 3
at org.languagetool.JLanguageTool.performCheck(JLanguageTool.java:1233)
at org.languagetool.JLanguageTool.checkInternal(JLanguageTool.java:973)
at org.languagetool.JLanguageTool.check(JLanguageTool.java:903)
at org.languagetool.JLanguageTool.check(JLanguageTool.java:885)
at org.languagetool.JLanguageTool.check(JLanguageTool.java:872)
at org.languagetool.JLanguageTool.check(JLanguageTool.java:862)
at org.languagetool.JLanguageTool.check(JLanguageTool.java:844)
at org.languagetool.JLanguageTool.check(JLanguageTool.java:801)
at org.languagetool.JLanguageTool.check(JLanguageTool.java:785)
at application.Jlangtool.main(Jlangtool.java:59)
Caused by: java.lang.StringIndexOutOfBoundsException: begin 0, end 4, length 3
at java.base/java.lang.String.checkBoundsBeginEnd(String.java:3751)
at java.base/java.lang.String.substring(String.java:1907)
at org.languagetool.JLanguageTool$TextCheckCallable.findLineColumn(JLanguageTool.java:1896)
at org.languagetool.JLanguageTool$TextCheckCallable.getTextLevelRuleMatches(JLanguageTool.java:1800)
at org.languagetool.JLanguageTool$TextCheckCallable.call(JLanguageTool.java:1767)
at org.languagetool.JLanguageTool$TextCheckCallable.call(JLanguageTool.java:1739)
at org.languagetool.JLanguageTool.performCheck(JLanguageTool.java:1229)
… 9 more

It happens when the toRuleMatchArray(ruleMatches) gets the added rules

Maybe that’s caused by some rare Unicode characters. Which text exactly causes this?

There is some special text where there is a tab in between or no start token.
E.g
“Hello my name is Mariam”
“Hello my name is Mariam.I am a developer”

Something like this. But I want to do this as a strategy purpose as well. Because I can’t alter my code for all text senarios and I want languagetool to skip to check the rules that are creating an error and give a result for all he other rules for a smooth running at user end.