Back to LanguageTool Homepage - Privacy - Imprint

Unable to Spellcheck Correctly using LanguageTool Java API


(Rana) #1

I am trying to correct some misspelled words present in a text file
using LanguageTool Java API. After going through LT wiki and https://languagetool.org/ I tried some example codes -

JLanguageTool langTool;    
String text = "I.- Any reference _in this Section to a panicular genus or species of an anirmgl, cxccpl where the context";
langTool = new JLanguageTool(Language.AMERICAN_ENGLISH);
 langTool.activateDefaultPatternRules();

List<RuleMatch> matches = langTool.check(text);
for (RuleMatch match : matches) {
      System.out.println("Potential error at line " +
            match.getEndLine() + ", column " +
            match.getColumn() + ": " + match.getMessage());
      System.out.println("Suggested correction: " +
            match.getSuggestedReplacements());
}

Maven dependency -

<dependency>
<groupId>org.languagetool</groupId>
<artifactId>languagetool</artifactId>
<version>2.0.1</version>
</dependency>

The output is as follows -

Potential error at line 0, column 19: Possible spelling mistake found
Suggested correction: [Lin, Min, ain, bin, din, fin, gin, in, kin, min, pin, sin, tin, win, yin]
Potential error at line 0, column 41: Possible spelling mistake found
Suggested correction: []
Potential error at line 0, column 74: Possible spelling mistake found
Suggested correction: []
Potential error at line 0, column 83: Possible spelling mistake found
Suggested correction: []

Expected Ouput -

Starting check in English (American)...
1. Line 1, column 19
Message: Possible spelling mistake found (deactivate)
Correction: in; win; bin; pin; tin; min; Lin; din; gin; kin; yin; ain; fin; sin; IN; In; Min; PIN
Context: I.- Any reference _in this Section to a panicular genus or sp...
2. Line 1, column 41
Message: Possible spelling mistake found (deactivate)
Correction: particular; funicular
Context: ...I.- Any reference _in this Section to a panicular genus or species of an anirmgl, cxccpl ...
3. Line 1, column 74
Message: Possible spelling mistake found (deactivate)
Correction: animal
Context: ...n to a panicular genus or species of an anirmgl, cxccpl where the context
4. Line 1, column 83
Message: Possible spelling mistake found (deactivate)
Context: ...nicular genus or species of an anirmgl, cxccpl where the context
Potential problems found: 4 (time: 171ms)

I got this output from LT standalone Desktop software. I compared its
installation folders and its contents with my source code and API jars
but could not find anything special which is making the former a better
solution.

Also, I want to replace the misspelled words with the first element in suggestion list.

Any kind of help will be highly appreciated.


(Daniel Naber) #2

This is a very old version, we're currently at 3.5. Please try <version>3.5</version>

You should be aware that the first suggestion is not guaranteed to be the best.


(Rana) #3

Thanks Daniel for a quick reply. Its working with your suggestion.
I am using this -
<dependency>
<groupId>org.languagetool</groupId>
<artifactId>language-en</artifactId>
<version>3.5</version>
</dependency>

Any idea how to correct misspelled words?


(Daniel Naber) #4

If you want to apply a suggestion from LT, you need to work on the string that you pass in, i.e. replace the incorrect part (match.getFromPos() to match.getToPos()) with the suggestion. But as there may be no suggestions or more than one, this will not always produce a correct text. There's no solution for that other than using LT interactively and letting a user select the best suggestion.


(Rana) #5

Yes. You are correct. The first element in suggestion list dose not always give the correct word.
Thanks a lot.