Is there support to add custom words to the existing dictionary?

Hi

I’m working for a Clinical research organization.So I have requirement to add some custom medical terms to the existing dictionary.
Moreover I want to do this using the Api and not manually editing the ignore.txt file. Is it possible?

Sudha

It’s a bit of a hack, but your can add words to the ignore.txt file,
e.g. org/languagetool/resource/en/hunspell/ignore.txt for English.

I tried that option and it works but I want to know if there is any Java class(provided by LanguageTool) that does this operation.

It’s possible, but a bit tricky: you first need to deactivate the
existing spell checker rule by its id, then create a new one which
ignores some words, and then add that new rule. Here’s an example for
British English that now accepts the word “errorone”:

BritishEnglish language = new BritishEnglish();
JLanguageTool tool = new JLanguageTool(language);
String text = "errorone errortwo";

System.out.println(tool.check(text));  // will find two errors

tool.disableRule("MORFOLOGIK_RULE_EN_GB");
MorfologikSpellerRule spellingRule = new 
MorfologikSpellerRule(JLanguageTool.getMessageBundle(), language) {
   @Override
   public String getFileName() {
     return "/en/hunspell/en_GB.dict";
   }
   @Override
   public String getId() {
     return "NEW_SPELLING_RULE";
   }
};
spellingRule.addIgnoreTokens(Arrays.asList("errorone"));  // list of words to be ignored
tool.addRule(spellingRule);

System.out.println(tool.check(text));  // will only find one error

This is not working for me, when using GermanyGerman.

I disabled MORFOLOGIK_RULE_DE_DE and used an own MorfologikSpellerRule like in the example. Also I saw in my debugger the added word in “wordsToBeIgnored”. But the result is a “possible misspelling” info.

Any ideas?

The id of the rule you need to disable is “GERMAN_SPELLER_RULE”. Yes, I know this is confusing…

Works like a charm, thank you for the great tool and support.

How does this work for French? There is no MORFOLOGIK_RULE_IT_IT? Does it also have another name? I cannot find something similar for French.

Also there is no fr_FR.dict only a fr_FR.dic? Does it mean FR is already ported completely from hunspell to JAVA?

How can we add French words to the ignore list?

Same for Spanish, how can we add words to the ignore list?

You can find the rules for French in French.getRelevantRules(). For French, HunspellNoSuggestionRule is used and it’s id is HUNSPELL_NO_SUGGEST_RULE. Similar for Spanish, the rule id is HUNSPELL_RULE.

Hi,

ah ok the connections are getting clearer. But If I use HUNSPELL_RULE For ignoring Words that way it doesnt work. Is there anything special to know? MORFOLOGIK_RULE_IT_IT seams to work, as is exists.

kind regards

ahh ok the problem should be that I am using the wrong parent for Overwriting, Hunspell based rules need extended from HunspellRule, not the MorfologikSpellerRule

“/en/hunspell/en_GB.dict” this is default path for dictionary. so how to set my custom path for dictionary?

I think you’ll need to extend class MorfologikAmericanSpellerRule and overwrite getFileName() and getId(). Then make sure that the rule with id MORFOLOGIK_RULE_EN_US is disabled and your new rule is enabled.

Ya, right, i did same process as per your comment but API doesn’t show my suggestion list as per my custom directory.

Check below code

            spellCheker.disableRule("MORFOLOGIK_RULE_EN_US");               
            MorfologikSpellerRule spellingRule = new MorfologikSpellerRule(
            JLanguageTool.getMessageBundle(), spellCheker.getLanguage()) {
            @Override
            public String getFileName() {
                ClassLoader classLoader = ClassLoader.getSystemClassLoader();
                File file = new File(classLoader.getResource("mySpelling.dict").getFile());
            return file.getName();
            }
            @Override
            public String getId() {
            return "NEW_SPELLING_RULE";
            }
            };
            spellCheker.addRule(spellingRule);

Could you provide your mySpelling.dict and this code in form of a self-contained test case, so it’s easier for us to reproduce the issue?

MorfologikAmericanSpellerRule is final class that’s why I’m not able to extends it.

You could extend AbstractEnglishSpellerRule, MorfologikAmericanSpellerRule doesn’t add much anyway.

I want to use my custom suggestion words instead of your API suggestion words using Java and this suggested word file is placed on my “user. home” directory.

So, Is this possible or not? If possible than How to achieve?

@Override
public String getFileName() {
 return "/en/hunspell/en_GB.dict";
}

In this method, I want to load from user.home directory. It is possible?