Back to LanguageTool Homepage - Privacy - Imprint

Rules for SRXSentenceTokenizer

(rahul) #1


The languagetool is taking the complete text and does the grammar check instead of tokenizing the text in sentences. Since it doesn’t tokenize the text, only first word of the complete text is checked for the uppercase rule. If there are two sentences in text, it doesn’t check uppercase rule for second sentence.

Now it works fine when I create new web project and include the library. But when I include it in existing project, it doesn’t work. Looking at the code for SRXSentenceTokenizer, I see it loads segment rules for segment tokenization. But I can’t find the rule file. Does anyone know the directory path for the file if any?

Thank you.

(Daniel Naber) #2

On 2013-08-28 22:58, rahul [via LanguageTool User Forum] wrote:

library. But when I include it in existing project, it doesn’t work.

The segmentation file is org/languagetool/resource/segment.srx inside
languagetool-core-2.2.jar. If the file is not found, you’ll get an
Exception. Could you post how you use LanguageTool in your code?


(rahul) #3

I use below code in my project-

//Load American English
enLangTool = new JLanguageTool(new AmericanEnglish());
enLangTool.disableRule(“MORFOLOGIK_RULE_EN_US”); //Disable Spell Check

matches = enLangTool.check(textToAnalyze);

I also tried to use SRXSentenceTokenizer independently but same result. It doesn’t tokenize.

SRXSentenceTokenizer senTokenizer = new SRXSentenceTokenizer(new AmericanEnglish());
List sentences = senTokenizer.tokenize(“This is test to sentence tokenizer. Does it tokenize properly? Let’s check.”);
for(String s: sentences){

I don’t use maven in my project. I have included all required libraries in classpath.

(rahul) #4

Also it doesn’t throw any exception as well. :frowning:

(Daniel Naber) #5

That’s strange, your example with using the sentence tokenizer directly
works for me. Maybe there’s a conflict with other libraries of your
project. Can you list those dependencies? Is segment.jar in your
classpath? But I guess you will need to use a debugger to find the