Why is JLanguageTool not thread safe?

Hey guys,

I’m using JLanguageTool to check text.
According to the doc in JLanguageTool.java, JLanguageTool is not thread safe. Create one instance per thread.

I’m not sure why JLanguageTool is not thread safe. I try to use a JLanguageTool instance in multiple threads, there is no exceptions. Does the doc out of update? Or I missed something?

If the “new JLanguageTool” called in multiple thread, it will be very slow. Any suggestion on how to used it to avoid large latency?

My test java code is:

  private static void test(Language testLanguage, String testText) throws IOException {
    new Thread(new Runnable() {
      @Override
      public void run() {
        try {
          Stopwatch stopwatch = Stopwatch.createStarted();
          JLanguageTool t = new JLanguageTool(testLanguage);
          System.out.println(testLanguage + " New time: " + stopwatch.elapsed(TimeUnit.MILLISECONDS));
          List<RuleMatch> errors = t.check(testText);
          System.out.println(testLanguage + " Check time: " + stopwatch.elapsed(TimeUnit.MILLISECONDS));
        } catch (IOException e) {
        }
      }
    }).start();
  }
public static void main(String[] args) throws IOException {
    test(new AustrianGerman(), "Esta vai a ser unha mostra de de exemplo para amosar o funcionamento de LanguageTool.");

    test(new GermanyGerman(), "Esta vai a ser unha mostra de de exemplo para amosar o funcionamento de LanguageTool.");

    test(new SwissGerman(), "Esta vai a ser unha mostra de de exemplo para amosar o funcionamento de LanguageTool.");
  }

The above code, it will cost more than 2 seconds on “new JLanguageTool()”.

What’s more. The following code will cost more than 6 seconds on “new JLanguageTool()”, and more than 20 seconds on checking text.

public static void main(String[] args) throws IOException {
    Language language = new GermanyGerman();
    for (int i = 0; i < 100; i++) {
      test(language, "Esta vai a ser unha mostra de de exemplo para amosar o funcionamento de LanguageTool.");
    }
  }

What is the correct way to use JLanguageTool ? Thanks ahead.

Thanks.

You’re initializing them all at once because you don’t wait for the first thread to finish. This way the threads block each other while initializing data their structures the first time. Just create the object once outside a thread (or wait for the thread result) and you’ll see that all following calls to new JLanguageTool() will be fast.

This code should produce 3 threads and 3 copies of the JLanguageTool, 1 per thread.

There are two ways where thread safety isn’t guaranteed:

  1. The same object is used across threads
  2. Threads update or work on the same object

For example, if in your code, you created a static object of the JLanguageTool and used that single object across multiple threads, you could run into deadlock issues.

What they are basically stating is that there are write processes or initializations that could be interrupted if a second thread attempted to perform the same action. I also have to say that in a professional setting, dealing with thread issues is far more time consuming than simply creating a JLanguageTool object in a function and scoped to that function frame rather than attempting to handle a single shared object instance across multiple threads. Is it optimal from a memory management perspective? No but memory is cheap.

Thanks Daniel.

Initializing JLanguageTool one by one will make the process fast.

Because JLanguageTool is not thread safe, it need to initialize a new JLanguageTool object in each thread. And thread will not wait for another thread. So the process will cause large latency.

For example:
If I build a action which will be used by multiple users.

class RunLanguageToolAction() {
     public Result run(String language, String text) {
           JLanguageTool t = new JLanguageTool(**);
           ......
     }
}

Then this action may be called by multiple thread. Then the threads block each other.

Am I right?
How should I use JLanguageTool in this case?

Thanks.

Thanks. I understand.

To avoid run into deadlock issues, it should create a new JLanguageTool object in each thread.

But the threads block each other while initializing data, it makes the process very slow.

In the homepage of LanguageTool (https://languagetool.org/), many users may run the check at the same time. And I think they are run in different threads, why they do not block each other?

Thanks.

You’ll see that only the very first initializing is slow. After that, it will be fast, no matter from how many threads it’s called. That’s what we do on languagetool.org and it works fine to serve many hundred thousand requests per day from a single machine.

sounds like JLangaugeTool has two levels of initialization:
internal initialization (slow but only needed once per (server-)session)
external initialization (fast but needed for every instance)

Thank you very much!
I will try it. Thanks for your help.