LanguageTool 5.5 Slower than 4.5

I recently decided to try to update the version of LangauageTool I was using from 4.5 to 5.5. 5.5 seems to be almost 20 times slower than 4.5 when being used for automated requests.
Is there something that I am missing?

The python code that I was using to verify this is below

from pylanguagetool import api
import time
import os
import sys

start = time.perf_counter()
for i in range(0, 100):
	api.check("Test Text", "http://127.0.0.1:8081/v2/", lang='en-US')
	
print(f'LanguageTool took {time.perf_counter() - start}')

Things you can try:

  • Don’t time the first 3-6 iterations, they tend to be slow
  • Set cacheSize in the config file which you specify with --config ... when starting the server

Do you have a suggestion for the cacheSize? The values that I have tried aren’t seeming to change much.

If you always send the same text, any size should be fine. You can also try these settings:

pipelineCaching=true
maxPipelinePoolSize=500
pipelineExpireTimeInSeconds=3600
1 Like

That appears to work, thanks!

These settings give me much improvement in response speed, too:

I’ve measured with:

% hyperfine --warmup 3 \
  'curl -s --data "language=de-DE&text=Eine erster Versuch" http://localhost:8081/v2/check' \
  "curl -s http://localhost:8081/v2/check?c=1 \
  --data-raw 'data=%7B%22text%22%3A%22Einen+guten+Morgen%22%7D&textSessionId=56639%3A1637566034156&enableHiddenRules=true&motherTongue=de&level=picky&language=auto&noopLanguages=de%2Cen&preferredLanguages=de%2Cen&preferredVariants=en-GB%2Cde-DE%2Cpt-BR%2Cca-ES&disabledRules=WHITESPACE_RULE%2CCONSECUTIVE_SPACES&mode=textLevelOnly'"

without:

Benchmark 1: curl -s --data "language=de-DE&text=Eine erster Versuch" http://localhost:8081/v2/check
  Time (mean ± σ):     440.5 ms ±  48.1 ms    [User: 5.2 ms, System: 1.6 ms]
  Range (min … max):   387.3 ms … 547.8 ms    10 runs

Benchmark 2: curl -s http://localhost:8081/v2/check?c=1 --data-raw 'data=%7B%22text%22%3A%22Einen+guten+Morgen%22%7D&textSessionId=56639%3A1637566034156&enableHiddenRules=true&motherTongue=de&level=picky&language=auto&noopLanguages=de%2Cen&preferredLanguages=de%2Cen&preferredVariants=en-GB%2Cde-DE%2Cpt-BR%2Cca-ES&disabledRules=WHITESPACE_RULE%2CCONSECUTIVE_SPACES&mode=textLevelOnly'
  Time (mean ± σ):      82.1 ms ±  10.6 ms    [User: 5.2 ms, System: 1.7 ms]
  Range (min … max):    72.4 ms … 119.5 ms    31 runs

with the settings:

Benchmark 1: curl -s --data "language=de-DE&text=Eine erster Versuch" http://localhost:8081/v2/check
  Time (mean ± σ):      34.0 ms ±   3.5 ms    [User: 5.3 ms, System: 1.6 ms]
  Range (min … max):    30.7 ms …  50.0 ms    87 runs

Benchmark 2: curl -s http://localhost:8081/v2/check?c=1 --data-raw 'data=%7B%22text%22%3A%22Einen+guten+Morgen%22%7D&textSessionId=56639%3A1637566034156&enableHiddenRules=true&motherTongue=de&level=picky&language=auto&noopLanguages=de%2Cen&preferredLanguages=de%2Cen&preferredVariants=en-GB%2Cde-DE%2Cpt-BR%2Cca-ES&disabledRules=WHITESPACE_RULE%2CCONSECUTIVE_SPACES&mode=textLevelOnly'
  Time (mean ± σ):      10.3 ms ±   0.5 ms    [User: 4.4 ms, System: 2.2 ms]
  Range (min … max):     9.5 ms …  13.4 ms    278 runs

I’ve extended my German blog post about Language-Tool to recommend these settings.

Hi guys

After migration of LanguageTool version from 4.x to 5.7 i noticed that it’s http server now works much much slower, as already described by @tsonnen. I’ve found this thread and tried to apply a config as advised here

java -cp "languagetool-server.jar" org.languagetool.server.HTTPServer --config languagetool.cfg --port 8081 --allow-origin

with the contents of languagetool.cfg file as below

pipelineCaching=true
maxPipelinePoolSize=500
pipelineExpireTimeInSeconds=3600

but i got no effect on speed, despite @tsonnen who started this thread confirmed it helped, so i wonder why that can be like that in my case? Can that be windows-specific (as i am on Windows) or maybe the command-line and/or contents of languagetool.cfg file is somehow incorrect? Or java version?

Also, which changes between versions could reduce the speed that much at all? Was that different default values for those 3 mentioned config options, so now it’s required to set them up explicitly to get the speed back? And/or was there much more rules added in newer version so that much more time is now required to test given text against those rules? Also, is it possible to add an option to turn off replacement suggestions calculation as in my case it doesn’t need that while batch processing, and if yes, would that increase the speed?

Appreciate your responses in advance

Could you share a curl call that’s slow and let us know how long it takes? Is it still slow the second the time the same curl is called? In general, LT is slow for the first start-up and then fast after the first few requests for a language. Speed then depends on text length and language used (some languages are slower than other because they have more rules).