Hey Community!
This is a follow-up to this topic: Fragen zur Optimierung des LanguageTool
We have forked languageTool and build our own image that we run it in our kubernetes cluster. We face issues with the memory consumption that we were not able to fix although we made quite a few experiments.
The current setup
- we use the annotation HTTP API with possibly long texts in each call (can be 50kB, 100.000 characters total, 75.000 text, 25.000 annotations)
- this is the Dockerfile: neuris-languagetool/Dockerfile.neuris at 5d84db583430b9a443e797efbeda05524d1265fa · digitalservicebund/neuris-languagetool · GitHub
- we don’t use fasttext as we only process german text
- this is the server.properties: neuris-languagetool/server.properties at 5d84db583430b9a443e797efbeda05524d1265fa · digitalservicebund/neuris-languagetool · GitHub
- we have disabled suggestions as we don’t need them
- this is the deployment configuration in kubernetes (3,5GB request, 4GB limit, 2 CPU request, 4 CPU limit):
resources:
requests:
cpu: 2
memory: 3584Mi
limits:
cpu: 4
memory: 4096Mi
ephemeral-storage: 1Gi
Observation
What we observe is that languageTool gets OOM killed by the kernel (meaning that it exceeds the 4GB limit) after getting about 15 requests shortly after one another (each approx. 50kB, 100.000 characters total, 75.000 text, 25.000 annotations, 80 matches). Apart from this, there was no other load.
What we tried
- we removed fasttext to make sure it doesn’t not add to the memory - no change
- we tried to set the
Xmxof the java process to 2GB or 3GB - no change - we tried to set the
maxCheckThreadsto 4 (equal to the max number of CPUs) and themaxWorkQueueSizesize to 50 - no change
To me, it seems suspicious that this amout of requests exceed 4GB of RAM.
What we considered trying
- giving it more memory - we try to avoid that

- change or configure the garbage collector, e.g. setting the ratio of younf and old items with
XX:NewRatioor trying ZGC or Shenandoah - configure metaspace e.g.
-XX:MaxMetaspaceSize=256m - splitting the requests to LanguageTool to achieve a smaller size
- up to now, I haven’t tried the combination of
maxCheckThreadsand settingXmx2GB/3GB but I expect no substancial change
What do you suggest? Is there anything else you would try?