Performance optimization for Russian language - self-hosted server with 200 concurrent users

Hi,

We are running a self-hosted LanguageTool server (version 6.7) for a corporate Chrome extension used by ~200 employees. The extension checks Russian text in real-time as users type.

Our server setup:

- 4 vCPU × 3.3GHz, 8GB RAM

- Docker container: meyay/languagetool:latest

- JAVA_XMS=1g, JAVA_XMX=6g

- No n-gram models

- fastText enabled for language detection

- cacheSize=1000

- Single instance on port 8010

Current performance:

- After JVM warm-up: ~200-300ms per request :white_check_mark:

- Cold start (first request after container restart): 6-11 seconds :cross_mark:

Users complain about slowness during the working day, not just after restarts. We use a debounce of 300ms in the extension + immediate check on space/punctuation.

Questions:

1. What are the best server.properties settings to maximize performance for Russian language checking with many concurrent users?

2. Does pipelineCaching / pipelinePrewarming actually help for Russian? We saw these options but couldn’t find benchmarks.

3. Is there a way to reduce response time below 200ms on this hardware without n-gram models?

4. Are there any known performance issues specific to Russian language rules that could cause slowness?

Thank you!