LanguageTool startup time in various versions and languages

Dominique_PELLE · March 31, 2018, 4:14am

I recall measuring LanguageTool startup time a long time ago.
I did it again for all versions from 4.1 (March 2018) to 2.1 (April 2013)
for various languages.

Here are the result (timings are in seconds):

version  br    ca    fr    de-DE en-US it    nl    eo    es    pl    pt-PT ru    
-------  ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- 
4.1      0.57  0.95  0.97  1.78  1.75  0.52  0.70  0.63  0.55  0.89  1.45  0.84  
4.0      0.60  0.91  0.96  1.73  1.80  0.54  0.68  0.61  0.54  0.80  1.34  0.80  
3.9      0.60  0.94  0.99  1.63  1.73  0.47  0.72  0.59  0.57  1.03  1.33  0.86  
3.8      0.61  0.90  0.88  1.65  1.58  0.48  0.72  0.60  0.51  1.06  1.21  0.77  
3.7      0.60  0.92  0.89  1.38  1.60  0.47  0.63  0.59  0.54  0.96  0.85  0.77  
3.6      0.52  0.86  0.87  1.32  1.44  0.43  0.58  0.52  0.46  0.95  0.79  0.77  
3.5      0.64  0.96  0.85  1.21  1.43  0.45  0.53  0.52  0.47  0.81  0.60  0.68  
3.4      0.53  0.80  0.84  1.21  1.34  0.46  0.52  0.50  0.47  0.87  0.54  0.62  
3.3      0.57  0.88  0.86  1.19  1.38  0.46  0.56  0.52  0.46  0.69  0.52  0.68  
3.2      0.51  0.74  0.80  1.24  1.30  0.38  0.53  0.52  0.44  0.65  0.53  0.66  
3.1      0.49  0.84  0.79  1.22  1.42  0.40  0.55  0.49  0.40  0.65  0.51  0.70  
3.0      0.56  0.84  0.76  1.22  1.40  0.40  0.59  0.57  0.40  0.64  0.52  0.70  
2.9      0.54  0.88  0.79  1.27  1.28  0.39  0.54  0.55  0.46  0.70  0.50  0.62  
2.8      0.51  0.74  0.71  1.01  1.26  0.36  0.50  0.50  0.42  0.66  0.47  0.52  
2.7      0.48  0.71  0.79  1.01  1.34  0.37  0.50  0.46  0.54  0.60  0.48  0.54  
2.6      0.47  0.70  0.93  1.02  1.22  0.37  0.57  0.45  0.50  0.60  0.49  0.44  
2.5      0.49  0.62  0.69  0.88  1.31  0.23  0.57  0.45  0.51  0.59  0.47  0.43  
2.4      0.45  0.63  0.72  1.06  1.27  0.22  0.54  0.44  0.50  0.55  0.45  0.40  
2.3      0.37  0.64  0.80  0.98  1.19  0.24  0.55  0.50  0.48  0.49  0.44  0.40  
2.2      0.33  0.56  0.67  0.84  0.44  0.28  0.49  0.48  0.45  0.48  0.45  0.37  
2.1      0.32  0.51  0.68  0.93  0.39  0.20  0.35  0.48  0.57  0.50  0.45  0.36

Here is another run to get an idea of how stable measurements are:

version  br    ca    fr    de-DE en-US it    nl    eo    es    pl    pt-PT ru    
-------  ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- 
4.1      0.62  0.97  0.88  1.76  1.62  0.49  0.74  0.60  0.56  0.89  1.36  0.86  
4.0      0.66  1.02  1.05  1.93  1.86  0.50  0.78  0.65  0.63  0.98  1.55  0.94  
3.9      0.63  1.00  0.99  1.67  1.88  0.57  0.68  0.67  0.58  0.90  1.43  0.87  
3.8      0.64  0.96  0.93  1.70  1.75  0.48  0.63  0.58  0.52  0.89  1.39  0.87  
3.7      0.67  1.02  1.05  1.58  1.76  0.54  0.63  0.61  0.51  1.14  0.84  0.81  
3.6      0.60  0.95  0.93  1.40  1.47  0.45  0.55  0.54  0.45  0.80  0.77  0.77  
3.5      0.59  0.88  0.86  1.30  1.43  0.48  0.63  0.56  0.45  0.84  0.60  0.75  
3.4      0.59  0.82  0.87  1.34  1.43  0.42  0.55  0.54  0.45  0.89  0.53  0.63  
3.3      0.59  0.87  0.78  1.25  1.47  0.40  0.58  0.51  0.46  0.87  0.60  0.75  
3.2      0.57  0.73  0.84  1.34  1.40  0.42  0.60  0.57  0.46  0.77  0.51  0.73  
3.1      0.52  0.87  0.75  1.23  1.39  0.41  0.57  0.50  0.41  0.64  0.50  0.73  
3.0      0.54  0.86  0.78  1.26  1.35  0.46  0.59  0.51  0.46  0.67  0.49  0.71  
2.9      0.56  0.93  0.83  1.29  1.40  0.44  0.52  0.54  0.44  0.69  0.55  0.71  
2.8      0.54  0.81  0.72  1.01  1.23  0.42  0.51  0.46  0.45  0.61  0.50  0.55  
2.7      0.52  0.68  0.76  1.19  1.24  0.46  0.49  0.51  0.59  0.66  0.50  0.53  
2.6      0.49  0.65  0.97  1.10  1.30  0.42  0.60  0.50  0.49  0.60  0.43  0.41  
2.5      0.45  0.73  0.69  1.04  1.26  0.25  0.53  0.48  0.52  0.57  0.51  0.45  
2.4      0.49  0.66  0.73  1.01  1.25  0.21  0.52  0.57  0.50  0.59  0.50  0.46  
2.3      0.36  0.58  0.83  1.02  1.28  0.22  0.56  0.46  0.49  0.54  0.45  0.42  
2.2      0.38  0.55  0.69  0.87  0.46  0.21  0.53  0.43  0.47  0.51  0.41  0.40  
2.1      0.30  0.55  0.77  0.92  0.52  0.20  0.39  0.47  0.54  0.51  0.42  0.36

It clearly shows a tendency for LT to become slower and slower
over time in some languages. Startup time approaching 2 seconds
is not nice.

I have a script that automates those measurements:

It downloads and unzip zip files from Index of /download/
(then caches locally them for further runs)
it measure the time to run LT on an empty input, i.e. the
time it takes to run something like:

$ echo | java -jar languagetool-commandline.jar -c utf-8 -l en-US -

it formats the results in a table as shown above

I can share the script but I don’t see a way to attach file in this forum.

Ruud_Baars · March 31, 2018, 4:38pm

Isn’t it just the amount of rules and data files, so in fact checks? E.g. nothing was structurally changed in Dutch, apart from more rules.

Dominique_PELLE · March 31, 2018, 5:50pm

The language with the most number of rules is Catalan (2999 rules
0.95 sec). The slowest startup is in German (2565 rules, 1.78 sec)
and English (2003 riles, 1.62 sec) which have fewer rules. So
something else contributes to startup time. I suppose that
German starts slower because it loads files such as
resource/de/{added,words-similar}.txt.

I measure startup time because I generally use LT
via the command line (via a Vim plugin) and I always pay
the price for startup time even for small text.

I upload the script I used for the automated measurements:

measure-startup.zip (917 Bytes)

dnaber · March 31, 2018, 7:53pm

A better approach would be to start an LT server instead. TexStudio does that on the first query and then uses that server.

tiagosantos · April 1, 2018, 2:35am

Although this can become an issue, 2 seconds is not much for anybody, nor any function that I recall, since languages are not supposed to be loaded more than a couple of times per usage.

At most, in server settings, less than 2 seconds per language would amount to roughly 1 minutes to load the server with all languages in a consumer grade PC.
Most consumer grade PC are unable to even load that many languages, and… why would they?

That means that it is becoming more functional, just like @Ruud_Baars said.
Maybe this should be a benchmark of usefulness, not something to be concerned about, right?

I think that is indeed the case. Java rules that load databases can correspond to as many XML rules as the number of lines in the database.
Each XML rule also has different performance footprints since the more complex the rule is, the more it takes to load.
Look at the temporary impact on parsing performance exemplified by the recent addition of a massive number of examples in the Dutch module.

Anyway, something was done very right on the loading performance of languagetool-core since:

version  br    ca    fr    de-DE en-US it    nl    eo    es    pl    pt-PT ru    
-------  ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- 
4.1      0.62  0.97  0.88  1.76  1.62  0.49  0.74  0.60  0.56  0.89  1.36  0.86  
4.0      0.66  1.02  1.05  1.93  1.86  0.50  0.78  0.65  0.63  0.98  1.55  0.94

All languages had load time decreased, even languages with minimal changes like Portuguese on version 4.1.

oserikov · April 1, 2018, 1:12pm

Recently I did a load-test on the english version of the LT. If someone could provide the data to test all the languages, I’d load-test the different versions of the LT using this data.

Ruud_Baars · April 1, 2018, 1:35pm

The increase of load time was not by the examples, but by one newconsistency rule with 8000 word pairs, which was actually a small subset of all allowed spelling variations for Dutch. So that is also the amount of checks. Whether this is time spent on reading files or loading the program in memory, I would not know. Maybe there is a possibility to not load and apply all rules always. Maybe there could be a ‘rough check’ and a ‘nitty gritty check’, rules split over both according to either the hit rate or manually.

tiagosantos · April 1, 2018, 6:20pm

You are right. I assumed that was the reason since you made a great number of commits reverting the commit:

looking at https://languagetool.org/regression-tests/performance-data.csv it seems that the reason was the other increase in XML rules made at 24th of February.

So, what was the reason to go through so much effort to revert the added examples?

Ruud_Baars · April 1, 2018, 7:18pm

I was not sure at the time. Only later I discovered this. Now I assume (but am not sure) the examles only affect the test. In time, I will re-add examples again. Having these LT hits as examples showed some weaknesses, so altered rules, and so a need to redo the examples as well.
Currently, most of the work is in enhancing speller and postags.

tiagosantos · April 1, 2018, 7:54pm

Maybe a simple way to redo this is just to make a patch with this commit and resubmit it. Probably there won’t be many (if any) merge conflicts at this stage, so it will be easy to reconcile them.

Alternatively, reverting the individual commits that came afterwards will also be effective.

Ruud_Baars · April 2, 2018, 5:58am

Too complicated for a non-developer.

tiagosantos · April 2, 2018, 9:07pm

https://github.com/languagetool-org/languagetool/commit/f0eccfc2dbf443d83df3ff63bae7ee9d6787118f.patch
Save as..
patch.exe -i <patchfile>
* if needed: patch.exe --help

Next time, if needed, it will be easier to revert in case of trouble. Just type:
git revert <commit name>.
It saves non-programmers at least 3 days of work;)

Ruud_Baars · April 3, 2018, 4:51am

No git, but subversion, no Windows. I am.glad i finally mastered a commit at all.

tiagosantos · April 3, 2018, 8:43pm

Looking at what you’ve done, it seems you are selling yourself short. If you wish to, I can resubmit that patch once I get to my development PC, but I believe other developers can also do it now, if you ask them.

Ruud_Baars · April 4, 2018, 5:41am

Please don’t. I will loose track completely.