Nightly diff

There is java error info in the nightly dump for Dutch. Since I did not change a thing, I am ignoring it.

@dnaber Since 28th February dailies are also having inconsistent results in the Portuguese dailies. It seems that the entire Wikipedia file is not being fully tested. This is obvious by the number of detection being cut to a half (https://languagetool.org/regression-tests/20190302/result_pt-PT_20190302.html). Disambiguation changes on that day do not produce such changes, and, while testing offline, the error detections that were dropped are still detected. This can be assessed both on the website or on the standalone tool.

-Portuguese (Portugal): 35405 total matches
-Portuguese (Portugal): ø0,89 rule matches per sentence
+Portuguese (Portugal): 12150 total matches
+Portuguese (Portugal): ø0,88 rule matches per sentence

This is also obvious by looking at the regression summary.

It seems the nightly regression test has reached the memory limit of the JVM. I’ve increased it, let’s see whether the tests work okay again tonight.

Many thanks for solving this @dnaber .

@dnaber Looks like I have talked too early. The file is now trimmed even earlier:
https://languagetool.org/regression-tests/20190304/result_pt-PT_20190304.html

-Portuguese (Portugal): 12338 total matches
-Portuguese (Portugal): ø0,88 rule matches per sentence
+Portuguese (Portugal): 4367 total matches
+Portuguese (Portugal): ø0,85 rule matches per sentence

It seems the issue is a bit more complex than just “we need a bit more memory”. I will further debug it today.

The issue should be fixed now (commit).

2 Likes

It is working fine now. Thank you.