Wrong fromx and tox positions

Hello,

I was testing some rules with LT command line version and noticed that sometimes i get wrong fromx position.
For example, my command line:

java -jar languagetool-commandline.jar -l en-US --enable IN_A_X_MANNER -eo --api file.txt

As a return I get match with fromx=“63851”, but it should be 63868. If i don’t use --api option i get column: 63852 (that is also wrong), but the error is marked correctly. Is this a bug or I am doing something wrong?

My example file (the text in file is one long line):

<div style=font-size:9px;font-family:Arial, Helvetica, sans-serif;width:127px;font-color:#44a854;> File Hosting Online Storage Backup

Thanks.

Thanks for your issue report. There might be a bug, but recently people who had this issue had some strange characters in their files. You might want to shorten your file to the minimum file that still shows the error. That will make it easier to see where the problem is.

I done that by just deleting first 22 chars which are only letters “a” and spaces.

I see the issue now - we have a special case when files have more than 64,000 bytes (code: languagetool/Main.java at master · languagetool-org/languagetool · GitHub). We have another related bug documented at cross-paragraph rules don't always work with command-line version · Issue #254 · languagetool-org/languagetool · GitHub. So yes, this is a bug. I’m not sure when/if we can fix it.

I think this issue should be fixed now. The first version to contain the fix will be tomorrow’s snapshot at Index of /snapshots/.