Back to LanguageTool Homepage - Privacy - Imprint

Command line utility


(Ruud Baars) #1

I was thinking of using the command line utility to check the test data. There is one paragraph per line in it, which is okay according to the tool help. But with or without -b, the 2 lines I used as input are considered 1 paragraph.
This is on Linux, Kubuntu, utf8.


(Daniel Naber) #2

Maybe the line endings are wrong. Could you attach a small file as an example, preferably as ZIP so it doesn't get modified by the forum?


(Ruud Baars) #3

I am quite sure the line endings are \n
Nevertheless, I added the 2 lines as .zip
testinvoer.txt.zip (163 Bytes)


(Daniel Naber) #4

The line endings are \n indeed, so they are correct (for Linux). How do you see -b doesn't work, i.e. what result would you expect?


(Ruud Baars) #5

Here is the output without -b, which shows lines are not treated apart from eachother. The 'surrounding' text is from the next line as well:

ruud@ruud-laptop:~/Bureaublad/LanguageTool-3.7$ java -jar languagetool-commandline.jar -l nl testinvoer.txt
Expected text language: Dutch
Working on testinvoer.txt...
1.) Line 1, column 0, Rule ID: UPPERCASE_SENTENCE_START
Message: Deze zin begint niet met een hoofdletter
Suggestion: Dit
dit is een testje. Ik loopt even om.
^^^

2.) Line 2, column 1, Rule ID: OT_IK_LOOPT[1]
Message: U bedoelt vast 'Ik loop'
Suggestion: Ik loop
dit is een testje. Ik loopt even om.
^^^^^^^^
Time: 179ms for 2 sentences (11.2 sentences/sec)

=====================================
And here the output WITH -b (exactly the same)
ruud@ruud-laptop:~/Bureaublad/LanguageTool-3.7$ java -jar languagetool-commandline.jar -b -l nl testinvoer.txt
Expected text language: Dutch
Working on testinvoer.txt...
1.) Line 1, column 0, Rule ID: UPPERCASE_SENTENCE_START
Message: Deze zin begint niet met een hoofdletter
Suggestion: Dit
dit is een testje. Ik loopt even om.
^^^

2.) Line 2, column 1, Rule ID: OT_IK_LOOPT[1]
Message: U bedoelt vast 'Ik loop'
Suggestion: Ik loop
dit is een testje. Ik loopt even om.
^^^^^^^^
Time: 182ms for 2 sentences (11.0 sentences/sec)


(Daniel Naber) #6

I see - but that's how it has been forever, I think. The -b refers to the logic of matching. With this text:

This is
a text.

You'll have different results whether you use -b or not. You could try if --line-by-line works for you instead.