Back to LanguageTool Homepage - Privacy - Imprint

Bulk testing


(Ruud Baars) #1

I would like to test a pre-filtered selection from the 4 GB of paragraphs without any spelling error for multi word issues.

A long lon time ago, i Used the server and XML output to send the input and process the output. Unfortunately, These functions are different now. The server requires some Java-like set-up including certificates (magic to me) and the output is (I think) JSON.
What is the most easy way for a not so technical person to test the bulk data en process the output?


(Daniel Naber) #2

You can call java -jar languagetool-commandline.jar with the --api option, it still returns XML.


(Ruud Baars) #3

I could do that, but it is deprecated. Hope it is not gone in the next major release then.


(Ruud Baars) #4

The API and --line-by-line don't gives something else that is strange; the fromy for the second line is reported as 2. I can live with that, but for line-by-line i think it strange, since it is meant for the line number within the checked text, right?.

ruud@ruud-laptop:~/Bureaublad/LanguageTool-3.7$ java -jar languagetool-commandline.jar -l nl --line-by-line --api testinvoer.txt
Warning: running in line by line mode. Cross-paragraph checks will not work.

<?xml version="1.0" encoding="UTF-8"?>
<!-- THIS OUTPUT IS DEPRECATED, PLEASE SEE http://wiki.languagetool.org/http-server FOR A BETTER APPROACH -->
<matches software="LanguageTool" version="3.7" buildDate="2017-03-27 10:50">
<language shortname="nl" name="Dutch"/>
<error fromy="1" fromx="-1" toy="1" tox="2" ruleId="UPPERCASE_SENTENCE_START" msg="Deze zin begint niet met een hoofdletter" replacements="Dit" context="dit is een testje. Ik loopt even om. " contextoffset="0" offset="0" errorlength="3" category="Hoofdlettergebruik" categoryid="CASING" locqualityissuetype="typographical"/>
<error fromy="2" fromx="0" toy="2" tox="8" ruleId="OT_IK_LOOPT" subId="1"  msg="U bedoelt vast &apos;Ik loop&apos;" replacements="Ik loop" context="dit is een testje. Ik loopt even om. " contextoffset="19" offset="19" errorlength="8" category="Vormfouten" categoryid="VORMFOUTEN" locqualityissuetype="uncategorized"/>
</matches>