How to use pos tagging only programmaticly?

Hello,

In command line, I do :

java -jar languagetool-commandline.jar -l fr –taggeronly document.txt > document-tagged.txt

Now I would like to use “tagger only” with xml document.
So I want to use LT programmaticly when parsing my xml.
Can you tell me how to do so ?

Thank you :slight_smile:

Try this:

    JLanguageTool lt = new JLanguageTool(Languages.getLanguageForShortName("en"));
    List<AnalyzedSentence> sentences = lt.analyzeText("This is a test.");
    for (AnalyzedSentence sentence : sentences) {
      AnalyzedTokenReadings[] tokens = sentence.getTokensWithoutWhitespace();
      for (AnalyzedTokenReadings token : tokens) {
        System.out.println(token.getToken() + ": " + token.getReadings());
      }
    }

You’ll need to remove the XML tags first.

1 Like

Perfect, thank you very much Daniel :wink:

@dnaber ? I’m an old Perl monk, and I’ve been out of that (other than like a one-liner to kill spam or something once in awhile) for a few years. I got into Python because it seems like the best to use for Semantic on-the-fly experiments. That doesn’t look like either … what is the code you use for LT?

LT’s source is mostly Java-code.

1 Like

Thanks… I’m never going there… just saying… I’ll crawl back under the couch now.