Tagging in French: where can I find the definition of tags

L[le/D e s]’[L’/</LE_LA>]homme[homme/N m s] est[être/V etre ind pres 3 s] dans[dans/P] le[le/D m s] bois[bois/N m sp].[./M fin, ]

I would like to have the ‘formal’ définition of the tags i.e.: [le/D e s] what D e and s really mean. I guess it’s Determinant, e? and s sigulier, but I would like to have a table where all those terms are defined.

Where can I find it (even in the code?)

Many thanks for this gret tool

On Fr 12.04.2013, 02:49:34 you wrote:

I would like to have the ‘formal’ définition of the tags i.e.: [le/D e s]
what D e and s really mean. I guess it’s Determinant, e? and s sigulier,
but I would like to have a table where all those terms are defined.

Please see


http://www.danielnaber.de

Many thanks.

Juste one clarification.

I tag the sentence “L’homme est dans le bois depuis le 10 février 1990.” in the standalone tool, and I get the following result

L[le/D e s]’[L’/</LE_LA>]homme[homme/N m s] est[être/V etre ind pres 3 s] dans[dans/P] le[le/D m s] bois[bois/N m sp] depuis[depuis/P] le[le/D m s] 10[10/Y] février[février/N m s] 1990[1990/Y].[./M fin, ]

I am able to check everything with the document you have provided except one thing: 10[10/Y] I would expect K instead of Y.

Any comment?

On Fr 12.04.2013, 04:36:47 you wrote:

I am able to check everything with the document you have provided except
one thing: 10[10/Y] I would expect K instead of Y.

Any comment?

This “Y” is introduced by the disambiguator. I have sent a mail to our mailing list,
asking for the documentation to be improved.

The disambiguator can be found at

Regards
Daniel


http://www.danielnaber.de

Thanks Daniel,

I have even access to this feature from .net, using the ikvm framework, here is the snipet (it works with the 2.2 version of languagetool):

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using org.languagetool;
using org.languagetool.language;
using org.languagetool.rules;

namespace languagetoolTest
{
class Program
{
static void Main(string[] args)
{
var lange = new BritishEnglish();
var langf = new French();

        var langs = new java.util.ArrayList();
        langs.add(langf);
        langs.add(lange);
        Language.reInit(langs);

        JLanguageTool langToole = new JLanguageTool(Language.LANGUAGES[1]);
        JLanguageTool langToolf = new JLanguageTool(Language.LANGUAGES[2]);

        AnalyzedSentence analyzedTextf = langToolf.getAnalyzedSentence("L'homme est dans le bois depuis le 10 février 1990.");
        System.Console.WriteLine(analyzedTextf.toString(", "));

        AnalyzedSentence analyzedTexte = langToole.getAnalyzedSentence("The man is in the wood since February 10th 1990.");
        System.Console.WriteLine(analyzedTexte.toString(", "));
    }
}

}

Result:

L[le/D e s]’[L’/</LE_LA>]homme[homme/N m s] est[être/V etre ind pres 3 s]
dans[dans/P] le[le/D m s] bois[bois/N m sp] depuis[depuis/P] le[le/D m s] 1
0[10/Y] février[février/N m s] 1990[1990/Y].[./M fin,
]

The[the/DT] man[man/NN] is[be/VBZ] in[in/IN] the[the/DT] wood[wood/JJ,
wood/NN:U, wood/VB, wood/VBP] since[since/CC, since/IN, since/RB] February[Feb
ruary/NNP] 10th[10th/JJ] 1990[1990/CD].[./.,
]

I spent some time to generate the .dll but finally got it to work on 2.2

Thanks a lot to let this software open source, very good to help make some research work!!!

On Fr 12.04.2013, 07:46:13 you wrote:

I have even access to this feature from .net, using the ikvm framework,
here is the snipet (it works with the 2.2 version of languagetool):

That’s interesting - it might be useful to others. Do you have a
description of the process? Or could you even publish the .dll? Is there
one large .dll or one .dll per .jar file?

I know we have a short description in our README.txt about ikvm but I’m not
sure whether it actually still works.

Regards
Daniel


http://www.danielnaber.de

Daniel, here is the outline:

========= creating the dll =============
1- Download the 2.2 version of languagetool from langagetool.org
2- if not prsent in your PC, get the java sdk 1.7 (neede to make jar files)
3- if not present, get the ikvm software (i have the 7.3.4830.0)
4- make a directory where you put the languagetool 2.2
5- in this dir (languagetool 2.2), create a directory (I named it jar)
6- copy org from languagetool 2.2 to jar
7- open a cmd, cd to the jar dir
8- use the jar tool to create a jar with the content of jar: <<jar cvf languagetool.jar *>>
9- copy the languagetool.jar in the libs
A- cd to the languagetool 2.2
B- use ikvmc to make the big dll (50Meg or so) using the following command line

ikvmc -target:library -out:languagetool.dll libs/cjftransform.jar libs/commons-lang.jar libs/commons-logging.jar libs/hunspell-native-libs.jar libs/ictclas4j.jar libs/jna.jar libs/junit.jar libs/jwordsplitter.jar libs/languagetool.jar libs/languagetool-core.jar libs/languagetool-core-tests.jar libs/lucene-gosen-ipadic.jar libs/morfologik-fsa.jar libs/morfologik-speller.jar libs/morfologik-stemming.jar libs/segment.jar libs/tika-core.jar

C-You have the languagetool.dll ready for a .net project

Now how to use the dll?

1- in languagetool 2.2 dir, with visualstudio 10 (or 12) (I made the test with xp and 7), create a project in a directory languagetoolTest
2- in the reference add the following dll

  • languagetool.dll
  • IKVM.OpenJDK.Core.dll
  • IKVM.OpenJDK.XML.Parse.dll

add the following usings in the program.cs file
using org.languagetool;
using org.languagetool.language;
using org.languagetool.rules;

Add this snipet in the Main {}:

        var lange = new BritishEnglish();
        var langf = new French();

        var langs = new java.util.ArrayList();
        langs.add(lange);
        langs.add(langf);
        Language.reInit(langs);

        JLanguageTool langToole = new JLanguageTool(Language.LANGUAGES[1]);
        JLanguageTool langToolf = new JLanguageTool(Language.LANGUAGES[2]);

        AnalyzedSentence analyzedTextf = langToolf.getAnalyzedSentence("L'homme est dans le bois depuis le 10 février 1990.");
        System.Console.WriteLine(analyzedTextf.toString(", "));

        AnalyzedSentence analyzedTexte = langToole.getAnalyzedSentence("The man is in the wood since February 10th 1990.");
        System.Console.WriteLine(analyzedTexte.toString(", "));

and you should get the following result:

L[le/D e s]’[L’/</LE_LA>]homme[homme/N m s] est[être/V etre ind pres 3 s]
dans[dans/P] le[le/D m s] bois[bois/N m sp] depuis[depuis/P] le[le/D m s] 1
0[10/Y] février[février/N m s] 1990[1990/Y].[./M fin,
]

The[the/DT] man[man/NN] is[be/VBZ] in[in/IN] the[the/DT] wood[wood/JJ,
wood/NN:U, wood/VB, wood/VBP] since[since/CC, since/IN, since/RB] February[Feb
ruary/NNP] 10th[10th/JJ] 1990[1990/CD].[./.,
]