Back to LanguageTool Homepage - Privacy - Imprint

Need help creating rule


(Marco A.G.Pinto) #1

Hello!

<!-- Concordance error plural - QUEM + VERB SINGULAR -->
<rule id="QUEM-VERB_PLURAL" name="Erro de concordância do plural QUEM + VERBO SINGULAR">
  <pattern>
    <token>quem</token>
    <marker>
        <token postag="VMIS3P0"></token>
    </marker>
  </pattern>
 <message>Erro de concordância do plural: <suggestion><match no="2" postag_regexp= "yes" postag="VMIS3P0" postag_replace="VMIS3S0"/></suggestion></message>     
 <example correction="fez">Foram eles quem <marker>fizeram</marker> os trabalhos.</example>
</rule>

It gives an error when I type: TESTRULES PT

If one types:
quem FIZERAM (3rd person plural verb)
it should suggest:
quem FEZ (3rd person singular verb)

How do I fix it?

Thanks!


(Konstantin Ladutenko) #2

Probably you should try to provide a similar example in English. This can help by ist one, and is give much more possibilities for others to help.


(jaumeortola) #3

I have tried the rule in the rule editor. The rule gives two suggestions: "fez" and "fê", as both words have the same lemma (fazer) and the same POS tag (VMIS3S0). I don't know if this is correct. You can fix the test with:
<example correction="fez|fê">Foram eles quem <marker>fizeram</marker> os trabalhos.</example>


(Tiago F. Santos) #4

I misunderstood the problem. Deleted the former post. I am facing a similar problem at the moment.
I leave here only the former tagset question.

I am trying to find the tagset to help me figure out a few remaining details. That file has the rest of the information.

Do you know who built the morfological patterns and the tagset for the Portuguese language?

The help page does not find it:
http://community.languagetool.org/ruleEditor2/posTagInformation?lang=pt1


(Marco A.G.Pinto) #5

This rule has already been implemented with the help of Yakov (as usual, I sent him a private e-mail asking for help).

Anyway, the bug was that the morphological pt_PT dictionary is full of typos, so sometimes it is not possible to write rules that will work 100%.

See the what I used and that also @jaumeortola suggested above:
<example correction="fez|fê">Foram eles quem <marker>fizeram</marker> os trabalhos.</example>

Yes, Yakov told me to use "fez|fê" but "fê" is wrong and it shouldn't be in the morphological dictionary.

A couple of days ago or so I had the same issue with other rule.

Also, the plural of "cão" in the morphological dictionary is "cães" and "cãos" which is wrong.

in 2017 I will try to add to my Proofing Tool GUI an option to convert the official (Minho University) pt_PT spellers into morphological. After this happens all will work 100% okay for Portuguese.


(Tiago F. Santos) #6

Still, that does not allow dynamic suggestions. It must be possible.

In my preview post I misunderstood and I was going to suggest that you extend the rule to more generic values.

The thing that brought me here is that in LanguageTools European Portuguese suggestions are very conservative (though they still fail very often).

The most common grammar errors due to typos are number and gender (in)congruency, so the set of rules I am creating test all the most common congruency and it's doing it quite well. You can test the code from the other page on on-line material and see for yourself. It will be easier to correct it adding exceptions, since the detection strategy is quite different from yours.

Regarding the conversion, it might be wise to discuss that with the U. Minho team. They actually have automated compiling for many dictionary types. If they can not accommodate your request, maybe both groups can benefit from code sharing. Check https://natura.di.uminho.pt/svn/main/dicionarios/jspell.pt/


(jaumeortola) #7

@tiagosantos
You can find a description of the tags in the Freeling library documentation:

https://talp-upc.gitbooks.io/freeling-user-manual/content/tagsets/tagset-pt.html


(Tiago F. Santos) #8

That was exactly what I needed. Many thanks.