European Portuguese (PT-PT) rule contributions

@tiagosantos

Daniel is going to give you commit rights.

Anyway, I have committed the removed.txt you just sent.

Thanks,

:slight_smile:

Awesome and many thanks Marco.
Anyway, I will keep pacing the commits in a way that allows you to review, find potencial problems and suggest fixes.

Cheers.

I think it is necessary to add the base form of words to the list like:

oo o NCMP000
cãos cão NCMP000
cãos cão AQ0MP0
umas umar VMIP2S0

I will fix that accordingly. Only momments ago, I downloaded the git copy. When I fell more confident with git I will push the updated version.

@tiagosantos

Hello!

Yesterday, I kind of finished my thesis+project, so I opened the thesis with LibreOffice and LT.

I spent hours creating a list of missing words for the pt_PT speller from Minho University (who replied saying they will add them when they have the time).

I will try soon to post here a list of possible false positives, maybe after the nightly.

All I remember was that “NATO” gave a gender error, so moments ago I went to the morphological page and it appears as a normal word:
NATO nato AQ0MS0

because it recognises it as a normal word only and not as the NATO organisation.

Could you add “NATO” as well?

Thanks!

Kind regards,

Hello Marco,

Congratulation on the completion of your thesis.

The best way is to add them yourself, since there are always more words that can be added.
They should be placed in languagetool-language-modules/pt/src/main/resources/org/languagetool/resource/pt in the file added.txt. Do not forget the lemma and that in this file, columns must be tab separated.

It would also be interesting if you get acquainted with:
http://wiki.languagetool.org/archive-developing-a-tagger-dictionary
http://wiki.languagetool.org/developing-a-tagger-dictionary

These additions and removals work better if they are integrated in the main morphological dictionary and synthetizer, after a reasonable test period in the added.txt and removed.txt files.

Having a separate project with those lists (in text files so they can be reviewed) and updating the binaries only once each realease, similar to the work in German and Catalan projects, would be ideal.

Best regards

Regarding
NATO nato AQ0MS0 you could add the tag
NATO NATO NP0FS0 and it would fix gender concordance false positives, but I am not sure it would not introduce other errors in the ‘nato’ adjective. If I am not mistaken, the speller and rules are not case sensitive. Either way, it is a good addition.

Any specific reason you’re pointing to the archived version of the page? Doesn’t Developing a tagger dictionary - LanguageTool Wiki work?

No good reason. First link I gathered and read. I have not changed afterwards due to this:

The manual process of creating and exporting a dictionary is documented at the Archive.

I will add the new link as well to both posts.