Back to LanguageTool Homepage - Privacy - Imprint

[pt_PT] Small false positive in my CV - 20161223


(Marco A.G.Pinto) #1

Hello @tiagosantos

I was improving my CV and LanguageTool gave a hit on:
"Dar/revogar acessos (Moodle) a docentes e alunos e prestar apoio técnico;"
"Passa horas na Internet a conversar com os seus amigos internacionais"

Could you give a look at it?

Thanks!


(Tiago F. Santos) #2

Hi Marco,

Thank you for the report.
I thought that this type of false positives was already handled a long time ago. I will have a look at it ASAP.

Cheers


(Tiago F. Santos) #3

Fix pushed.
The second case will need deep changes, excluding passive voice in some cases. This will have to be improved after release, since it will produce too many unintended changes.
Tags were added to the code to remind about this.


(Marco A.G.Pinto) #4

@tiagosantos

Another false positive in my CV this time due to a morphological word being masculine and it is there as feminine:
https://www.priberam.pt/dlpo/workshop

"Workshop Temático"

Here is what is in the morphological dictionary:
workshop workshop
NCFS000

workshops workshop
NCFP000

To fix it one just needs to replace "F" with "M".

Tiago, I know I am a pain in the ass... I really need to learn how to fix this kind of things myself so that in the future I don't need to bother you.

To fix it, one adds to removed.txt and adds to added.txt?

Thanks!

Kind regards,


(Tiago F. Santos) #5

No problem, but you would benefit from knowing how to do it.

You are correct. The way to fix a POS tag:
* add the wrong tag to removed.txt. Format is 'word|TAB|lemma|TAB|POStag'. Follow the examples in the file;
* add the fixed tag to added.txt. Same format.

You will advance much faster in your learning of LanguageTool (and programming in general) if you try your ideas.
As long as there are no deletions, you can usually revert easily, when things do not work as expected.


(Marco A.G.Pinto) #6

Thanks!

I have just fixed it and also added "Leitão" to be accepted as a proper name too.

It is good to know how to fish instead of receiving the fish. :slight_smile:


(Marco A.G.Pinto) #7

Now that we are at it:
Microsoft NPFSO00
"O Microsoft Windows"
"A Microsoft tem criado muitos programas"

The other day when I tested a small bit of my thesis, LanguageTool complained of the "Microsoft" word. Shall I add to morphological a masculine form too?


(Tiago F. Santos) #8

That can only be solve with the multiword. Microsoft is NPFS000 but Microsoft Windows is NPMS000. When you have more than one word, add it to multiwords.txt. It is the same as before, but without lemma.

I noticed your commit. You corrected Leitão, but since this is also a relatively common name, it might be better to only use it also on multiwords. See 'Vilas Leitão' for the applied example, and, please tell me if you found other false positives with it.

Portuguese family names are tricky because they can be almost any noun, so they should only be added in combinations, unless they are not ambiguous.


(Marco A.G.Pinto) #9

Tiago, I have just added to multiword:
Microsoft Windows NPMS000
Microsoft Office NPMS000
Microsoft Excel NPMS000
Microsoft PowerPoint NPMS000
Microsoft Word NPMS000
Microsoft Access NPMS000

Ohhhh... "Desidério Vilas Leitão" no longer appears as a grammar error. :slight_smile:

I also tried the name of other friend: "Nuno Leitão" and it also worked okay.

Oki... I am about to remove "Leitão" from the added.txt.


(Marco A.G.Pinto) #10

Done!

Removed "Leitão".

:slight_smile:


(Marco A.G.Pinto) #11

@tiagosantos

I have just tried the nightly in LibreOffice and it still shows a grammar error in:
"- Dar/revogar acessos (Moodle) a docentes e alunos e prestar apoio técnico;"

:frowning:


(Tiago F. Santos) #12

Yes. I have not removed that because it is disambiguation related. Disambiguation is used to make the program understand the context of each word and recognize which meaning is intended for it.

On this context 'a' is recognized as a determinant instead of a adposition. It could be solved by adding a direct exception to the grammar rule, but this would be incorrect and would cripple the rule, since the context it was programmed to be recognized as a determinant and it is valid.

Since I am giving priority to high impact changes, disambiguation improvements have been delayed. This follows the Pareto principle. Improving disambiguation is in the plans, and I intend to make a work as significant on it, as the work done on the rules. This will have less visible impact to the user, but it should resolve all this borderline cases.