Back to LanguageTool Homepage - Privacy - Imprint

[pt] Portuguese rule contribution/discussion


(Tiago F. Santos) #61

Thank you.

1) It suggests "O meu";
Diz is according to European Portuguese rules. Anyway, this is more of a style rule than a grammar rule, so you can ignore it if you prefer.

To suggest replacing "pás" with "paz".
A new rule has been added to handle the word confusion.


It has very few trigger words, so you may wish to add a few more, if you find more useful examples.

2) It says "pás" is informal language.
Is 2) a false positive or can it somehow be improved not to be triggered in a sentence such as the above?

Due to the priority system, the word will not be tagged as informal when misused, but a special rule for "pá" has been added as well. This is the first false positive I see with it, but it should reduce them even further.


(Marco A.G.Pinto) #62

@tiagosantos

Hello!

"
1. Informação pessoal
1.1. Nome
1.2. Morada
1.3. Telemóvel
1.4. E-mail
"

It says in the first line that the sentence starts with a number.

Most of the documents I write follow the logic above:
1. blah blah blah
1.1. blah
2. blah
2.1. blah
2.2. blah
etc.

Is it easy to improve the rule not to show the warning?

Thanks!

Kind regards,


(Tiago F. Santos) #63

I could not reproduce this on my latest build. Only for the second line(1.1. Name).
I have been fiddling with this rule recently, because I found this type of false positive, so it should be fixed now on yesterday's build. Anyway, I found that the decimal number was also triggered and it has also been improved.

SEND_END detection regression, may be the reason behind some false positives in this situations.


(Marco A.G.Pinto) #64

Thanks, Tiago!

I will test it when the nightly is released.


(Marco A.G.Pinto) #65

@tiagosantos

Sorry for only testing now.
"
1. Informação pessoal
1.1. Nome
1.2. Morada
1.3. Telemóvel
1.4. E-mail
"

It now flags 1.1.

All the rest works fine.


(Marco A.G.Pinto) #66

@tiagosantos

Tiago,

"Meu irmão, descansa em pás!"

Still triggers informal language :frowning:


(Tiago F. Santos) #67

This is not a problem because wrongWordinContext takes priority. Every word can change the interpretation of the sentence completely (specially in disambiguation mechanisms), so we can't safeguard against it.
Anyway, anyone can change line 20477 to
<exception postag_regexp='yes' postag='(N.|[ADP]..|V.....)[FC].*|SPS.+'/></token>
in their copy if they find this is a common error in their wirtting.


(Marco A.G.Pinto) #68

@tiagosantos

Hello!

You are going to murder me... but I found another issue... there is a dash issue in dates, please see the attached image (I tried to attach an .ODT but it is not supported... which is strange because it worked before?)

It complains about the dash in the first date: "20170326" but not in the one at the end of the page.

.

PS-> Tiago, I tried to paste the text here and then copy/paste into LO and it no longer gave the error, so I am sending you the ODT via e-mail.

Thanks!

Kind regards,


(Tiago F. Santos) #69

No problem, errors happen. I have seen inconsistant behaviour between LT-server, LT-standalone and LT-LibreOffice before, and there is even a bug I filed related to it. Sentence segmentation method is different.
I have looked into the code already regarding that, but that is an issue that requires more time to solve than what I am willing to spend on it, at the moment.
If the error is too specific, try to just delete the segment and rewrite. Nothing can be done regarding special characters or special formats, specially on LO. For example:


(Marco A.G.Pinto) #70

@tiagosantos

Tiago, I have created a new rule:
"sob o ponto de vista" > "do ponto de vista"

Feel free to improve it:
"sob o MEU/TEU/SEU/NOSSO/VOSSO ponto de vista"

I can't remember how to add words that may or may not exist.

Thanks!

EDIT: added SEU/NOSSO above


(Tiago F. Santos) #71

Good one. I didn't knew there was a debate on this one. But this should be on the "Style" category with an URL explaining it. Is this the best?
https://ciberduvidas.iscte-iul.pt/consultorio/perguntas/do-ponto-de-vista/450

Add min='0' to the token to make it optional. Next update I add all those improvements. No worries.


(Marco A.G.Pinto) #72

Thank you, @tiagosantos


(Marco A.G.Pinto) #73

@tiagosantos

I was on Facebook and saw this mistake:
"Temos de por em prática tudo o que aprendemos!"

I was thinking about creating a rule that suggests "pôr" but I was wondering if the rule should be created to match the words "POR EM" or if there are more matches to be checked.

Could you advice?

Thanks!


(Tiago F. Santos) #74

This can be very useful.
I believe that the best is to added a simple rule (detect both "de por" and "por em") and correct after checking the regression tests. I can't recall any exception, but they are bound to exist.


(Marco A.G.Pinto) #75

Thanks!


(Marco A.G.Pinto) #76

@tiagosantos

Nightly results of my "por > pôr" rule:

+Title: Beja
+Line 1, column 44, Rule ID: POR[1]
+Message: Substitua por 'pôr'.
+Suggestion: pôr
+A sua importância é atestada pelo facto de por lá passar uma das vias romanas.
+                                           ^^^                                
+

Should I remove the "de por" rule or is there a way to improve it?

The "por em" seems to be okay.

Thanks!


(Marco A.G.Pinto) #77

@tiagosantos

Thank you for fixing it.

Kind regards,


(Tiago F. Santos) #78

No worries. Best regards.


(Marco A.G.Pinto) #79

@tiagosantos

Hello Tiago,

I am not sure if this is a false positive.

LT flagged:
"Ele disse que quem estabelece a % de incapacidade é o médico a que vou na sexta."

It suggest "à".

Now I don't know how to write the sentence.

Thanks!


(Tiago F. Santos) #80

Great question.
Seems like a false positive to me, though that made me have doubts about it too, since it does make logical sense to use 'à' for time expressions.
After today's regression test results (that should be very verbose due to the NO_VERB rule correction) I will try to confirm and fix it.
This false positive is very misleading and needs to be addressed.