European Portuguese (PT-PT) rule contributions

tiagosantos · October 19, 2016, 10:05pm

@Yakov
Many thanks Yakov. Now anyone can easily fix the dictionary in a way that changes can be reviewed by anyone.

Checking the regression test, the results with the new rules have been great. Considering that there are 6 more rules the end of the day result is this:

-Portuguese: 4468 total matches -Portuguese: ø0,11 rule matches per sentence +Portuguese: 3849 total matches +Portuguese: ø0,10 rule matches per sentence

Considering that some false positive are actually valid grammar corrections is even better:

`
+Line 1, column 132, Rule ID: ERRO_DE_CONCORDNCIA_DO_NMERO_DO_VERBO_3P[1]
+Message: Erro de concordância verbal.
+… mais bem servidos nessa área, ainda que em todos eles haja grandes

                                                                                           ^^^^^^^^

+Line 1, column 1, Rule ID: ERRO_DE_CONCORDNCIA_DO_NMERO_DO_VERBO_1S[1]
+Message: Erro de concordância verbal.
+Eu costuma jogar frequentemente tênis com ele nos domingos.
+^^^^^^^^^^ `

We can even reduce this a bit further by adding to the new removed.txt this:

oo oo NCMP000 cãos cãos NCMP000 cãos cãos AQ0MP0 uma uma VMIP2S0 uma uma VMIP2S0 umas umas VMIP2S0

I was going to post all XML rules for punctuation, but many of the rules I have recreated are available but inactive by default in the LO extension.

They are active for other languages in the same build environment. Is there any pertinent bug that require them to be predefined as inactive for the Portuguese language?

The JAVA rules are ative by default in most (all?) other languages. The ones I have noticed that are inactive by default specifically in Portuguese are: “Capitalization”, “Word repetition”, “Double spacing” and both “Punctuation rules”.

When you have time, can you verify this?