Back to LanguageTool Homepage - Privacy - Imprint

[pt] Fix false positives - 2019-10-23

Hello!

The rule created by Tiago created tons of false positives:
https://internal1.languagetool.org/regression-tests//20191022/result_pt-PT_20191022.html

+Title:
+Line 1, column 26, Rule ID: PT_DIACRITICS_REPLACE
+Message: ‘para’ é uma expressão estrangeira importada cuja grafia tem diacríticos. É preferível escrever ‘Pará’
+Suggestion: Pará
+Nada melhor que um banho para nos acordar.

  •                     ^^^^       
    

I tried to disable it by finding: PT_DIACRITICS_REPLACE

But I failed in finding it.

Where is it?

And, why only now it created the false positives?

Thanks!

False positives started to appear because the rule was supposed to be case-insensitive but due to a bug that never worked (see https://github.com/languagetool-org/languagetool/issues/2051). I’ve turned the rule off by default for now so we can hopefully fix it. Instead of just turning on case-sensitivity again, maybe you could go over the list (pt/diacritics.txt) and see whether it needs to be cleaned up?

@marcoagpinto this is the file https://github.com/languagetool-org/languagetool/blob/master/languagetool-language-modules/pt/src/main/resources/org/languagetool/rules/pt/diacritics.txt

I already commented out “Para”. Please note that the matching is case-insensitive.

Looks like this fix changed the AbstractReplaceRule2 behaviour. How can the changes in downstream projects be mitigated?

I’ll add this to our change log:

  • AbstractSimpleReplaceRule2 has been fixed so that it’s now case-insensitive.
    If you implement a sub class of it and you want the old behavior, please implement
    isCaseSensitive() and have it return true. (Issue #2051)