I came up with a shorter name for the rule we talked about last time.
But I have spent the whole day removing false positives and testing against 600 000 sentences.
BEFORE:
Portuguese (Portugal): 3780 total matches
Portuguese (Portugal): ø0.01 rule matches per sentence
Portuguese (Portugal): 17323 input lines ignored (e.g. not between 10 and 300 chars or at least 4 tokens)
CURRENTLY:
Portuguese (Portugal): 525 total matches
Portuguese (Portugal): ø0.00 rule matches per sentence
This rule will take a few more days, like two or three, to remove all the false positives.
Hi,
I have seen some problems when the verb is ‘ser’ or ‘estar’. Some cases wtih ‘ser’ are ‘o que é…?’ and in an interrogative it shouldn’t be replaced; ‘o que está’ is being replaced by ‘sendo’ when it should be ‘estando’.