[en] many nagging error from rule REP_PASSIVE_VOICE

Dominique_PELLE · September 19, 2022, 8:22am

English Rule REP_PASSIVE_VOICE causes many false errors whenever I check English texts.
It’s only about style. IMO, enabling this rule by default causes more harm than good.
Should it be disabled by default?

In my personal setup, I chose to disable it by default because it causes too many false errors.

dnaber · September 19, 2022, 8:43am

Our data tells us the rule doesn’t match often, but when it matches, it does indeed get disabled quite a bit. It’s already only active in “picky” mode. Do you have some examples of cases where you turned off the rule? Maybe we can further improve it instead of turning it off completely.

Dominique_PELLE · September 20, 2022, 8:07am

I suspect it the rule is triggered mostly on large enough input.

I just opened a random page in Wikipedia France - Wikipedia and it triggered the rules REP_PASSIVE_VOICE in several places. I then opened a second random page Earth - Wikipedia and it again triggered the rule. I did not cherry pick the pages. I think most English text large enough triggers the rule which is annoying.

jaumeortola · September 20, 2022, 10:00am

Thanks for the comment, Dominique. The rule is triggered with these repetition settings: min_prev_matches="4" distance_tokens="20": when there are 4 previous matches of the same rule, and they are close enough (less than 20 tokens).
The problem (or bug) is that the condition of <20 tokens is required only from the penultimate to the last repetition. (Most repetition rules use min_prev_matches="1")
I will talk to the English developers about the best solution.

dnaber · September 20, 2022, 10:46am

Issue now at Repetition settings: distance_tokens · Issue #7107 · languagetool-org/languagetool · GitHub