Back to LanguageTool Homepage - Privacy - Imprint

[en] [pt] [ru] [pl] Re-enabling dash rules

TL;DR:
I optimized AbstractDashRule and would like to enable these rules again for Russian, Portuguese and Polish.
Any objections, anything else I should consider?

I recently wrote a micro-benchmark that automatically profiles all rules (available here right now, but I’ll file a PR soon).
When running it on the English rules, I noticed that the EN_DASH_RULE was a huge outlier, slower by an order of magnitude than all other rules. Luckily, I had an optimization ready that I already applied elsewhere: I replaced the matching using patterns with an efficient string search algorithm. While it can’t handle arbitrary amounts of white space between word parts anymore now, it’s significantly faster.
Thus, I would like to also re-enable the dash rules for Portuguese, Russian and Polish.
Are there any objections? As far as I could see, they were only disabled because of performance reasons.

1 Like

Does that refer to the open-source languagetool? Where is the pull request with those changes?

PS - By the way, welcome back.

Yes, it’s about the open-source edition.
Sorry, the pull request will be further delayed because I encountered some semi-related problems
with my changes that I still have to fix.