Back to LanguageTool Homepage - Privacy - Imprint

[en] Possible rule - 2017-06-08

(Marco A.G.Pinto) #1

Hello @Mike_Unwalla

The other day I noticed:
“This is all about English speaking people!”
(it suggests an hyphen)

But, it doesn’t work with other languages:
“This is all about Portuguese speaking people!”

Could it be improved?


Kind regards,

(Mike Unwalla) #2

Hello @marcoagpinto,

The rule could be improved. I looked on COCA, and most instances of ‘English speaking’ with no hyphen are not correct. All instances of ‘French speaking’ should have a hyphen. (No instances of ‘Portuguese speaking’!)

The rule is in \resource\en\compounds.txt.The list of candidate languages for the rule is long: As an alternative to adding a long list of words to compounds.txt, I think that a better method is to have a grammar rule. The languages could be specified in grammar.xml in an entity definition, in the same way that we have entity definitions for weekdays and for months.

@danielnaber, what do you think is the best method?

(Daniel Naber) #3

The entity approach sounds fine. We probably shouldn’t list thousands of languages, as it might slow down the regular expression (I haven’t tested that, though).

(Mike Unwalla) #4

Done (