[en] Possible rule - 2017-06-08

Hello @Mike_Unwalla

The other day I noticed:
“This is all about English speaking people!”
(it suggests an hyphen)

But, it doesn’t work with other languages:
“This is all about Portuguese speaking people!”

Could it be improved?

Thanks!

Kind regards,

Hello @marcoagpinto,

The rule could be improved. I looked on COCA, and most instances of ‘English speaking’ with no hyphen are not correct. All instances of ‘French speaking’ should have a hyphen. (No instances of ‘Portuguese speaking’!)

The rule is in \resource\en\compounds.txt.The list of candidate languages for the rule is long: http://www.nationsonline.org/oneworld/language_code.htm. As an alternative to adding a long list of words to compounds.txt, I think that a better method is to have a grammar rule. The languages could be specified in grammar.xml in an entity definition, in the same way that we have entity definitions for weekdays and for months.

@danielnaber, what do you think is the best method?

The entity approach sounds fine. We probably shouldn’t list thousands of languages, as it might slow down the regular expression (I haven’t tested that, though).

Done ([en] Add a rule for languages to rulegroup MISSING_HYPHEN · languagetool-org/languagetool@1dee478 · GitHub).

1 Like