Back to LanguageTool Homepage - Privacy - Imprint

[en] Possible rule - 2017-06-08


(Marco A.G.Pinto) #1

Hello @Mike_Unwalla

The other day I noticed:
"This is all about English speaking people!"
(it suggests an hyphen)

But, it doesn't work with other languages:
"This is all about Portuguese speaking people!"

Could it be improved?

Thanks!

Kind regards,


(Mike Unwalla) #2

Hello @marcoagpinto,

The rule could be improved. I looked on COCA, and most instances of 'English speaking' with no hyphen are not correct. All instances of 'French speaking' should have a hyphen. (No instances of 'Portuguese speaking'!)

The rule is in \resource\en\compounds.txt.The list of candidate languages for the rule is long: http://www.nationsonline.org/oneworld/language_code.htm. As an alternative to adding a long list of words to compounds.txt, I think that a better method is to have a grammar rule. The languages could be specified in grammar.xml in an entity definition, in the same way that we have entity definitions for weekdays and for months.

@danielnaber, what do you think is the best method?


(Daniel Naber) #3

The entity approach sounds fine. We probably shouldn't list thousands of languages, as it might slow down the regular expression (I haven't tested that, though).


(Mike Unwalla) #4

Done (https://github.com/languagetool-org/languagetool/commit/1dee478648c55fbe806a494d7f441d929aaf7779).