The rule could be improved. I looked on COCA, and most instances of ‘English speaking’ with no hyphen are not correct. All instances of ‘French speaking’ should have a hyphen. (No instances of ‘Portuguese speaking’!)
The rule is in \resource\en\compounds.txt.The list of candidate languages for the rule is long: http://www.nationsonline.org/oneworld/language_code.htm. As an alternative to adding a long list of words to compounds.txt, I think that a better method is to have a grammar rule. The languages could be specified in grammar.xml in an entity definition, in the same way that we have entity definitions for weekdays and for months.
@danielnaber, what do you think is the best method?