Back to LanguageTool Homepage - Privacy - Imprint

Coherency rules for words such as 'Christianize' and 'balkanize'


(Mike Unwalla) #1

Words that are derived from proper nouns sometimes have an initial capital letter (example, ‘Christianize’). Some words can have a lowercase initial letter (example, ‘balkanize’).

Oxford Manual of Style (2002), section 4.1.12 says that if “the association is remote, merely allusive, or a matter of convention”, then initial lowercase is acceptable. “Some words of this type can have both capitals and lower case in different contexts. This depends on whether the connection with the noun is close or loose or – in the case of a term derived from a personal name – whether the word is being used to evoke a specific person or that person’s general attributes.” It gives a list of examples such as “Byzantine (of architecture) but byzantine (complexity)”.

Different dictionaries give different capitalizations.

I think that the easiest solution is to add all the coherency pairs as lower case words, even if they usually have an initial capital letter. The spell checker can deal with words such as ‘Christianize’, which all dictionaries show with initial upper case letter. Is there a problem with my proposed solution?


(Daniel Naber) #2

I’m not sure whether there might be problems, but I suggest you just give it a try. We have many unit tests and the nightly checks, those should prevent most severe problems.


(Mike Unwalla) #3

Done (https://github.com/languagetool-org/languagetool/commit/4f767c4556a4b1d5810f47a36363237c90529dae).

My proposed ‘easiest solution’ does not work because some of the lower case words do not have postags. Where both initial lower case and initial upper case are possible, I could use initial lower case in the coherency.


(Mike Unwalla) #4

Also https://github.com/languagetool-org/languagetool/commit/04935bbd2dfa352af9c9a6cf90f462b922b56512 for words than can have both an initial lower case letter or upper case letter. (The rule is not complete. Please add missing words if you find them.)