Back to LanguageTool Homepage - Privacy - Imprint

Problems with umlauts in GERMAN_SPELLER_RULE

Like the topic of the following link, LanguageTool is detecting all words with umlauts as wrong.

Example from the API:
Text: Sie können die Police vor dem Fälligkeitsdatum kündigen.
Language: de-DE
Curl: curl -X POST --header ‘Content-Type: application/x-www-form-urlencoded’ --header ‘Accept: application/json’ -d ‘text=Sie%20ko%CC%88nnen%20die%20Police%20vor%20dem%20Fa%CC%88lligkeitsdatum%20ku%CC%88ndigen.&language=de-DE&enabledOnly=false’ ‘https://languagetool.org/api/v2/check

The response shows how “GERMAN_SPELLER_RULE” (“Möglicher Rechtschreibfehler”) rule detects words with umlauts as two words (breaking the word on the umlaut character).

Have umlauts special encoding?

Special characters should appear in their normal form for LT. This looks as if “ö” is “o” + umlaut characters, as separate chars. Instead, use the “ö” directly (and URL-encode it).

Well, looking at URL generated by API, I understand that text is URLEncoded (replaced “ö” with “%25C3%25B6”)

I don’t think the URL encoding step is the problem, but its input. “können” is encoded as ko%CC%88nnen in your original example. It should be k%C3%B6nnen I think.

Sure, I realised that is a different char. Thank you very much!