Back to LanguageTool Homepage - Privacy - Imprint


(Ruud Baars) #1

For suggestions for run-ons, I could switch on the suggestions with spaces in the spell checker. Unfortunately, this would create wrong suggestions for ad hoc formed compounds.
So now I am using the SimpleReplaceRule to suggest alternatives for common run-ons, and disable the spellchecker for those using ignore.txt.

Is Dutch the only language with this issue? I can not image it is; German and Danish are sure to have the same kind of challenge. I the way I am trying to tackle this the best way, or is there a smarter solution?

Could we create a kind of ‘spelltagger’, which accepts compounds as well as provides a tag for them?

(Daniel Naber) #2

Can you give an example?

(Ruud Baars) #3

Easily. A word like ‘krokodillengedrag’ is valid, but when it is not in the list, it would generate ‘krokodillen gedrag’, which could also be correct in some sentence constructions, but most of the time it will not be correct. Of course the existing compound detector would help, but that does not fit well with Dutch compounding rules.

Runons are more common with very frequent (and short) words mostly. So I would prefer to have the split suggestions limited to those short and common words, or limited to a list of checked cases.

Melding: Mogelijke spelfout gevonden (uitschakelen)
Suggestie: krokodillen gedrag
Context: Test een keer krokodillengedrag

(Daniel Naber) #4

Maybe that’s the reason the problem is less visible in German: most compounds are accepted, even artificial ones (e.g. Kühlschrankerstbesteigung = Kühlschrank + erst + Besteigung).

(Ruud Baars) #5

Is assume that is the case.