[de] How to deal with words missing from the Austrian dictionary?

Jan_Schreiber · June 26, 2017, 5:32pm

We’ve been receiving a lot of user suggestions for Austrian words lately. Most of them aren’t a problem, but there are two I don’t want in the global German spelling.txt file: ‘Leibnitz’, name of a city in Styria (Steiermark), and ‘Feber’ (February), and there are probably more to come. If somebody from Germany types one of those, it is very likely unintentional.
So my question is: Wouldn’t it be useful to have spelling files for all three variants of German, plus one global file?

dnaber · June 26, 2017, 5:56pm

Yes, I think so. There are not that many Austrian words, but some should be included. Leibnitz is a rather small city though, I’m not sure whether it should be included.

dnaber · June 26, 2017, 6:05pm

For comparison, here are some German cities with a similar number of inhabitants as Leibnitz:

Hersbruck
Ludwigslust
Rosbach v. d. Höhe
Feuchtwangen
Twistringen
Steinheim an der Murr
Perleberg
Coswig (Anhalt)
Gladenbach
Burladingen
Pößneck
Bad Urach
Beelitz

(Source)

LT knows some of them and others not. Unfortunately there doesn’t seem to be a systematic approach.

Another idea: the spell checker should not just provide a suggestion but brielfy explain it, like: “Leibnitz [Stadt in der Steiermark]”.

Jan_Schreiber · June 26, 2017, 6:27pm

I’ve never seen a feature like this in a spell checker. That would really take spell checking to the next level.

jaumeortola · June 26, 2017, 6:32pm

Well, that’s just a “grammar rule”.

You could:

add these words to the dictionary (or the spelling.txt file).
create rules that detect these words.
disable these rules for Austrian German, when appropriate.

tiagosantos · June 26, 2017, 7:17pm

German grammar already as a category related to this:

See for example:

    <!-- Prominente -->
    <rule id="ALEXIUS_MEINONG" name="'Alexius Meinung (Meinong)'">
        <pattern case_sensitive="yes">
            <token>Alexius</token>
            <marker>
                <token>Meinung</token>
            </marker>
        </pattern>
        <message>Meinten Sie den Philosophen Alexius <suggestion>Meinong</suggestion>?</message>
	<url>https://de.wikipedia.org/wiki/Alexius_Meinong</url>
        <short>&eigenname;</short>
        <example correction="Meinong">Alexius <marker>Meinung</marker> ist ein Philosoph.</example>
    </rule>

Check line 6013.

Jan_Schreiber · June 26, 2017, 7:54pm

Yeah, I know. But it would be cool if the spell checker could prevent some of these errors from happening in the first place. If someone types ‘Leibnits’ it could suggest ‘Leibnitz (town in Austria)’ and ‘Leibniz (philosopher)’.

SkyCharger001 · June 26, 2017, 8:10pm

Perhaps a first step would be to allow the spell-checker to report the originating dictionary. (I personally have User-Defined-Dictionaries by the name of “Locations”, “Names”, “Brands” and others, so I think you’ll understand how this can help.)

dnaber · June 26, 2017, 8:43pm

Another example (also German): Stihl suggest Stil, Stiel, and Stiehl.

tiagosantos · June 26, 2017, 9:14pm

Again, if you raise the priority of those relevant rules (and create an adequate database), it will display the errors you wish.
Make a “copy paste” java replace rule, change it to “place names”, populate the database with misspellings and related city names, set it to a priority higher than the spell checker and you have that effect.
That is how [pt] identifies misspellings with foreign words and regional variants.
If you want to go fancy and explain more than one category per java class (places mixed with personalities), a third field needs to be added for a custom message.

Theseiko · July 10, 2017, 9:20am

ad Leibnitz:

again it was me, suggesting the word.

Leibnitz is not just a city but also a district:

Theseiko · July 10, 2017, 9:23am

The origin of the message should be readable for computer programs as well for automated spell checking.