Current behavior: For the misspelled German word “Verhlaten”, I get the following suggestions (in that order):
- Verladen
- Verlagen
- Verhalten
- Erraten
- Verraten
That’s fine. The first three suggestions all have an edit distance of two from the original string. However, maybe this can be tweaked a little.
Desired behavior: IMO, the most intuitive sorting of the first three items would be
- Verhalten
- Verladen
- Verlagen
I think this is because “Verhalten” (a) consists of exactly the same characters as the original string, and (b), more specifically, it can be obtained from the original string by swapping two adjacent characters (this is probably one of the most common fixes for typos). Intuitively speaking, “Verhalten” is even closer to the original string than the other two. I wonder if these two criteria (a) and (b) can be used to tweak the order of the suggestions. [Also, “Verladen” is better than “Verlagen” because t and d sound similar, but I think this is already covered in de_DE.info
.]