The texts we are spell checking contain a lot of proper nouns that end up as possible spelling mistakes. Most often it is names of people and places, but as we are spell checking news articles a lot of them are foreign names and places.
What is the proper way of handling this? Is it just to add all variations to the dictionary? E.g. Hussain, Hussein, Hossain, Hosain, etc. Is there a common dictionary for names, or should they be redundant in each language?
Quite commonly the names appear multiple times in the text. Is it possible to check them for consistency? I.e. if it is spelt Hussain four times and Hossain once, we could assume that there was a typo. On the other hand this doesn’t catch the cases when the name is consistently misspelt, but still as a common variation of the name, but that seems almost impossible to catch. But perhaps one could catch the error when it is part of the full name, e.g. Barack Hossain Obama would be corrected to Barack Hussein Obama.