Hey Daniel, I was thinking about using a completely different approach to solve the problem. Word embeddings such as Glove and word2vec are created by using a concept called continuous bag of words/ skip grams. The basic idea is that you encode the semantic meaning of a word by using the words surrounding words (both to the left and right of the word). These word vectors are in fact created by training a deep net with the words neighbors and then having to predict the word or vice versa.
Since these vectors are freely available online we can use them to create a function that gives a score to each word in a confusion pair. For example if the sentence is what is the some of two and two. We calculate the score of “some” and “sum” given (“what”, “is”, “the”,“of”,“two”,"two) and the right word is the word with the highest score
The advantage of this method is:
1)We don’t need to train neural networks(We can if we want to though)
2)There are word vectors freely available for more than 100 languages.
3) This method will surely be fast since we just need to calculate a log probability score.
4)Easier to migrate to Java as well.
What do you think about this?