Week 4: 17 May – 19 May
I’ve trained the first model (colab notebook link). No complicated features engineering and no model tuning were applied, that model is just a first one to work with. It has ~65% accuracy on the task of guessing whether the user will choose the suggestion or not.
The features used are:
- left and right 3-gram context probability
- edit distance between an misspelling and the suggestion
- does the first letter of the suggestion match the first letter of the misspelling
Today I will compare the model with the current approach used by LT using more relevant quality metric (number of times when the top1 suggestion was selected by user) and will work on the integration of the model with the spellchecker’s ordering algorithm.