Any progress on multi word spell checking?

(Ruud Baars) #1

Some time ago @danielnaber reported someone was working on this. Any news?

(Daniel Naber) #2

I’m not sure what exactly you are referring to, could you explain?

(Ruud Baars) #3

I was suggesting to start some efforts on getting data for ML on groups of words for probability of corrections where all words are in itself correctly spelled, but the group as a whole is not.
Then you made a remark someone was already working on this, and it could take a week or two…

(Daniel Naber) #4

We’re currently looking at pre-trained language models. If a pre-trained model can be used, no data needs to be collected, or the amount of data we need is smaller. If you’d like to collect data, that’s great, but I currently cannot make any promises about how or when we’d use that data.

(Ruud Baars) #5

I am collecting language data anyway. So if there is use for it, dara for teaining language models for languages that have none yet, is available.