Continuing the discussion from LT and GSoC 2018 - looking for students:
if we have small data, users can download it and use LT offline.
Offline usage of LT is working via local deployment of the server-side? or did you mean the offline version of the spellchecker only or something else? Is that ability to use LT offline implemented now?
From the architectural point of view, I think, there should always be the way to download the data and use LT offline, but depending on the amount of data the default behavior should vary – that gives us the behavior not depending of the language and avoids the hardcode.
Having looked at the spellchecker’s code I see that it uses neither n-grams nor word-specific feautures such as POS-tags. N-grams are mentioned in the ideas list, but POS-tags are not. I think that POS-tags could be be very useful
There are various examples of the ML-based spellcheckers in the web so that’s not hard to implement the new one, but then it has to be integrated and with java and continously updated. Does LT use trained models somewhere else (to follow these cases technologies stack and style)?