Build n-gram for a new language

alex2 · April 16, 2017, 1:34pm

Hi Daniel,

Now, I have a huge amount of text. (~2GB)
I saw a reply of you on the discussion below: En n-gram data

It means I can use Lucene 5.2.1 to create the index, doesn’t it?
By the way, can you tell me next step to create the ngrams myself.

Thanks a lots.