There’s a 7 GB zip file with a set of n-grams for Polish created by the Wrocław University for Science and Technology, with a CreativeCommons BY-SA 4.0 license. It has 2-grams, 3-grams and even some 4-grams, but they’re all in .txt format, as tables of words.
https://zasobynauki.pl/zasoby/n-gramy-jezykowe,18469/
How do I convert this into a lucene database usable by LanguageTool?