Possible improvements for Belarusian

Hello, guys! I’m just totally new here and this is my first message.

I’ve been using Belarusian (BY) spellchecker for some time though its vocabulary seems to be pretty poor indeed. Recently I’ve seen a note like “Belarusian has very incomplete support in LanguageTool and there is nobody taking care of it”. And it seems to me that there are ways to significantly and (not sure) relatively easy improve Belarusian spellchecker.

One of the ways is to somehow use the existing BY vocabulary for MS Word (or some other existing). Say, MS Word’s vocab is much reacher and produces much less “false alarms”, though lacking some functionality (the reason I’m using LT for final spellcheck). Something potentially useable: http://bnkorpus.info/download.html

Is it possible to use the existing bases to “quickly” incorporate them into LT? Or just there’s nobody who could take care of this and you’re looking for new active participants?

Hi, thanks for your interest in LanguageTool. Yes, we’re looking for new participants, as it’s very difficult to work in a language one doesn’t speak. The spell checker already uses existing data, but maybe not the latest version. What we use is this data also used by e.g. LibreOffice:

Belarusian affix file and dictionaries

Version number: 0.52

Copyright: Mikalai Udodau (C) 2010, 2011

License: Creative Commons Attribution-ShareAlike

See Creative Commons — Attribution-ShareAlike 3.0 Unported — CC BY-SA 3.0

Updating it (if there’s a new version available) doesn’t require programming, but one should be familiar with the command line. The process is document at Spell check - LanguageTool Wiki. If you’d like to give it a try that would be great. You can always ask here if something isn’t clear.

Regards
Daniel

I can add this dictionary (dict-be-official-2008-20140108.oxt - праверка арфаграфіі для LibreOffice/OpenOffice (на ўмовах ліцэнзіі Creative Commons Attribution/Share-Alike 3.0).) to LanguageTool.
But we need to test the LT for the Belarusian language after this integration.

That would be nice! I can test LT for the Belarusian afterwards.

Done!
You can test new dictionary (https://www.languagetool.org/download/snapshots/LanguageTool-20151214-snapshot.zip
or on-line form from www.languagetool.org).

Ya, it works! :slight_smile:
I cannot directly compare with the previous version. But as I see from some minor tests, LT now produces a lot less “false alarms” than it did before. Moreover, being combined with an older dictionary it produces less “false alarms” than the MS Word dictionary alone. The ability of detecting errors is preserved.

Later I’ll ask the guys to test it a little bit more, though it seems to work pretty fine already.

Thanks for the good work!

http://bnkorpus.info/download.html contains also full database in the xml format, that is much better than old dictionary from Mikalai Udodau