Back to LanguageTool Homepage - Privacy - Imprint

Wrong speel checker (POLISH and RUSSIAN and ENGLISH + ALL LANGUAGE) Problem still not solved

The problem with detecting capital letters after the dot, etc.
It does not detect missing spaces after the DOT.
Detection issue with duplicate spaces.

The problem has been around for a long, long time. I reported this issue, it was supposed to be solved but still unresolved.

This is not really an easy problem, since all kinds of correct word groups contain a full stop, like URLs, abbreviations etc.

Could you post a text as text, not as screenshot, or even better: both?

Mini example:


Hey how.are you.

It does not detect an error

I have 10,000,000 such errors in the text and I don’t know how to correct it.

Maybe you can use a regular expression, e. g. \w\.\w

Unfortunately, how.are could easily be a file name. So if we would report this, every filename would cause a report.

I think it would be possible to report this kind of error between very common words. Common file extensions could be excluded, ad well as strings having this multiple times (URLs)
I think it is almost language independent as well.

I just made a test rule for Dutch, and will see what it does in the nightly test. If successful, I will post that rule for other languages.

You could try search and replace. It looks like you are editing text from OCR, am I right?
Doing some changes in a text editor is sometimes very helpful then.
There are some good command line tools as well, especially for Linux. And some Windows editors can use macro’s to special search and replace.
But still… I built a test rule for Dutch, which will be tested in tonight’s run.

There are many more problems in the text, but LT does not detect it! For example, too many spaces so reduce only standard 1 space, an uppercase letter should be proposed after the dot, and so on.
Besides, I use the best and fastest editor in the world for huge text files EmEditor, and I don’t know how to integrate language checking with Emurasoft EmEditor. These are writing errors (human), not any OCR.

Multiple spaces is very easy to replace with single. Just globally replace 2 spaces by one, snd repeat that until no space is found.
LT has, like all tools, some limitations.

The problem is that you have to distinguish the spaces from the required tabs (regex: \t). For example, the book form.

A good editor does that. Kate of KDE in Linux and Notepad++ on Windows e.g.

The general rule for all languages can detect a missing space after a dot if the next character is a capital letter.

H. This is not good correction LT, other correctors can detect it e.g.*

[Automatically fixed errors (1)]

This list shows the automatically corrected errors.

To zdanie nie zaczyna się wielką literą

Hej. ja będę.

Девятая аудитория располагалась на третьем этаже главного корпуса, но была маленькой и неуютной.

Or example:

Все способы имеют свои преимущества и недостатки. Например, световое зонирование не даст полного ощущения деления комнаты на зоны, так что данный способ используют, как правило, в сочетании с другими. А слишком много перегородок, наоборот, приведет к чрезмерному дроблению пространства, в итоге - комната будет выглядеть слишком маленькой и неуютной. Так что при выборе того или иного способа необходимо учитывать эти нюансы.
How to fix problem?

And what is the problem? The phrases with the word in bold are correct.

Please check this Github issue: #3655