I am writing a small tool that verify HTML files (esp. to convert AsciiDoc files to HTML, and verify those).
Consider the following piece of text.
A combination of the foo, bar, and baz items.
Foo, bar, and baz, are enclosed in <pre>. I am already using annotations because the bits in <pre> may not be correct English. However, that makes LanguageTool see:
A combination of the, and items.
So I get:
Articles like ‘the’ are rarely followed by punctuation. A word may be missing after ‘the’, or the punctuation mark may not be necessary.
and diverse spacing errors. Can I tell the API that something is a non-word (so it shouldn’t trigger spellchecking) but still make it “gramatically”-significant? (e.g. I think it should “think” there’s an adjective there).
(ugh, I see I had already asked this here Draft AsciiDoc integration - #2 by koalillo , but the initial post in that thread was flagged because… it was spam? I wanted to post a link to a project I’m starting that I think can be useful, but now I cannot link to GitHub?)
...ootnotes,#footer{padding:0}} The foo, bar, and baz tokens. Last updated 2...
~~~~
Put a space after the comma, but not before the comma.
________________________________________________________________________________
...tes,#footer{padding:0}} The foo, bar, and baz tokens. Last updated 2023-0...
~~~~
Two consecutive commas
________________________________________________________________________________
I think I could solve the problem by sending “green” instead of a space in interpretAs, but that seems like an ugly hack.