Using the Docker image erikvl87/languagetool i added a custom rule to the grammar.xml category TYPOS like it is explained in the development-overview documentation. It shuold replace double quotes with guillemets.
The rule looks like this
<category id="TYPOS" name="Mögliche Tippfehler">
...
<rule id="GUILLEMENTS" name="Guillements als Anführungszeichen verwenden">
<regexp>(^"|\s")([[\d\p{L}\p{Punct}&&[^"]]\s]*)([\d\p{L}\p{Punct}&&[^"]]{1})"</regexp>
<message>Bitte die französischen Anführungszeichen verwenden <suggestion><match no="1" regexp_match=""" regexp_replace="" />»\2\3«</suggestion></message>
<example correction="»\2\3«"><marker>"\2\3"</marker></example>
</rule>
But a double quoted text with multiple sentences is not matched.
eg. “Tom goes out. Tom returns.”
Multiline sentences seperated by commas are matched.
When i try out my regex in an Java Regex Tool the example above is matched.
I think the sentence tokenizer is seperating the sentences before my custom rule is evaluated.
Any suggestions help how to solve this case just with a custom rule in the grammar.xml?
Thanks in advance for any help!