Some words can be spelled in different ways (e.g. btw and BTW, or deur-knop/deurknop), all of them correct. But it is a good thing to keep it constant across an entire document. Is there a function to get this done?
Yes, there’s a file
coherency.txt for some languages. I’ll activate it for Dutch, I will let you know when it’s available.
I sometimes use alternate spellings to hint at (minor) accents, so would it be possible to disable this (as a rule)?
It’s a rule like any other rule, so it can be enabled/disabled, just like any other rule.
just wanted to be sure.
We could discuss the variations. Casing might not be an issue, but accents might. Let’s try it first.
Do not mix variants of the same word (’" + word1 + “’ and '” + word2 + “’) within a single text.
Gebruik liever geen verschillende spellingen (’” + word1 + “’ en '” + word2 + "’) door elkaar in een tekst.
Consistente spelling van woorden met meerdere correcte vormen.
(P.S. Is is strictly limited to 2? There are probably more in Dutch…)
I added hivtest and hiv-test as variants forst, to get an idea of how it will work.
Thanks, I’ve added the translations. Could you also come up with an example in which both variants of such a word are used? This is shown when the user wants to see an example.
Currently it is.
Not a big issue; they can always be added as pairs.
An example: We raden af om in één tekst zowel hivtest als hivtest te schrijven.
The rule is working fine. The rule apparently assumes the first hit is the best, second is reported as deviation.
Understandable. I Dutch however, both variants are correct, but one is considered better than the other (hivvirus is the ‘base’ form, hiv-virus ‘optional’).
So a whish of mine would be to to alter the function so the first item in the pair in the consistency file is the best one. Should i make it a request in Github?
I do not fully agree with that reasoning for this specific word, the dictionary of the Dutch Language Union contains both forms with the following clarification:
Bij enkele woorden kan het eerste deel in losse letters worden uitgesproken, maar ook als een geheel. In het eerste geval komt er een verbindingsstreepje tussen de beide delen van het woord, in het tweede geval worden ze zonder meer aan elkaar geschreven.
So the prefered ortography depends on how you actually pronounce the word. But I still agree that there should be a way to indicate a preferred orthography, as there are words where there is a clear preference. I am creating an issue on GitHub with some of my ideas for the word coherency rule.
You are correct for this example. But even then, it would be better for consistency to hava e preferred one. Consider the situation:
I can say hivvirus, but also hiv-virus, but also hiv-patient and hivpatiënt.
Here hiv-virus and hivpatiënt will be marked, which is also an inconsisteny.
I think that ideally we should only have one line in the file with a regular expression, which matches all compound words with “hiv”. I made an issue about it here.