hyphenated compounds with numbers on the left side are false positives:
my 2 cents : Have a look at the official speller list: e.g. woordenlijst
One should check the complicated rules in ‘technische handleiding spelling’
standard/common way of writing decades (years):
30's of goed, zoeken uit,
=> Officially it is : de 30’er jaren, de jaren 30.: Jaren '30 / jaren 30 - Taaladvies.net
numbers followed by units are expected to have a space in between but remain lower case:
24uur per dag,
253ha meerderjarig zijn verklaard.,
- 8uur, zijn we weer open.
=> Easily immunized in the disambiguator; add a rule to suggest the space.
Split numbers and words?:
5Waarom moet ik thuis maar een klein beetje eten voordat ik de cursus Koken ga doen?
=> Good idea, but prepare for many exceptions, e.g. paragraph numbers. Imho, most of these mistakes are artifacts from text extraction.
On Names, many names in Tatoeba are copied from a sentence in a different language, where the name is very uncommon in Dutch. Maybe it is better to use only untranslated Tatoeba sentences. And even those are sometimes of low quality.
Sorry to put this on the forum where the text is in GITHUB, but GITHUB is unworkable for me.