I get the idea that any the rules for splitting into sentences in the srx is a bit too greedy.
Numbers like 15.4 are split into 2 numbers.
15.4.3.Dit is een test. (a common error an paragraph number)
is split into separate sentences.
I could understand the split between 15.4.3. and Dit , since it is also a common mistake to forget the space after the . at the end of a sentence, But I had rather signal that in a rule, than split there.
How to correct this?
By the way, is there a way testing the srx, lile throwing a corpus at it, it marking the sentence boundary with a special character?