I would like to write some advanced typographical rules for checking the white space used in some contexts.
To make the implementation usable by different languages, we should store the “white space character before” at the same time we store if “isWhitespaceBefore” in each token. A shared rule filter could be written that checks the white space character before, and each language could write its own XML rules that take advantage of the filter.
What do you think? Is this approach reasonable?