Hello!
The current format for bitext rules does not implement the regexp element. Is this a part of your roadmap or is there any deeper reason preventing this from happening?
Hello!
The current format for bitext rules does not implement the regexp element. Is this a part of your roadmap or is there any deeper reason preventing this from happening?
There’s no specific reason I can think of other than a lack of time and the fact that both features (bitext and <regexp>
) are quite specific and their combination is even more specific.
Thanks for answering so swiftly.
I wanted to use bitext rules to check the consistency of typographical elements between aligned text hence regexes.
Note that you can still use regexp on a token level, e.g. this will match a token that’s a digit:
<pattern>
<token regexp="yes">\d</token>
</pattern>
Indeed, that solves some problems
But for some loose patterns where the number of tokens is not necessarily constant, I haven’t found a workaround so far.