Back to LanguageTool Homepage - Privacy - Imprint


Will there be a chunk_regexp (similar to existing postag_regexp) feature available to write tokens in a XML rule pattern?

Well, if it’s needed we’ll need to introduce it… On the other hand, a
chunk has only three possible values so far and using regular
expressions looks a bit like overkill. We’ll also introduce with
the upcoming version, so you should be able to express what you need
with that. Could you try that?

That would be great to have the logical . I can try it, if it is available soon.
Also, I came across another issue with postag_regexp; If there are multiple tags (pos or chunk) associated with a token then writing a regex to apply to the multiple strings separated by spaces if a challenge.

is already available in the current snapshots
(;O=D). So if you have
the chunk attribute, there should also be .

Example of using with chunks … would it be something like?

         <token chunk="E-NP-singular" or chunk="E-NP-plural"><exception postag="NNPS"/></token>

It should be like this:


did not work within …

Getting following error.

cvc-complex-type.2.4.a: Invalid content was found starting with
element ‘or’. One of ‘{unify, and, token, includephrases}’ is

I guess is supported only in the