Back to LanguageTool Homepage - Privacy - Imprint

chunk_regexp


(Nina) #1

Will there be a chunk_regexp (similar to existing postag_regexp) feature available to write tokens in a XML rule pattern?


(Daniel Naber) #2

Well, if it's needed we'll need to introduce it... On the other hand, a
chunk has only three possible values so far and using regular
expressions looks a bit like overkill. We'll also introduce with
the upcoming version, so you should be able to express what you need
with that. Could you try that?


(Nina) #3

That would be great to have the logical . I can try it, if it is available soon.
Also, I came across another issue with postag_regexp; If there are multiple tags (pos or chunk) associated with a token then writing a regex to apply to the multiple strings separated by spaces if a challenge.


(Daniel Naber) #4

is already available in the current snapshots
(http://languagetool.org/download/snapshots/?C=M;O=D). So if you have
the chunk attribute, there should also be .


(Nina) #5

Example of using with chunks ... would it be something like?

         <token chunk="E-NP-singular" or chunk="E-NP-plural"><exception postag="NNPS"/></token>

(Daniel Naber) #6

It should be like this:





(Nina) #7

Thanks.


(Nina) #8

did not work within ...

Getting following error.

cvc-complex-type.2.4.a: Invalid content was found starting with
element 'or'. One of '{unify, and, token, includephrases}' is
expected.

I guess is supported only in the
...
.