chunk_regexp

Nina · September 26, 2013, 6:00pm

Will there be a chunk_regexp (similar to existing postag_regexp) feature available to write tokens in a XML rule pattern?

dnaber · September 26, 2013, 9:24pm

Well, if it’s needed we’ll need to introduce it… On the other hand, a
chunk has only three possible values so far and using regular
expressions looks a bit like overkill. We’ll also introduce with
the upcoming version, so you should be able to express what you need
with that. Could you try that?

Nina · September 27, 2013, 1:46pm

That would be great to have the logical . I can try it, if it is available soon.
Also, I came across another issue with postag_regexp; If there are multiple tags (pos or chunk) associated with a token then writing a regex to apply to the multiple strings separated by spaces if a challenge.

dnaber · September 27, 2013, 2:30pm

is already available in the current snapshots
(Index of /snapshots/). So if you have
the chunk attribute, there should also be .

Nina · September 27, 2013, 8:25pm

Example of using with chunks … would it be something like?

         <token chunk="E-NP-singular" or chunk="E-NP-plural"><exception postag="NNPS"/></token>

dnaber · September 27, 2013, 8:40pm

It should be like this:

Nina · September 27, 2013, 8:44pm

Thanks.

Nina · October 24, 2013, 6:18pm

did not work within …

Getting following error.

cvc-complex-type.2.4.a: Invalid content was found starting with
element ‘or’. One of ‘{unify, and, token, includephrases}’ is
expected.

I guess is supported only in the
…
.