Incorrect chunk detection

(Praneet Khandelwal) #1

The boy in addition to the friend is delayed.
In the above case 'friend' is identified as E-NP-singular but for:
The boy in addition to his friend is delayed.
Here 'his' and 'friend' are identified as B-NP-plural and E-NP-plural respectively.

The disambiguator log is:
IN_NNUN -> in[in/IN,B-PP]
DT_VB_NN -> friend[friend/NN,E-NP-singular]
VBN_VBD -> delayed[delay/VBN,I-VP]

Can anyone guide me how to correct this?

(Daniel Naber) #2

We rely on OpenNLP for chunking, and it internally uses statistics, so it will not be 100% correct. I don't know of a way to fix specific cases. Details about chunking can be found at

(Praneet Khandelwal) #3

Wanted to know which library is used for POS tagging? Is it also OpenNLP?

(Daniel Naber) #4

No, we use our own dictionary with some disambiguation.