Trouble with EnglishChunker

Mility · August 20, 2015, 9:46am

Hi Daniel,
Today, I added a new rule below:

<rule id="B_NP_SINGULAR_AND_B_NP_SINGULAR_VBZ" name="singular and singular vbz(vbp)">    
		<pattern>
			<token chunk="B-NP-singular" />
			<token chunk="E-NP-singular" />
			<token>and</token>
			<token chunk="B-NP-singular" />
			<token chunk="E-NP-singular" />
			<marker>
				<token postag='VBZ'></token>
			</marker>
		</pattern>
		<message>Compound subjects joined by and are always plural: <suggestion><match no="6" postag="VBP" /></suggestion></message>
		<short>Subject-Verb Agreement</short>
		<example correction=''>A pencil and an eraser <marker>makes</marker> writing easier.</example>
		<example>A pencil and an eraser make writing easier.</example>
	</rule>

When I added it in my grammar.xml, it’s doesn’t work, and I found the “an eraser” in ruleEditor2 is chunked as below(red box inside):

,

but in my standalone version, the opennlp tool has updated to Apache OpenNLP 1.6.0, it changed to
this(below red box inside)?

so I have trouble with EnglishChunker, I want to know why they are different.

Thanks
Regards
Mility

dnaber · August 20, 2015, 3:48pm

I cannot reproduce that with the latest version it git, i.e. I always get singular. Please try the very latest version in git.

Mility · August 21, 2015, 12:54am

Thanks, when I used this:fix singular chunks that got detected as plural chunks · languagetool-org/languagetool@247ef10 · GitHub
it turn to right, and this rule could be added in the grammar.xml.