Hi all, I’m new to the forum, but a huge fan of LanguageTool. I’m hoping someone can help me figure out why a rule I have written produces gratuitous, low-quality suggestions.
The best way to communicate the issue is to provide three versions of the rule: version 1 contains a single suggestion element, version 2 another single suggestion element, and finally version 3 contains both suggestion elements (from version 1 and version 2). The corrections generated by the third version, however, are out of line with the previous two versions (both in terms of their number and their content).
All three versions of the rule pass autotest–i.e., the correction set indicated in each version is the correction set the rule in fact produces. As you will see below, the third version’s correction set is highly undesirable (the most significant issue is that the word “the” is in parenthesis, but I also note the rule produces 4 corrections when versions 1 and 2 combined only produce 3).
Can anyone provide insight into what causes this behavior and how I might avoid it? Either way, thanks for making such an awesome tool available for me to use!
Version 1:
<rule id="RULE_210936907448054287052768140797021576881" name="prior_to_1">
<pattern>
<token>prior</token>
<token>to</token>
<token postag="JJ.*|DT|N.*" postag_regexp="yes"/>
<token postag="NNS|NNPS" postag_regexp="yes"/>
<token postag="VBG"/>
<token regexp="yes">and|or</token>
<token postag="VBG"/>
</pattern>
<message>This rule works as expected.</message>
<suggestion>before \3 \4 <match no="5" postag="V.*" postag_regexp="yes" postag_replace="VBD"/> \6 <match no="7" postag="V.*" postag_regexp="yes" postag_replace="VBD"/></suggestion>
<example correction="Before the reporters leaved and headed|Before the reporters left and headed"><marker>Prior to the reporters leaving and heading</marker> to the briefing room,</example>
</rule>
Version 2:
<rule id="RULE_210936907448054287052768140797021576882" name="prior_to_2">
<pattern>
<token>prior</token>
<token>to</token>
<token postag="JJ.*|DT|N.*" postag_regexp="yes"/>
<token postag="NNS|NNPS" postag_regexp="yes"/>
<token postag="VBG"/>
<token regexp="yes">and|or</token>
<token postag="VBG"/>
</pattern>
<message>This rule works as expected.</message>
<suggestion>before \3 \4 <match no="5" postag="V.*" postag_regexp="yes" postag_replace="VBP"/> \6 <match no="7" postag="V.*" postag_regexp="yes" postag_replace="VBD"/></suggestion>
<example correction="Before the reporters leave and headed"><marker>Prior to the reporters leaving and heading</marker> to the briefing room,</example>
</rule>
Version 3:
<rule id="RULE_210936907448054287052768140797021576883" name="prior_to_3">
<pattern>
<token>prior</token>
<token>to</token>
<token postag="JJ.*|DT|N.*" postag_regexp="yes"/>
<token postag="NNS|NNPS" postag_regexp="yes"/>
<token postag="VBG"/>
<token regexp="yes">and|or</token>
<token postag="VBG"/>
</pattern>
<message>This rule DOES NOT work as expected.</message>
<suggestion>before \3 \4 <match no="5" postag="V.*" postag_regexp="yes" postag_replace="VBD"/> \6 <match no="7" postag="V.*" postag_regexp="yes" postag_replace="VBD"/></suggestion>
<suggestion>before \3 \4 <match no="5" postag="V.*" postag_regexp="yes" postag_replace="VBP"/> \6 <match no="7" postag="V.*" postag_regexp="yes" postag_replace="VBD"/></suggestion>
<example correction="Before the reporters leaved and headed|Before the reporters left and heading|Before (the) reporters leaved and heading|Before (the) reporters left and heading"><marker>Prior to the reporters leaving and heading</marker> to the briefing room,</example>
</rule>