I am running testrules.bat against a set of punctuation rules and cannot really see what its problem is.
Here is the rule:
<rule id="Conjunctions-12--SENT_START--Because" name="Conjunctions R12 PoS analysis: Start sentence with Because: need two clauses">
<!-- Okay, does it start with Because?-->
<pattern>
<token postag='SENT_START'></token>
<token >Because</token>
<marker>
<token postag='SENT_END' skip="-1"></token>
</marker>
<!-- Goto end and skip back to see if it is has two clauses?-->
<and>
<token postag=',|:' postag_regexp="yes"></token>
<token postag='CC|IN' postag_regexp="yes">
<exception>Because</exception>
</token>
</and>
</pattern>
<message>
Conjunctions R12: The sentence is a fragment, it starts with Because. Has only one punctuated clause, and no coordinating or subordinating conjunction. Basically: "Because something. What?"
</message>
<url>http://grammar.ccc.commnet.edu/grammar/conjunctions.htm</url>
<short>
CC R12: Sentence is a fragment?
</short>
<example type='incorrect'>
Because the dog likes to bark<marker>.</marker>
</example>
<example type='correct'>
Because the dog likes to bark, it annoys the neighbours.
</example>
</rule>
And here is the output from testrules.bat:
Running XML validation for en/grammar.xml…
Running pattern rule tests for English… Exception in thread “main” junit.framework.AssertionFailedError: English rule Conjunctions-12–SENT_START–Because:
“Because the dog likes to bark.”
Errors expected: 1
Errors found : 0
Message:
Conjunctions R12: The sentence is a fragment, it starts with Because. Has only has one punctuated clause, and no coordinating or subordinating conjunction. Basically: “Because something, what?”
Analyzed token readings: [/SENT_START*] Because[because/CC*,B-SBAR] [ /null*] the[the/DT,B-NP-plural] [ /null*] dog[dog/NN,E-NP-plural] [ /null*] likes[like/NNS,like/VBZ,B-VP] [ /null*] to[to/IN,to/TO,I-VP] [ /null*] bark[bark/NN:UN,bark/VB,bark/VBP,I-VP] .[./.,./SENT_END,O]
Matches: []
at junit.framework.Assert.fail(Assert.java:57)
at junit.framework.TestCase.fail(TestCase.java:227)
at org.languagetool.rules.patterns.PatternRuleTest.testBadSentences(PatternRuleTest.java:290)
at org.languagetool.rules.patterns.PatternRuleTest.testGrammarRulesFromXML(PatternRuleTest.java:237)
at org.languagetool.rules.patterns.PatternRuleTest.runTestForLanguage(PatternRuleTest.java:173)
at org.languagetool.rules.patterns.PatternRuleTest.runGrammarRulesFromXmlTestIgnoringLanguages(PatternRuleTest.java:142)
at org.languagetool.rules.patterns.PatternRuleTest.main(PatternRuleTest.java:500)
Running disambiguator rule tests…
Running disambiguation tests for English…
407 rules tested.
Tests successful.
Running XML bitext pattern tests…
Tests successful.
Validating false-friends.xml…
Validation successfully finished.
I was going to mark the entire sentence, but given how particular the pattern matching has proven to be, I am now concentrating on just marking the bad termination. Without, in this case, any success, I hasten to add.
Irvine