Sorry it took so long, I have a very slow internet connection and downloads take forever.
I have ran a moderately extensive set of tests, (results below,) and the ‘fix’ corrects the basic problem, though there is still some issues. I have reproduced the console output directly, with a highlighted edit to explain what was tested:
Microsoft Windows XP [Version 5.1.2600]
© Copyright 1985-2001 Microsoft Corp.
C:\Documents and Settings\anonymous>CD C:\LTD
First test: Did it break what already worked?
C:\LTD>test-rules.bat EN
Running XML pattern tests…
Known languages: [English, English (US), English (GB), English (Australian), English (Canadian), English (New Zealand), English (South African), Persian, French
, German, German (Germany), German (Austria), German (Swiss), Simple German, Polish, Catalan, Catalan, Catalan (Valencian), Italian, Breton, Dutch, Portuguese,
Portuguese (Portugal), Portuguese (Brazil), Russian, Asturian, Belarusian, Chinese, Danish, Esperanto, Galician, Greek, Icelandic, Japanese, Khmer, Lithuanian,
Malayalam, Romanian, Slovak, Slovenian, Spanish, Swedish, Tamil, Tagalog, Ukrainian, Testlanguage]
Running XML validation for en/grammar.xml…
Running pattern rule tests for English… 1 rules tested.
Tests finished!
Running disambiguator rule tests…
Running disambiguation tests for English…
407 rules tested.
Tests successful.
Running XML bitext pattern tests…
Tests successful.
Validating false-friends.xml…
Validation successfully finished.
Second test: Did it ‘fix’ the problem with ‘skip’ when its used with adjectival numbers like: million, billion… etc
C:\LTD>test-rules.bat EN
Running XML pattern tests…
Known languages: [English, English (US), English (GB), English (Australian), English (Canadian), English (New Zealand), English (South African), Persian, French
, German, German (Germany), German (Austria), German (Swiss), Simple German, Polish, Catalan, Catalan, Catalan (Valencian), Italian, Breton, Dutch, Portuguese,
Portuguese (Portugal), Portuguese (Brazil), Russian, Asturian, Belarusian, Chinese, Danish, Esperanto, Galician, Greek, Icelandic, Japanese, Khmer, Lithuanian,
Malayalam, Romanian, Slovak, Slovenian, Spanish, Swedish, Tamil, Tagalog, Ukrainian, Testlanguage]
Running XML validation for en/grammar.xml…
Running pattern rule tests for English… Exception in thread “main” junit.framework.AssertionFailedError: English: Did not expect error in:
Yet another problem list: 5 million, 6 million, and 7 million, bottles of beer.
Matching Rule: R1.3–Comma-CC[1]:[/CD, /, and|or|nor, /CD]:R1.3: Incorrect comma before coordinating conjunction
at junit.framework.Assert.fail(Assert.java:57)
at junit.framework.Assert.assertTrue(Assert.java:22)
at junit.framework.Assert.assertFalse(Assert.java:39)
at junit.framework.TestCase.assertFalse(TestCase.java:210)
at org.languagetool.rules.patterns.PatternRuleTest.testCorrectSentences(PatternRuleTest.java:415)
at org.languagetool.rules.patterns.PatternRuleTest.testGrammarRulesFromXML(PatternRuleTest.java:236)
at org.languagetool.rules.patterns.PatternRuleTest.runTestForLanguage(PatternRuleTest.java:173)
at org.languagetool.rules.patterns.PatternRuleTest.runGrammarRulesFromXmlTestIgnoringLanguages(PatternRuleTest.java:142)
at org.languagetool.rules.patterns.PatternRuleTest.main(PatternRuleTest.java:501)
Running disambiguator rule tests…
Running disambiguation tests for English…
407 rules tested.
Tests successful.
Running XML bitext pattern tests…
Tests successful.
Validating false-friends.xml…
Validation successfully finished.
Third test remove markers from basic antipattern
C:\LTD>test-rules.bat EN
Running XML pattern tests…
Known languages: [English, English (US), English (GB), English (Australian), English (Canadian), English (New Zealand), English (South African), Persian, French
, German, German (Germany), German (Austria), German (Swiss), Simple German, Polish, Catalan, Catalan, Catalan (Valencian), Italian, Breton, Dutch, Portuguese,
Portuguese (Portugal), Portuguese (Brazil), Russian, Asturian, Belarusian, Chinese, Danish, Esperanto, Galician, Greek, Icelandic, Japanese, Khmer, Lithuanian,
Malayalam, Romanian, Slovak, Slovenian, Spanish, Swedish, Tamil, Tagalog, Ukrainian, Testlanguage]
Running XML validation for en/grammar.xml…
Running pattern rule tests for English… 1 rules tested.
Tests finished!
Running disambiguator rule tests…
Running disambiguation tests for English…
407 rules tested.
Tests successful.
Running XML bitext pattern tests…
Tests successful.
Validating false-friends.xml…
Validation successfully finished.
Fourth test: Remove markers from special case antipattern for millions, billions…
C:\LTD>test-rules.bat EN
Running XML pattern tests…
Known languages: [English, English (US), English (GB), English (Australian), English (Canadian), English (New Zealand), English (South African), Persian, French
, German, German (Germany), German (Austria), German (Swiss), Simple German, Polish, Catalan, Catalan, Catalan (Valencian), Italian, Breton, Dutch, Portuguese,
Portuguese (Portugal), Portuguese (Brazil), Russian, Asturian, Belarusian, Chinese, Danish, Esperanto, Galician, Greek, Icelandic, Japanese, Khmer, Lithuanian,
Malayalam, Romanian, Slovak, Slovenian, Spanish, Swedish, Tamil, Tagalog, Ukrainian, Testlanguage]
Running XML validation for en/grammar.xml…
Running pattern rule tests for English… 1 rules tested.
Tests finished!
Running disambiguator rule tests…
Running disambiguation tests for English…
407 rules tested.
Tests successful.
Running XML bitext pattern tests…
Tests successful.
Validating false-friends.xml…
Validation successfully finished.
5th test: Try adding a second skip. For this I used:
and|or|nor
Since the following is no longer a list, if the two skips are working, I expected it to fail. Which it did.
This is the simplest list 1, 2, or 3 apples.
And, if both skips are working as expected, only the following is a list:
I like this 1st list: 1, 2, 3, 4, and 5, it does not cause me grief.
This test was also successful
The special case for million|billion|… antipattern referred to is:
hundred|thousand|million|milliard|billion|trillion|quadrillion|quintillion|sextillion|septillion|octillion|nonillion|decillion|undecillion|duodecillion|tredecillion|quattuordecillion|quindecillion|sexdecillion|sedecillion|septendecillion|octodecillion|novemdecillion|novendecillion|vigintillion|centillion|googol|googolplex
hundred|thousand|million|milliard|billion|trillion|quadrillion|quintillion|sextillion|septillion|octillion|nonillion|decillion|undecillion|duodecillion|tredecillion|quattuordecillion|quindecillion|sexdecillion|sedecillion|septendecillion|octodecillion|novemdecillion|novendecillion|vigintillion|centillion|googol|googolplex
and|or|nor
I hope this helps. If you need anything more, just ask.
Irvine