Back to LanguageTool Homepage - Privacy - Imprint

[en] Rule designing problem. Help needed!


#1

I'm trying to create a rule that will detect all the sentences in which 'add/added/adding to' wrongly written as 'add/adding/added into/at/with', but I cannot match the chunks and tagsets properly. How can I do that? Please help!

Some examples of incorrect sentences are as follows:

His comments added more into the confusion circle.
Add more salt in the juice.
Add sugar at the juice.
... added up with ...

Correct sentences will be:

His comments added more to the confusion circle.
Add more salt to the juice.
Add sugar to the juice.
... added up to ...

Here is the XML code:

  <!-- English rule, 2016-10-21 -->
  <rule id="CONFUSION_ADD_WITHADD_TO" name="confusion add with/add to">
  <pattern>
    <token inflected='yes' regexp='yes'>add|added|adding</token>
    <token postag='NN:U|JJ|JJR' postag_regexp='yes'></token>
    <token regexp='yes'>about|below|excepting|off|toward|above|beneath|for|on|under|across|beside|besides|from|onto|underneath|after|between|in|out|until|against|beyond|outside|up|along|but|inside|over|upon|among|by|past|around|concerning|regarding|with|at|despite|into|since|within|down|like|through|without|before|during|near|throughout|behind|except|of</token>
    <token postag='DT'></token>
    <token postag='NN:UN|NN:U|NN|NNU|NNS' postag_regexp='yes'></token>
  </pattern>
  <message>Did you mean: add to</message>
  <short>Wrong preposition: add to</short>
  <example correction='added more to the confusion'>His comments <marker>added more into the confusion</marker> circle.</example>
  <example correction='Add more to'><marker>Add more salt in</marker> the juice.</example>
  <example correction='Add sugar to'><marker>Add sugar at</marker> the juice.</example>
  <example correction='added up to'> ... <marker>added up with</marker> ... </example>
  <example>His comments added more to the confusion circle.</example>
  <example>Add more salt to the juice.</example>
  <example>Add sugar to the juice.</example>
  <example> ... added up to ... </example>
  </rule>

I'm getting serious error messages. How do I generalise 'add to' in this code?


(Daniel Naber) #2

I haven't checked if this is the only problem, but your <message> doesn't contain a <suggestion>. It's optional, but if you don't have a <suggestion>, you cannot specify a value in the correction attribute, it needs to be correction='' then.

I suggest working sentence by sentence, making one work after the other. If there are error messages, please post them here.


#3

Sorry, I tried but it's not working without flaw.
As you have suggested working sentence by sentence, I broke it up into three rules.
Now it is truncating the sentence. Please test it then answer me. I'm really looking for some help!
Here is the modified code:

<!-- English rule, 2016-10-21 correction of 'add with/at/in' vs 'add to'-->
<rule id="CONFUSION_ADD_WITH_ADD_TO_0001" name="confusion add with/add to 0001">
<pattern>
  <token inflected='yes' regexp='yes'>add|added|adding</token>
  <token postag='NN:U|JJ|JJR' postag_regexp='yes'></token>
  <token regexp='yes'>about|below|excepting|off|toward|above|beneath|for|on|under|across|beside|besides|from|onto|underneath|after|between|in|out|until|against|beyond|outside|up|along|but|inside|over|upon|among|by|past|around|concerning|regarding|with|at|despite|into|since|within|down|like|through|without|before|during|near|throughout|behind|except|of</token>
  <token postag='DT'></token>
  <token postag='NN:UN|NN:U|NN|NNU|NNS' postag_regexp='yes'></token>
</pattern>
<message>Did you mean: <suggestion>\1 \2 to</suggestion>?</message>
<short>Wrong preposition: add to</short>
<example correction='added more to the confusion'>His comments <marker>added more into the confusion</marker> circle.</example>
<example correction='Add sugar to'><marker>Add sugar at</marker> the juice.</example>
</rule>
<rule id="CONFUSION_ADD_WITH_ADD_TO_0002" name="confusion add with/add to 0002">
<pattern>
  <token inflected='yes' regexp='yes'>add|added|adding</token>
  <token postag='JJ|JJR|JJS' postag_regexp='yes'></token>
  <token postag='NN:UN|NN:U|NN|NNU|NNS' postag_regexp='yes'></token>
  <token regexp='yes'>about|below|excepting|off|toward|above|beneath|for|on|under|across|beside|besides|from|onto|underneath|after|between|in|out|until|against|beyond|outside|up|along|but|inside|over|upon|among|by|past|around|concerning|regarding|with|at|despite|into|since|within|down|like|through|without|before|during|near|throughout|behind|except|of</token>
  <token postag='DT'></token>
  <token postag='NN:UN|NN:U|NN|NNU|NNS' postag_regexp='yes'></token>
</pattern>
<message>Did you mean: <suggestion>\2 \3 to</suggestion>?</message>
<short>Wrong preposition: add to</short>
<example correction='Add more salt to'><marker>Add more salt in</marker> the juice.</example>
</rule>
<rule id="CONFUSION_ADD_WITH_ADD_TO_0003" name="confusion add with/add to 0003">
<pattern>
  <token inflected='yes' regexp='yes'>add|added|adding</token>
  <token>up</token>
  <token regexp='yes'>about|below|excepting|off|toward|above|beneath|for|on|under|across|beside|besides|from|onto|underneath|after|between|out|until|against|beyond|outside|up|along|but|inside|over|upon|among|by|past|around|concerning|regarding|with|at|despite|into|since|within|down|like|through|without|before|during|near|throughout|behind|except|of</token>
</pattern>
<short>Wrong preposition: add to</short>
<example correction=''>... <marker>added up with</marker> ...</example>
<example> ... added up to ... </example>
<message>Did you mean: <suggestion>add \2 to</suggestion>?</message>
</rule>

(Daniel Naber) #4

Please try http://community.languagetool.org/ruleEditor/expert, it often gives quite good error messages. In your case, the value of correction was wrong, it needs to match exactly what <suggestion>...</suggestion> provides. Also, the <marker>...</marker> in the example needs to cover all the tokens of the <pattern>. The corrected first rule looks like this:

<rule id="CONFUSION_ADD_WITH_ADD_TO_0001" name="confusion add with/add to 0001">
<pattern>
  <token inflected='yes' regexp='yes'>add|added|adding</token>
  <token postag='NN:U|JJ|JJR' postag_regexp='yes'></token>
  <token regexp='yes'>about|below|excepting|off|toward|above|beneath|for|on|under|across|beside|besides|from|onto|underneath|after|between|in|out|until|against|beyond|outside|up|along|but|inside|over|upon|among|by|past|around|concerning|regarding|with|at|despite|into|since|within|down|like|through|without|before|during|near|throughout|behind|except|of</token>
  <token postag='DT'></token>
  <token postag='NN:UN|NN:U|NN|NNU|NNS' postag_regexp='yes'></token>
</pattern>
<message>Did you mean: <suggestion>\1 \2 to</suggestion>?</message>
<short>Wrong preposition: add to</short>
<example correction='added more to'>His comments <marker>added more into the confusion</marker> circle.</example>
<example correction='Add sugar to'><marker>Add sugar at the juice</marker>.</example>
</rule>

#5

Thanks! I will try to check this code as you suggested and report you accordingly. It would take some time. I will make the necessary corrections tomorrow.


(Knorr) #6

When you add inflected="yes" to a token, you only need to add the base form, i.e., add (adds, adding, added will automatically match, too).
So, in your rule <token inflected='yes'>add</token> already does the job.

Moreover, I think <marker>...</marker> should enclose the first three tokens so that only these are replaced.


#7

Thanks @Knorr and @dnaber!
@Knorr , did you say <suggestion>...</suggestion> should enclose the required tokens so that only these are replaced?
I followed your advice. It worked amazingly. Thanks!
Now it's working as expected.

The codes at the moment are:

    <!-- English rule, 2016-10-21 correction of 'add with/at/in' vs 'add to'-->
    <rule id="CONFUSION_ADD_WITH_ADD_TO_0001" name="confusion add with/add to 0001">
    <pattern>
      <token inflected='yes'>add</token>
      <token postag='NN:U|JJ|JJR' postag_regexp='yes'></token>
      <token regexp='yes'>about|below|excepting|off|toward|above|beneath|for|on|under|across|beside|besides|from|onto|underneath|after|between|in|out|until|against|beyond|outside|up|along|but|inside|over|upon|among|by|past|around|concerning|regarding|with|at|despite|into|since|within|down|like|through|without|before|during|near|throughout|behind|except|of</token>
      <token postag='DT'></token>
      <token postag='NN:UN|NN:U|NN|NNU|NNS' postag_regexp='yes'></token>
    </pattern>
    <message>Did you mean: <suggestion>\1 \2 to \4 \5</suggestion>?</message>
    <short>Wrong preposition: add to</short>
    <example correction='added more to the confusion'>His comments <marker>added more into</marker> the confusion circle.</example>
    <example>His comments added more to the confusion circle.</example>
    <example correction='Add sugar to'><marker>Add sugar at</marker> the juice.</example>
    <example>Add sugar to the juice.</example>
    </rule>
    <rule id="CONFUSION_ADD_WITH_ADD_TO_0002" name="confusion add with/add to 0002">
    <pattern>
      <token inflected='yes'>add</token>
      <token postag='JJ|JJR|JJS' postag_regexp='yes'></token>
      <token postag='NN:UN|NN:U|NN|NNU|NNS' postag_regexp='yes'></token>
      <token regexp='yes'>about|below|excepting|off|toward|above|beneath|for|on|under|across|beside|besides|from|onto|underneath|after|between|in|out|until|against|beyond|outside|up|along|but|inside|over|upon|among|by|past|around|concerning|regarding|with|at|despite|into|since|within|down|like|through|without|before|during|near|throughout|behind|except|of</token>
      <token postag='DT'></token>
      <token postag='NN:UN|NN:U|NN|NNU|NNS' postag_regexp='yes'></token>
    </pattern>
    <message>Did you mean: <suggestion>\1 \2 \3 to \5 \6</suggestion>?</message>
    <short>Wrong preposition: add to</short>
    <example correction='Add more salt to'><marker>Add more salt in</marker> the juice.</example>
    <example>Add more salt to the juice.</example>
    </rule>
   <rule id="CONFUSION_ADD_WITH_ADD_TO_0003" name="confusion add with/add to 0003">
<pattern>
  <token inflected='yes'>add</token>
  <token>up</token>
  <token regexp='yes'>about|below|excepting|off|toward|above|beneath|for|on|under|across|beside|besides|from|onto|underneath|after|between|out|until|against|beyond|outside|up|along|but|inside|over|upon|among|by|past|around|concerning|regarding|with|at|despite|into|since|within|down|like|through|without|before|during|near|throughout|behind|except|of</token>
</pattern>
<message>Did you mean: <suggestion>\1 \2 to</suggestion>?</message>
<short>Wrong preposition: add to</short>
<example correction='added up to'>... <marker>added up with</marker> ...</example>
<example> ... added up to ... </example>
</rule>

Though, I have a request. Please re-check the code again, find any flaw (if it's still there) and fix it then publish it. So, please...


#8

I'm getting error messages like this:

There are problems with your rule:
Unexpected end position of <marker>...</marker> in incorrect example sentence: 42 but expected 50

Please locate the place of error and inform me.


(Knorr) #9

Hi! Your rules look pretty good!!
Regarding the <marker>: Let's say you want the suggestion to replace only a certain part of the pattern you specified. In this case, you can use the following XML:

        <rule>
        <pattern>
          <marker>
              <token>suns</token>
          </marker>
          <token>or</token>
          <token>daughters</token>
        </pattern>
        <message>Did you mean: <suggestion>sons</suggestion>?</message>
        <example correction='sons'>Does she have <marker>suns</marker> or daughters?</example>
        </rule>

This rule will match "suns or daughters", but the suggestion / the highlighted error / the corrections apply to "suns" only.

So, your first rule could look like this:

    <rule id="CONFUSION_ADD_WITH_ADD_TO_0001" name="confusion add with/add to 0001">
    <pattern>
      <marker>
          <token inflected='yes'>add</token>
          <token postag='NN:U|JJ|JJR' postag_regexp='yes'></token>
          <token regexp='yes'>about|below|excepting|off|toward|above|beneath|for|on|under|across|beside|besides|from|onto|underneath|after|between|in|out|until|against|beyond|outside|up|along|but|inside|over|upon|among|by|past|around|concerning|regarding|with|at|despite|into|since|within|down|like|through|without|before|during|near|throughout|behind|except|of</token>
      </marker>
      <token postag='DT'></token>
      <token postag='NN:UN|NN:U|NN|NNU|NNS' postag_regexp='yes'></token>
    </pattern>
    <message>Did you mean: <suggestion>\1 \2 to</suggestion>?</message>
    <short>Wrong preposition: add to</short>
    <example correction='added more to'>His comments <marker>added more into</marker> the confusion circle.</example>
    <example>His comments added more to the confusion circle.</example>
    <example correction='Add sugar to'><marker>Add sugar at</marker> the juice.</example>
    <example>Add sugar to the juice.</example>
    </rule>

#10

Excellent! Now I get it (not completely but partially at least).
Using <marker>...</marker> under <pattern>...</pattern> did all the magic.
I will try to change the code to some extent so that the code fits into a wide range of situations and report you as early as possible.
Thanks!


#11

Okay, I have tested the code and it works as desired under the following conditions.

===========================================================
His comments added more into the confusion circle.
Add more salt in the juice.
Add sugar at the juice.
... added up with ...
Add more acid into the beaker.
Don't add oil into petrol.
Adding more salt into the lemon juice will not enhance the taste.
Try adding more sugar in the coffee and make it tasteless.
Adding more humour onto the story doesn't necessarily make it more beautiful.
Adding more sugar within the tea would probably spoil the taste.
Adding more people like John into the party will tremendously complicate the situation.
===========================================================

I still haven't clue whether it is foolproof or not. If you think it can be improved any further, please suggest me or if you think it's just fine, publish it.
Here is the code:

    <!-- English rule, 2016-10-21 correction of 'add with/at/in' vs 'add to'-->
    <rule id="CONFUSION_ADD_WITH_ADD_TO_0001" name="confusion add with/add to 0001">
    <pattern>
      <marker>
          <token inflected='yes'>add</token>
          <token postag='NN:U|JJ|JJR' postag_regexp='yes'></token>
          <token regexp='yes'>about|below|excepting|off|toward|above|beneath|for|on|under|across|beside|besides|from|onto|underneath|after|between|in|out|until|against|beyond|outside|up|along|but|inside|over|upon|among|by|past|around|concerning|regarding|with|at|despite|into|since|within|down|like|through|without|before|during|near|throughout|behind|except|of</token>
      </marker>
      <token postag='DT'></token>
      <token postag='NN:UN|NN:U|NN|NNU|NNS' postag_regexp='yes'></token>
    </pattern>
      <message>Did you mean: <suggestion>\1 \2 to</suggestion>?</message>
      <short>Wrong preposition: add to</short>
      <example correction='added more to'>His comments <marker>added more into</marker> the confusion circle.</example>
      <example>His comments added more to the confusion circle.</example>
      <example correction='Add sugar to'><marker>Add sugar at</marker> the juice.</example>
      <example>Add sugar to the juice.</example>
    </rule>
    
    <rule id="CONFUSION_ADD_WITH_ADD_TO_0002" name="confusion add with/add to 0002">
    <pattern>
      <marker>
      <token inflected='yes'>add</token>
      <token postag='JJ|JJR|JJS' postag_regexp='yes'></token>
      <token postag='NN:UN|NN:U|NN|NNU|NNS' postag_regexp='yes'></token>
      <token regexp='yes'>about|below|excepting|off|toward|above|beneath|for|on|under|across|beside|besides|from|onto|underneath|after|between|in|out|until|against|beyond|outside|up|along|but|inside|over|upon|among|by|past|around|concerning|regarding|with|at|despite|into|since|within|down|like|through|without|before|during|near|throughout|behind|except|of</token>
      </marker>
      <token postag='DT'></token>
      <token postag='NN:UN|NN:U|NN|NNU|NNS' postag_regexp='yes'></token>
    </pattern>
    <message>Did you mean: <suggestion>\1 \2 \3 to</suggestion>?</message>
    <short>Wrong preposition: add to</short>
    <example correction='Add more salt to'><marker>Add more salt in</marker> the juice.</example>
    <example>Add more salt to the juice.</example>
    </rule>
    <rule id="CONFUSION_ADD_WITH_ADD_TO_0003" name="confusion add with/add to 0003">
    <pattern>
      <marker>
      <token inflected='yes'>add</token>
      <token>up</token>
      <token regexp='yes'>about|below|excepting|off|toward|above|beneath|for|on|under|across|beside|besides|from|onto|underneath|after|between|out|until|against|beyond|outside|up|along|but|inside|over|upon|among|by|past|around|concerning|regarding|with|at|despite|into|since|within|down|like|through|without|before|during|near|throughout|behind|except|of</token>
      </marker>
    </pattern>
    <message>Did you mean: <suggestion>\1 \2 to</suggestion>?</message>
    <short>Wrong preposition: add to</short>
    <example correction='added up to'>... <marker>added up with</marker> ...</example>
    <example> ... added up to ... </example>
    </rule>
    <rule id="CONFUSION_ADD_WITH_ADD_TO_0004" name="confusion add with/add to 0004">
    <pattern>
      <marker>
      <token inflected='yes'>add</token>
      <token postag='NN:UN|NN:U|NNU' postag_regexp='yes'></token>
      <token regexp='yes'>about|below|excepting|off|toward|above|beneath|for|on|under|across|beside|besides|from|onto|underneath|after|between|in|out|until|against|beyond|outside|up|along|but|inside|over|upon|among|by|past|around|concerning|regarding|with|at|despite|into|since|within|down|like|through|without|before|during|near|throughout|behind|except|of</token>
      <token postag='NN:UN|NN:U|NN|NNU|NNS' postag_regexp='yes'></token>
      </marker>
    </pattern>
    <message>Did you mean: <suggestion>\1 \2 to \4</suggestion>?</message>
    <short>Wrong preposition: add to</short>
    <example correction='add lubricant to petrol'><marker>add lubricant into petrol</marker>.</example>
    <example>add lubricant to petrol.</example>
    </rule>
    <rule id="CONFUSION_ADD_WITH_ADD_TO_0005" name="confusion add with/add to 0005">
    <pattern>
      <marker>
      <token inflected='yes'>add</token>
      <token postag='JJ|JJR|JJS' postag_regexp='yes'></token>
      <token postag='NN:UN|NN:U|NNU' postag_regexp='yes'></token>
      <token regexp='yes'>about|below|excepting|off|toward|above|beneath|for|on|under|across|beside|besides|from|onto|underneath|after|between|in|out|until|against|beyond|outside|up|along|but|inside|over|upon|among|by|past|around|concerning|regarding|with|at|despite|into|since|within|down|like|through|without|before|during|near|throughout|behind|except|of</token>
      <token postag='NN:UN|NN:U|NN|NNU|NNS' postag_regexp='yes'></token>
      </marker>
    </pattern>
    <message>Did you mean: <suggestion>\1 \2 \3 to \5</suggestion>?</message>
    <short>Wrong preposition: add to</short>
    <example correction='add more lubricant to petrol'><marker>add more lubricant into petrol</marker>.</example>
    <example>add more lubricant to petrol.</example>
    </rule>
    <rule id="CONFUSION_ADD_WITH_ADD_TO_0006" name="confusion add with/add to 0006">
    <pattern>
      <marker>
      <token inflected='yes'>add</token>
      <token postag='DT|JJ' postag_regexp='yes'></token>
      <token postag='JJ|JJR|JJS' postag_regexp='yes'></token>
      <token postag='NN:UN|NN:U|NNU' postag_regexp='yes'></token>
      <token regexp='yes'>about|below|excepting|off|toward|above|beneath|for|on|under|across|beside|besides|from|onto|underneath|after|between|in|out|until|against|beyond|outside|up|along|but|inside|over|upon|among|by|past|around|concerning|regarding|with|at|despite|into|since|within|down|like|through|without|before|during|near|throughout|behind|except|of</token>
      <token postag='NN:UN|NN:U|NN|NNU|NNS' postag_regexp='yes'></token>
      </marker>
    </pattern>
    <message>Did you mean: <suggestion>\1 \2 \3 \4 to \6</suggestion>?</message>
    <short>Wrong preposition: add to</short>
    <example correction='add much more lubricant to petrol'><marker>add much more lubricant into petrol</marker>.</example>
    <example>add much more lubricant to petrol.</example>
    </rule>
<rule id="CONFUSION_ADD_WITH_ADD_TO_0007" name="confusion add with/add to 0007">
<pattern>
  <marker>
  <token inflected='yes'>add</token>
  <token postag='RP|JJR|JJR' postag_regexp='yes'></token> <!-- MORE -->
  <token postag='NN:UN|NNS' postag_regexp='yes'></token> <!-- PEOPLE -->
  <token postag='IN|JJ|NN' postag_regexp='yes'></token> <!-- LIKE -->
  <token postag='NNP|IN|NN|NN:UN|NN:U|NNU|NNS' postag_regexp='yes'></token> <!-- JOHN -->
  <token regexp='yes'>about|below|excepting|off|toward|above|beneath|for|on|under|across|beside|besides|from|onto|underneath|after|between|in|out|until|against|beyond|outside|up|along|but|inside|over|upon|among|by|past|around|concerning|regarding|with|at|despite|into|since|within|down|like|through|without|before|during|near|throughout|behind|except|of</token>
  <token postag='DT' postag_regexp='yes'></token> <!-- THE -->
  <token postag='NN:UN|NN:U|NN|NNU|NNS' postag_regexp='yes'></token> <!-- PARTY -->
  </marker>
</pattern>
<message>Did you mean: <suggestion>\1 \2 \3 \4 \5 to \7 \8</suggestion>?</message>
<short>Wrong preposition: add to</short>
<example correction='Adding more people like John to the party'><marker>Adding more people like John into the party</marker> will tremendously complicate the situation.</example>
<example>Adding more people like John to the party will tremendously complicate the situation.</example>
</rule>

#12

Instead of writing individual rules for each incorrect sentences I came up with a different idea.

====================================================================

<rule id="CONFUSION_ADD_WITH_ADD_TO_0008" name="confusion add with/add to 0008">
    <pattern>
            <token inflected='yes' skip='-1'>add<exception scope="next" regexp='yes'>,|to</exception></token>
            <token regexp='yes' inflected='yes' skip='-1'>like|such|except|to</token>
            <!-- <token skip="-1">append</token> -->
        <marker>
            <token regexp='yes'>about|below|excepting|off|toward|above|beneath|for|on|under|across|beside|besides|from|onto|underneath|after|between|in|out|until|against|beyond|outside|up|along|but|inside|over|upon|among|by|past|around|concerning|regarding|with|at|despite|into|since|within|down|like|through|without|before|during|near|throughout|behind|except|of</token>
        </marker>
    </pattern>
    <message>Did you mean: <suggestion>to</suggestion>?</message>
    <example correction='Adding more people like John to the party'><marker>Adding more people like John into the party</marker> will tremendously complicate the situation.</example>
    <example>Adding more people like John to the party will tremendously complicate the situation.</example>
</rule>

=======================================================================

<rule id="APPEND_AT_WITH_INTO_ONTO_VS_TO_0001" name="append at/with/into/onto vs to 0001">
    <pattern>
            <token inflected='yes' skip='-1'>append<exception scope="next" regexp='yes'>,|to|like|such|except|to|on|of</exception></token>
            <!-- <token skip="-1">append</token> -->
        <marker>
            <token regexp='yes'>about|below|excepting|off|toward|above|beneath|for|on|under|across|beside|besides|from|onto|underneath|after|between|in|out|until|against|beyond|outside|up|along|but|inside|over|upon|among|by|past|around|concerning|regarding|with|at|despite|into|since|within|down|like|through|without|before|during|near|throughout|behind|except|of</token>
        </marker>
    </pattern>
    <message>Did you mean: <suggestion>to</suggestion>?</message>
    <example correction='appended to'>Appendices are generally <marker>appended at</marker> the last pages of a book.</example>
    <example>Appendices are generally appended to the last pages of a book.</example>
</rule>

==============================================================

Now it detects all the errors, but I'm getting the same marker error from the expert mode editor.
Some of the incorrect sentences written as follows:

=========================================
His comments added more into the confusion circle.
Add more salt in the juice.
Add sugar at the juice.
... added up with ...
Add more acid into the beaker.
Don't add oil into petrol.
Avoid adding oil with diesel.
adding more salt into the lemon juice will not enhance the taste.
Try adding more sugar in the coffee and make it tasteless.
Adding more humour onto the story doesn't necessarily make it more beautiful.
Adding more people like John into the party will tremendously complicate the situation.
Adding people like John into the party will tremendously complicate the situation.
Why are you adding more men like this stupid John into the party?
Why are you adding more stupid men like this stupid John into the party?
Why are you adding more stupid men like this John into the party?
Why are you adding more stupid men like John into the party?
Appendices are generally appended at the last pages of a book.
==============================

All these incorrect sentences can be detected and corrected now, but I'm getting the marker errors!

Please help! I'm a little bit confused now.


(Knorr) #13

I like the idea unifying the rules.

Your example sentence should look like this:
<example correction='to'>Adding more people like John <marker>into</marker> the party will tremendously complicate the situation.</example>
The correction must be identical to the text given as <suggestion>...</suggestion>

Maybe you want to read http://wiki.languagetool.org/development-overview

I found two false positives (001& 002):

        <example>Nothing adds up in the wold.</example>
        <example>You should avoid adding sentences in a language other than your own.</example>

#14

When the text within this <example correction='...'> area and the text enclosed by <suggestion>...</suggestion> are identical the problems are less likely to occur.
Thanks for the help!
However, I could not find a way to add special cases (exceptions) in this method. Do you have any suggestion?


(Knorr) #15

I do not exactly know what you mean by 'special cases'. Do you mean <antipattern>?

Let's say you wanted to match "add(s|ing|ed) together" except for sentences that start with "What". Then, you could write a rule like this:

<!-- this is an example rule: -->
<rule>
    <antipattern>
        <token postag="SENT_START"/>
        <token skip="-1">What</token>
        <token inflected="yes">add</token>
    </antipattern>
    <pattern>
        <token inflected="yes">add</token>
        <token>together</token>
    </pattern>
    <message>Did you mean <suggestion>\1 up</suggestion>?</message>
    <example correction="Add up"><marker>Add together</marker> two numbers</example>
    <example>What did you add together?</example>
</rule>

#16

I meant to say 'exceptions to the rules' like:
'in a language' and 'in the world' should not be replaced with 'to a language' and 'to the world'
as you've already mentioned.
By the way, I completely messed up the code. Now I'm planning to re-write everything from scratch.
I looked into the Antipatterns but it seems to be a bit complex to a beginner like me. Can you please elaborate the function of <antipattern>...</antipattern> to some extent so that I can get the basic idea of its use? I tried to understand the function from the XML code provided by LT found in the directory "C:\Program Files (x86)\LanguageTool-3.5\org\languagetool\rules\en\grammar.xml" but could not understand anything.
Again, I have a question in my mind, how do I generalise the phrases like 'add to' effectively as the words ('add' and 'to') may appear anywhere in between the sentence? I'm planning to create preposition rules for the most common words we use, so finding the right way is important to me.
I'm looking for your help.


(Knorr) #17

OK, I see.
Your patterns already match "add up" anywhere in the sentence. Only if you had something like

    <pattern>
        <token postag="SENT_START"/>
        <token inflected="yes">add</token>
        <token>up</token>
    </pattern>

only "add up" at the start of a sentence would match.
As its name already suggests, an antipattern is pretty similar to a pattern. Antipatterns are exceptions to matches of a pattern. You can think of them as filters to a pattern's matches.
What do you think the following combination of (anti)patterns does?

    <antipattern>
        <token>we</token>
        <token inflected="yes">add</token>
        <token>up</token>
    </antipattern>
    <antipattern>
        <token inflected="yes">add</token>
        <token>up</token>
        <token></token>
        <token>between</token>
    </antipattern>
    <pattern>
        <token inflected="yes">add</token>
        <token>up</token>
    </pattern>

This will match any occurrence of "add up" (=pattern), except for cases where "we" precedes "add" (1st antipattern) and except for cases where the second word after "up" equals "between".
Match: "I added up all numbers"
No Match: "We added up all numbers"
No match: "They added up numbers between 0 and 5."
Match: "They added up all integer numbers between 0 and 5"

As you have already experienced, it may be quite complicated (or even impossible) to write rules that have no false positives/negatives.
Of course, to allow "Nothing adds up in the wold." you could write an antipattern like this

        <antipattern>
            <token>up</token>
            <token>in</token>
            <token>the</token>
            <token>world</token>
        </antipattern>

But then "Nothing adds up in this world" would be marked as wrong. It's always a trade-off: You might exclude "in" from your pattern completely. (Of course, in this case, some errors will be missed, but there will be less false positives, too.)

Maybe you have a look at a recently added rule of MikeUnwalla (link) in which he added a rule for "beware + preposition"


#18

@Knorr ,Thanks!
You are pretty good at explaining complex stuff. Now I understand the purpose of <antipattern>...</antipattern>, at least a little, though.

My plan is to create and contribute a few rules to the LT team for the English language.
I will try utmost not to include anything incorrect or inappropriate to my rules before they permanently settle down to LT, and perform a considerable amount of research before writing a rule.
I'm hopeful that the native speakers in the community will certainly help me out before publishing a rule.

As for the 'add_to' , 'add_up', 'add_up_to' rule, I've yet not started reworking on the same. When I will do, I will ask for help, right here, in the same post. Thanks for your effort to help me. I really appreciate your cooperation. Have a good day!


(Knorr) #19

@RuleFreak No problem! You are welcome!


#20

@Knorr , Thanks for the assurance.