Back to LanguageTool Homepage - Privacy - Imprint

[rule suggestion] "can is" typo

Are the forums the correct place to suggest rules? The rule editor just links to the generic site, which suggests the forum as the first entry point, so here goes:

Sometimes you end up typing both ‘can’ and ‘is’ in a sentence, because you’ve reformulated it a bunch of times.
Here is my suggested rule:

<!-- English rule, 2020-07-26 -->
<rule id="CAN_IS" name="can is">
  <token postag='CD|PRP|PRP\$|JJ|JJS|JJR|NN:U|DT' postag_regexp='yes' negate_pos='yes'></token>
 <message>Did you mean either can or is?</message>
 <example correction=''>Solving this more accurately <marker>can is</marker> important for many applications Can is there way to help you? Finding ways to improve can is important for the future.</example>
 <example>A soda can is a beverage container.</example>
 <example>That can is expired.</example>
 <example>Your can is expired.</example>
 <example>Looks the old garbage can is being sold.</example>
 <example>Looks like the old can is being sold.</example>
 <example>Doing all that you can is good enough.</example>
 <example>I guess doing all that one can is the best anyone can ask for.</example>
 <example correction=''>Can is there way to help you?</example>
 <example>The oldest can is worth 500$</example>
 <example>An even older can is only worth 2$</example>
 <example correction=''>Finding ways to improve can is important for the future.</example>

This works fine for most sentences, but it still gives a false positive for Maybe live it up while you still can is the best approach, which I can’t figure out how to fix.

“All one can is cry”

Ah yeah. I assume you mean this as an example wrong sentence, but the rule wouldn’t work for that either. Any idea how to write it better?

Ask @tiff or @Mike_Unwalla

No rule without exceptions. Just try it on a large text collection and check the false/true ratio and amounts. A very rare false positive is not a problem. Most common exceptions can be added as such. Building rules often seems easy, but then the language use kicks in :slight_smile:

“Maybe ‘live it up while you still can’ is the best approach.” should be a closer match to the spoken form.

@atnas, thanks for your contribution.

If you have a rule that solves or partly solves an issue (, you can add your rule as a comment to that issue.

If there is no issue for the problem that your rule solves, I suggest that you make an issue and give examples of the problem. Then, add your rule as a comment. That’s my personal preference. @dnaber, do you have a preference/suggestion?

As @Ruud_Baars wrote, try the rule on a large corpus. As a start, you can test the rule with 250,000 sentences. Use the rule editor with devMode:

The rule that you supplied does not have the correct syntax. It contains examples such as this:
<example correction=''>Can is there way to help you?</example>

The rule editor gives a warning that tells you that <marker> is missing from the ‘incorrect’ examples.

As @SkyCharger001 shows, the problem is missing quote marks. Possibly, in the source text the phrase was in italics. LanguageTool does not know that. Thus, there is a false warning. There is nothing we can do about it.

I ran your rule in the rule edtor (devMode). I found these false positives:

  • This whimsical watering can is filled with 8 oz. of delicious lemon pretzels.
  • After the bottle/can is empty, let the heater run for half an hour.
  • … and features a cover of his song “Southern Can Is Mine.”
  • Among younger speakers, can is more common, with tin referring to a …[Missing quote marks. There is no way to fix the false positive].

I agree, issues are a good place for rules (if you’re familiar with Github pull requests, that’s even better).