Back to LanguageTool Homepage - Privacy - Imprint

English: set/sat/seat rule


(eska) #1

Hi, found this in the communication of a co-worker and thought I'd make a rule out of it:

<!-- English rule, 2016-10-20 -->
<rule id="SATSEATSET" name="sat/seat/set">
 <pattern>
  <token postag='PRP'></token>
  <token regexp='yes'>sea?t</token>
  <token>together</token>
 </pattern>
 <message>Instead of "<match no="2" regexp_match="(s)ea?(t)" regexp_replace="$1a$2"/>", did you mean "sat", the past tense form of "sit"?</message>
 <short>Did you mean "sat"?</short>
 <example correction=''><marker>We set together</marker>.</example>
 <example>We sat together.</example>
</rule>

(Mike Unwalla) #2

@eska,

Thank you.

I searched the NOW corpus (http://corpus.byu.edu/now/), which has 2.8 billion words. I found 3 incorrect sentences for the structure pronoun+set/seat+together. Thus, I think that this rule is a candidate for the statistics rules (confusion_sets.txt).

@dnaber , when you get time, please look at these pairs, and if applicable, put them in confusion_sets.txt:
seat/sat
set/sat


(Daniel Naber) #3

I've added three pairs: seat/sat, set/sat, seat/set. They work quite well, catching between roughly 35% and 75% of confusions with low false alarm rates. They will become active on languagetool.org later today. Thanks!