Back to LanguageTool Homepage - Privacy - Imprint

[zh] Part of Speech


#1

Hi, I speak Chinese and I wanted to create a new rule, but I can’t seem to create rules based on parts of speech like other languages. English, for example, has CC and CD for different parts of speech. Can I use these for Chinese?


(Daniel Naber) #2

Hi, thanks for interest in LanguageTool! Unfortunately, support for Chinese is not maintained in LanguageTool, and the tags are not properly documented. But you can use this tool to analyze text and thus see its tags: https://community.languagetool.org/analysis/index?lang=zh


#3

Thanks for the reply. What does it mean when the rule editor says "The rule did not find the expected error”? Does it mean I can’t add the rule?


(Daniel Naber) #4

It means the pattern did not match the example sentence, i.e. there’s some problem with the pattern. Maybe it was too strict? If that doesn’t help, you can post the rule here and we’ll try to help if we can.


#5

I got it resolved. Finally wrote my first rule:

<!-- Chinese rule, 2018-04-13 -->
<rule id="" name="不存在有/没有">
 <pattern>
  <token>不</token>
  <token>存在</token>
  <token regexp='yes'>有?</token>
  <token postag='v'></token>
 </pattern>
 <message>使用<suggestion>没有</suggestion>比不存在有更简洁。</message>
 <example correction=''><marker>不存在有隐瞒</marker>。</example>
 <example>没有隐瞒。</example>
</rule>

How do I get it submitted?


(Daniel Naber) #6

Thanks! We need two more things, then I can add it: Could you set the ID for that rule (using only these characters: A-Z, _) and could you tell me which category it best fits in? Chinese currently has these categories:

词语错误, 成语错误, 词法-实词, 词法-虚词, 句法

Also, you’ve probably seen the message We've checked your pattern (...) and found the following matches. Please consider modifying your rule if these matches are false alarms. Have you checked that the match it finds is a valid match and not a false alarm?


#7
<!-- Chinese rule, 2018-04-13 -->
<rule id="NOT_EXIST_NO" name="不存在有/没有" type="style">
 <pattern>
  <token>不</token>
  <token>存在</token>
  <token regexp='yes'>有?</token>
  <token postag='v'></token>
 </pattern>
 <message>“不存在有”为欧化中文,您可以使用<suggestion>没有</suggestion>。</message>
 <example correction="不存在"><marker>不存在有隐瞒</marker>。</example>
 <example>没有隐瞒。</example>
</rule>

Fixed the rule. The rule should go to “句法”. I have also checked the pattern matches and they are indeed errors.


(Daniel Naber) #8

For me, the test says Found wrong correction(s) in sentence '不存在有隐瞒。': '[没有]' but expected '[不存在]' - could you check that?


#9

Please check if this works:

<!-- Chinese rule, 2018-04-13 -->
<rule id="NOT_EXIST_NO" name="不存在有/没有" type="style">
 <pattern>
  <token>不</token>
  <token>存在</token>
  <token regexp='yes'>有?</token>
  <token postag='v'></token>
 </pattern>
 <message>“不存在有”为欧化中文,您可以使用“<suggestion>没有</suggestion>”。</message>
 <example correction="没有"><marker>不存在有</marker>隐瞒。</example>
</rule>

(Daniel Naber) #10

I get a different error now. You can test it yourself at https://community.languagetool.org/ruleEditor/expert:


#11

This passed the test (I also added another example):

<rule id="NOT_EXIST_NO" name="不存在有/没有" type="style">
     <pattern>
      <marker>
       <token>不</token>
       <token>存在</token>
       <token regexp="yes">(有|任何)?</token>
      </marker>
      <token postag='v'></token>
     </pattern>
     <message>“<match no="1"/><match no="2"/><match no="3"/>”为欧化中文,您可以使用<suggestion>没有</suggestion>。</message>
     <example correction="没有">医生<marker>不存在有</marker>误解病人的病历。</example>
     <example correction="没有">政府报告<marker>不存在任何</marker>隐瞒。</example>
    </rule>

(Daniel Naber) #12

And the matches (now 4) are also real matches, not false alarms, is that correct?


#13

Not false alarms.


(Daniel Naber) #14

Thanks - I’ve just added the rule, it will go online at https://languagetool.org today at about 22:30 CEST.