Back to LanguageTool Homepage - Privacy - Imprint

Adding custom Rules


#1

Hi,

I followed the instructions on the adding new rules page, http://wiki.languagetool.org/tips-and-tricks, but I cannot seem to get my custom rules to work.

As a simple test, I made a rule that just identifies the digit “1” and suggests to change it to “One”. I added it to an external file called my-custom-rules.xml

<rule id="ONE_DIGIT" name="one digit">
 <pattern>
  <marker>
  <token>1</token>
  </marker>
 </pattern>
 <message>1 should be spelled out. Like One</message>
 <example correction=''><marker>1</marker></example>
 <example>one</example>
</rule>

I then attempted to reference it in the grammar.xml file by placing a reference in the <DOCTYPE rules section

<!ENTITY CustomRules SYSTEM "file:///home/my-custom-rules.xml">

Then place &CustomRules after the last rule in a category section
’…
</rule>
&CustomRules;
</category>
</rules>

If I restart my server, the rule is not recognized.


(Daniel Naber) #2

Just to be sure the issue is really with the rule being in an extra file: does it work when in grammar.xml? Do you get an error when you specify a non-existing path at !ENTITY CustomRules SYSTEM?


#3

I’ll find out real quick. Should I only be restarting the server after editing the files, or does it take more than a simple restart?


(Daniel Naber) #4

Restart should be enough.


#5

I added the rule to the grammar.xml file, and it’s still not working. Which is odd because it’s a simple rule. This time, however, language tool does not raise any errors.

Before, I kept getting “Error: Internal Error: java.lang.RuntimeException: Could not activate rules”

Also, if I make a change and simply restart the server the error does not come up. It only gets raised after running ‘./build.sh languagetool-standalone package -DskipTests’ again.


#6

This is the intro part of my grammar.xml

<!DOCTYPE rules [
<!ENTITY CustomRules SYSTEM "file:///home/custom-rules.xml">
<!ENTITY weekdays "Monday|Wednesday|T(ue|hur)sday|Friday|S(atur|un)day">
<!ENTITY abbrevWeekdays "Mon?|Tue?|Wed?|Thu?|Fri?|Sat?|Sun?">
<!ENTITY months "January|February|March|April|May|Ju(ne|ly)|August|September|October|November|December">
<!ENTITY abbrevMonths "Jan|Feb|Mar|Apr|Ju[ln]|Aug|Sept?|Oct|Nov|Dec">
<!ENTITY languages "Akan|Amharic|Arabic|Assamese|Awadhi|Azerbaijani|Balochi|Bangla|Belarusian|Bengali|Bhojpuri|Burmese|Cantonese|Cebuano|Chewa|Chhattisgarhi|Chittagonian|Czech|Deccan|Dhundhari|Dutch|English|Filipino|French|Fula|Gaelic|German|Greek|Gujarati|Hakka|Haryanvi|Hausa|Hiligaynon|Hindi|Hmong|Hunanese|Hungarian|Igbo|Ilocano|Ilonggo|Indonesian|Italian|Ja[pv]anese|Jin|Kannada|Kazakh|Khmer|Kinyarwanda|Kirundi|Konkani|Korean|Kurdish|Madurese|Magahi|Maithili|Malagasy|Malay(alam)?|Malaysian|Mandarin|Marathi|Marwari|Mossi|Nepali|Odia|Oriya|Oromo|Pashto|Persian|Polish|Portuguese|Punjabi|Quechua|Romanian|Russian|Saraiki|Serbo-Croatian|Shona|Sindhi|Sinhalese|Somali|Spanish|Sundanese|Swedish|Sylheti|Tagalog|Tamil|Telugu|Thai|Turk(ish|men)|Ukrainian|Urdu|Uyghur|Uzbek|Vietnamese|Visayan|Wu|Xhosa|Xiang|Yoruba|Yue|Zhuang|Zulu"><!-- Most are from https://en.wikipedia.org/wiki/List_of_languages_by_number_of_native_speakers -->
]>

This is the end of my grammar.xml

    <rule id="EUPUB_VISA" name="visa">
        <pattern>
            <token inflected="yes">visa</token>
        </pattern>
        <message>Visa is misused to render not only 'approval', but also the act of giving approval. In English, a visa is generally 'an official authorisation appended to a passport, permitting entry into and travel within a particular country or region'. It is also the name of a credit card. Alternatives: approval, endorsement, to approve, to endorse.</message>
        <url>http://euenglish.webs.com/</url>
        <short>EU English: visa</short>
        <example type="incorrect" correction="">The centralised ex-ante <marker>visa</marker> performed by the Financial Controller, …</example>
        <example type="incorrect" correction="">The Finance Section has to <marker>visa</marker> all transactions before they can be authorised.</example>
        <example type="incorrect" correction="">… the ex-ante <marker>visa</marker> of the Delegation would be suspended unless…</example>
        <example>The centralised ex-ante <marker>approval</marker> performed by the Financial Controller, …</example>
    </rule>
    &CustomRules;
</category>

This is my custom-rules.xml file.

<rule id="ONE_DIGIT" name="one digit">
 <pattern>
  <marker>
  <token>1</token>
  </marker>
 </pattern>
 <message>1 should be spelled out. Like One</message>
 <example correction=''><marker>1</marker></example>
 <example>one</example>
</rule>

(Daniel Naber) #7

What’s the full error message? You should first make the rule work in grammar.xml and without any Java errors before you move it to its own file.


#8

Ill post the full message shortly. I need to recreate it.

BTW, the error is not raised when the rule is just in the grammer.xml file.


(Daniel Naber) #9

Your rule won’t work there because the whole category MISUSED_TERMS_EU_PUBLICATIONS is off by default (default="off").


#10

Thanks for pointing that out! I added a new category and set default=“on”, and now the rule works.

But I had to run,

./build.sh languagetool-standalone package -DskipTests

and then restart the server for any changes to take place.

Shouldn’t I be able to just restart the server?

Thanks!


(Daniel Naber) #11

Yes. This sounds like you modified the file from the source instead of the file of the build. If you just work with the ZIP, there’s no such confusion.


#12

Okay, the grammar.xml file I am editing is located at:
languagetool/languagetool-language-modules/en/src/main/resources/org/languagetool/rules/en/grammar.xml

where can I find the build files? I am working on ubuntu 16.04 through a terminal.

I am running the server by running

java -cp /languagetool/languagetool-standalone/target/LanguageTool-4.2-SNAPSHOT/LanguageTool-4.2-SNAPSHOT/languagetool-server.jar org.languagetool.server.HTTPServer --port 8081

Thank you for all your help!


(Daniel Naber) #13

Everything under target is generated by Maven. I suggest either using our snapshots or, if you actually need to make code changes, to work in your IDE and only build at the very end (i.e. modify, run, test, repeat - everything in the IDE).


#14

Awesome thank you!