Back to LanguageTool Homepage - Privacy - Imprint

Help needed with 'suppress_misspelled'


(Mike Unwalla) #1

For the Oxford spelling rules for verbs, I want to prevent a rule from giving a message for verbs such as advertise, advise, appraise, and chastise. I can do that by putting the verbs into an exception on the token. To make the rule as accurate as possible, I also would like to use ‘suppress_misspelled’, because I probably do not have a full list of verbs that cannot be spelled with ize. But, I cannot do what I want.

“You can even suppress the whole rule from being matched if you use the same attribute for any suggestion element” (http://wiki.languagetool.org/development-overview#toc6). I think this sentence means that I must put ‘suppress_misspelled’ into both the suggestion and the match, as is done in some of the rules in English grammar.xml.

The rule that follows correctly ignores ‘advise’ but gives a message for ‘televise’. Any ideas why?

<rule id="TEST_SUPPRESS_MISSPELLED1" name="Test: suppress_mispelled">
    <pattern>
        <token regexp="yes">([a-z]+?)ise</token>
    </pattern>
    <filter class="org.languagetool.rules.en.EnglishPartialPosTagFilter"
        args="no:1 regexp:(?i)([a-z]+?ise) postag_regexp:VBP?"/>
    <message>TEST1. The word '\1' is not the Oxford spelling. Use '<suggestion suppress_misspelled="yes"><match suppress_misspelled="yes" no="1" regexp_match="([a-z]+?)ise" regexp_replace="$1ize"/></suggestion>'.</message>
    <example correction="organize">The verb '<marker>organise</marker>' is not the Oxford spelling.</example>
    <example>The word '<marker>organize</marker>' is the Oxford spelling.</example>
    <example>We <marker>advise</marker> you to be careful.</example>
    <example correction="televize">They will <marker>televise</marker> the football match.</example>
</rule>

(Daniel Naber) #2

I see no obvious reason, so this would need real debugging from Java I guess. Just so I understand: “Oxford spelling” is not just en-GB, but even more special and it should be possible to enable it via rules?


(Mike Unwalla) #3

Oxford spelling is a style preference that is applicable to en-GB. The Oxford Dictionaries blog has a good summary: https://blog.oxforddictionaries.com/2011/03/28/ize-or-ise/.

My plan is to create a set of rules that a user can enable to check for Oxford spelling (that is, find ~ise spellings).


(Mike Unwalla) #4

The unexpected behaviour is not a problem for me now.

I found an alternative method: put the POS in the token and use regexp_match and regexp_replace in the suggestion. With that method, suppress_misspelled works as I expect.


(Fabian Koglin) #5

Oxford spelling actually has its own IETF code, en-GB-oxendict. However, the differences aren’t so large that it couldn’t be implemented with a handful of optional rules in en-GB.


(Mike Unwalla) #6

Yes, I know. But thanks for making sure that I know.

This morning I found an OpenOffice .oxt file: https://sourceforge.net/projects/aoo-extensions/files/1881/4/en_gb-oed.oxt/download. Possibly, we could use that as a source, but I don’t know about copyright status.

Also, @dnaber, I think that the Oxford spelling rules are a good candidate for inclusion in the premium version of LT only. If you agree, tell me, and I will send you the prototype Oxford spelling rules that I have.


(Daniel Naber) #7

Thanks. The concept of the premium rules is currently that they are active by default, i.e. there isn’t even a UI to enable rules on the website. I understand the Oxford rules would be optional and not active by default?


(Aafreen) #8

How you gave this filter option. For what classes we can use this option???


(Mike Unwalla) #9

Yes (if I put the rules into LT). But I thought that maybe the rules would be better only in the premium version, because they are useful for professional proofreaders.


(Mike Unwalla) #10

As best I know, the only documentation for EnglishPartialPosTagFilter is in CHANGES.txt (for LT 2.8). Line 165 and following lines tell you how to use the filter.