Suggestion to improve the relevance of corrections

Guillaume · May 18, 2018, 11:56am

I want to go further by modifying the core to improve the relevance of corrections.

Eg with the sentence “Les têtes fémorale.”, if I add a suggestion with the current attributs LanguageTool returns 2 corrections “fémorales” and “fémoraux” because I can’t add the genre of “têtes” in my <match> tag.

“<suggestion>\2 <match no=3 postag=”(.*)s" postag_regexp=“yes” postag_replace="$1p"></suggestion>"

Based on our analysis, the need would be to add to the <match> tag two attributes (eg “postag_no_search” and “postag_search”) that would specify on which token the suggestion must do a first search to get the additional informations.

“<suggestion>\2 <match no_search=“2” postag_search=”. (.*) ." no=3 postag="(.) . s" postag_regexp=“yes” postag_replace="$2 $1|e p"></suggestion>"

Or an other suggestion could be to add the postag attibutes to <suggestion> tag to filter the suggestion, eg :

“<suggestion no=“2” postag=”. [me] p" postag_regexp=“yes”>\2 <match no=3 postag="(.) . s" postag_regexp=“yes” postag_replace="$1 [me] p"></suggestion>"

“<suggestion no=“2” postag=”. f p" postag_regexp=“yes”>\2 <match no=3 postag="(.) . s" postag_regexp=“yes” postag_replace="$1 f p"></suggestion>"

If the <suggestion> postag match, LT provide the correction of the <match> else doesn’t provide a result.

Before developing this new feature I submit my issue to be sure it doesn’t already exist a solution.

dnaber · May 20, 2018, 8:25am

Have you checked whether this is possible with a RuleFilter (documented here)? I think it should be possible. The XML might be a bit more verbose, but no new code other than the class that extends RuleFilter would need to be introduced.