Hello! I am trying to figure out a solution to a suggestion when I want to inflect a verb to match the tense of a verb I am replacing. Here’s a pattern and example:
Suppose I wish to change the sentence to “He went on to schedule an interview.” This is straightforward:
<suggestion><match no="1" postag="(V.*)" postag_regexp="yes" postag_replace="$1">go</match> on to \3</suggestion>
My question is: What happens when I want to change the sentence to “He scheduled an interview.”? The issue is that the verb is detected as a VB so I do not have it at the time I am writing the rule. It’s “dynamic” in the sense that I don’t know it until a sentence is run. Is there a way to accomplish this?
I naively tried the following, which doesn’t do anything:
<rule id="PROCEED_TO_VB" name="proceed to VB">
<pattern>
<token inflected="yes">proceed<exception>proceeding</exception></token>
<token>to</token>
<token postag="VB">
</token>
</pattern>
<message>Shortening this sentence may make it clearer.</message>
<suggestion>\3<match no="1" regexp_match="proceed(.*)" regexp_replace="$1"/></suggestion>
<suggestion>then \3</suggestion>
<suggestion><match no="1" postag="(V.*)" postag_regexp="yes" postag_replace="$1">go</match> on to \3</suggestion>
<example correction="scheduled|then scheduled|went on to schedule">He <marker>proceeded to schedule</marker> an interview.</example>
</rule>
The first suggestion fails because I’m trying to get “to schedule” to conjugate against “He”. Any insights are appreciated. Thanks!
Without having tried it, here is a general tip: The regexp_match and replace can sometimes be prone to bugs. I suggest working with the postags instead:
Yes, this is a fantastic start. However, when we declare:
<match no="1" postag="VBP"/>
What we are actually trying to do is conjugate “schedule” into the same form as “proceed” when we don’t know the form of “proceed” until runtime. When we declare VBP, we are assuming that match 1 is non-3rd person singular present, but it might not be if “proceed” is a different form.
The only workaround I can think of right now is to create at least three different rules targeting the different forms of “proceed” and using the correct postag explicitly.
I see! You need to transfer the postag of \1 (any form of proceed, but it needs to be a verb form) to the verb in \3.
In the suggestion, if you want a matching form of the verb in \3, you can use our AdvancedSynthesizerFilter which was made exactly for this kind of situation. This way, you only need one rule:
<rule id="PROCEED_TO_VB" name="proceed to VB">
<pattern>
<token postag="V.*" postag_regexp="yes" inflected="yes">proceed<exception>proceeding</exception></token>
<token>to</token>
<token postag="VB"/>
</pattern>
<filter class="org.languagetool.rules.en.AdvancedSynthesizerFilter" args="lemmaFrom:3 lemmaSelect:V.* postagFrom:1 postagSelect:V.*"/>
<message>Shortening this sentence may make it clearer.</message>
<suggestion>{suggestion}</suggestion>
<suggestion>then {suggestion}</suggestion>
<suggestion><match no="1" postag="(V.*)" postag_regexp="yes" postag_replace="$1">go</match> on to \3</suggestion>
<example correction="scheduled|then scheduled|went on to schedule">He <marker>proceeded to schedule</marker> an interview.</example>
</rule>
Suppose now the rule works but could be improved for sentences in this form:
He proceeded to prepare and bake a cookie.
Is there a way to improve our rule such that we can leverage multiple {suggestion} tags?
For example, can we do something along the lines of this?
<rule id="PROCEED_TO_VB" name="proceed to VB">
<pattern>
<token postag="V.*" postag_regexp="yes" inflected="yes">proceed<exception>proceeding</exception></token>
<token>to</token>
<token postag="VB"/>
<token>and</token>
<token postag="VB"/>
</pattern>
<filter class="org.languagetool.rules.en.AdvancedSynthesizerFilter" args="lemmaFrom:3 lemmaSelect:V.* postagFrom:1 postagSelect:V.*"/>
<filter class="org.languagetool.rules.en.AdvancedSynthesizerFilter" args="lemmaFrom:5 lemmaSelect:V.* postagFrom:1 postagSelect:V.*"/> <!-- Can we do this? -->
<message>Shortening this sentence may make it clearer.</message>
<suggestion>{suggestion} and {suggestion2}</suggestion> <!-- Unsure. -->
<suggestion>then {suggestion} and {suggestion2}</suggestion> <!-- Unsure. -->
<suggestion><match no="1" postag="(V.*)" postag_regexp="yes" postag_replace="$1">go</match> on to \3 and \5</suggestion>
<example correction="prepared and baked|then prepared and baked|went on to prepare and bake">He <marker>proceeded to prepare and bake</marker> a cookie.</example>
</rule>
Obviously, if the two rules could be a single rule with the last two tokens as optionals, that would be great, but two rules can work as well. Any insights help!