Back to LanguageTool Homepage - Privacy - Imprint

[pt] Problem developing rule "Vou apresentar de seguida"

Hello!

While looking at my thesis I found out that I could improve the grammar at the start of subchapters, by replacing for example:
“Vamos apresentar a seguir” with “Apresentamos a seguir”

So, I have spent some two hours creating the rule.

The problem is that TESTRULES PT gives errors everywhere in it.

So, I tried to break it down and trying to fix the first personal verb:

<rulegroup id='IR_VERBO-A_DE-SEGUIR-SEGUIDA-INFINITIVO' name="Ir_verbo + a/de seguir/seguida + Verbo_inf">
	<!--      Created by Marco A.G.Pinto, Portuguese rule 2021-02-1 (1-JAN-2021+)      -->
<!--
=EU=
Vou explicar a seguir o processo.
Vou a seguir explicar o processo.
Vou de seguida explicar o processo.
=TU=
Vais explicar a seguir o processo.
Vais a seguir explicar o processo.
Vais de seguida explicar o processo.
=ELE=
Vai explicar a seguir o processo.
Vai a seguir explicar o processo.
Vai de seguida explicar o processo.
=NÓS=
Vamos explicar a seguir o processo.
Vamos a seguir explicar o processo.
Vamos de seguida explicar o processo.
=VÓS/VOCÊS/ELES=
Vão explicar a seguir o processo.
Vão a seguir explicar o processo.
Vão de seguida explicar o processo.
-->	

	  <!-- EU -> APRESENTEI -->
	  <rule> 
		<pattern>
		   <marker>
			<and>
				<token inflected='yes'>ir</token>
				<token postag='VMIP1S0' postag_regexp='no'/>
			</and>
			<token min="0" max="1" regexp='yes'>a|de</token>
			<token min="0" max="1" regexp='yes'>seguir|seguida</token>
			<token postag='VMN0000' postag_regexp='no'/>
		   </marker>
		</pattern>
		<message>Em certos contextos, esta perífrase pode ser simplificada.</message>
		<suggestion><match no='4' postag='VMIP1S0' postag_regexp="yes" postag_replace='VMN0000'/> \2 \3</suggestion>
		<example correction='Explico a seguir'><marker>Vou a seguir explicar</marker> o processo.</example>
  </rule>
  
	</rulegroup>

But I get the error:

Testing rule 2600…
Skipped 0 rules for variant language to avoid checking rules more than once
2671 rules tested.
Exception in thread “main” org.languagetool.rules.patterns.PatternRuleTest$PatternRuleTestFailure: Test failure for rule IR_VERBO-A_DE-SEGUIR-SEGUIDA-INFINITIVO[1] in file /org/languagetool/rules/pt/grammar.xml: Incorrect match position markup (expected match position: 0 - 21, actual: 0 - 12) in sentence: Vou a seguir explicar o processo.
at org.languagetool.rules.patterns.PatternRuleTest.addError(PatternRuleTest.java:310)
at org.languagetool.rules.patterns.PatternRuleTest.testBadSentences(PatternRuleTest.java:447)
at org.languagetool.rules.patterns.PatternRuleTest.lambda$testGrammarRulesFromXML$1(PatternRuleTest.java:339)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Running disambiguator rule tests…
Running disambiguation tests for Portuguese…
100…
200…
290 rules tested (397ms)
Disambiguator tests successful.
Running XML bitext pattern tests…
Bitext pattern tests successful.
Validating false-friends.xml…
Validation successfully finished.

Could Jaume or someone give me a tip on how to fix this?

Then I will apply the fix to the other parts.

Maybe it is something very simple that is missing or wrong.

Thank you!

@jaumeortola

Any clues?

Thanks!

The first problem is that “seguir” can match the third token and also the fourth token (as an infinitive).

The second problem is that the POS tags in the suggestion are swapped. This is the fixed rule:

<rule> 
     <pattern>
       <marker>
         <and>
           <token inflected='yes'>ir</token>
           <token postag='VMIP1S0' postag_regexp='no'/>
         </and>
         <token min="0" max="1" regexp='yes'>a|de</token>
         <token min="0" max="1" regexp='yes'>seguir|seguida</token>
         <token postag='VMN0000' postag_regexp='no'><exception>seguir</exception></token>
       </marker>
     </pattern>
     <message>Em certos contextos, esta perífrase pode ser simplificada.</message>
     <suggestion><match no='4' postag='VMN0000' postag_regexp="yes" postag_replace='VMIP1S0'/> \2 \3</suggestion>
     <example correction='Explico a seguir'><marker>Vou a seguir explicar</marker> o processo.</example>
   </rule>

If I understand the rule, the suggestion is synthesized with the lemma of the fourth token and the POS tag of the first one. This cannot be done with the usual synthesizer, but it can be done with a new filter. Write just one rule (instead of six rules for every verb person and number), and I will add the filter to Portuguese.

@jaumeortola

Thank you!

At 5am I will create the rule.

I have created similar rules using six rules, one for each person.

I don’t know what a filter is (how it works?) or do you mean that I write a rule that shows the verb in all persons, and then you will add something that shows only one person in the results?

Something like:
“Vamos a seguir apresentar”
would suggest in one rule all the 6 persons and then you will create a filter to show just one?
“Apresento a seguir”
“Apresentas a seguir”
“Apresenta a seguir”
“Apresentamos a seguir”
“Apresentam a seguir”

At 5am I will do it.

Thanks!

@jaumeortola

Hello!

I have just committed the rule:

Can you add the filter?

Please notice that there is an issue:

=EU=
Vou explicar a seguir o processo.
Vou a seguir explicar o processo.
Vou de seguida explicar o processo.
=TU=
Vais explicar a seguir o processo.
Vais a seguir explicar o processo.
Vais de seguida explicar o processo.
=ELE/ELA=
Vai explicar a seguir o processo.
Vai a seguir explicar o processo.
Vai de seguida explicar o processo.
=NÓS=
Vamos explicar a seguir o processo.
Vamos a seguir explicar o processo.
Vamos de seguida explicar o processo.
=VÓS=
Ides explicar a seguir o processo.
Ides a seguir explicar o processo.
Ides de seguida explicar o processo.
=ELES/ELAS=
Vão explicar a seguir o processo.
Vão a seguir explicar o processo.
Vão de seguida explicar o processo.

The first of each gets an extra blank space when we apply the suggestion. I don’t have a clue why it is happening.

Vou explicar a seguir o processo.
Vais explicar a seguir o processo.
Vai explicar a seguir o processo.
Vamos explicar a seguir o processo.
Ides explicar a seguir o processo.
Vão explicar a seguir o processo.

Also, it suggests too many replacements and only 5 are shown (I believe the filter will solve that?)

Thanks!