[pt] Problem creating rule - 2020-09-16

Hello @jaumeortola

I was trying to create the following rule:

<rule id='TEM_PARTICIPIO-PASSADO' name='Tem + Particípio passado'>
<!--      Created by Marco A.G.Pinto, Portuguese rule - 2020-09-16 (2-JUL-2020+)   -->
  <pattern>
      <marker>
		<token>tem</token>
		<token postag='VMP00SM' postag_regexp='no'/>
	  </marker>
  </pattern>
  <message>Substitua por <suggestion><match no='2' regexp_match='VMP00SM' regexp_replace='VMIP3S0'/></suggestion>.</message>
  <example correction='pesquisa'>O professor <marker>tem pesquisado</marker> o assunto há décadas.</example>
</rule>

But TESTRULES PT gives an error:

Exception in thread “main” org.languagetool.rules.patterns.PatternRuleTest$PatternRuleTestFailure: Test failure for rule TEM_PARTICIPIO-PASSADO[1] in file /org/languagetool/rules/pt/grammar.xml: Incorrect suggestions: Expected ‘pesquisa’, got: ‘pesquisado’ on input: ‘O professor tem pesquisado o assunto há décadas.’
at org.languagetool.rules.patterns.PatternRuleTest.assertSuggestions(PatternRuleTest.java:525)
at org.languagetool.rules.patterns.PatternRuleTest.testBadSentences(PatternRuleTest.java:417)
at org.languagetool.rules.patterns.PatternRuleTest.testGrammarRulesFromXML(PatternRuleTest.java:318)
at org.languagetool.rules.patterns.PatternRuleTest.runTestForLanguage(PatternRuleTest.java:169)
at org.languagetool.rules.patterns.PatternRuleTest.runGrammarRulesFromXmlTestIgnoringLanguages(PatternRuleTest.java:152)
at org.languagetool.rules.patterns.PatternRuleTest.main(PatternRuleTest.java:683)

What is wrong with it?

Thanks!

Use this suggestion:
<suggestion><match no='2' postag='VMP00SM' postag_regexp="yes" postag_replace='VMIP3S0'/></suggestion>

What are you trying to do? The rule doesn’t make sense to me.

“O Professor TEM PESQUISADO a teoria há muitos anos.”

To replace with:
“O Professor PESQUISA a teoria há muitos anos.”

It is a better/cleaner grammar replacement.

So, if it finds “TEM” + Past Participle, it suggests the same verb in VMIP3S0.

Then, I will have to create a rule group:
TENHO
TENS
TEM
TEMOS
TÊM

and check the word after the past participle to avoid false positives.

@jaumeortola

Does it make more sense this way?:

<rule id='TEM_PARTICIPIO-PASSADO' name='Tem + Particípio passado'>
<!--      Created by Marco A.G.Pinto, Portuguese rule - 2020-09-16 (2-JUL-2020+)   -->
  <pattern>
	<token>tem</token>
	<token postag='VMP00SM' postag_regexp='no'/>
  </pattern>
  <message>Substitua por <suggestion><match no='2' regexp_match='VMP00SM' regexp_replace='VMIP3S0'/></suggestion>.</message>
  <example correction='pesquisa'>O professor <marker>tem pesquisado</marker> o assunto há décadas.</example>
  <example type='correct'>O professor <marker>pesquisa</marker> o assunto há décadas.</example>
  <example type='incorrect'>O professor <marker>tem pesquisado</marker> o assunto há décadas.</example>
</rule>
1 Like

@jaumeortola

Sorry to bother you again.

It works in the stand-alone tool:

<rule id='TER_PARTICIPIO-PASSADO' name='Ter + Particípio passado'>
<!--      Created by Marco A.G.Pinto, Portuguese rule - 2020-09-16 (2-JUL-2020+)   -->
  <pattern>
	<token>tem</token>
	<token postag='VMP00SM' postag_regexp='no'/>
  </pattern>
  <message>Substitua por <suggestion><match no='2' postag='VMP00SM' postag_regexp="yes" postag_replace='VMIP3S0'/></suggestion>.</message>
  <example correction='pesquisa'>O professor <marker>tem pesquisado</marker> o assunto há décadas.</example>
  <example type='correct'>O professor <marker>pesquisa</marker> o assunto há décadas.</example>
  <example type='incorrect'>O professor <marker>tem pesquisado</marker> o assunto há décadas.</example>
</rule>

But TESTRULES PT still gives an error:

Skipped 0 rules for variant language to avoid checking rules more than once
2545 rules tested.
Exception in thread “main” org.languagetool.rules.patterns.PatternRuleTest$PatternRuleTestFailure: Test failure for rule TER_PARTICIPIO-PASSADO[1] in file /org/languagetool/rules/pt/grammar.xml: Incorrect suggestions: Expected ‘’, got: ‘pesquisa’ on input: ‘O professor tem pesquisado o assunto há décadas.’
at org.languagetool.rules.patterns.PatternRuleTest.assertSuggestions(PatternRuleTest.java:525)
at org.languagetool.rules.patterns.PatternRuleTest.testBadSentences(PatternRuleTest.java:417)
at org.languagetool.rules.patterns.PatternRuleTest.testGrammarRulesFromXML(PatternRuleTest.java:318)
at org.languagetool.rules.patterns.PatternRuleTest.runTestForLanguage(PatternRuleTest.java:169)
at org.languagetool.rules.patterns.PatternRuleTest.runGrammarRulesFromXmlTestIgnoringLanguages(PatternRuleTest.java:152)
at org.languagetool.rules.patterns.PatternRuleTest.main(PatternRuleTest.java:683)

Any idea of how I can fix this?

Thank you!

Hello!

@jaumeortola @tiff @dnaber @Yakov

I now have the following code, but TESTRULES PT still shows errors.

Does anyone know how to fix it?

Thanks and sorry for all the trouble regarding this.

<rulegroup id='TER_PARTICIPIO-PASSADO' name='Ter + Particípio passado'>
<!--      Created by Marco A.G.Pinto, Portuguese rule - 2020-09-16 (2-JUL-2020+)   -->

  <!-- TENHO -->
  <rule>
	<pattern>
		<marker>
			<token>tenho</token>
			<token postag='VMP00SM' postag_regexp='no'/>
		</marker>	
		<token regexp='yes'>as?|os?|em|há|desde|um|uns|umas?|muitas?|muitos?|com|que</token>
	</pattern>
	<message>Substitua por <suggestion><match no='2' postag='VMP00SM' postag_regexp="yes" postag_replace='VMIP1S0'/></suggestion>.</message>
	<example correction='pesquiso'>Eu <marker>tenho pesquisado</marker> o assunto há décadas.</example>
	<example type='correct'>Eu <marker>pesquiso</marker> o assunto há décadas.</example>
	<example type='incorrect'>Eu <marker>tenho pesquisado</marker> o assunto há décadas.</example>
  </rule>

  <!-- TENS -->
  <rule>
	<pattern>
		<marker>
			<token>tens</token>
			<token postag='VMP00SM' postag_regexp='no'/>
		</marker>	
		<token regexp='yes'>as?|os?|em|há|desde|um|uns|umas?|muitas?|muitos?|com|que</token>
	</pattern>
	<message>Substitua por <suggestion><match no='2' postag='VMP00SM' postag_regexp="yes" postag_replace='VMIP2S0'/></suggestion>.</message>
	<example correction='pesquisas'>Tu <marker>tens pesquisado</marker> o assunto há décadas.</example>
	<example type='correct'>Tu <marker>pesquisas</marker> o assunto há décadas.</example>
	<example type='incorrect'>Tu <marker>tens pesquisado</marker> o assunto há décadas.</example>
  </rule>

  <!-- TEM -->
  <rule>
	<pattern>
		<marker>
			<token>tem</token>
			<token postag='VMP00SM' postag_regexp='no'/>
		</marker>	
		<token regexp='yes'>as?|os?|em|há|desde|um|uns|umas?|muitas?|muitos?|com|que</token>
	</pattern>
	<message>Substitua por <suggestion><match no='2' postag='VMP00SM' postag_regexp="yes" postag_replace='VMIP3S0'/></suggestion>.</message>
	<example correction='pesquisa'>O professor <marker>tem pesquisado</marker> o assunto há décadas.</example>
	<example type='correct'>O professor <marker>pesquisa</marker> o assunto há décadas.</example>
	<example type='incorrect'>O professor <marker>tem pesquisado</marker> o assunto há décadas.</example>
  </rule>
  
  <!-- TEMOS -->
  <rule>
	<pattern>
		<marker>
			<token>temos</token>
			<token postag='VMP00SM' postag_regexp='no'/>
		</marker>	
		<token regexp='yes'>as?|os?|em|há|desde|um|uns|umas?|muitas?|muitos?|com|que</token>
	</pattern>
	<message>Substitua por <suggestion><match no='2' postag='VMP00SM' postag_regexp="yes" postag_replace='VMIP1P0'/></suggestion>.</message>
	<example correction='pesquisamos'>Nós <marker>temos pesquisado</marker> o assunto há décadas.</example>
	<example type='correct'>Nós <marker>pesquisamos</marker> o assunto há décadas.</example>
	<example type='incorrect'>Nós <marker>temos pesquisado</marker> o assunto há décadas.</example>
  </rule>	  

  <!-- TÊM -->
  <rule>
	<pattern>
		<marker>
			<token>têm</token>
			<token postag='VMP00SM' postag_regexp='no'/>
		</marker>	
		<token regexp='yes'>as?|os?|em|há|desde|um|uns|umas?|muitas?|muitos?|com|que</token>
	</pattern>
	<message>Substitua por <suggestion><match no='2' postag='VMP00SM' postag_regexp="yes" postag_replace='VMIP3P0'/></suggestion>.</message>
	<example correction='pesquisam'>Os senhores <marker>têm pesquisado</marker> o assunto há décadas.</example>
	<example type='correct'>Os senhores <marker>pesquisam</marker> o assunto há décadas.</example>
	<example type='incorrect'>Os senhores <marker>têm pesquisado</marker> o assunto há décadas.</example>
	<example type='correct'>Eles <marker>pesquisam</marker> o assunto há décadas.</example>
	<example type='incorrect'>Eles <marker>têm pesquisado</marker> o assunto há décadas.</example>		
  </rule>	  
  
</rulegroup>

The problem is in the examples like <example type='incorrect'>. They have to be <example correction='pesquiso'>.

And in the last rule there are two suggestions: <example correction='pesquisamo|pesquisamos'>

@jaumeortola

You are a genius!

Thanks!

I am about to make some tests against a 200 000 corpus like I did yesterday.