[pt] Rule: PARA [NÃO] TER DE

Hello @rjlima

So, here is our first 2022 rule together.

A few weeks ago, I wrote in a document:

Lembre-se de gravar as preferências das técnicas para que não tenha de as alterar manualmente constantemente.

So, I thought it could be simplified:

para que não tenha de → para não ter de

the “não” can be min=“0” max=“1”

There is still the problem of matching the “ter” with the singular or plural of the original sentence, but I can make some tests against a corpus of 600 000 sentences.

What name shall I give to the rule ID and its description? “Simplificar: Para + Que + [não] + Verbo ter + de → para + [não] + Verb Ter + de”

This is my biggest difficulty, I am terrible at picking the names.

Also, I need your advice if the last token should be limited to “de” or if we can also use other tokens (prepositions).

Thanks!

Hi @marcoagpinto, answering your questions:

  • on naming: it seems that there are other rules similar to this one as in “Ele disse que tem aulas → Ele disse ter aulas”, maybe it is useful to see the names for these rules and create one for all these cases; one idea “Usar a versão com infinitivo da subordinada”
  • on other prepositions: not a preposition, but in Brazil we use ‘que’ also ‘para que não tenha que’

Ahhhh… good to know… I will implement the rule tomorrow.

To make it easier testing the names will become:

“Usar a versão com infinitivo da subordinada [1]”
“Usar a versão com infinitivo da subordinada [2]”
“Usar a versão com infinitivo da subordinada [3]”

If I place all rules in one group, it will be harder to fix individual rules.

Unless someone explains to me how to test individual rules in a group.

@jaumeortola @udomai How do I trigger only one rule in a group to testing against the 600 000 sentences? Thanks!

@jaumeortola @udomai

I am using the command:

java -Dfile.encoding=UTF-8 -Xmx4500M -jar languagetool-wikipedia.jar check-data -l pt-PT -r GENERAL_NUMBER_AGREEMENT_ERRORS -f pt-PT.txt -f tatoeba-pt.txt --max-sentences 600000 --context-size 100 >afternew.txt

@rjlima

I have created the rule:

	<!-- USAR EXPRESSÕES MAIS SIMPLES SUGERINDO VERBOS NO INFINITIVO -->
    <rulegroup id='SIMPLIFICAR_CONVERTER_PARA_VERBO_INFINITIVO' name="Simplificar: Usar a versão com infinitivo da subordinada" type="style">
    <!--      Created by Marco A.G.Pinto with Ricardo Joseh Lima suggestions, Portuguese rule 2022-02-04 (1-JAN-2022+)      -->
	<!--
#1 - 1/2:
Grave as preferências para que não tenha de as alterar constantemente. → Grave as preferências para não ter de as alterar constantemente.
#2 - 2/2:
Grave as preferências para que tenha de as alterar constantemente. → Grave as preferências para ter de as alterar constantemente.
	-->
	
<!-- RULE #1 - PARA QUE <NÃO> TENHA → PARA <NÃO> TER - 1/2 -->
	<!--
Grave as preferências para que não tenha de as alterar constantemente. → Grave as preferências para não ter de as alterar constantemente.
	-->
		<rule>
			<pattern>
				<token>para</token>
				<token>que</token>
				<token postag='RN' postag_regexp='no'/>
				<token postag='VMM0.+' postag_regexp='yes'>
					<exception postag_regexp='yes' postag='VM[^MS].+'/>
				</token>
			</pattern>
			<message>Em certos contextos, esta perífrase pode ser simplificada.</message>
			<suggestion>\1 \3 <match no='4' postag='VMM0(.)(.)0' postag_regexp="yes" postag_replace='VMN0$1$20'/></suggestion>
			<example correction="para não ter">Grave as preferências <marker>para que não tenha</marker> de as alterar constantemente.</example>
		</rule>
<!-- RULE #2 - PARA QUE TENHA → PARA TER - 2/2 -->
	<!--
Grave as preferências para que tenha de as alterar constantemente. → Grave as preferências para ter de as alterar constantemente.
	-->
		<rule>
			<pattern>
				<token>para</token>
				<token>que</token>
				<token postag='VMM0.+' postag_regexp='yes'>
					<exception postag_regexp='yes' postag='VM[^MS].+'/>
				</token>
			</pattern>
			<message>Em certos contextos, esta perífrase pode ser simplificada.</message>
			<suggestion>\1 <match no='3' postag='VMM0(.)(.)0' postag_regexp="yes" postag_replace='VMN0$1$20'/></suggestion>
			<example correction="para ter">Grave as preferências <marker>para que tenha</marker> de as alterar constantemente.</example>
		</rule>	

    </rulegroup>

It gives 290 hits, and they seem all valid after my quick scroll look in the results.

Here is attached the results:
afternew5.txt (139.2 KB)

Do you find any mistakes?

There are some verbs that appear between “()” in the suggestions, maybe they have an incorrect POS. I will try to fix them in the next months.

Great! It looks all correct to me

Good to know :slight_smile:

1 Like