[pt] Rule: DEPRECIATIVO

marcoagpinto · April 18, 2022, 1:40am

Hello Ricardo,

I have just coded a rule that checks if depreciative words are being used:

	<!-- TERMOS DEPRECIATIVOS -->
    <rule id='DEPRECIATIVO' name="5. Termos depreciativos" type="style">	
	<!--      Created by Marco A.G.Pinto with Ricardo Joseh Lima suggestions, Portuguese rule 2022-04-18 (1-JAN-2022+)      -->
	<!--
Vou à loja do monhé. → Vou à loja do indiano.
	-->	  
      <pattern>		
		<token regexp='yes' inflected="yes">&depreciativo;</token>
      </pattern>
	  <message>Este termo é depreciativo, pondere empregar um termo alternativo.</message>
	  <short>Termo depreciativo</short>
      <example type="incorrect">Vou à loja do <marker>monhé</marker>.</example>
      <example type="correct">Vou à loja do <marker>indiano</marker>.</example>
    </rule>

Notice that in the rule, I use INFLECTED=“YES” to get all variants of the words.

The commit is here:

I have attached the check against 600 000 sentences there, but it only produced one hit, which happens because the entity only has a few words so far.

Do you have any suggestions for the entity words?

Thanks!

marcoagpinto · April 18, 2022, 2:43am

@rjlima

I have organised the rules and added more two depreciative words: “portuga” and “tuga”.

rjlima · April 18, 2022, 2:02pm

One hit only indeed none of these are common in Brazilian Portuguese.
But portuga unfortunately is…
And many others.
So far I remembered japa for japonês, and china, for chinês.
There are verbs also like denegrir and judiar, both pejorative.
Maybe search for a list of words classified as politicamente incorreto would increase the words of your rule.
The problem is that there are many lists and not all are reliable.
This would require some time to filter, this week is busy for me as it ends on Wednesday - carnival in Rio de Janeiro, but I will take a closer look soon for this topic is very important.

marcoagpinto · April 19, 2022, 3:51am

@rjlima

I have just made small changes in the rule:

I added an exception “NP.+” to differentiate between “China” (country) and “china” (noun, depreciative).

Basically the only words in the entity are now:

emidia · May 8, 2022, 12:43pm

I’m so glad to see this topic here! It’s a very important service LT is doing through language. It’s also a tough mission.

I agree with Ricardo’s suggestions and I would add:

aleijado (should be “pessoa com deficiência”)
criado-mudo (should be “mesa de cabeceira”)
baianada (shouldn’t be used because is a bias against people from Brazil’s Northeast)

There are others I could search and post here. There’s also expressions like “da cor do pecado” which refers to black women as a sexual object or “coisa de preto”, unfortunately used recently by a racist politician to talk about bad habits. Does expressions also could be in this list?

Some links:

rjlima · May 11, 2022, 11:16am

Hi, good suggestions, but as I said this is a topic that deserves attention and should be taken with extreme care.
I totally agree with baianada entering.
As for criado mudo it is worth reading this thread: https://twitter.com/agencialupa/status/1463234768942862336
And finally for ‘pessoas com deficiência’ I am not sure if ‘deficiência’ is a word that should be used, for it is generic and there are other alternatives that these people employ.

emidia · May 15, 2022, 7:32pm

Hi, Ricardo! It’s a tough mission and I strongly agree with the extra care about it.

Here are some thoughts, doubts, and comments:

About “criado-mudo”
Thanks for sharing the thread about “criado-mudo”, Ricardo! I didn’t know about this research. In the last tweet about the expression, Agência Lupa indicates: “Alternativa: mesa de cabeceira”. What is your opinion about it?

About “pessoa com deficiência”
I’m afraid I need more details about your concern. In any case, a doubt: hypoteticly, could LT could leave a “warning” if “aleijado” is identified (something like “this word may not be the best term” or so?)*

*Is there a place I could study warnings LT already uses?

marcoagpinto · May 17, 2022, 12:55am

The site of the “7 palavras preconceituosas” only works if we disable the ads blocker, and it is only spam, I couldn’t find anything there except tons of adverts.

rjlima · May 17, 2022, 10:08am

@emidia as for ‘criado mudo’ I stand with not suggesting anything, the discussion is controversial. For ‘aleijado’, maybe the user would also like a word as a suggestion, but the way you phrased it was interesting and for me it solves the issue of ‘what to put in the place’ - as I said ‘deficiência’ is not a word that people that fit in this category, see pages on capacitismo for example.