Style guide suggestions: Allow more than 1000 rules and distinguish between languages

Heyas! I’ve been using LanguageTool for a few years now, and I’ve been a premium user since nearly two years; but only very recently I’ve started setting up my own style guide rules.

As of right now, my style guide CSV has more than 1200 entries, and I still want to add several hundred more, but I’ve hit the max limit of 1000.
While more than 1000 custom rules might sound like a lot, the vast majority of those rules are common words and expressions of my Catalan dialect. This also includes verb replacements, and each verb has 70+ verbal forms that are manually replaced; that’s the reason why my number of rules is so high.

I would want to know if expanding the style guide rules limit to above 1000 is something planned, because it would be very useful for me.

Another thing that I would like to suggest: the vast majority of my style guide rules are in Catalan, and the very few times when I write in Spanish, a lot of my Catalan rules appear, due to the many word similarities between Catalan and Spanish.

I wish to suggest a new column in the CSV file to specify which language or languages should each rule apply on.

1 Like

Hi Xavier. Thanks for your comments.

Regarding the limitations of the style guide, I will send your message to the product team.

Is there anything we can do to improve the grammar checker and avoid so many custom rules? For example, add new words to the dictionary, add more rules, or add more user options, that could be useful for all users in general. We are willing to make those improvements.

Jaume Ortolà

1 Like

Hello Xavier,

Thank you for reaching out to us. We’re delighted to hear that you’ve been enjoying LanguageTool thus far. Regarding the issues you’ve encountered with the personal style guide:

Unfortunately, as of now, the feature is not set up in a way to fully accommodate the use case you’ve described. However, we have noted your feedback and will investigate long-term solutions. Thank you for bringing this to our attention and making us aware of your use case. In the short term, as Jaume mentioned above, please feel free to let us know what kinds of rules you’re missing in Catalan, so that everyone can benefit from your insights.

Best regards,
Christian

1 Like

Hi Jaume and Christian, thanks for your answers!
Yes, I have a few suggestions that could help me and other users with the style guide feature.

Here are four suggestions related to the style guide feature that would help me (and hopefully also help other users).

1) Verb replacements

The thing that takes up the biggest quantity of rules on my style guide are verb replacements; on my style guide, I manually replace all the verbal forms of 9 Catalan verbs, for a total of more than 600 rules. These replacements are towards verbs more commonly used in the Catalan dialect I speak (català tarragoní), but also towards verbs that I’m more used to say.

If LanguageTool had a system that it could suggest replacements for verbs, like say I could replace the entire “passejar” verb for “tombar” in just one style guide rule, without having to do such many rules; it would save up a lot of rules and a lot of time.

Although I understand that this system might be pretty complex or even directly unviable, as in languages like English, each verb only has 5 forms, but in Catalan or other languages there can be 80 or more (especially if we also replace valencian and balear forms).

As of right now, here is the list of the verbs that I manually replaced:

  • passejar → tombar
  • mirar → guaitar
  • veure → guipar
  • malvendre → abarnegar
  • sortir → eixir
  • gaudir → xalar
  • trepitjar → palcigar
  • rapar → xollar
  • atipar → arrigolar

Btw, the verbs “abarnegar”, “palcigar” and “arrigolar” doesn’t seem to be in LanguageTool’s default dictionary in Catalan, even though they are acceptable verbs albeit rarely used; I had to add them in my personal dictionary.

2) Option to disable the Catalan accents diacrítics rule

This is not directly related to the style guide feature, but I’ll help to reduce the number of rules on my style guide, so I’ll say it here anyway.

I try to, whenever possible (or at least, whenever I remember… xD), use the Catalan accent diacrítics following the old rules (pre-2016 rules).

When adding an accent diacritic in a word, LanguageTool suggests removing it, as it follows the new rules, rather than the old ones, and gives the rule "Hi sobra l’accent diacrític (segons les normes noves; desactiveu la regla si voleu les normes tradicionals)". But there is no toggle to disable this rule as the message says; is this message the same from the Softcatalà corrector? (Where this same rule and such a button do exist)

For now, as a way to help in this regard, I always ignore the LanguageTool diacrític rule, and I have a few hundred custom rules on my style guide (133 to be precise) that remind be of those diacrítics when necessary. These rules could be avoided if I had a toggle in the LanguageTool’s interface to prefer the old accent diacrítics style, just how there is in the Softcatalà corrector.

Also, now that I’m asking for disabling rules, I would also like to have a way to toggle off the “Cal evitar el «‘lo’ neutre».” rule, as I use it very regularly, and I even use it as a gender-neutral pronoun in some informal contexts.

3) Allow for multiple source phrases in a single rule

On the style guide rules, you can put several replacement phrases on a single rule, but you cannot put several source phrases on a single rule. Why?

Allow me to explain it by showing this example: in my beloved català tarragoní, the adjective “arrigol” can be used (in certain contexts) as a substitute of the more common elsewhere adjectives “fart”, “tip” or “satisfet”. Currently, my style guide needs 12 individual rules for covering all the 3 source adjectives and their 4 forms of each.

Should I be allowed to place several source phrases on the same rules, the above 12 rules could be resumed on just the 4 below:

  • fart,tip,satisfet → arrigol
  • farta,tipa,satisfeta → arrigola
  • farts,tips,satisfets → arrigols
  • fartes,tipes,satisfetes → arrigoles

Here’s another example, I personally prefer to say “camí” or “volta” instead of “vegada”, “intent” or “cop” in certain contexts, so I have these 6 rules:

2024-03-12_20-37

That could be resumed on just the 2 below:

  • vegada,intent,cop → camí,volta
  • vegades,intents,cops → camins,voltes

4) Allow for regular expressions in the source phrases

This is another thing that, I think, would be very useful. Having regular expressions instead of the source phrase in a rule.

Here’s an example of how a regular expression could be useful. In my style guide, I have a bunch of rules replacing the “he or she” expression with the singular they pronoun, as the webpage https://noheorshe.com suggests. But I have a bunch of different rules for all the forms of those pronouns, and different rules for different styling per pronouns (“he or she”, “he / she”, “he/she”, “he & she”, etc…).

Should regular expressions be compatible with the style guide, I could tell LanguageTool to replace everything that matches the expression
[Hh][Ee](\s*/\s*|\s*&\s*|\sor\s|\sand\s|\sand\s*/?\s*or\s)[Ss][Hh][Ee]
with “them” (the previous expression was in C# RegEx because that’s the one I know the most). With these expressions, all the “he or she” rules in my style guide would only take 4 rules instead of the current 48. See the picture below for how the above expression could resume 8 rules:

I also know of regex search-and-replace systems, these systems that can execute replacements within a single regex expression; should this be possible, several complex rules could be resumed in a single big regex. But I don’t if this system is in the regex system you may use, so I don’t know how viable is this.

I’m going to end this already extremely long message with a quick list of some rules in my style guide that I think they would be nice to have by default in LanguageTool:

  • In British English, “lift” seems to be generally more preferred than “elevator”.
  • In English, consider following the guidance of the https://noheorshe.com website and use the singular they pronoun (and their forms) as a substitute for expressions like “he or she”, “him or her”, “his or her” or “his or hers”. See the example I wrote in my point 4.
  • In Catalan, suggest “des del” for the typo “desdel”; currently the only suggested replacments are “desgel”, “dèdal”, “Dèdal”, “descel” and “dessal”, and I think that “des del” is more likely in this case. I make this typo more often than I’m willing to admit… xD
  • In Catalan, suggest replacing the English expression “de chill” for alternatives like “tranquil·lament”, “relaxadament”, “amb calma” or “d’estranquis”.
  • In Catalan, suggest replacing the Spanish word “temazo(s)” (when referring to a really good song) with “temacle(s)”, “temarro(s)”, “cançonás(ses)”, “cançonassa(/nasses)” or “temás(sos)”. There doesn’t seem to be much consensus over which ones are better, in ésAdir, they prefer “temacle” or “temarro”, but I personally prefer the latter three…

I know that my use case of the style guide is pretty extreme, with how many custom rules I have, and that I probably have way too many rules than other people may have; but here’s hoping that this style guide feature keeps on improving with future updates.

Please let me know if you have questions about these suggestions.

Moltes gràcies pels comentaris, Xavier.

Ja hem afegit els suggeriments que tenen una solució directa: *desdel, *temazo.

Les qüestions com la preferència entre sortir i eixir són pròpies d’un llibre d’estil molt particular. No tenim una manera d’oferir això com a opció configurable. Haurien d’existir les regles i una opció d’usuari per a activar-les. Sabem que poden tenir aplicacions. Per exemple, tenim una regla sortir>eixir, que s’ha usat per a fer adaptacions de traduccions de programari. Amb un servidor propi es pot fer tot això, evidentment.

Tenir opcions d’usuari més completes, que permetin, per exemple, les normes tradicionals dels diacrítics, ho tenim planificat de fa temps, però no hi hem pogut avançar.

Envio els comentaris sobre l’anglès a l’equip corresponent.

Jaume Ortolà