LanguageTool now ignores URLs and email addresses (i.e. they are not underlined as spelling mistakes). Would it be useful to do the same with hashtags (#hashtag) and user mentions (@username). What do you think?
I agree, it sounds nice. So we have: file names, domain names (like languagetool.org), hashtags and user mentions.
Iām wonderig if we will have false negatives in some language. For example, domain names like āaaaa.esā could be an error, although an unusual one (a missing white space plus a missing capitalization). But āesā is a common word in several languages.
A global disambiguation.xml file would be easier to maintain (and can be used for other rules, proper nouns, etc.). But it will not allow fine-tuning for each language, so it has to contain only rules fully acceptable to all languages.
I can try to implement the global file in October.
One of the things I see happening a lot, are tags that are incorrect, existing of two parts: #hash tag or #hash-tag (In Dutch, the - is a valid word char, but not a valid tag char for Twitterā¦
If I run the sentence āHallo @sprache, #sprache ist wichtig!ā with language ādeā, it does not find any issues. But if I use āautoā it complains about āspracheā in each case. Is this intended?
With language=de you donāt get spelling errors. You need language=de-DE (or auto) to get them. That is on purpose.
That means that hashtags and mentions are not ignored in German. (But they are ignored somewhat in the web page languagetool.org, arenāt they? @tiff)
Should we implement the global disambiguation.xml we talked about? @dnaber
Interesting that it doesnāt alert me that the language ādeā doesnāt exist and that I should use āde-DEā. But indeed if I use āde-DEā I get the same feedback as for āautoā.
Awaiting the feedback if this will be handled via the global disambiguation.xml or if I should handle it on my side.
Thatās nice! But I wonder if there could be a list of more popular addresses to correct in these cases. So, if I type user@gamil.com it wouldnāt be ignored as there would be an exception for gmail; same for Google, Microsoft, Apple, and so on - and even langaugetool.org, I mean languagetool.org
Semi-related, I guess in hashtags the main thing to ignore is case and replacing spaces with dashes or omitting them entirely. but if I write #safethewales ā¦ I would still want it to tell me that it should be #savesthewhales potentially