Back to LanguageTool Homepage - Privacy - Imprint

[en] Uncountable nouns for added.txt

@Mike_Unwalla @tiff

Hello!

I have spent two days working in a feature for Proofing Tool GUI that allows to have a master wordlist and other lists and analyse them to check which entries are missing in the master.

I am sending you a list of 476 uncountable nouns missing in the English added.txt, taken from my GB speller morphologic data (ongoing).

Please add it to the added.txt when you have the time, but be sure to see if it matches what is planned.

Thanks!

EN_uncountable_nouns_missing_476words_20201009.zip (3.8 KB)

Hi @marcoagpinto, thanks for the data.

I do not understand:

What do you want me to do before I add the missing POS to added.txt?

Well, you should look at them to make sure it is 100% what you expected them to be?

But maybe I am being too picky on that.

:slight_smile:

Hi @marcoagpinto, if you checked the words, then I am happy to add them without doing a second check.

Did you check all the words in a good BrE dictionary such as https://www.lexico.com or https://www.ldoceonline.com ?

The uncountable nouns are basically based on Wiktionary, since it is the only dictionary that states “uncountable”.

The other dictionaries just say “Mass Noun” but then they have examples that use plural (like the Oxford one to which I have a Premium account, so I can’t trust Oxford on that).

The term ‘uncountable noun’ and ‘mass noun’ are synonyms: https://www.lexico.com/definition/uncountable_noun.

If Oxford says ‘mass noun’ but does not have an example, I think that we can safely add the noun as NN:U. @udomai, @tiff, what do you think?

I agree: mass nouns vs. count nouns is the same distinction as countable nouns vs. uncountable nouns.

A question for clarifying: If we mark a word as being “uncountable”, does that mean “it can be used as a mass noun” or “it can only be used as a mass noun”?

I am asking because many nouns can be used both countably and uncountably.

  • Your picture needs more colour. (mass noun)

  • This screen can only display four colours. (count noun)

  • There was still some beer in the glass. (mass noun)

  • Can I take a Nyquil when I’ve had two beers? (count noun)

@Mike_Unwalla @tiff @udomai

Basically, you just have to open my .txt file provided here and give a quick look if you agree with the words.

All words that seem valid (most of them if not all?) should be added to the English added.txt.

@udomai,

There are 2 ways to mark a noun as uncountable:
NN:U means uncountable only.
NN:UN means countable or uncountable.

All the postags are explained in \resource\en\tagset.txt.

@marco, I cannot agree or disagree because I do not know approximately 30% of the words.

For the words that I do know, some are already NN:U in LT. Examples: gunpoint, hafnium

@marcoagpinto, thank you very much for the data! I’ve scrolled through it. Yours is a list of exclusively uncountable nouns and I haven’t found any errors in it. I was thinking whether having several “infowars” was a possibility (maybe because of Alex Jones’… “well-known” website). And maybe in semiotics, having several “semioses” is a possibility. So maybe those two are candidates for an NN:UN tag.

@Mike_Unwalla, thank you for the reminder, I am aware of the tagset. My question was basically answered by looking at the list: It’s a list of purely uncountable words.

It (infowar) is already NN:UN in LT.

So, before we add the postags to added.txt, we must make sure that each noun does not already have a tag NN:U or NN:UN.

I will do that check this afternoon and then add the missing nouns to added.txt.

@marcoagpinto, done: https://github.com/languagetool-org/languagetool/commit/cc21962475d99788c2ef05dede864acf210570a6

@Mike_Unwalla

Thanks!

In a few weeks or months we need to start talking about other kinds of words such as adjectives, adverbs and such.

Right now, as I work in the British speller, I have been adding POS information to it which is basically uncountable nouns.

But, I also want to add information of other kind so that I can export it to be used by added.txt.