Back to LanguageTool Homepage - Privacy - Imprint

Taxonomic words

Hello @tiff @Mike_Unwalla @dnaber

I am about to create a list of Taxonomic words to spelling.txt + added.txt.

I phoned the Professor yesterday, and he told me that they are uppercase, and gender neutral. I believe they are proper names.

What is the postag for English?

I will create for PT and EN.

Thanks!

Hi @marcoagpinto, I don’t know about POS and capitalization for words in a taxonomy.

I found ’ International Code of Nomenclature of Bacteria: Bacteriological Code, 1990 Revision.’ (https://www.ncbi.nlm.nih.gov/books/NBK8808/).

Rule 7: “The name of a taxon above the rank of genus up to and including order is a substantive or an adjective used as a substantive of Latin or Greek origin, or a latinized word. It is in the feminine gender, the plural number, and written with an initial capital letter.”

So, plural noun (NNS) and adjective (JJ), not proper noun. BUT, until this morning, I thought that they were proper nouns.

Rule 10a “The name of a genus or subgenus is a substantive, or an adjective used as a substantive, in the singular number and written with an initial capital letter.”

So, noun and adjective (JJ), not proper noun. But, the rule does not tell me whether the noun is countable (NN) or non-count (NN:U)

I don’t know whether there is a single rule for all taxonomies.

I suppose that a quick win is to treat all the terms as proper nouns, but is a bit of a hack.

And what is the proper noun both gender POS?

There is no gender for proper nouns. The postag is NNP for singular (such as ‘Marco’) and NNPS for plural (such as ‘Englishmen’).

For future reference, you can see all the POS in \org\languagetool\resource\en\tagset.txt.

@Mike_Unwalla

Find attached the:
spelling.txt
added.txt
of Taxonomic words.

Could you organise the core files like I did for Portuguese, and create groups?

For example, in my spelling.txt and added.txt, I have a section for computer terms, other for proper names, etc.

This way I may help in the future, such as adding words to it myself and sort them using my Proofing Tool GUI tool.

Thanks!

taxonomic_words_EN.zip (1.8 KB)

@marcoagpinto, no sorry, I have a 3-month backlog of work, so I am not going to take on more work. Possibly, one of the other maintainers (@tiff?) can do the work.

@tiff

Can you do it?

Thanks!

I’ll take care tomorrow.

Done as part of https://github.com/languagetool-org/languagetool/commit/1df8ed30468bf7c6a3415b5f63b3675b10ac2d5a

I skipped the POS tagging. Most words were already tagged.

Thank you, @tiff

The words of the pseudo-latin taxonomy is the same for all languages, is it not? Should it not be in the general list then?

@Ruud_Baars

Well thought, you are right.