Back to LanguageTool Homepage - Privacy - Imprint

Matching tags (different issue)

(Ruud Baars) #1

Sometimes, there is one grapheme for multiple meanings.
'scheppen' can mean ' to create' as well as 'to dig'. In Dutch, both forms of 'scheppen' have different derivatives:
scheppen, schiep geschapen (create)
scheppen, schep, geschept (dig)

To be able to refer to just one of the verbs, more is needed than the grapheme of the root and its word type. Some part would be needed to identify the right meaning. In dictionaries, this is often just a number.

Technically, it would be possible to add it to the postag, But that would have quite some impact too. Would it not be more 'pure' to have it as an extra field?

(Daniel Naber) #2

For some languages, the POS tags already look like this: SUB:AKK:SIN:NEU - it's basically possible to extend them by adding a colon and more information. Morfologik (the library we use for fast POS tag lookup) has only two fields anyway, the word and its POS tags, so extending this list seems to be a valid approach to me.

(Ruud Baars) #3

Okay. I guess I could add a 'meaning number' it the list.

(Ruud Baars) #4

I am thinking about moving to this system as well, but I will edit existing rules first until there are no false positives in 100 hits for the rule.