SafeTex
(SafeTex)
November 17, 2012, 4:08pm
1
Hello
I want to write a rule for finding compound nouns with hyphens like ‘flower-pot’
So far I have this rule which can find ‘flower pot’ and it works
<rule name="Hyphen check">
<rule id="HYPEN_OR_NOT" name="Hyphen or not">
<pattern>
<token postag="NN|NNS|NN:U|NN:UN|" postag_regexp="yes"></token>
<token postag="NN|NNS|NN:U|NN:UN|" postag_regexp="yes"></token>
</pattern>
<message>Found</message>
</rule>
</category>
But I don’t know how to add a segment for the hyphen. I’ve tried all sorts of things.
Can anyone help to expand this rule to find ‘flower-pot’ ?
Thanks
dnaber
(Daniel Naber)
November 17, 2012, 4:31pm
2
On Sa 17.11.2012, 08:08:36 you wrote:
I want to write a rule for finding compound nouns with hyphens like
‘flower-pot’
You can use this to find words with hyphens:
.+-.+
It’s not possible to check the POS tags of the first and second part though
(without programming at least), as this is considered a single word by LT.
Regards
Daniel
–
http://www.danielnaber.de
SafeTex
(SafeTex)
November 17, 2012, 8:55pm
3
Hello
Can I have one more question tonite.
Going back to the first rule
Found
I noticed it found things like
‘take shelter’ , make money
Normally, take and make are verbs although they can be nouns so eg. to be on the take
But it also finds
‘in house’ and there is practically no way ‘in’ can be a noun
So why is it finding all these combinations which are not really noun + noun?
Thanks
dnaber
(Daniel Naber)
November 18, 2012, 11:22am
4
On Sa 17.11.2012, 12:55:55 you wrote:
So why is it finding all these combinations which are not really noun +
noun?
Because guessing the word’s part-of-speech is error-prone, so we return all
readings. A disambiguator can sometimes be used to remove the invalid
readings (Developing a Disambiguator - LanguageTool Wiki ).
Regards
Daniel
–
http://www.danielnaber.de