nancyntse13
(Praneet Khandelwal)
May 31, 2017, 5:54am
1
I needed to check the list of uncountable nouns that is being used. I can see the list in the github source but not in the standalone version I downloaded to run on my pc. Thanks
Yakov
(Yakov)
May 31, 2017, 6:34am
2
The list of uncountable nouns is packed into the main POS tag dictionary (english.dict).
nancyntse13
(Praneet Khandelwal)
May 31, 2017, 6:36am
3
Thanks but can it be seen since it is encoded.
Yakov
(Yakov)
May 31, 2017, 7:21am
4
POS tag dictionarу make with awk script, what uses uncountable.txt:
#the script annotates uncountable nouns
BEGIN {FS="\t";
glosfile="2of12inf.txt"; #Kevin's file
while ((getline < glosfile) > 0){
if ($1~/%/) {gsub(/%/,"");
tabela[$1]="uncount"
}
}
english_file="english.txt"; #created temporary file
while ((getline < english_file) > 0){
if (tabela[$1]=="uncount")
lemma[$2]="uncount"
if ($3=="VBG")
gerund[$1]="uncount"
}
uncountables="uncountable.txt" #uncountable nouns
while ((getline < uncountables) > 0)
if ($0!~/^#/ && $0!="") {
if ($0~/ /) {
print "Entry " $0 " contains a space. Exiting."; exit(1)
This file has been truncated. show original
You can also dump (export) POS tag dictionary to text file:
http://wiki.languagetool.org/developing-a-tagger-dictionary