Back to LanguageTool Homepage - Privacy - Imprint

confusion pair correction


(Tushar Lakhera) #1

I really liked the concept of confusion pair correction and I have already started working on the project.
I am facing an issue, the text files like BNC_corpus.txt,targets.txt,eval_data etc are used frequently and I am not able to locate these text files. Any help will be highly appreciated.


(Daniel Naber) #2

What do you mean by “are used frequently”? Are these files referenced from anywhere?


(Tushar Lakhera) #3

Continuing the discussion from confusion pair correction:

length = []
with open("text_files/BNC_corpus.txt") as f:
	for line in f.readlines() :
		length.append(len(line.split()))

in the above example, I am not able to find BNC_corpus.txt.
similarly, there are many text files I have already mentioned.


(Daniel Naber) #4

Where did you take that example code from?


(Tushar Lakhera) #5

i was going through the code on github… drexjojo/ confusion_pair_correction
check avglength.py


(Daniel Naber) #6

I suggest you open an issue in that repo asking for the files. Or maybe @jaumeortola or @Yakov remember where the files are.


(Tushar Lakhera) #7

I have opened two issues on that repo…including the text files.
@dnaber did you checked the repo?