Back to LanguageTool Homepage - Privacy - Imprint

confusion pair correction

I really liked the concept of confusion pair correction and I have already started working on the project.
I am facing an issue, the text files like BNC_corpus.txt,targets.txt,eval_data etc are used frequently and I am not able to locate these text files. Any help will be highly appreciated.

What do you mean by “are used frequently”? Are these files referenced from anywhere?

Continuing the discussion from confusion pair correction:

length = []
with open("text_files/BNC_corpus.txt") as f:
	for line in f.readlines() :
		length.append(len(line.split()))

in the above example, I am not able to find BNC_corpus.txt.
similarly, there are many text files I have already mentioned.

Where did you take that example code from?

i was going through the code on github… drexjojo/ confusion_pair_correction
check avglength.py

I suggest you open an issue in that repo asking for the files. Or maybe @jaumeortola or @Yakov remember where the files are.

I have opened two issues on that repo…including the text files.
@dnaber did you checked the repo?