Hello,
@gsider and I are thinking of creating the SpecificCaseRule in Greek. In order to find the specific expressions, we thought of using a Wikipedia corpus. Is there a script responsible for reading the corpus file (.xml), or should we create it from the beginning? Any other ideas to solve this issue are welcome.
There’s org.languagetool.dev.dumpcheck.WikipediaSentenceExtractor
to get sentences from the XML, and org.languagetool.dev.dumpcheck.SentenceSourceChecker
to check the XML against LT.
1 Like
Great, we’re on it!