@gsider and I are thinking of creating the SpecificCaseRule in Greek. In order to find the specific expressions, we thought of using a Wikipedia corpus. Is there a script responsible for reading the corpus file (.xml), or should we create it from the beginning? Any other ideas to solve this issue are welcome.
org.languagetool.dev.dumpcheck.WikipediaSentenceExtractor to get sentences from the XML, and
org.languagetool.dev.dumpcheck.SentenceSourceChecker to check the XML against LT.
Great, we’re on it!