The changes I’m making remove the assumptions behind the current LT implementation. Most of the rules (in my changed version) now work using data rather than having hard coded file names. There is no assumption that things will be found in the classpath, resource lookup is deferred to a resource broker, which it sort of does now but not really. I’ve also modified things so that a language can be loaded at runtime, the files/data it needs doesn’t need to be on the classpath.
I’ve also made some changes to decouple rules from the language and decouple the rules from JLanguageTool. So I can create and test a rule by itself.
An example (in my local codebase) would be:
// A default broker is used that knows how to look up resources from the classpath. The user can set their own broker.
AmericanEnglish en = new AmericanEnglish();
JLanguageTool lt = new JLanguageTool(en);
MorfologikAmericanSpellerRule rule = en.createMorfologikSpellerRule(messages, userConfig);
lt.check(rule, “This is my test sentence.”);
The constructor for MorfologikSpellerRule is now:
public MorfologikSpellerRule(ResourceBundle messages, Language language, UserConfig userConfig, Set dictionaries, List ignoreWords, List prohibitedWords);
The language knows how to create it’s own rules (as it does now) but you can pick and choose the one(s) you are interested in. Resource lookup is requested from a ResourceDataBroker which knows how to look up dictionaries, the ignore words and prohibited words. Where the broker gets the actual data from is its own business, my default broker uses Path objects but can resolve those paths against the classpath if required.
Obviously my changes run a lot deeper than this (for instance the assumption that RuleFilter constructors have no args no longer applies). I’ve also removed things like the short code for languages and replaced it with a Locale. This simplifies a lot of things. I’ve tried as much as possible to keep compatibility with the existing codebase but a lot has changed. I’m happy for the changes to be merged with the current LT implementation but I think the volume of changes may be too much for some people (despite the increased flexibility it will bring, a lot of people won’t need it because LT already works for them but for my use case I really don’t want to mess with the classpath and don’t want to have the overhead of all the languages the end user may ever want to use). So I’ll leave that decision up to you guys, I’m happy to have a parallel project where I merge changes from LT into my local project. I don’t want to fragment your user base but I really want to use LT but I can’t in its current form, hence my changes.
I’ve currently got the core and English modules working (and all tests pass), the rest of the languages will be easy to modify since I now have all the components necessary to make it work.