Use a modern framework for embedded HTTP server discussion

I think that it’s convenient to take the input from the input file or set of files – that’s the easiest way to control the input size and other params.

BTW, didn’t you plan to replace maven by gradle? The latter is more human-readable etc…

Yes, but I got stuck because of the complexity (LibreOffice add-on, stand-alone, command-line, …). If you want to work on that, might be a nice part of a GSoC project.

I see the maven-site-plugin in plugins list, is it used somewhere in the project? Can’t find any reference neither in utility .sh scripts nor in the documentation.

Where exactly do you see it? I don’t see it, I don’t think it’s being used.

image

Maybe this is the how Intellij Idea displays every maven project – I’m more familiar with gradle than with maven.

Anyway my question is solved :slight_smile:

Got a couple of exams this week, so was AFK :\

Finally, I’ve compared the SparkJava and SpringBoot implementations running the same default LT settings, <min: 25, max: 200> threads, with unlimited requests queue size.

The test data was the blog posts of 10-15 years old bloggers taken from that corpus – one post per request.

The load test was running for 15 minutes per thread. The number of threads (users) was growing from 1 to 100 during the first 10 minutes. The number of requests per user was not limited – they were querying until the test time expires.

On the OOM error caused by the queue overfull the response BAD_REQUEST was sent.

During the test both frameworks were using the same amount of memory (looks like the default Xmx512m),
the CPU load was ~99%. The testing machine was Amazon t2.micro running RHEL 7.

Results

Framework Errors Throughput
SparkJava 23.14 % 18.6 req/min
SpringBoot 14.45 % 32.3 req/min
None (LT 4.0) 36.34 % 25.4 req/min

So the SpringBoot implementation is more efficient and the next step is to compare the Spring Boot and Spring WebFlux (the reactive Spring), I think.

Implementation info

Below is shown the difference between SparkJava and SpringBoot implementations (the logic was moved to the framework independent LanguageToolApiService (When using Spring, I was injecting the service as the singleton, but that’s the only difference), the main() entry-points are almost the same).

Both implementations are looking high-level enough to be easy to read and support.
The links to complete versions SparkJava, SpringBoot (were written as fast as possible, so some hardcode and hacks were made to make these implementations provide the default LT server experience. The LT-server works fine, but I don’t guarantee that other server-dependent parts are OK – I’ve commented most of the server-related things in GUI for example).

SpringBoot

@Controller
public interface LanguageToolApiController {
    @GetMapping(path = "/languages")
    ResponseEntity<List<LanguageDTO>> languages();

    @PostMapping(path = "/check", consumes = MediaType.APPLICATION_FORM_URLENCODED_VALUE)
    ResponseEntity<CheckResultDTO> check(
            @RequestParam("text") String text,
            @RequestParam("language") String language,
            @RequestParam("motherTongue") String motherTongue,
            @RequestParam("preferredVariants") String preferredVariants,
            @RequestParam("enabledRules") String enabledRules,
            @RequestParam("disabledRules") String disabledRules,
            @RequestParam("enabledCategories") String enabledCategories,
            @RequestParam("disabledCategories") String disabledCategories,
            @RequestParam("enabledOnly") boolean enabledOnly
    );
}

@Slf4j
@Controller
public class LanguageToolApiControllerImpl implements LanguageToolApiController {

    private final LanguageToolApiService languageToolApiService;

    @Autowired
    public LanguageToolApiControllerImpl(LanguageToolApiService languageToolApiService) {
        this.languageToolApiService = languageToolApiService;
    }

    @Override
    public ResponseEntity<List<LanguageDTO>> languages() {
        log.info("GET /languages request");
        ResponseEntity<List<LanguageDTO>> response;
        try {
            List<LanguageDTO> languages = languageToolApiService.languages();
            response = new ResponseEntity<>(languages, HttpStatus.OK);
        } catch (Error e) {
            log.error("Error!", e);
            response = new ResponseEntity<>(HttpStatus.BAD_REQUEST);
        }
        log.info("GET /languages response: '{}'", response);
        return response;
    }

    @Override
    public ResponseEntity<CheckResultDTO> check(String text, String language, String motherTongue, String preferredVariants,
                                                String enabledRules, String disabledRules, String enabledCategories,
                                                String disabledCategories, boolean enabledOnly) {
        log.info("POST /check request: " +
                        "text='{}', " +
                        "language='{}', " +
                        "motherTongue='{}', " +
                        "preferredVariants='{}', " +
                        "enabledRules='{}', " +
                        "disabledRules='{}', " +
                        "enabledCategories='{}', " +
                        "disabledCategories='{}', " +
                        "enabledOnly='{}'",
                text,
                language,
                motherTongue,
                preferredVariants,
                enabledRules,
                disabledRules,
                enabledCategories,
                disabledCategories,
                enabledOnly
        );
        ResponseEntity<CheckResultDTO> response;
        try {
            CheckResultDTO checkResultDTO = languageToolApiService.check(text, language, motherTongue, preferredVariants, enabledRules,
                    disabledRules, enabledCategories, disabledCategories, enabledOnly);

            response = new ResponseEntity<>(checkResultDTO, HttpStatus.OK);
        }
        catch (Exception e) {
            log.error("Error!", e);
            response = new ResponseEntity<>(HttpStatus.BAD_REQUEST);
        }

        log.info("POST /check request: " +
                        "text='{}', " +
                        "language='{}', " +
                        "motherTongue='{}', " +
                        "preferredVariants='{}', " +
                        "enabledRules='{}', " +
                        "disabledRules='{}', " +
                        "enabledCategories='{}', " +
                        "disabledCategories='{}', " +
                        "enabledOnly='{}', " +
                        "response='{}'",
                text,
                language,
                motherTongue,
                preferredVariants,
                enabledRules,
                disabledRules,
                enabledCategories,
                disabledCategories,
                enabledOnly,
                response
        );
        return response;
    }
}

SparkJava

@Slf4j
public class LanguageToolApiController {

    private final ObjectMapper mapper;
    private final LanguageToolApiService languageToolApiService;

    public LanguageToolApiController() {
        log.info("BEFORE init()");
        mapper = new ObjectMapper();
        languageToolApiService = new LanguageToolApiServiceImpl();
        threadPool(200, 25, 60000);
        setUpEndPoints();
        log.info("AFTER init()");
    }


    private void setUpEndPoints() {
        get("/languages", (request, response) -> {

            log.info("GET /languages request");

            String responseString;
            int responseStatus;
            try {
                response.type("application/json");
                List<LanguageDTO> languages = languageToolApiService.languages();
                responseString = mapper.writeValueAsString(languages);
                responseStatus = HttpStatus.OK_200;
            } catch (Exception e) {
                log.error("Error!", e);
                responseString = "";
                responseStatus = HttpStatus.BAD_REQUEST_400;
            }
            response.status(responseStatus);

            log.info("GET /languages response='[body='{}', status='{}']'", responseString, response.status());

            return responseString;
        });

        post("/check", (request, response) -> {
            String text = request.queryParams("text");
            String language = request.queryParams("language");
            String motherTongue = request.queryParams("motherTongue");
            String preferredVariants = request.queryParams("preferredVariants");
            String enabledRules = request.queryParams("enabledRules");
            String disabledRules = request.queryParams("disabledRules");
            String enabledCategories = request.queryParams("enabledCategories");
            String disabledCategories = request.queryParams("disabledCategories");
            boolean enabledOnly = Boolean.parseBoolean(request.queryParams("enabledOnly"));

            log.info("POST /check request: " +
                            "text='{}', " +
                            "language='{}', " +
                            "motherTongue='{}', " +
                            "preferredVariants='{}', " +
                            "enabledRules='{}', " +
                            "disabledRules='{}', " +
                            "enabledCategories='{}', " +
                            "disabledCategories='{}', " +
                            "enabledOnly='{}'",
                    text,
                    language,
                    motherTongue,
                    preferredVariants,
                    enabledRules,
                    disabledRules,
                    enabledCategories,
                    disabledCategories,
                    enabledOnly
            );

            String responseString;
            int responseStatus;
            try {
                response.type("application/json");
                CheckResultDTO checkResultDTO = languageToolApiService.check(text, language, motherTongue, preferredVariants, enabledRules,
                        disabledRules, enabledCategories, disabledCategories, enabledOnly);
                responseString = mapper.writeValueAsString(checkResultDTO);
                responseStatus = HttpStatus.OK_200;
            } catch (Exception e) {
                log.error("Error!", e);
                responseString = "";
                responseStatus = HttpStatus.BAD_REQUEST_400;
            }

            response.status(responseStatus);

            log.info("POST /check request: " +
                            "text='{}', " +
                            "language='{}', " +
                            "motherTongue='{}', " +
                            "preferredVariants='{}', " +
                            "enabledRules='{}', " +
                            "disabledRules='{}', " +
                            "enabledCategories='{}', " +
                            "disabledCategories='{}', " +
                            "enabledOnly='{}', " +
                            "response='[body='{}', status='{}']'",
                    text,
                    language,
                    motherTongue,
                    preferredVariants,
                    enabledRules,
                    disabledRules,
                    enabledCategories,
                    disabledCategories,
                    enabledOnly,
                    responseString,
                    response.status()
            );

            return responseString;
        });
    }
}

Feedback is welcome.

Thanks for running the test. It would be interesting to see how the current low-level implementation compares to both.

Here it is: Errors: 36.34%, Throughput 25.4 req/min. Updated the report above with that information.
The results:

Framework Errors Throughput
SparkJava 23.14 % 18.6 req/min
SpringBoot 14.45 % 32.3 req/min
None (LT 4.0) 36.34 % 25.4 req/min

Here is a draft of my proposal. Could you take a look?

Thanks, looks mostly good. Suggestions / comments:

  • SSL support: We use the reverse proxy approach you suggest on languagetool.org. However, we don’t know whether people rely on the SSL feature. So I’m hesitant to remove it.
  • I suggest changing the order of tasks: 1) spelling suggestions 2) HTTP server 3) Gradle - this way we focus on issues that improve LT from the user’s point of view.
  • base line for suggestions: the first base line should be the current approach that doesn’t use ML at all.

@jaumeortola, @Yakov any comment from your side?

How is Gradle better than Maven?
What is the advantage of switching to a new build environment?

Gradle offers less excessive syntax which is more human-readable (5-lines-of-xml-per-dependency in maven are an overkill to me) and has good native support of unit-testing (so the surefire plugin is not needed anymore). Gradle is also faster than Maven – that’s a good thing especially when running tests.
Since the build logic of the LT is not super complex, the gradle’s known flexibility is not the key feature now, but it could be useful in future anyway.
Finally, I’ve just asked whether it’s a good idea to migrate and received the positive answer.

Should I add that info to my proposal?

Updated the proposal with these suggestions, reorganized the timeline the way providing more time to accomplish the tasks.

Yes.
Speed up the execution of tests will help us to execute the process of LT development faster.

Just pointing out that the execution of tests time won’t change cause that’s jvm who executes tests, and the jvm stays the same. The performance improvement is achieved mostly on the build and composition stages – that’s why the overall unit-testing time reduces.

updated the proposal with the explanation of Gradle pros.

The LT GSoC guide says that student’s application should follow the same application template that Blender uses (section Application template). Is this still the only acceptable template now?
I.e. could you give me an advice whether I should refactor my proposal to follow the template suggested on the Blender’s page?

No, it’s just one way to write a proposal. No need to rewrite yours, although you could check if some information might be missing from your proposal when comparing it to the Blender template.

Could you clarify the benefit of reactive approach for the LT?
LT server does not seem to follow microservices architecture and the HTTP endpoints are asynchronous. So where you’d suggest to introduce the reactivity?