Back to LanguageTool Homepage - Privacy - Imprint

[GSoC reports] spellchecker, server-side framework and build tool tasks


(Daniel Naber) #21

Thanks! For those who want to compare with the current state, you can use this command:

curl --data "language=en-US&text=A qick brown fox jamps ower the lasy dog" https://languagetool.org/api/v2/check

On Linux, you can use json_pp to pretty-print the result, i.e.

curl --data "language=en-US&text=A qick brown fox jamps ower the lasy dog" https://languagetool.org/api/v2/check | json_pp

Trying this manually is of limited use of course - what’s interesting will be the evaluation based on real data.


(Yakov) #22

Thanks!
Working well.
I also try it with our firefox extension, but it works only over https…


(Oleg) #23

The evaluation on the real hold-out data shows >87% accuracy when the current LT.org’s deployed solution has ~86%. That does not take in account the distance of the correct suggestion from the first position, so I’ll try to find a pretty way to use and display that info etc.

Will try to enable ssl but it requires some extra time to create a (self-signed I think) certificate etc.


(Daniel Naber) #24

It’s not that difficult using https://letsencrypt.org


(Oleg) #25

Thanks, I’ll use it!


(Oleg) #26

Week 6: 10 June

Working on the multiple languages support for the suggestions orderer, hope to deploy tonight.
Found a way to painless use XGBoost with java: jpmml-xgboost.


(Oleg) #27

Week 7: 11 June

Training data preprocessing (took more time than I thought).
Working on the multiple languages support for the suggestions orderer: added mock models for all the languages – only for the true models learning time.


(Oleg) #28

Week 7: 12 June – 15 June

  • Studying jpmml-xgboost
  • Working on features extractor update
  • Training the models