French has no suggestions of wrong written words at all

Hi, I have a small questions.

When I select “French” and enter a sentence like “C’est francais” it marks “francais” as an error but does not give me alternatives like in English or Italian (checked on LanguageTool homepage). Is this a bug or just a missing feature?

It’s a known missing feature. We had to disable the suggestions because they were too slow. Making them fast requires a bit of work (the list of words needs to be converted to a binary format for fast lookups) and I’m sure it’s on someone’s TODO list.

I don’t think that the binary format supports all the Hunspell features yet. I don’t remember what was missing though. Maybe it was the ICONV and OCONV features of Hunspell among other things which was missing. That is why French uses Hunspell, but the suggestions with Hunspell are indeed too slow for large text (hence disabled). I’d rather not sacrifice the excellent Hunspell French dictionary from Dicollecte.org for something degraded but with suggestions.

It might be worth having an option to enable/disable spelling suggestions to override the default behavior (disabled for French and some other languages).

Besides the spelling checker, some French grammar rules provides suggestions, but not all of them.

What’s still missing is documented at the bottom of Spell check - LanguageTool Wiki - I’m not sure what the comment that ICONV isn’t strictly needed actually refers to.

Has this been resolved? Is there a way to get French suggestions now? Is there something I can do to make it work somehow? Even if it’s slow it is better than nothing.

I don’t think that Dominique has worked on that yet. What’s needed is basically a large list of words with all forms. In other words, the *.dic and *.aff files at languagetool/languagetool-language-modules/fr/src/main/resources/org/languagetool/resource/fr/hunspell at master · languagetool-org/languagetool · GitHub would need to be expanded. hunspell has the “unmunch” tool for that, but as the French *.aff file uses some advanced features, it doesn’t properly work it seems. At least the result has some strange numbers in it, but I cannot properly verify it as I don’t really speak French.

French could be done the same way I did Dutch. I am willing ti do that.

Ruud.

Verzonden van smartphone.

“dnaber [via LanguageTool User Forum]” ml-node+s2306527n4642172h18@n4.nabble.comschreef:

I don't think that Dominique has worked on that yet. What's needed is basically a large list of words with all forms. In other words, the *.dic and *.aff files at https://github.com/languagetool-org/languagetool/tree/master/languagetool-language-modules/fr/src/main/resources/org/languagetool/resource/fr/hunspell would need to be expanded. hunspell has the "unmunch" tool for that, but as the French *.aff file uses some advanced features, it doesn't properly work it seems. At least the result has some strange numbers in it, but I cannot properly verify it as I don't really speak French. 	 	 	 	

_____________________________________________

If you reply to this email, your message will be added to the discussion below:
http://languagetool-user-forum.2306527.n4.nabble.com/French-has-no-suggestions-of-wrong-written-words-at-all-tp4641163p4642172.html

	To start a new topic under LanguageTool User Forum, email ml-node+s2306527n3993201h17@n4.nabble.com 
	To unsubscribe from LanguageTool User Forum, click here.
	NAML

Is there a way to use the Libreoffice dictionaries? To somehow import them? They have a new french one:
http://extensions.libreoffice.org/extension-center/dictionnaires-francais/releases/5.2

Hi Ruud, thanks. I lost overview a bit, could you summarize in one or two sentences what that approach was? Is it reproducible with a script, so that we can re-create the list of words once a new hunspell dictionary is published?

It is reproducable, assuming there is a raw source words list.
I have one, but there may be a better one.

I could script it. The quality of results are better if the source words
list is bigger of course, and from lots of sources.

I could do this if required. In fact for any language now.

Ruud

Op 07-01-15 om 08:12 schreef dnaber [via LanguageTool User Forum]:

Hi Ruud, thanks. I lost overview a bit, could you summarize in one or
two sentences what that approach was? Is it reproducible with a
script, so that we can re-create the list of words once a new hunspell
dictionary is published?


If you reply to this email, your message will be added to the
discussion below:
http://languagetool-user-forum.2306527.n4.nabble.com/French-has-no-suggestions-of-wrong-written-words-at-all-tp4641163p4642175.html

To start a new topic under LanguageTool User Forum, email
ml-node+s2306527n3993201h17@n4.nabble.com
To unsubscribe from LanguageTool User Forum, click here
http://languagetool-user-forum.2306527.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3993201&code=YmFhcnNyakB4czRhbGwubmx8Mzk5MzIwMXwxOTc0OTIzNjEz.
NAML
http://languagetool-user-forum.2306527.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html!nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers!nabble%3Aemail.naml-instant_emails!nabble%3Aemail.naml-send_instant_email!nabble%3Aemail.naml

It is a pity a direct replay is always refused.

The procedure is very simple:

  • get a big list of words and frequencies
  • turn it into a words list and a gaia list
  • spellcheck the words to find the correct ones
  • remove redundant entries from the words list (Firstupper, FULLUPPER etc)
  • combine the correct words and the gaia to a spell dic.

I put it all together, stuff to be temporarily found here:
www.spellonit.com/PrIvAtE/mkLTdic.zip

I could not test the last step, since the computer is still recovering from a disk full crash.

Ruud

Hey all,

Any progress or updates on this? Last comment was in Jan 2015

Thanks :slight_smile: