Back to LanguageTool Homepage - Privacy - Imprint

The matter of Dialog and Grammar Rules


(Glenn Allen Hefley) #1

I'm sure this has been brought up before, it can't be only me, but I failed to discover a previous discussion. First off, very cool tool. I'm loving it -- and if there is an area you can point me to, where a python hack can toy with this issue, I'm down, and it's not a big issue either, because I've never seen a grammar tool have this... anyway, it's about Fiction and dialog. In non-fiction it is a quote so they have the same issue.

Inside dialog, it is a quote, a verbatim transfer of sound, sometimes only possible to render in phonetic symbolism for spelling i.e. Awk! but more often it is a grammar rule, which is breached, since most people do not speak in grammar friendly fashion. Speaking in a grammar friendly manner is dull, and you can't get laid using a dull and witless voice.

The issue is, I want to tell my tool not to check certain rules inside dialog, but to always check them outside of dialog. Outside of dialog, well, that's me, and the writer has to hold to a narrower course or he looks witless and doesn't get laid. I know, it's all weird and stuff but hey, so is grammar.

So, how can I put in a regex or set a table look up or something that checks to see if the infraction is inside quotation marks? And, if so, to walk on by and let the man say his peace?

Is this possible? I'm much more comfortable adding nonsensical words to my dictionary, because I use the same spelling for them, and I don't use them often, but grammar rules are a whole different set of fears.

Thanks in advance.

Glenn


#2

If you plan to preprocess them.

Why not just do a REGEX replace for the meanings of said words in quotes?

For example, replace Awk word with a different word before sending it to the processor.

Ironically enough, I am actually trying to build a tool for novel writing. Though for now I am still developing the component of combining TinyMCE with LanguageTool.

Demo:
https://knowzero.github.io/tinymce4-languagetool/demo.html

Github:

That said, I may implement that feature to allow things of that sort. Especially accents and the like.


(Daniel Naber) #3

Hi Glenn, how are you using LT, i.e. from which software? In general, LT cannot do yet what you need, so the software you're using would need to use LT differently, depending on whether you're in a dialog.


(Mike Unwalla) #4

Hi @glennhefley

For spelling, we have 'ignore_spelling' (http://wiki.languagetool.org/hunspell-support and http://wiki.languagetool.org/developing-a-disambiguator#toc11).

In disambiguation.xml, you could make a rule that applies a special postag to each quoted word. A second rule then ignores the spelling of quoted words.

For the grammar rules, put an antipattern on each rule to ignore text that has the special postag (you can probably automate that task).

Add these rules near the bottom of disambiguation.xml:

<rule name="Is quoted speech (first word)" id="IS_QUOTED1">
  <pattern>
    <token regexp="yes">'|‘|"|“</token>
    <marker>
      <token skip="-1"/>
    </marker>
    <token regexp="yes">'|’|"|”</token>
  </pattern>
  <disambig action="add"><wd pos="IS_QUOTED"/></disambig>
  <example type="untouched">If your <marker>fergulator</marker> is defective, stop the test.</example>
  <example type="untouched">Did you write, "The cat sat on the mat? [No closing quote mark.]</example>
  <example type="ambiguous" inputform="These[these/DT]" outputform="These[These/IS_QUOTED,these/DT]">... and I quote, "<marker>These</marker> unusual gizmo2 inherbitions caused much dmage to the quipment."</example>
  <example type="ambiguous" inputform="Shhbamkrrr[Shhbamkrrr]" outputform="Shhbamkrrr[Shhbamkrrr/IS_QUOTED]">Thunder struck '<marker>Shhbamkrrr</marker>' and the frightened rabbits ran.</example>
</rule>

<rule name="Is quoted speech (remaining words" id="IS_QUOTED2">
  <pattern>
    <token postag="IS_QUOTED"/>
    <marker>
      <token><exception regexp="yes">'|’|"|”</exception></token>
    </marker>
  </pattern>
  <disambig action="add"><wd pos="IS_QUOTED"/></disambig>
  <example type="untouched">If your <marker>fergulator</marker> is defective, stop the test.</example>
  <example type="ambiguous" inputform="fergulator[fergulator]" outputform="fergulator[fergulator/IS_QUOTED]">"Oh, no, my red and white <marker>fergulator</marker> is defective," she cried.</example>
  <example type="ambiguous" inputform="inherbitions[inherbitions]" outputform="inherbitions[inherbitions/IS_QUOTED]">... and I quote, "These unusual gizmo2 <marker>inherbitions</marker> caused much dmage to the quipment."</example>
</rule>

<rule name="ignore spelling of quoted text" id="IGNORE_SPELLING_IN_QUOTED_TEXT">
  <pattern>
    <token postag="IS_QUOTED"/>
  </pattern>
  <disambig action="ignore_spelling"/>
</rule>

In grammar.xml, add this antipattern at the top of each rule:

<antipattern>
  <token postag="IS_QUOTED"/>
</antipattern>

(Glenn Allen Hefley) #5

@Mike_Unwalla Wow... you're awesome. I think I even understand that and I've been writing all night since I posted the question... I'm going to wait on trying it out, get some sleep first to make my understanding is... actual understanding.

@!KnowZero, I have the standalone and the Libre... I would like to be inside the Libre but I'll make a VM and put the stand alone in there, and try out what Mike gave me

TinyMCE is a good idea, and I was going that route myself, but then I stumbled into enough areas that my head exploded and I came up with something a bit more involved. Which is good, no.. it is.. because it's now so complicated I won't feel bad not ever getting it done.

I'm on GitHub if you want to look at my notes. Right now it's just notes and vapor and NTLP python, some Sentiment Analysis with LSTMs, wordnet, verbnet, sensenet, framenet, and 10k novels used to make a repository and those 13k words from that emotional affective study did rating them for valence, arousal and dominance. So, it sounds cool and I could probably take over... like maybe Portland or Ontario, but I'm not sure how good it's going to be for writing novels yet.
This is the kind of stuff I'm looking to have my Muse/Daemon help with

This is where I"m going to start out

and to be honest, if I can get that first thing done and functional, I'll be a happy guy. :slight_smile: Then we can take over the world and all the rest of the stuff.

I will report back.. Thank you very much for the help!


(Glenn Allen Hefley) #6

Yeah, it's never really a thing I am aware of enough to do a preprocess... the spelling, yes, but to be honest, I suck at grammar and always have. Well... I know it, I just can't write and try to use it at the same time and I gave up trying ten years ago.... I can't afford to not write. And, yes I can tag it -- and in fact that's exactly what I do now, is the LT finds an offense, I decide if it is something to fix or not. If it isn't, then I put the "mistake" into a comment on the side, fix it, and then after I'm all done, my last chore is to go back to all of the comments and put it back the way I want it to be before sending it to the editor.

Of course her first list is all of those mistakes, or was until I stopped deleting the comments so she could see I did it on purpose. -- now she just sighs and cusses at me.

But it would be cool -- dialect is one thing that it would be good for, slurrs. Whacked with a Salmon, tends to get caught. And even with great coding that simple task is going to be harsh... it must be because as I said before, no one has that.. not even the expensive commercial packages.


(Tiago F. Santos) #7

If you are in LibreOffice and your spellchecking is also done by LO, you may wish to try a change to Mike's rule.
Use immunize (used to mark the tokens as immunized, i.e., never matched by any rule).
That way you won't need to tag every rule.

Change this:
<disambig action="add"><wd pos="IS_QUOTED"/></disambig>
for this:
<disambig action="immunize"></disambig>

I look forward to seeing wordthy advances.


(Glenn Allen Hefley) #8

Thanks for that suggestion. I'm not going to have time today. Got a deadline to meet. But will be reporting back soon.

thanks again, and fot the Wordthy .. good.. i need all the overseeing and hints I can get. It seems to me that most of the stuff I need is already developed.. but not tasked in the direction of use. So, I believe some retasking and I'll look like a stud... with a long list of thank yous on every page.

The WordTTY is first though. I planned on doing that in NodeJS, using Electron as the framework. Then I stumbled over Claudia, which looks like the developer did all the heavy lifting for the Alexa connection, and the Bot connections as well. So that's going to be some rule writing, and then training, and little else... hope, I hope I hope. Next is a augmented novel, that is going to bring in some user interface and build up interaction tool sets. Then Wordthy-Main. So, if ever find yourself poking around and see that I'm in need of some ... skills? .. or hints, please, ... you won't be insulting me at all.

Thanks again.

G


(Tiago F. Santos) #9

Hasty as usual, and I goofed up here.
After looking at it again, I realized the error.

It should be on the

<disambig action="ignore_spelling"/>
for this:
<disambig action="immunize"/>

My apologies.