How to create rules for (typical) wrong POS order

Dear LanguageToolmakers,

we are currently in process of deciding whether LanguageTool is something we should use in our teaching and provide our students with. One key argument for us would be if it was possible to form rules to warn our students when getting POS order wrong (i.e. putting prepositional phrases outside the German “Verblammer” of auxilary and inflected verb at the very end of the sentence, see Schrijven | Verbklammern und Substantivierungen).

German sentences such as
“Peter wurde intensiv hofiert von Petra im Sommer.”
are correct under certain oral discourse conditions, but are mostly regarded as unacceptible, especially in written German.
The correct version should be:
“Peter wurde im Sommer intensiv von Petra hofiert.”

From what I read in the wiki, I would guess that POS tags are the key to the solution, but I’m not sure whether I already understand them … I’d really appreciate some advice how we could write a rule or a set of rules that would warn our students when they are getting these things wrong.

Best, Thorsten

On Mo 08.10.2012, 01:53:18 you wrote:

Hello Thorsten,

German sentences such as
“Peter wurde intensiv hofiert von Petra im Sommer.”

here’s an example of a pattern that would match that sentence:

werden in|im|auf|über

It basically matches werden/wurde/wird etc., later followed (skip=“-1”) by
a participle (PA2), later followed by a simple prepositional phrase at the
end of the sentence. This is just a first approach, so it might create false
alarms and it will not match all cases. So you would probably need several
other rules.

Alternatively, you can always write more complex rules in Java, if you
don’t want to be limited by our grammar syntax.

I’ll be forwarding your question to our mailing list - most developers are
over there. Maybe someone has another idea how to approach this.

Regards
Daniel


http://www.danielnaber.de

Hey Daniel,

thanks so much for your quick reaction. Sentence patterns are a great way to deal with these cases, as long as one can discriminate inflected and non-inflected forms (such as in your example). As far as I can see now, it should be possible to create a small set of similar rules to cover the cases I have in mind and integrate them into a rule-subset for native speakers of Dutch (together with a set of common “false friends” between German and Dutch).

But it should be good to see these cases of uncommon POS-order and how to cover them elegantly discussed in the developers’ community, thanks for passing this question on to the mailing list!

Best, Thorsten

Dear Thorsten,

Please contact me for grammar rules and false friends regarding Dutch. We at OpenTaal support LanguageTool for Dutch. We have a development environment for testing new rules and false friends. Contact me at pander at opentaal punt org

Regards,

Pander