Searching for specific unicode characters

Hello

I’d like to search for specific Unicode characters

So far, I have this

\P{u00AB}

where U00AB = «

But it doesn’t work and I just can’t spend more time on LT at the moment cos of work.

Can anyone help me with my code ?

Thanks

On Sa 01.12.2012, 12:07:37 you wrote:

\P{u00AB}

where U00AB = «

You don’t need to escape characters, as the grammar.xml is UTF-8 and can
contain the characters directly:

«


http://www.danielnaber.de

Hi Daniel,
I’m trying to do exactly the same.
I’d like to detect \n and \t (newline and tab character).
Following your suggestion, I typed those characters directly into the grammar.xml.
My pattern for \n is just this:



The problem I have is that it’s detecting newLines almost anywhere in the file now.
I guess there must be something I’m missing, can that be done?
thanks, regards
Vincenzo

Just un update, I tried to use regexp, my pattern is like this:

MY_REGEX_HERE

I’ve tried multiple regexes like:
\n \u000A

They worked ok in both
http://regexpal.com/ and RegexPlanet: online regular expression testing for Java
However none of them worked in Language Tool.

Anyone can support on this please?
Thanks, regards
Vincenzo

I’m trying to do exactly the same.
I’d like to detect n and t (newline and tab character).

This is not possible with XML rules, you’d need to write a Java rule.
Actually just detecting single characters can be done easily in any
programming language, so I’m not sure if using LT isn’t quite overkill
here.

Regards
Daniel


http://www.danielnaber.de

Hi Daniel,
thanks for your help.
I’m using LT because it’s already used in my project, so it would be good to have all logic using the same technology.
I guess that also blank spaces can’t be detected as well.
I’m trying to detect leading spaces in a sentence with regex like: ^ .* or ^\s.*
but it won’t work.
Can you please confirm that blanks can’t be detected this way too?
Thanks, regards
Vincenzo

Can you please confirm that blanks can’t be detected this way too?

That’s right, no kind of whitespace can be detected by rules. There’s
one exception described here, but I don’t know if that’s enough for you:

http://wiki.languagetool.org/tips-and-tricks#toc13

Regards
Daniel


http://www.danielnaber.de