Back to LanguageTool Homepage - Privacy - Imprint

[en] How to suggest NARROW NO-BREAK SPACE (U+202F)

I want to suggest the Unicode character ‘NARROW NO-BREAK SPACE’ (U+202F) (http://www.fileformat.info/info/unicode/char/202F/index.htm).

This fragment of code in a suggestion is meant to find a dot and replace it with a thin space:
<match no="3" regexp_match="\." regexp_replace="\u202F"/>

It doesn’t work. I see \u202F in the suggestion.

Any ideas about what I am doing wrong?

You can use &#x202f; I think.

@dnaber, thanks. That works for the suggestion. (I can see a space.)

But, the replacement does not render correctly in the GUI. Instead of a space, I see a box character. (The font is Arial Unicode MS.)

Sorry, I’m not very familiar with the GUI, so I cannot spend time on debugging that.

Arial might not have that character.

@Ruud_Baars, thanks.

I tried also with AWLUnicode and Lucida Sans Unicode, but they too do not render correctly.

What is the reason for doing this? (Maybe we can find an alternative?) Looks like it is a java interface thing.

It is to give a suggestion in the THOUSANDS_SEPARATOR rule for https://github.com/languagetooler-gmbh/languagetool-premium/issues/727

The rule suggests a comma (which is the usual non-technical separator).

But why suggest a character no-one can enter by hand? Is it the official separator?

Hi @Ruud_Baars, some standards organizations recommend the thin space as a thousands separator: https://en.wikipedia.org/wiki/Decimal_separator#Digit_grouping.

For readability, you don’t want to split a number over 2 lines. Thus, a non-breaking thin space is necessary.

The rule THOUSANDS_SEPARATOR is not for technical persons. It’s for general uses, so the current suggestion of a comma is sufficient.