[en] How to suggest NARROW NO-BREAK SPACE (U+202F)

Mike_Unwalla · November 28, 2019, 1:09pm

I want to suggest the Unicode character ‘NARROW NO-BREAK SPACE’ (U+202F) (Unicode Character 'NARROW NO-BREAK SPACE' (U+202F)).

This fragment of code in a suggestion is meant to find a dot and replace it with a thin space:
<match no="3" regexp_match="\." regexp_replace="\u202F"/>

It doesn’t work. I see \u202F in the suggestion.

Any ideas about what I am doing wrong?

dnaber · November 28, 2019, 2:13pm

You can use   I think.

Mike_Unwalla · November 28, 2019, 3:50pm

@dnaber, thanks. That works for the suggestion. (I can see a space.)

But, the replacement does not render correctly in the GUI. Instead of a space, I see a box character. (The font is Arial Unicode MS.)

dnaber · November 28, 2019, 3:55pm

Sorry, I’m not very familiar with the GUI, so I cannot spend time on debugging that.

Ruud_Baars · November 28, 2019, 6:23pm

Arial might not have that character.

Mike_Unwalla · November 29, 2019, 10:30am

@Ruud_Baars, thanks.

I tried also with AWLUnicode and Lucida Sans Unicode, but they too do not render correctly.

Ruud_Baars · November 29, 2019, 3:49pm

What is the reason for doing this? (Maybe we can find an alternative?) Looks like it is a java interface thing.

Mike_Unwalla · November 29, 2019, 6:47pm

It is to give a suggestion in the THOUSANDS_SEPARATOR rule for https://github.com/languagetooler-gmbh/languagetool-premium/issues/727

The rule suggests a comma (which is the usual non-technical separator).

Ruud_Baars · November 30, 2019, 10:27am

But why suggest a character no-one can enter by hand? Is it the official separator?

Mike_Unwalla · December 2, 2019, 10:56am

Hi @Ruud_Baars, some standards organizations recommend the thin space as a thousands separator: Decimal separator - Wikipedia.

For readability, you don’t want to split a number over 2 lines. Thus, a non-breaking thin space is necessary.

The rule THOUSANDS_SEPARATOR is not for technical persons. It’s for general uses, so the current suggestion of a comma is sufficient.