"context.text" in the API results strips newlines

roundrobin · March 6, 2018, 4:14pm

Is there a reasoning behind that the “text” attribute in the “context” field is removing all the newlines characters (ex: \n)?

What am i trying to archive?
On the frontend I want to highlight the words that match the error, but for that I need the surrounding context.
The problem is if the surrounding context in the API is not preserving the whitespace chars, I can’t do the matching with RegExps.

Can we keep the newlines in the “context.text” field? I think it not a good idea to change the original text to indicate a context of the match, because the newlines matter IMO for such context.

Example:

   {
  "software": {
    "name": "LanguageTool",
    "version": "4.1-SNAPSHOT",
    "buildDate": "2018-03-06 10:41",
    "apiVersion": 1,
    "status": ""
  },
  "warnings": {
    "incompleteResults": false
  },
  "language": {
    "name": "German (Germany)",
    "code": "de-DE"
  },
  "matches": [
    {
      "message": "Möglicher Rechtschreibfehler gefunden",
      "shortMessage": "Rechtschreibfehler",
      "replacements": [],
      "offset": 167,
      "length": 6,
      "context": {
        "text": "...nd umfassendem Know-how setzen Wir sind HuiBui!  wir zahlreiche Services um, die Sie t...",
        "offset": 43,
        "length": 6
      },
      "sentence": "Mit langjähriger Erfahrung und umfassendem Know-how setzen\nWir sind HuiBui!",
      "rule": {
        "id": "GERMAN_SPELLER_RULE",
        "description": "Möglicher Rechtschreibfehler",
        "issueType": "misspelling",
        "category": {
          "id": "TYPOS",
          "name": "Mögliche Tippfehler"
        }
      }
    }
  ]
}

dnaber · March 6, 2018, 7:08pm

I can’t think of a reason, it just seems it has always been this way and nobody complained. The reason for that might be that you can use offset and length (below matches, not below context) to find the context. As you’ve sent the request to LT, you should also still have the original text?