API results for multiple sentences

(Ruud Baars) #1

When I feed the api with tow sentences by accident, there is an offset and length for the shortened content, but not for the sentence. Please check this match (json converted to php array, just the match part). Isn this inconsistent?

array(8) {
string(49) “Is dit niet overdreven? Is “hoog” niet voldoende?”
string(0) “”
array(1) {
array(1) {
string(4) “hoog”
array(3) {
string(57) “… betreft geen grenzen kent.De prijs zal erg hoog zijn.”
string(27) “De prijs zal erg hoog zijn.”
array(5) {
string(12) “OVERDRIJVING”
string(1) “2”
string(22) “mogelijke overdrijving”
string(13) “uncategorized”
array(2) {
string(5) “STYLE”
string(5) “Stijl”

(Daniel Naber) #2

I see two offsets there, one should be relative to the context, the other one should be relative to the whole text.

(Ruud Baars) #3

Indeed, but the values are not related to the reported sentence, but to the input. I will make it work using that knowledge.

The issue is that the sentence splitter I used is not SRX. Will have to look into that for stand alone use.