bug: response header wrong for non streamed responses (no chunks) #2560

Propheticus · 2024-04-01T10:16:59Z

When calling the /chat/completions API endpoint without "stream": true set, the response is indeed a single JSON object of type "chat.completion" and not a streaming of multiple server event lines starting with "data: " followed by objects of type "chat.completion.chunk".

So instead of

data: {"choices":[{"delta":{"content":" Pos"},"finish_reason":null,"index":0}],"created":1711961792,"id":"GEinhvWFJJLg5Qr8KJ6r","model":"_","object":"chat.completion.chunk"}

data: {"choices":[{"delta":{"content":"itive"},"finish_reason":null,"index":0}],"created":1711961792,"id":"h05jZ89hTwi7thCiVrBR","model":"_","object":"chat.completion.chunk"}

data: {"choices":[{"delta":{"content":"."},"finish_reason":null,"index":0}],"created":1711961792,"id":"5XwVZ6y0S8aIdtRSxSe0","model":"_","object":"chat.completion.chunk"}

data: {"choices":[{"delta":{"content":""},"finish_reason":null,"index":0}],"created":1711961792,"id":"6Vi2z9md1sQCWKhSYRJI","model":"_","object":"chat.completion.chunk"}

data: {"choices":[{"delta":{"content":""},"finish_reason":null,"index":0}],"created":1711961792,"id":"dRmZRw35MDeyq23GHAxr","model":"_","object":"chat.completion.chunk"}

data: {"choices":[{"delta":{"content":""},"finish_reason":"stop","index":0}],"created":1711961792,"id":"fpmpyJEnXHidePQM1cnz","model":"_","object":"chat.completion.chunk"}

data: [DONE]

We get

{"choices":[{"finish_reason":null,"index":0,"message":{"content":" Positive.","role":"assistant"}}],"created":1711962125,"id":"7Y3MXJ4ndC5QW1BrTvd4","model":"_","object":"chat.completion","system_fingerprint":"_","usage":{"completion_tokens":3,"prompt_tokens":59,"total_tokens":62}}

However, the response header still says:

HTTP/1.1 200 OK
Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive
Access-Control-Allow-Origin: *
Date: Mon, 01 Apr 2024 08:55:47 GMT
Transfer-Encoding: chunked

I would expect a content-type of "application/json" and not "text/event-stream". Also the transfer-encoding: chunked is false.

The text was updated successfully, but these errors were encountered:

louis-jan · 2024-04-03T13:50:20Z

Great find! @Propheticus

Van-QA · 2024-04-09T07:30:25Z

hi @Propheticus, the issue was resolved in our nightlybuild, would you mind retrying it? thank you

Propheticus · 2024-04-09T07:40:48Z

@Van-QA Will do as soon as I get Avast to stop sandboxing and blocking the installer.

Propheticus · 2024-04-09T07:49:12Z

That works! Also the JSON is formatted with newlines and indents now.

HTTP/1.1 200 OK
Content-Type: application/json
Cache-Control: no-cache
Connection: keep-alive
Access-Control-Allow-Origin: *
Date: Tue, 09 Apr 2024 07:46:50 GMT
Transfer-Encoding: chunked



{
	"choices":[
		{
			"finish_reason":null,
			"index":0,
			"message":{
				"content":" Positive.",
				"role":"assistant"
			}
		}
	],
	"created":1712648810,
	"id":"pfSUz0sCcSG0jTqmQdlg",
	"model":"_",
	"object":"chat.completion",
	"system_fingerprint":"_",
	"usage":{
		"completion_tokens":3,
		"prompt_tokens":59,
		"total_tokens":62
	}
}

Propheticus · 2024-04-12T08:54:12Z

2 remarks:

The finish_reason is shown as null. That is normal for chunks that are not the last chunk. The last chunk, or in this case the only 'chunk' should mention the stop reason, e.g. "finish_reason":"stop" or .."length". Open AI spec.
The Transfer-Encoding: chunked is still shown in the header while I'd expect a Content-Length: <length> instead. HTTP header doc.

Propheticus added the type: bug Something isn't working label Apr 1, 2024

Propheticus changed the title ~~bug: response header wrong for non streamed (no chunks) calls~~ bug: response header wrong for non streamed responses (no chunks) Apr 1, 2024

Van-QA assigned louis-jan Apr 1, 2024

louis-jan added this to the v0.4.11 milestone Apr 3, 2024

louis-jan mentioned this issue Apr 3, 2024

fix: wrong response header for non streamed responses #2606

Merged

3 tasks

Propheticus closed this as completed Apr 9, 2024

Propheticus reopened this Apr 12, 2024

Van-QA assigned vansangpfiev Apr 25, 2024

louis-jan added the P1: important Important feature / fix label Apr 25, 2024

vansangpfiev mentioned this issue May 7, 2024

fix: finish_reason for non stream completion janhq/cortex.llamacpp#16

Merged

vansangpfiev closed this as completed Aug 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: response header wrong for non streamed responses (no chunks) #2560

bug: response header wrong for non streamed responses (no chunks) #2560

Propheticus commented Apr 1, 2024 •

edited

Loading

louis-jan commented Apr 3, 2024

Van-QA commented Apr 9, 2024

Propheticus commented Apr 9, 2024

Propheticus commented Apr 9, 2024

Propheticus commented Apr 12, 2024

bug: response header wrong for non streamed responses (no chunks) #2560

bug: response header wrong for non streamed responses (no chunks) #2560

Comments

Propheticus commented Apr 1, 2024 • edited Loading

louis-jan commented Apr 3, 2024

Van-QA commented Apr 9, 2024

Propheticus commented Apr 9, 2024

Propheticus commented Apr 9, 2024

Propheticus commented Apr 12, 2024

Propheticus commented Apr 1, 2024 •

edited

Loading