Logprobs Refractor #331

hnyls2002 · 2024-03-25T09:23:55Z

What does this PR do?

Refractor all logits and logprobs handling logic, including:
- Clarify the naming rules: *_token_logprobs stands for each token' logprobs, *_prompt_logprob stands for the sum of logprobs of a segment of prompt.
- Split the *_token_logprobs into prefill_token_logprobs and decode_token_logprobs.
Refractor SRT API and OpenAI API logprobs response format: List[(logprob, token_id, token_text)]
- OpenAI API responds with the prefilling and decoding logprobs in one list and always returns all the prefilling logprobs.
- SRT API can specify the logprob_start_len and respond in meta_data["prefill_token_logprobs"] and meta_data["decode_token_logprobs"] respectively.
Support detokenized results in sgl.select response which is the only use case for multiple requests in a single HTTP request.
top_logprobs support.

hnyls2002 · 2024-03-26T17:26:35Z

closes #296
closes #232

comaniac

Overall LGTM. Just a few comments

python/sglang/srt/managers/router/model_rpc.py

python/sglang/srt/server.py

merrymercy · 2024-05-15T07:00:44Z

python/sglang/srt/layers/logits_processor.py

+ if extend_seq_lens_cpu[i] == 0:
+ continue
+ k = input_metadata.top_logprobs_nums[i]
+ t = all_logprobs[pt : pt + extend_seq_lens_cpu[i]].topk(k)


pt is not accumulated

merrymercy · 2024-05-15T07:01:03Z

python/sglang/srt/layers/logits_processor.py

+ extend_seq_lens_cpu = input_metadata.extend_seq_lens
+ for i in range(len(input_metadata.extend_seq_lens)):
+ if extend_seq_lens_cpu[i] == 0:
+ continue


continue will result in a out-of-bound error later.

merrymercy · 2024-05-15T07:02:12Z

python/sglang/lang/interpreter.py

+ normalized_prompt_logprobs,
+ prefill_token_logprobs,
+ decode_token_logprobs,
+ ) = self.backend.select(self, expr.choices, expr.temperature)


did not modify the openai backend.

hnyls2002 added 3 commits March 24, 2024 16:53

improve logits processor style

d4b5baf

Merge branch 'main' into logprobs

cf684ea

refractor logprobs

f9d8f13

hnyls2002 marked this pull request as draft March 25, 2024 09:24

hnyls2002 added 7 commits March 25, 2024 19:07

minor fix

36f2268

rename

91f508a

fix openai compatible

12ca275

update test_http_decode_stream

55f7dde

support top_logprobs

40806b0

fixed openai protocol

581629a

update test

f06d927

hnyls2002 marked this pull request as ready for review March 26, 2024 17:24

hnyls2002 requested review from comaniac and merrymercy March 26, 2024 17:24

fixed choice

dd336f4

comaniac approved these changes Mar 27, 2024

View reviewed changes

python/sglang/srt/managers/router/model_rpc.py Outdated Show resolved Hide resolved

python/sglang/srt/server.py Show resolved Hide resolved

python/sglang/srt/server.py Show resolved Hide resolved

python/sglang/srt/server.py Show resolved Hide resolved

hnyls2002 added 2 commits March 28, 2024 05:19

resolve reviews

956891f

remind text_offset

cfa7e65

hnyls2002 merged commit 3842eba into main Mar 28, 2024

hnyls2002 deleted the logprobs branch March 28, 2024 06:34

This was referenced May 12, 2024

Fix logit processor bugs #425

Merged

Fix logit processor bugs #427

Merged

merrymercy reviewed May 15, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Logprobs Refractor #331

Logprobs Refractor #331

hnyls2002 commented Mar 25, 2024 •

edited

Loading

hnyls2002 commented Mar 26, 2024

comaniac left a comment

merrymercy May 15, 2024

merrymercy May 15, 2024

merrymercy May 15, 2024

Logprobs Refractor #331

Logprobs Refractor #331

Conversation

hnyls2002 commented Mar 25, 2024 • edited Loading

What does this PR do?

hnyls2002 commented Mar 26, 2024

comaniac left a comment

Choose a reason for hiding this comment

merrymercy May 15, 2024

Choose a reason for hiding this comment

merrymercy May 15, 2024

Choose a reason for hiding this comment

merrymercy May 15, 2024

Choose a reason for hiding this comment

hnyls2002 commented Mar 25, 2024 •

edited

Loading