Skip to content

Release v0.2.9

Latest
Compare
Choose a tag to compare
@Ying1123 Ying1123 released this 02 Aug 08:55
· 6 commits to main since this release
30a9b2e

Highlights

  • New feature: Chunked prefill (#800, #811)
  • New models: Deepseek v2
  • Performance improvement: vectorized logprob computation
  • Accuracy fix: fix the double BOS problem in the chat template; move logits to float32; update flashinfer sampling kernels
  • Feature fix: fixed many missing logprob-related features in the OpenAI API server
  • CI/CD infra is now fully ready. The tests cover frontend, backend, accuracy, and performance tests.

What's Changed

New Contributors

Full Changelog: v0.2.5...v0.2.9