v0.4.0
🎉 Enhancements
- Mixtral by @flozi00 in #122
- Added Phi by @tgaddair in #132
- add support for H100s by @thelinuxkid in #111
- upgrade to py 3.10 by @flozi00 in #121
- Add predibase as a source for adapters by @magdyksaleh in #125
- enh: Add soci indexing to allow Lazy loading of LoRAX images by @gyanesh-mishra in #95
🐛 Bugfixes
- fix: Set Mistral sliding window to max position embeddings when None by @tgaddair in #128
- Fix Qwen tensor parallelism by @tgaddair in #120
- fix: Llama AWQ with GQA by @tgaddair in #114
- fix: Mixtral adapter loading wraps lm_head by @tgaddair in #131
📝 Docs
- Add Skypilot example and getting started guide by @tgaddair in #117
- docs: fix broken link by @Fluder-Paradyne in #133
- Added Mixtral and Phi to docs by @tgaddair in #134
🔧 Maintenance
- Increase default client timeout to 60s by @tgaddair in #119
- Make transpose contiguous for fan-in-fan-out by @tgaddair in #129
- remove lorax env var by @geoffreyangus in #113
New Contributors
- @gyanesh-mishra made their first contribution in #95
- @thelinuxkid made their first contribution in #111
- @Fluder-Paradyne made their first contribution in #133
Full Changelog: v0.3.0...v0.4.0