Skip to content

Releases: huggingface/lighteval

v0.3.0

29 Mar 16:42
Compare
Choose a tag to compare

Release Note

This introduced the new extended tasks feature, documentation and many other patches for improved stability.
New tasks are also introduced:

MT-Bench marks the introduction of multi-turn prompting as well as llm-as-a-judge metric.

New tasks

Features

Documentation

Small patches

New Contributors

Full Changelog: v0.2.0...v0.3.0

v0.2.0

01 Mar 14:31
Compare
Choose a tag to compare

Release Note

This release focuses on customization and personalisation: it's now possible to define custom metrics, not just custom tasks, see the README for the full mechanism.
Also includes small fixes to improve stability and new tasks. We made the choice to split community tasks from the main library source to better manage maintenance.

Better community task handling

New tasks

Features

  • Add an automatic system to compute average for tasks with subtasks by @clefourrier in #41

small patches

✨ Community Contributions

Full Changelog: v0.1.1...v0.2.0

v0.1.1

09 Feb 11:29
Compare
Choose a tag to compare

Small patch for PyPi release

Include tasks_table.jsonl in package

v0.1.0

08 Feb 10:27
Compare
Choose a tag to compare

Init

LightEval 🌤️

A lightweight LLM evaluation

Context

LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron.

We're releasing it with the community in the spirit of building in the open.

Note that it is still very much early so don't expect 100% stability ^^'
In case of problems or question, feel free to open an issue!

Full Changelog: https://github.com/huggingface/lighteval/commits/v0.1