README.md

Eval Harness Documentation

Welcome to the docs for the LM Evaluation Harness!

To learn about the public interface of the library, as well as how to evaluate via the commandline or as integrated into an external library, see the Interface
To learn how to add a new library, API, or model type to the library, as well as a quick explainer on the types of ways to evaluate an LM, see the Model Guide.
For a crash course on adding new tasks to the library, see our New Task Guide.
To learn more about pushing the limits of task configuration that the Eval Harness supports, see the Task Configuration Guide.