Skip to content

Latest commit

 

History

History
14 lines (11 loc) · 501 Bytes

README.md

File metadata and controls

14 lines (11 loc) · 501 Bytes

proofsearch

Experiments with Llemma and hypertree proof search (HTPS) for Lean4.

Relies on

Baseline results

Run bash scripts/eval_llemma7b.sh to run Llemma 7b (with awq quantization) on the minif2f validation data. Compute the accuracy with python compute_metrics.py. This should get around 26% accuracy.