-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit adf9240
Showing
23 changed files
with
22,606 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,110 @@ | ||
|
||
*.pyc | ||
*.tar.gz | ||
# JetBrains PyCharm IDE | ||
.idea/ | ||
|
||
# Byte-compiled / optimized / DLL files | ||
__pycache__/ | ||
*.py[cod] | ||
*$py.class | ||
|
||
# C extensions | ||
*.so | ||
|
||
# macOS dir files | ||
.DS_Store | ||
|
||
# Distribution / packaging | ||
.Python | ||
env/ | ||
build/ | ||
develop-eggs/ | ||
dist/ | ||
downloads/ | ||
eggs/ | ||
.eggs/ | ||
lib/ | ||
lib64/ | ||
parts/ | ||
sdist/ | ||
var/ | ||
wheels/ | ||
*.egg-info/ | ||
.installed.cfg | ||
*.egg | ||
|
||
# PyInstaller | ||
# Usually these files are written by a python script from a template | ||
# before PyInstaller builds the exe, so as to inject date/other infos into it. | ||
*.manifest | ||
*.spec | ||
|
||
# Installer logs | ||
pip-log.txt | ||
pip-delete-this-directory.txt | ||
|
||
# Unit test / coverage reports | ||
htmlcov/ | ||
.tox/ | ||
.coverage | ||
.coverage.* | ||
.cache | ||
nosetests.xml | ||
coverage.xml | ||
*.cover | ||
.hypothesis/ | ||
|
||
# Translations | ||
*.mo | ||
*.pot | ||
|
||
# Django stuff: | ||
*.log | ||
local_settings.py | ||
|
||
# Flask stuff: | ||
instance/ | ||
.webassets-cache | ||
|
||
# Scrapy stuff: | ||
.scrapy | ||
|
||
# Sphinx documentation | ||
docs/_build/ | ||
|
||
# PyBuilder | ||
target/ | ||
|
||
# Jupyter Notebook | ||
.ipynb_checkpoints | ||
|
||
# pyenv | ||
.python-version | ||
|
||
# celery beat schedule file | ||
celerybeat-schedule | ||
|
||
# SageMath parsed files | ||
*.sage.py | ||
|
||
# dotenv | ||
.env | ||
|
||
# virtualenv | ||
.venv | ||
venv/ | ||
ENV/ | ||
|
||
# Spyder project settings | ||
.spyderproject | ||
.spyproject | ||
|
||
# Rope project settings | ||
.ropeproject | ||
|
||
# mkdocs documentation | ||
/site | ||
|
||
# mypy | ||
.mypy_cache/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
# `llemma` formal2formal | ||
|
||
Scripts for the Lean formal2formal (tactic prediction) experiments in \ | ||
[Llemma: an open language model for mathematics [Azerbayev et al 2023]]() | ||
|
||
#### Setup | ||
Install Python packages: | ||
``` | ||
pip install -r requirements.txt | ||
``` | ||
|
||
Install Lean: | ||
``` | ||
# from https://leanprover-community.github.io/install/linux.html | ||
# After running this command, select (2), then `nightly`, then `y`: | ||
curl https://raw.githubusercontent.com/leanprover/elan/master/elan-init.sh -sSf | sh | ||
source $HOME/.elan/env | ||
lake | ||
``` | ||
|
||
Configure LeanDojo: | ||
``` | ||
export CONTAINER="native" | ||
``` | ||
|
||
#### Run | ||
See `scripts` | ||
|
||
#### Compute metrics | ||
|
||
```bash | ||
python compute_metrics.py | ||
==> | ||
|
||
codellama7b_minif2f_test 0.20491803278688525 50 244 | ||
codellama34b_minif2f_test 0.22131147540983606 54 244 | ||
llemma7b_minif2f_test 0.26229508196721313 64 244 | ||
llemma34b_minif2f_test 0.2581967213114754 63 244 | ||
``` | ||
|
||
|
||
|
||
### Troubleshooting | ||
- We observe a Ray error when running the 34b script (with VLLM `--tp-degree > 1`) on an untraced LeanDojo repo. A workaround is to run the 7b script with `--tp-degree 1` such that LeanDojo completes tracing the repo. Then run the 34b script with `--tp-degree > 1`. | ||
|
||
### Citation | ||
Please cite the following: | ||
``` | ||
@article{azerbayev2023llemma, | ||
title={Llemma: an open language model for mathematics}, | ||
author={Zhangir Azerbayev and Hailey Schoelkopf and Keiran Paster and Marco Dos Santos and Stephen McAleer and Albert Q. Jiang and Jia Deng and Stella Biderman and Sean Welleck}, | ||
eprint={xyz.xyz}, | ||
archivePrefix={arXiv} | ||
year={2023} | ||
} | ||
``` | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
import json | ||
import glob | ||
|
||
for setting in [ | ||
'codellama7b_minif2f_test', | ||
'codellama34b_minif2f_test', | ||
'llemma7b_minif2f_test', | ||
'llemma34b_minif2f_test' | ||
]: | ||
fs = [json.load(open(x)) for x in glob.glob('./output/%s/*.json' % setting)] | ||
n = 0 | ||
ns = 0 | ||
for f in fs: | ||
for result in f['results']: | ||
name = result['example']['full_name'] | ||
|
||
# Extra helper theorem in the OpenAI code | ||
if 'sum_pairs' in name: | ||
continue | ||
|
||
n += 1 | ||
if result['success']: | ||
ns += 1 | ||
|
||
print(setting, ns/n, ns, n, sep='\t') | ||
|
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
Please note: | ||
- Llemma34b was run with three shards: 1 shard for the first 122 theorems, and two for the remaining. |
Oops, something went wrong.