Skip to content

EleutherAI/math-lm

Repository files navigation

Llemma: an open language model for mathematics

ArXiv | Models | Data | Code | Blog | Sample Explorer

Repository for Llemma: an open language model for mathematics [Azerbayev et al 2023].

This repository hosts data and training code related to the following artifacts:

Name HF Hub Link
Llemma 7b EleutherAI/llemma_7b
Llemma 34b EleutherAI/llemma_34b
Proof-Pile-2 EleutherAI/ProofPile2
  AlgebraicStack EleutherAI/AlgebraicStack

This repository also contains submodules related to the overlap, fine-tuning, and theorem proving experiments described in the paper. Additional evaluation code is in a fork of the Eleuther LM Evaluation Harness.

Directories

This repository contains the following directories

  • proof_pile_2: scripts for downloading and preprocessing data.
  • gpt-neox: git submodule containing a modified branch of EleutherAI/gpt-neox
  • lm-evaluation-harness: code for all evaluations, except formal2formal theorem proving.
  • llemma_formal2formal: git submodule containing scripts for the formal2formal experiments
  • overlap: git submodule containing the overlap and memorization analysis
  • finetunes: git submodule containing scripts for the fine-tuning experiments

Because this project contains submodules, you should clone this project with the --recurse-submodules flag or, alternatively, run git submodule update --init --recursive from within the project directory after cloning the project. After running git pull, you should also run git submodule update.

Citation

Please cite the following:

@article{azerbayev2023llemma,
  title={Llemma: An Open Language Model For Mathematics}, 
  author={Azerbayev, Zhangir and Schoelkopf, Hailey and Paster, Keiran and Dos Santos, Marco and McAleer, Stephen and Jiang, Albert Q. and Deng, Jia and Biderman, Stella and Welleck, Sean},
  journal={arXiv preprint arXiv:2310.06786},
  year={2023}
}

You may also be interested in citing our training data, which is a mix of novel data and data from the following sources:

@article{paster2023openwebmath,
  title={OpenWebMath: An Open Dataset of High-Quality Mathematical Web Text},
  author={Paster, Keiran and Santos, Marco Dos and Azerbayev, Zhangir and Ba, Jimmy},
  journal={arXiv preprint arXiv:2310.06786},
  year={2023}
}

@software{together2023redpajama,
  author = {Together Computer},
  title = {RedPajama: An Open Source Recipe to Reproduce LLaMA training dataset},
  month = April,
  year = 2023,
  url = {https://github.com/togethercomputer/RedPajama-Data}
}

@misc{kocetkov2022stack,
      title={The Stack: 3 TB of permissively licensed source code}, 
      author={Denis Kocetkov and Raymond Li and Loubna Ben Allal and Jia Li and Chenghao Mou and Carlos Muñoz Ferrandis and Yacine Jernite and Margaret Mitchell and Sean Hughes and Thomas Wolf and Dzmitry Bahdanau and Leandro von Werra and Harm de Vries},
      year={2022},
      eprint={2211.15533},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published