Skip to content

Latest commit

 

History

History

pretrain

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

pretrain

Inspired by Andrew's https://github.com/karpathy/build-nanogpt

  • fineweb.py: Constructs fineweb training data.
  • hellaswag.py: Evaluates using the Hellaswag dataset.
  • train_gpt2.py: Trains a nano GPT-2 model from scratch.
  • train_llama.py: Trains a nano LLaMA-3 model from scratch.
  • train_mixtral.py: Trains a nano Mixtral 8x model from scratch.
  • train_deepseek.py: Trains a nano Deepseek Moe model from scratch.
  • play_with_llama.ipynb: Compares the training results.
  • mixtral_result.ipynb: Compares the MoE training results.