Skip to content

LLM360/k2-train

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Training Code for LLM360 K2-65B

This repository contains the code for training K2-65B, a 65 billion parameter large language model from LLM360.

Note

This repository is under active development. If you have suggestions or find bugs, please open a GitHub issue or reach out.

Environment

The simplest way to launch the training should be using our Docker Image. We will provide a more detailed writeup of the environment later.

Launch Training

To launch training, run:

bash scripts/pretrain_65b.sh

Converting Megatron Checkpoints to HuggingFace Format

To convert model checkpoints from Megatron to HuggingFace format, run:

python convert_ckpt_to_hf.py --load_path <megatron_ckpt_dir> --save_path <huggingface_ckpt_dir>

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages