Skip to content

An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.

License

Notifications You must be signed in to change notification settings

pranaybaldev/gpt-neox

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GPT-NeoX

An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hundreds of billions of parameters or larger.

Requirements

$ pip install -r requirements.txt

Test deepspeed locally

$ deepspeed train_enwik8.py \
	--deepspeed \
	--deepspeed_config ./configs/base_deepspeed.json

About

An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 84.4%
  • C++ 12.8%
  • Cuda 1.2%
  • C 0.9%
  • Dockerfile 0.6%
  • Shell 0.1%