An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hundreds of billions of parameters or larger.
forked from EleutherAI/gpt-neox
-
Notifications
You must be signed in to change notification settings - Fork 0
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
License
jamestiotio/gpt-neox
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Languages
- Python 85.2%
- C++ 11.9%
- Cuda 1.1%
- C 0.8%
- Dockerfile 0.7%
- Shell 0.3%