GPT-NeoX

An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hundreds of billions of parameters or larger. This repository is under development and may change rapidly without warning.

Requirements

$ pip install -r requirements.txt

Running the code

The anatomy of a call to the DeepSpeed engine is the following

$ deepspeed --hostfile=host_path train_script.py \
	--deepspeed \
	--deepspeed_config ./configs/base_deepspeed.json

Running the code locally

Running the code on a server

This code is set up to run automatically on as many GPUs as are avaliable. To run across multiple machines, you need to make use of a hostfile which lists the IP address of each machine you wish to run the code on followed by the number of GPUs to use. For example, 123.45.67.890 slots=8 instructs the code to run on all eight GPUs of the machine at 123.45.67.890. Each machine should be listed on a separate line with no end-of-line punctuation. It is officially recommended that you set up passwordless ssh, but we have had success entering the password at run-time. To have your hostfile used by GPT-NeoX automatically, store it at ~/jobs/hostfile. Otherwise, you can provide it as an argument as shown above.

EleutherAI members:

~/scripts/

The directory ~/scripts/ stores various scripts for automatically starting runs with particular settings and configs that we have found useful. They can be run using sh scripts/script_name.sh but should not be relied upon. We do not guarentee forward compatibility of any scripts.

Datasets

Tokenizers

Using our data

Using your data

Advanced Options

Contribute

If you want to get involved, check out our repo projects. Anything that is listed as "todo" or has not been assigned to anyone is fair game, but please leave a comment so that we know you're working on it!

Resources

If you have trouble getting the model to run, consider consulting this guide to installing in a GCE virtual machine. You may also find the (very sparse) DeepSpeed docs helpful.

Name		Name	Last commit message	Last commit date
Latest commit History 162 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
configs		configs
gpt_neox		gpt_neox
kubernetes		kubernetes
scripts		scripts
.gitignore		.gitignore
CODEOWNERS		CODEOWNERS
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
deploy_k8s.sh		deploy_k8s.sh
environment.yml		environment.yml
install_deepspeed.sh		install_deepspeed.sh
requirements.txt		requirements.txt
train.py		train.py
train.sh		train.sh
train_enwik8.py		train_enwik8.py
train_enwik8_pipeline.py		train_enwik8_pipeline.py
train_pipeline.py		train_pipeline.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPT-NeoX

Requirements

Running the code

Running the code locally

Running the code on a server

~/scripts/

Datasets

Tokenizers

Using our data

Using your data

Advanced Options

Contribute

Resources

About

Releases

Packages

Languages

License

zenithez/gpt-neox

Folders and files

Latest commit

History

Repository files navigation

GPT-NeoX

Requirements

Running the code

Running the code locally

Running the code on a server

~/scripts/

Datasets

Tokenizers

Using our data

Using your data

Advanced Options

Contribute

Resources

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages