ParaGen

ParaGen is a PyTorch deep learning framework for parallel sequence generation. Apart from sequence generation, ParaGen also enhances various NLP tasks, including sequence-level classification, extraction and generation.

Requirements and Installation

Install third-party dependent package:

# Ubuntu
apt-get install libopenmpi-dev libssl-dev openssh-server
# CentOS
yum install openmpi openssl openssh-server
# Conda
conda install -c conda-forge mpi4py

To install ParaGen from source:

cd ParaGen
pip install -e .

For distributed training on multiple GPUs, run ParaGen with torch.distributed

python -m torch.distributed.launch --nproc_per_node {GPU_NUM} paragen/entries/run.py --configs {config_file}

You can also use horovod for distributed training. Install horovod with

# require CMake to install horovod. (https://cmake.org/install/)
HOROVOD_WITH_PYTORCH=1 HOROVOD_GPU_OPERATIONS=NCCL HOROVOD_NCCL_HOME=${NCCL_ROOT_DIR} pip install horovod

Then run ParaGen with horovod:

horovodrun -np {GPU_NUM} -H localhost:{GPU_NUM} paragen-run --config {config_file}

Install lightseq to faster train:

pip install lightseq

Getting Started

Before using ParaGen, it would be helpful to overview how ParaGen works.

ParaGen is designed as a task-oriented framework, where task is regarded as the core of all the codes. A specific task selects all the components for support itself, such as model architectures, training strategies, dataset, and data processing. Any component within ParaGen can be customized, while the existing modules and methods are used as a plug-in library.

As tasks are considered as the core of ParaGen, it works with various modes, such as train, evaluate, preprocess and serve. Tasks act differently under different modes, by reorganizing the components without code modification.

Please refer to examples for detailed instructions.

ParaGen Usage and Contribution

We welcome any experimental algorithms on ParaGen.

Install ParaGen;
Create your own paragen-plugin libraries under third_party;
Experiment your own algorithms;
Write a reproducible shell script;
Create a merge request and assign reviewers to any of us.

Name		Name	Last commit message	Last commit date
Latest commit History 253 Commits
3rdparty/image_classification		3rdparty/image_classification
examples		examples
paragen		paragen
scripts		scripts
tests		tests
tutorials		tutorials
.editorconfig		.editorconfig
.flake8		.flake8
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ParaGen

Requirements and Installation

Getting Started

ParaGen Usage and Contribution

About

Releases

Packages

Contributors 3

Languages

License

bytedance/ParaGen

Folders and files

Latest commit

History

Repository files navigation

ParaGen

Requirements and Installation

Getting Started

ParaGen Usage and Contribution

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages