Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Future Plan of Transformer Kernel #600

Closed
hxbloom opened this issue Dec 14, 2020 · 2 comments
Closed

Future Plan of Transformer Kernel #600

hxbloom opened this issue Dec 14, 2020 · 2 comments

Comments

@hxbloom
Copy link

hxbloom commented Dec 14, 2020

I'm using Megatron-LM example to train GPT-2 on my cluster. I've also tested the DeepSpeed Transformer Kernel in the bing_bert example, it is really helpful, much faster than the original PyTorch version with less memory consumption.

I would like to know if you have any future plan to extend the transformer kernel, for example, support more models like GPT-2, or integrate model parallel into the kernel for large model training?

Thanks!

@RezaYazdaniAminabadi
Copy link
Contributor

Hi Dong,

Thanks for pointing out the use-cases for Transformer kernel. There us a plan for supporting other types of transformer networks. For the GPT2, we have the kernels modified to support that, and we obtained about 30% and 40% speedup in the forward and backward pass. We are going to release them very soon. Please stay tuned!

Thanks,
Reza

@hxbloom
Copy link
Author

hxbloom commented Dec 18, 2020

Look forward to your next release!

Dong

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants