Full-Stack, GPU-based Acceleration of Deep Learning

A tutorial at CVPR2023, Vancouver, June 19th from 13:30 to 17:00, East Hall 11, Vancouver Convention Center.

Webpage

Description

This tutorial focuses on describing techniques to allow deep learning practitioners to accelerate the training and inference of large deep networks while also reducing memory requirements across a spectrum of off-the-shelf hardware for important applications such as autonomous driving and large language models. Topics include, but are not limited to:

Deep learning specialized hardware overview. We review the architecture of the most used deep learning acceleration hardware, including the main computational processors and memory modules.
How deep learning is performed on this hardware. We cover aspects of algorithmic intensity and an overview of theoretical aspects of computing. Attendees will learn how to estimate processing time and latency by looking only at hardware specs and the network architecture.
Best practices for acceleration. We provide an overview of best practices for designing efficient neural networks including channel number selection, compute heavy operations, or reduction operations among others.
Existing tools for model acceleration. In this part we will focus on existing tools to accelerate a trained neural network on GPU devices. We will particularly discuss operation folding, TensorRT, ONNX graph optimization, sparsity.
Research overview of recent techniques. In the last part, we will focus on recent advanced techniques for post training model optimization including pruning, quantization, model distillation or NAS among others.

Program

Monday, June 19th, c 1:30pm to 5:00pm. Vancouver Convention Center. Local time, Vancouver (PST)

13:30	13:35	Opening Remarks
13:35	14:15	Jason Clemons	foundations of GPU architecture, hardware perspective.
14:15	15:00	Pavlo Molchanov	DNN Performance optimization: How to achieve more with less cost, software perspective
15:00	15:30	Coffee Break
15:30	16:15	MAying Shen	Sparsity in DNN and model compression
16:15	17:00	Hongxu (Danny) Yin	Recent trends on transformer acceleration, data efficiency and security,

Instructors

Organizers

Maying Shen, Senior Research Engineer
Jason Clemons, Senior Research Scientist
Hongxu (Danny) Yin, Senior Research Scientist
Pavlo Molchanov, Principal Research Scientist
Jose M. Alvarez, Director, Applied research
Jan Kautz, VP of research

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
data		data
README.md		README.md
index.html		index.html
index_2023.html		index_2023.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Full-Stack, GPU-based Acceleration of Deep Learning

Webpage

Description

Program

Instructors

Organizers

About

Releases

Packages

Contributors 2

Languages

NVlabs/EfficientDL

Folders and files

Latest commit

History

Repository files navigation

Full-Stack, GPU-based Acceleration of Deep Learning

Webpage

Description

Program

Instructors

Organizers

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages