feat: DiTFastAttn for PixArt #297

ZDJeffrey · 2024-10-08T04:15:49Z

Summary

DiTFastAttn is an attention compression method for Diffusion Transformer Models. Using the redundancy of DiT. It introduces some compression methods for the self-attention to accelerate the inference speed on a single GPU.

Implementation

A new config class and a new argument group are added for the DiTFastAttn in xfuser\config\
Following the implementation of Long Context Attention, DiTFastAttn module is implemented in xfuser\core\fast_attention
As it can only be used with data parallelism, the attention processor is implemented independently instead of greatly modify the original attention processor in xfuser\model_executor\layers\attention_processor.py.
Before the DiTFastAttn works, the compression method need to be set. Thus when the function prepare_run is called, the compression method is set if the DiTFastAttn is enabled.

How to use

To use the DiTFastAttn, the following arguments are required:
- use_fast_attn: enable the fast attention.
- n_calib: Number of prompts for compression method selection.
- threshold: Threshold for selecting attention compression method. It relatively determines the compression ratio.
- window_size: Size of window attention. According to the paper, the window size is recommended to be 1/8 of the token size.
When using the DiTFastAttn for a model for the first time, some arguments need to be set for the compression.
- coco_path: Path of MS COCO annotation json file(e.g. captions_val2014.json from official site). The file contains the captions of the images, which are sampled for the compression.
After the compression, the method will be saved in a json file in cache folder. This file can be loaded for the same models with same arguments if use_cache is set.
Note: The DiTFastAttn can only be used with data parallelism. If other parallelism methods are used, the program will raise an error.

Test

So far, only the implementation of DiTFastAttn for PixArt models are done. I have tested it with data parallelism on PixArt-alpha/PixArt-Sigma-XL-2-1024-MS and PixArt-alpha/PixArt-XL-2-1024-MS. As the PixArt-alpha/PixArt-Sigma-XL-2-2K-MS model is not available on the Huggingface now, I have not tested it yet.

WIP

Implementation of DiTFastAttn for other models is still in progress.
The benchmark of the DiTFastAttn is not done yet. I will do it after the implementation of DiTFastAttn for other models is done.

ZDJeffrey · 2024-10-08T06:00:59Z

Image Compare

PixArt-alpha/PixArt-XL-2-1024-MS

origin: epoch time: 3.81 sec, memory: 15.512717312 GB
DiTFastAttn(threshold=0.15): epoch time: 3.17 sec, memory: 16.547694592 GB

Eigensystem

LGTM. Elegant code!

xibosun

The pr defines FastAttnState, with a similar implementation to the existing state. Most code of DiTFastAttn are written to xfuser/core/fast_attention with minor modification to existing code. Overall, the code is correct and elegant.

feifeibear requested review from feifeibear, xibosun and Eigensystem October 8, 2024 05:33

Eigensystem approved these changes Oct 8, 2024

View reviewed changes

xibosun approved these changes Oct 8, 2024

View reviewed changes

feifeibear merged commit ae504d6 into xdit-project:main Oct 8, 2024
3 checks passed

ZDJeffrey added 2 commits October 8, 2024 19:19

feat: DiTFastAttn for PixArt

b48821c

add code reference

17413f6

feifeibear pushed a commit to feifeibear/xDiT that referenced this pull request Oct 25, 2024

feat: DiTFastAttn for PixArt (xdit-project#297)

25d5ede

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: DiTFastAttn for PixArt #297

feat: DiTFastAttn for PixArt #297

ZDJeffrey commented Oct 8, 2024

ZDJeffrey commented Oct 8, 2024 •

edited

Loading

Eigensystem left a comment

xibosun left a comment

feat: DiTFastAttn for PixArt #297

feat: DiTFastAttn for PixArt #297

Conversation

ZDJeffrey commented Oct 8, 2024

Summary

Implementation

How to use

Test

WIP

ZDJeffrey commented Oct 8, 2024 • edited Loading

Image Compare

PixArt-alpha/PixArt-XL-2-1024-MS

Eigensystem left a comment

Choose a reason for hiding this comment

xibosun left a comment

Choose a reason for hiding this comment

ZDJeffrey commented Oct 8, 2024 •

edited

Loading