When applied to Stable Diffusion, our agent attention accelerates generation and substantially enhances image generation quality without any additional training.
We practically apply agent attention to ToMeSD model. ToMeSD reduces the number of tokens before attention calculation in Stable Diffusion, enhancing generation speed. Nonetheless, the post-merge token count remains considerable, resulting in continued complexity and latency. Hence, we replace the Softmax attention employed in ToMeSD model with our agent attention to further enhance speed.
Here are results with FID scores vs. time and memory usage (lower is better) when employing Stable Diffusion v1.5 to generate 2,000
Method | r% | FID ↓ | Time (s/im) ↓ | Memory (GB/im) ↓ |
---|---|---|---|---|
Stable Diffusion v1.5 | 0 | 28.84 (baseline) | 2.62 (baseline) | 3.13 (baseline) |
ToMeSD | 10 | 28.64 | 2.40 | 2.55 |
20 | 28.68 | 2.15 | 2.03 | |
30 | 28.82 | 1.90 | 2.09 | |
40 | 28.74 | 1.71 | 1.69 | |
50 | 29.01 | 1.53 | 1.47 | |
AgentSD | 10 | 27.79 (↓1.05 better) | 1.97 (1.33x faster) | 1.77 (1.77x less) |
20 | 27.77 (↓1.07 better) | 1.80 (1.45x faster) | 1.60 (1.95x less) | |
30 | 28.03 (↓0.81 better) | 1.65 (1.59x faster) | 2.05 (1.53x less) | |
40 | 28.15 (↓0.69 better) | 1.54 (1.70x faster) | 1.55 (2.02x less) | |
50 | 28.42 (↓0.42 better) | 1.42 (1.84x faster) | 1.21 (2.59x less) |
ToMeSD accelerates Stable Diffusion while maintaining similar image quality. AgentSD not only further accelerates ToMeSD but also significantly enhances image generation quality without extra training!
- PyTorch >= 1.12.1
Place the agentsd folder in your project and apply AgentSD to any Stable Diffusion model with:
import agentsd
if step == 0:
# Apply Agent Attention and ToMe during early 20 diffusion steps
agentsd.remove_patch(self.model)
agentsd.apply_patch(model, sx=4, sy=4, ratio=0.4, agent_ratio=0.95)
elif step == 20:
# Apply ToMe in later diffusion steps
agentsd.remove_patch(model)
agentsd.apply_patch(model, sx=2, sy=2, ratio=0.4, agent_ratio=0.5)
To apply AgentSD to SDv1 PLMS sampler, add the following to this line:
import agentsd
if i == 0:
agentsd.remove_patch(self.model)
agentsd.apply_patch(self.model, sx=4, sy=4, ratio=0.4, agent_ratio=0.95)
elif i == 20:
agentsd.remove_patch(self.model)
agentsd.apply_patch(self.model, sx=2, sy=2, ratio=0.4, agent_ratio=0.5)
To apply AgentSD to SDv2 DDIM sampler, add the following to this line (setting attn_precision="fp32"
to avoid numerical instabilities on the v2.1 model):
import agentsd
if i == 0:
agentsd.remove_patch(self.model)
agentsd.apply_patch(self.model, sx=4, sy=4, ratio=0.4, agent_ratio=0.95, attn_precision="fp32")
elif i == 20:
agentsd.remove_patch(self.model)
agentsd.apply_patch(self.model, sx=2, sy=2, ratio=0.4, agent_ratio=0.5, attn_precision="fp32")
If you find this repo helpful, please consider citing us.
@article{han2023agent,
title={Agent Attention: On the Integration of Softmax and Linear Attention},
author={Han, Dongchen and Ye, Tianzhu and Han, Yizeng and Xia, Zhuofan and Song, Shiji and Huang, Gao},
journal={arXiv preprint arXiv:2312.08874},
year={2023}
}