Forward hooks not called when fast path is used in TransformerEncoderLayer #128413

iibrahimli · 2024-06-11T12:45:09Z

🐛 Describe the bug

When TransformerEncoderLayer is run in evaluation mode and a few conditions are met, the fast path is used (which is a fused optimized implementation) instead of calling the modules like MultiheadAttention (e.g. self.self_attn). This means any forward hooks or pre-hooks registered for the submodules are not called.

import torch
from torch import nn


cache = []


# forward hook to save output
def hook(module, inputs, output):
    cache.append(output[0].detach())


enc_layer = nn.TransformerEncoderLayer(d_model=32, nhead=8, batch_first=True)
enc_layer.eval()

# register hook to get the output of the self-attention layer
handle = enc_layer.self_attn.register_forward_hook(hook)

# input tensor of shape (batch_size, seq_len, d_model)
x = torch.randn(4, 6, 32)

# forward pass
with torch.inference_mode():
    output = enc_layer(x)

# output of the self-attention layer
assert len(cache) == 1, f"Expected 1 output, got {len(cache)}"
print(cache[0].shape)

# remove hook
handle.remove()

In this example, we would expect the cache to contain the output, but it does not. However, if we modify any condition for fast path selection, e.g. use an odd number of attention heads nhead=3, fast path is not used and the hook is called as expected.

Versions

PyTorch version: 2.2.0
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 14.5 (arm64)
GCC version: Could not collect
Clang version: 15.0.0 (clang-1500.3.9.4)
CMake version: version 3.23.1
Libc version: N/A

Python version: 3.12.0 (main, Oct 5 2023, 15:44:07) [Clang 14.0.3 (clang-1403.0.22.14.1)] (64-bit runtime)
Python platform: macOS-14.5-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Apple M1

Versions of relevant libraries:
[pip3] mypy==1.8.0
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.26.4
[pip3] pytorch-lightning==2.2.0.post0
[pip3] torch==2.2.0
[pip3] torchmetrics==1.3.1
[pip3] torchviz==0.0.2
[conda] numpy 1.22.3 py39h64940a9_2 conda-forge
[conda] pytorch 1.11.0 cpu_py39h03f923b_1 conda-forge
[conda] torchdata 0.3.0 pyhd8ed1ab_0 conda-forge

cc @jbschlosser @bhosmer @cpuhrsch @erichan1 @drisspg @mikaylagawarecki

The text was updated successfully, but these errors were encountered:

iibrahimli · 2024-06-11T13:20:21Z

My proposed solution would be to fall back from using fast path if there are pre-/forward hooks on any submodules of the layer. I have started working on it: #128415

iibrahimli changed the title ~~Forward hooks not called when fast path is used in TransformerEncoderLayer and TransformerDecoderLayer~~ Forward hooks not called when fast path is used in TransformerEncoderLayer Jun 11, 2024

iibrahimli mentioned this issue Jun 11, 2024

Disable fast path in TransformerEncoderLayer when there are forward (pre-)hooks attached to modules #128415

Closed

colesbury added the oncall: transformer/mha Issues related to Transformers and MultiheadAttention label Jun 12, 2024

pytorchmergebot closed this as completed in 2db3305 Jun 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Forward hooks not called when fast path is used in TransformerEncoderLayer #128413

Forward hooks not called when fast path is used in TransformerEncoderLayer #128413

iibrahimli commented Jun 11, 2024 •

edited by pytorch-bot bot

Loading

iibrahimli commented Jun 11, 2024

Forward hooks not called when fast path is used in TransformerEncoderLayer #128413

Forward hooks not called when fast path is used in TransformerEncoderLayer #128413

Comments

iibrahimli commented Jun 11, 2024 • edited by pytorch-bot bot Loading

🐛 Describe the bug

Versions

iibrahimli commented Jun 11, 2024

iibrahimli commented Jun 11, 2024 •

edited by pytorch-bot bot

Loading