Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: cpu: conv: Use acl_indirect_gemm for bf16 convolutions #1933

Merged
merged 1 commit into from
Jun 24, 2024

Conversation

Ryo-not-rio
Copy link
Contributor

@Ryo-not-rio Ryo-not-rio commented May 28, 2024

Description

Use acl_indirect_gemm for bf16 convolutions

performance improvements:

Total benchdnn tests: 57
Min: 15x
Average: 131x
Max: 320x

Checklist

General

  • Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit?
  • Have you formatted the code using clang-format?

Performance improvements

  • Have you submitted performance data that demonstrates performance improvements?

performance improvements:

Total benchdnn tests: 57
Min: 15x
Average: 131x
Max: 320x
@vpirogov vpirogov added this to the v3.6 milestone May 28, 2024
@mgouicem mgouicem added the platform:cpu-aarch64 Codeowner: @oneapi-src/onednn-cpu-aarch64 label May 30, 2024
Copy link
Contributor

@jondea jondea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed internally, LGTM

Copy link
Contributor

@jondea jondea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to amend the indirect/gemm heuristics to take into account bf16*bf16->bf16. Currently they only consider fast math.

@jondea
Copy link
Contributor

jondea commented Jun 6, 2024

We need to amend the indirect/gemm heuristics to take into account bf16*bf16->bf16. Currently they only consider fast math.

This has been fixed by #1948

@vpirogov vpirogov merged commit 3a05ca5 into oneapi-src:main Jun 24, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:cpu-aarch64 Codeowner: @oneapi-src/onednn-cpu-aarch64
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants