Pulse · ROCm/hipBLASLt

November 16, 2024 – November 23, 2024

Overview

36 Active pull requests

0 Active issues
- 22 Merged pull requests
- 14 Open pull requests
- 0 Closed issues
- 0 New issues

22 Pull requests merged by 15 people

Update 35 Equality logic yaml sizes.
#1378 merged Nov 23, 2024
update 38 Equality logic yaml sizes
#1372 merged Nov 23, 2024
gfx942 38cu F8BS NN TN NT grid tune
#1345 merged Nov 22, 2024
gfx942 38cu HHS BBS NN TN NT grid tune
#1331 merged Nov 22, 2024
Fix invalid stream-k test case, make dynamic grid the default
#1359 merged Nov 22, 2024
update gfx942 xf32 freesize
#1375 merged Nov 22, 2024
Add gfx942 xf32 NN/NT/TN Equality yamls for 1105 xf32
#1376 merged Nov 22, 2024
Fix: incorrect required workspace size for singleKernel GSU
#1371 merged Nov 22, 2024
gridbased search for batched gemm
#1362 merged Nov 22, 2024
Add profiling to TensileCreateLibrary
#1329 merged Nov 22, 2024
Set Python_ROOT virtual.env
#1344 merged Nov 21, 2024
Fixing and adding test for DepthU=48
#1303 merged Nov 20, 2024
[Hotfix] Disable setOccupancyLimit for gfx120X
#1368 merged Nov 20, 2024
Remove alias for MirrorDims in logic yaml
#1361 merged Nov 20, 2024
GFX942 equality tuning for F8HS and F8B8HS for TN,NT,NN
#1302 merged Nov 19, 2024
Add setOccupancyLimit
#1364 merged Nov 19, 2024
Remove Min/Max/TotalVgprNumber in Common.py
#1355 merged Nov 19, 2024
gfx12 - change to use byte_sel modifier for v_cvt_f32_fp8 and v_cvt_f…
#1172 merged Nov 19, 2024
Add profile logging and standardize scaleA and scaleB datatypes
#1275 merged Nov 18, 2024
Fix F8/BF8 failed cases for GWVW=8 and Beta != 0
#1333 merged Nov 18, 2024
adding bpl64 support to addLdsLoad (for Bias and scaleAlphaVector)
#1336 merged Nov 18, 2024
Add sgpr occupancy
#1349 merged Nov 18, 2024

14 Pull requests opened by 13 people

Modify to check if alpha is in host memory.
#1356 opened Nov 18, 2024
Refactoy the pack scheduling for scheduleIterAlg = 3.
#1358 opened Nov 18, 2024
Fix F32 FMAC Perf Bugs for gfx11/12
#1360 opened Nov 19, 2024
Bump rocm-docs-core from 1.8.3 to 1.8.5 in /docs/sphinx
#1363 opened Nov 19, 2024
Library Logic Format Simplification
#1365 opened Nov 19, 2024
Remove PackageLibrary option
#1367 opened Nov 20, 2024
[Sparse] fix sparse kernel generation failure
#1369 opened Nov 20, 2024
Update 12 Equality logic yamls.
#1370 opened Nov 20, 2024
Avoid divide by 0 when calculating predicted performance with streamk
#1373 opened Nov 21, 2024
Code object compression via bundling
#1374 opened Nov 22, 2024
[Experimental] hipBLASLt tensor swizzling integration
#1377 opened Nov 22, 2024
Fp8 tuning upstream
#1380 opened Nov 22, 2024
Find python
#1381 opened Nov 22, 2024
Logic fix to exclude streamk by default
#1382 opened Nov 22, 2024

14 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

Adding HostLibraryTests back to hipBLASLt
#1147 commented on Nov 20, 2024 • 10 new comments
Add support for fallback from compute type f16 to f32
#1263 commented on Nov 19, 2024 • 8 new comments
feature: DTV with Swizzling (tensorA)
#1246 commented on Nov 22, 2024 • 1 new comment
Enable variant builds via device ID and cu count
#1222 commented on Nov 18, 2024 • 0 new comments
dot2 fp16 mac kernel for gfx942
#1258 commented on Nov 23, 2024 • 0 new comments
Tune Aquavanjaram942X F8F8S Equality TN 1 GEMM size
#1270 commented on Nov 20, 2024 • 0 new comments
Static build
#1283 commented on Nov 23, 2024 • 0 new comments
Check arguments in yaml file, abort if not recognized.
#1294 commented on Nov 19, 2024 • 0 new comments
Tune Aldebaran BF16 NN TN NT GEMM sizes
#1323 commented on Nov 21, 2024 • 0 new comments
Add --experimental flag to TensileCreateLibrary
#1328 commented on Nov 22, 2024 • 0 new comments
Tune Aquavanjaram 942 20CU HHS NN and TN GEMM sizes tuning in equality library
#1330 commented on Nov 21, 2024 • 0 new comments
[CQE only] gfx12 - change to use byte_sel modifier for v_cvt_f32_fp8 and v_cvt_f…
#1339 commented on Nov 18, 2024 • 0 new comments
Add initial optional stream-k libraries
#1347 commented on Nov 21, 2024 • 0 new comments
[OPT] Optimize tail loop
#1353 commented on Nov 22, 2024 • 0 new comments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

November 16, 2024 – November 23, 2024

Overview

Could not load contribution data

22 Pull requests merged by 15 people

14 Pull requests opened by 13 people

14 Unresolved conversations

Insights: ROCm/hipBLASLt

November 16, 2024 – November 23, 2024

Overview

Could not load contribution data

22 Pull requests merged by 15 people

14 Pull requests opened by 13 people

14 Unresolved conversations