Skip to content

Releases: bdashore3/flash-attention

v2.6.3

26 Jul 00:14
Compare
Choose a tag to compare

Synced to Upstream version

NOTE: Backward and Dropout are disabled meaning that this release is INFERENCE ONLY.

This is because including these features more than doubles the build time and makes the github action time itself out. Please raise an issue to the parent repo to help reduce the build times if you want these features.

v2.6.1

12 Jul 00:06
Compare
Choose a tag to compare
Actions: Switch to CUDA 12.3

Signed-off-by: kingbri <[email protected]>

v2.5.9.post2

09 Jul 23:28
Compare
Choose a tag to compare
v2.5.9.post2 Pre-release
Pre-release

Quick release to add softcapping commits. Does not have backward, dropout, or alibi support.

v2.5.9.post1

28 May 01:45
Compare
Choose a tag to compare
Actions: Clarify dispatch formatting

Signed-off-by: kingbri <[email protected]>

v2.5.8

28 Apr 07:55
Compare
Choose a tag to compare

Same as Upstream tag

Now built for only torch 2.2.2 and 2.3.0

v2.5.6

30 Mar 20:34
Compare
Choose a tag to compare

v2.5.2

07 Feb 22:37
Compare
Choose a tag to compare

Same as the upstream tag

Adds this PR to help fix building on Windows

v2.4.2

03 Feb 00:29
Compare
Choose a tag to compare

Inline with the parent repo's tag

Made for cuda 12.x and pytorch 2.1.2 and 2.2

v2.4.3 and up cannot be built on Windows at this time.

v2.4.1

25 Dec 06:26
Compare
Choose a tag to compare
Add Windows workflows

2.3.3-windows

18 Nov 23:37
Compare
Choose a tag to compare

In parity with the original tag

Built with Pytorch 2.1.1 and CUDA 12.2. This wheel will work with pytorch 2.1+ and CUDA 12+

Full Changelog: https://github.com/bdashore3/flash-attention/commits/2.3.3