Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve issues between kv_cache and flash attention. #1178

Merged

Commits on Mar 8, 2024

  1. When using kv cache and flash attention in conjunction, it's crucial …

    …to set the causal parameter of flash_varlen_qkv_fn to False. Failing to do so will lead to inaccurate results.
    chaochen99 committed Mar 8, 2024
    Configuration menu
    Copy the full SHA
    883d04d View commit details
    Browse the repository at this point in the history