Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDEV-30307 KILL command inside a transaction causes problem for galer… #2397

Open
wants to merge 1 commit into
base: 10.4
Choose a base branch
from

Conversation

sjaakola
Copy link
Contributor

…a replication

  • The Jira issue number for this PR is: MDEV-30307

Description

If KILL command is submitted inside a transaction, galera replication may have problems to deal with cluster conflict resolving. One failure scenario is if the transaction (submitting the KILL command) is it self a victim of BF aborting. Eternal hang of the node may result.

sql_kill() and sql_kill_user() functions have now fix, to perform implicit commit before starting the KILL command execution. Because of the implicit commit, the KILL execution will not happen inside transaction context anymore.

How can this PR be tested?

Added new test scenario in galera.galera_bf_kill test to make the issue surface, The test scenario has a multi statement transaction containing a KILL command. When the KILL is submitted, another transaction is replicated, which causes BF abort for the KILL command processing. Handling BF abort rollback while executing KILL command causes node hanging, in this scenario.

.

Basing the PR against the correct MariaDB version

  • This is a new feature and the PR is based against the latest MariaDB development branch
  • [x ] This is a bug fix and the PR is based against the earliest branch in which the bug can be reproduced

Backward compatibility

This change is deviation of stand-alone MariaDB behavior. With the fix, the transaction will be implicitly committed before the KILL command execution.

…a replication

Added new test scenario in galera.galera_bf_kill test to make the issue surface,
The tetst scenario has a multi statement transaction containing a KILL command.
When the KILL is submitted, another transaction is replicated, which causes BF abort for the KILL command processing.
Handling BF abort rollback while executing KILL command causes node hanging, in this scenario.

sql_kill() and sql_kill_user() functions have now fix, to perform implicit commit before starting the KILL command
execution. BEcause of the implicit commit, the KILL execution will not happen inside transaction context anymore.
@janlindstrom janlindstrom added the Codership Codership Galera label Mar 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Codership Codership Galera
3 participants