Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDEV-28452 wsrep_ready: OFF after MDL BF-BF conflict #426

Open
wants to merge 4 commits into
base: 10.6
Choose a base branch
from

Conversation

plampio
Copy link

@plampio plampio commented May 13, 2024

Galera Cluster failed when two transactions that were executed
serially on a master node were incorrectly executed in parallel on a
slave node. The slave node detected MDL BF-BF conflict and quit the
cluster.

The problem was caused by OPTIMIZE TABLE on a table that is the child
table of a foreign key constraint. If UPDATE was performed on the
parent table of the foreign key constraint while OPTIMIZE TABLE was
running on the child table, the master node would run the transactions
serially, but a slave node might run them in parallel resulting in MDL
BF-BF conflict. The problem is fixed by adding foreign key constraint
to the replicated write set preventing parallel apply of the
transactions on the slave node. The code fixing this issue for
OPTIMIZE TABLE is similar to the code for ALTER TABLE.

Galera Cluster failed when two transactions that were executed
serially on a master node were incorrectly executed in parallel on a
slave node. The slave node detected MDL BF-BF conflict and quit the
cluster.

The problem was caused by OPTIMIZE TABLE on a table that is the child
table of a foreign key constraint. If UPDATE was performed on the
parent table of the foreign key constraint while OPTIMIZE TABLE was
running on the child table, the master node would run the transactions
serially, but a slave node might run them in parallel resulting in MDL
BF-BF conflict. The problem is fixed by adding foreign key constraint
to the replicated write set preventing parallel apply of the
transactions on the slave node.
Removed debug messages.
@plampio plampio requested review from sciascid and temeo May 13, 2024 08:37
Copy link

@janlindstrom janlindstrom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test case needs work to remove unnecessary sleeps by replacing them with wait_condition or debug_sync WAIT_FOR/SIGNAL controlling.

# 1) OPTIMIZE TABLE on the child table of the foreign key constraint,
--connect node_1a, 127.0.0.1, root, , test, $NODE_MYPORT_1
--connection node_1a
--sleep 5

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really need this sleep? I suggest using wait_condition instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, this sleep is not needed.
Removed the sleep.


# create two tables
CREATE TABLE `user` (
`id` char(36) COLLATE utf8mb4_unicode_ci NOT NULL,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is these table column names directly from customer and do you really need all columns?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These column names are indeed copied exactly from the customer case.
I have now removed all those columns that are not part of the keys and, thus, do not affect the test in any way.
But it would be possible to create a minimal test case with table names t1 and t2 and with just two columns per table.


# allow the stopped OPTIMIZE TABLE transaction to proceed
--connection node_2a
--sleep 5

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These sleeps make test slow, there is mechanism to wait until we have reached debug sync by sending messages and then using WAIT_FOR e.g see test MW-369.inc

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not been able to replace this sleep with other synchronization mechanism, such as WAIT_FOR.
But I commented out the sleep and the test seems to work without it.


# cleanup
--connection node_1
--sleep 5

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove all sleeps and rely on either wait_condition (using processlist) or debug_sync WAIT_FOR/SIGNAL

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have now removed two of the three sleeps in the MTR test.
Removing the last sleep is more tricky.

DBUG_EXECUTE_IF("sync.mdev_28452",
{
const char act[]=
"now "

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you could signal that we have reached this code by now signal mdev_28452_reached and you could wait that on test case.

sql/sql_admin.cc Outdated
}
}
}
#else
WSREP_TO_ISOLATION_BEGIN_WRTCHK(NULL, NULL, first_table);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The line WSREP_TO_ISOLATION_BEGIN_WRTCHK() should be removed. It is replaced by your code above.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right.
Fixed as suggested.

there. Variables are reset back in THD::reset_for_next_command() before
processing of next command.
*/
if (wsrep_auto_increment_control)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why we need to change auto_increment variables here.
What happens if you don't?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested also without changing auto_increment_* variables and the MTR tests succeeded.
I don't understand what these variable do, but the code segment was copied from
Sql_cmd_alter_table::execute() (in sql/sql_admin.cc) and this fix is very similar to that.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimize table could do table re-create, but I do not know if that has any effect on autoinc values (could be verified using show create table x; in both nodes).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants