Skip to content

12.3 mdev 39429#5015

Open
janlindstrom wants to merge 2 commits intoMariaDB:12.3from
mariadb-corporation:12.3-MDEV-39429
Open

12.3 mdev 39429#5015
janlindstrom wants to merge 2 commits intoMariaDB:12.3from
mariadb-corporation:12.3-MDEV-39429

Conversation

@janlindstrom
Copy link
Copy Markdown
Contributor

@janlindstrom janlindstrom commented Apr 29, 2026

  • Fix regression caused by RESET MASTER change where it is not allowed if binlog threads are active
  • Test case changes only to tests: galera. MDEV-3901, galera.galera_as_slave_replay, galera_slave_replay, galera.MDEV-39011
  • Fix result set difference on galera.galera_defaults, wsrep. variables_debug (test only)
  • Fix usage of SHOW BINLOG EVENTS (caused by binlog-in-engine )
  • Test case changes only to tests: galera.galera_forced_binlog_format, galera.rpl_row_annotate, galera_sr.MDEV-18585, galera_sr.galera_sr_gtid, galera_sr.galera_sr_log_bin, galera_sr.mysql-wsrep-features#136
  • Test case stabilization : galera.GCF-360, galera.MDEV-27862, galera_3nodes.galera_vote_majority_dml, galera_3nodes.galera_sst_stateless, galera_sr.mysql-wsrep-features#8

@janlindstrom janlindstrom self-assigned this Apr 29, 2026
Copy link
Copy Markdown
Contributor

@dr-m dr-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to run the tests, but everything is failing for me:

mysql-test/mtr --big-test --parallel=auto --suite=galera,galera_sr,galera_3nodes,wsrep --force --max-test-fail=0

I am getting errors from wsrep_sst_rsync, I guess in every single Galera test. In this environment, on the 11.4 branch, the above command is working (over half of the exactly 700 tests completed without any failure, the other half is pending).

@dr-m
Copy link
Copy Markdown
Contributor

dr-m commented Apr 29, 2026

I tried to run the tests, but everything is failing for me:

mysql-test/mtr --big-test --parallel=auto --suite=galera,galera_sr,galera_3nodes,wsrep --force --max-test-fail=0

I am getting errors from wsrep_sst_rsync, I guess in every single Galera test. In this environment, on the 11.4 branch, the above command is working (over half of the exactly 700 tests completed without any failure, the other half is pending).

I unbroke the tests by reverting #4923:

git revert 6efb70285b908be57cc50934a6c8f994773d7583

Copy link
Copy Markdown
Contributor

@dr-m dr-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested these changes on my local system. Two tests failed because I had built cmake -DPLUGIN_PERFSCHEMA=NO. Those would be fixed by the following:

diff --git a/mysql-test/suite/galera/t/mdev-36554.test b/mysql-test/suite/galera/t/mdev-36554.test
index 255c2b9cffe..09119a48996 100644
--- a/mysql-test/suite/galera/t/mdev-36554.test
+++ b/mysql-test/suite/galera/t/mdev-36554.test
@@ -3,6 +3,7 @@
 #
 
 --source include/galera_cluster.inc
+--source include/have_perfschema.inc
 
 CALL mtr.add_suppression("Event .* Update_rows.* apply failed");
 CALL mtr.add_suppression("mariadbd: Can't find record in 't1'");
diff --git a/mysql-test/suite/wsrep/t/wsrep_off.test b/mysql-test/suite/wsrep/t/wsrep_off.test
index 27e64c92e93..6d5f76c8424 100644
--- a/mysql-test/suite/wsrep/t/wsrep_off.test
+++ b/mysql-test/suite/wsrep/t/wsrep_off.test
@@ -1,4 +1,5 @@
 --source include/have_innodb.inc
+--source include/have_perfschema.inc
 --source include/have_wsrep_provider.inc
 --source include/have_binlog_format_row.inc
 

Please apply them to the earliest applicable branch. While our CI does not run any tests with this configuration, some developers do.

The tests wsrep.variables and wsrep.wsrep_provider_plugin_defaults would fail with a result difference, because I was testing with galera-4 from Debian downstream, version 26.4.25-2, instead of whatever this was tested against.

Comment on lines 5413 to 5418
if (opt_galera_info) {
if (!write_galera_info(backup_datasinks.m_data,
mysql_connection)) {
m_bs_con)) {
return(false);
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested reverting this patch, and I got the following result: a hang of the test galera_3nodes.galera_gtid_consistency (not sure if it is related to this reversion), as well as the following type of failure in galera.galera_log_bin_ext_mariabackup, galera.galera_sst_mariabackup_gtid, galera.galera_ist_MDEV-28423:

CURRENT_TEST: galera.galera_log_bin_ext_mariabackup
mysqltest: In included file "./include/galera_wait_ready.inc": 
included from /mariadb/main/mysql-test/suite/galera/include/start_mysqld.inc at line 16:
included from /mariadb/main/mysql-test/suite/galera/t/galera_log_bin_sst.inc at line 68:
included from /mariadb/main/mysql-test/suite/galera/t/galera_log_bin_ext_mariabackup.test at line 3:
At line 28: "Server did not transition to READY state"

as well as the following:

CURRENT_TEST: galera_3nodes.galera_gtid_consistency
mysqltest: In included file "./include/wait_until_connected_again.inc": 
included from ./include/start_mysqld.inc at line 49:
included from /mariadb/main/mysql-test/suite/galera_3nodes/t/galera_gtid_consistency.test at line 120:
At line 44: Server failed to restart

So, this code change definitely is a necessary fixup after 794b1d0 which had been reviewed in #3775.

@janlindstrom janlindstrom enabled auto-merge (rebase) April 29, 2026 08:28
Copy link
Copy Markdown
Contributor

@dr-m dr-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this needs to be merged before the pending release, I am sending my approval. The script permission issue was fixed in #5017.

This regression was caused by commit 794b1d0
Binlog-in-engine: New binlog implementation integrated in InnoDB.
Mariabackup request BACKUP STAGE BLOCK_COMMIT MDL-lock
using m_bs_con connection. Because we have wsrep,
write_galera_info is called using mysql_connection.
Note that m_bs_con and mysql_connection are different
connections. In write_galera_info write_current_binlog_file
is called and FLUSH BINLOG LOGS is executed. In reload_acl_and_cache
MDL_BACKUP_START MDL-lock is requested. Because we already
have conflicting MDL-lock for BLOCK_COMMIT in different THD
it has to wait. This wait ends on timeout and backup fails
causing mariabackup SST to fail and node will not join the cluster.

Fixed by using same connection for write_galera_info as for
BACKUP STAGE BLOCK_COMMIT i.e. m_bs_con.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants