Making the MTR rpl suite GTID_MODE Agnostic

In MySQL 5.6 we introduced GTID_MODE as a new server option. A global transaction identifier (GTID) is a unique identifier created and associated with each transaction when it is committed on the server of origin (master). This identifier is unique not only to the server on which it originated, but is unique across all servers in a given replication setup. There is a 1-to-1 mapping between all transactions and all GTIDs. For additional information, please refer to the MySQL manual.

Prior to 5.6.17 and 5.7.4, we had GTID specific replication (referred to as “rpl” within the MTR suite) regression tests, and we had to separately run the binlog suite with the GTID_MODE=ON option enabled. To improve the test coverage further we wanted to make the MTR rpl suite GTID_MODE agnostic.

From 5.6.17 and 5.7.4 on, the MTR rpl suite is run daily within the continuous integration testing framework of MySQL, with GTID_MODE=ON for better code coverage.

However we cannot just turn on the necessary mysqld switches and expect all of the test cases to pass, as the tests are bound to behave differently when run with GTID_MODE=ON. There were a number of challenges associated with this new change, and this post will explain the challenges and the necessary solutions that were implemented in order to make this change.

Challenges

We wanted to periodically run the rpl suite with --GTID_MODE=ON. However, one cannot just turn on the necessary mysqld switches and expect all of the test cases to pass due to some differences in behavior when run with GTID_MODE=ON.

Some of the challenges that we expected were:

  • C1. Result file differences due to additional output from SHOW BINLOG EVENTS. This breaks many tests due to the result file difference, and MTR is results file oriented.

  • C2. Tests that mix transactional and non transactional engines in the same statement/ transaction. GTIDs have options and requirements around such behavior.

  • C3. The fact that some tests do not require the logging of slave updates to be ON, and thus with GTID_MODE=ON we fail to start the server. This is because all servers involved in a GTID based replication group must have GTID_MODE=ON.

  • C4. Currently, even slaves using the “--gtid-mode=on --enforce-gtid-consistency --log-slave-updates” server options will still connect to the master the using MASTER_AUTO_POSITION=0, because by default the replication setup in the mysql-test rpl suite has MASTER_AUTO_POSITION=0. But for GTID_MODE=ON, we should have MASTER_AUTO_POSITION=1 since that will allow for automatic connection and replication setup, using the correct GTID values.

Solutions

To overcome these challenges, the following solutions were implemented :

  • Solution to C1: We use “show_binlog_events.inc” which filters out the additional GTID output from the result files.

    However, this is not always enough. In some cases we need to create two wrappers for the test case. One works on traditional event positioning and the other with GTIDs. The former keeps the original test name, the latter gets “_gtid_” injected into it just after “rpl”. For example, rpl.rpl_binlog_errors becomes rpl.rpl_gtid_binlog_errors.

    These should also wrap the original test case that it is moved into (keeping the original name, with the extension renamed from .test to .inc).

  • Solution to C2: These tests are not supported, so they were simply added to the skip-test-list when GTID_MODE=ON.

  • Solution to C3: We simply needed to skip these tests as well.

  • Solution to C4: We will set MASTER_AUTO_POSITION=1 by setting –let $use_gtids=1 before including master-slave.inc.

    This setting must be automatic in rpl_init.inc, before calling rpl_change_topology.inc.

    And it should be unset on rpl_end.inc before calling rpl_change_topology.inc again, which reverts CHANGE MASTER to MASTER_AUTO_POSITION=0

  • Apart from these more recurring challenges, we should also deprecate --sync_slave_with_master (and similar ones) and only use “include/sync_slave_sql_with_master.inc”, which handles both legacy file-positions, as well as GTID-based replication protocols.

    Conclusion

    Having solved the above challenges, we now run the MTR rpl suite with GTID_MODE=ON on a daily basis. This has greatly improved the test coverage, thus allowing us to identify GTID related problems well before they are pushed in the working release branch.

    If you have any questions or feedback regarding this project, please post them here. I would love to hear what the community thinks about all of this.

Leave a Reply

Your email address will not be published. Required fields are marked *


+ two = 3

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">