資安

MySQL8.0 Xtrabackup備份問題

一、背景

MySQL 8.0已經成為最受矚目的版本,增加了許多新特性,如Clone Plugin、Hash Join等等。由於新的MySQL重做日誌和數據字典格式,所以推出新的XtraBackup 8.0版本,本文基於MySQL產品在自動搭建備庫時遇到問題做出解析。

1.1 測試環境

  • MySQL 8.0.18
  • xtrabackup 8.0.8
  • binlog_format=raw,gtid_mode=on,log-bin=1
  • sysbench 1.0.17

1.2 問題來源

某天,QFusion測試人員遇到了MySQL 8.0在備庫搭建時報錯,需要DBA同學跟進問題。首先通過show slave status查看報錯信息,發現從庫的sql thread出現異常。

root@localhost : (none) 08:56:40> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: test-58-aa-f8aa500-headless
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000085
          Read_Master_Log_Pos: 1751589
               Relay_Log_File: mysql-relay-bin.000002
                Relay_Log_Pos: 416
        Relay_Master_Log_File: mysql-bin.000050
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 1062
                   Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25758' at master log , end_log_pos 16052967. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 15520378
              Relay_Log_Space: 19285319437
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1062
               Last_SQL_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction '4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25758' at master log , end_log_pos 16052967. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 41796
                  Master_UUID: 4a8f2de4-1fe1-11ea-a7d3-42549a90e353
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State:
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp: 191216 18:56:49
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: 4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25758-48217
            Executed_Gtid_Set: 4a8f2de4-1fe1-11ea-a7d3-42549a90e353:1-25757
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
       Master_public_key_path:
        Get_master_public_key: 0
            Network_Namespace:
1 row in set (0.00 sec)

通過報錯信息中的提示,發現是出現了主鍵衝突。

root@localhost : (none) 08:57:05> select * from performance_schema.replication_applier_status_by_worker limit 1\G
*************************** 1. row ***************************
                                           CHANNEL_NAME:
                                              WORKER_ID: 1
                                              THREAD_ID: NULL
                                          SERVICE_STATE: OFF
                                      LAST_ERROR_NUMBER: 1062
                                     LAST_ERROR_MESSAGE: Worker 1 failed executing transaction '4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25758' at master log , end_log_pos 16052967; Could not execute Write_rows event on table test.sbtest24; Duplicate entry '1529529' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 16052967
                                   LAST_ERROR_TIMESTAMP: 2019-12-16 18:56:49.326384
                               LAST_APPLIED_TRANSACTION:
     LAST_APPLIED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 0000-00-00 00:00:00.000000
    LAST_APPLIED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 0000-00-00 00:00:00.000000
         LAST_APPLIED_TRANSACTION_START_APPLY_TIMESTAMP: 0000-00-00 00:00:00.000000
           LAST_APPLIED_TRANSACTION_END_APPLY_TIMESTAMP: 0000-00-00 00:00:00.000000
                                   APPLYING_TRANSACTION: 4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25758
         APPLYING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 2019-12-16 17:31:13.653629
        APPLYING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 2019-12-16 17:31:13.653629
             APPLYING_TRANSACTION_START_APPLY_TIMESTAMP: 2019-12-16 18:56:49.325745
                 LAST_APPLIED_TRANSACTION_RETRIES_COUNT: 0
   LAST_APPLIED_TRANSACTION_LAST_TRANSIENT_ERROR_NUMBER: 0
  LAST_APPLIED_TRANSACTION_LAST_TRANSIENT_ERROR_MESSAGE:
LAST_APPLIED_TRANSACTION_LAST_TRANSIENT_ERROR_TIMESTAMP: 0000-00-00 00:00:00.000000
                     APPLYING_TRANSACTION_RETRIES_COUNT: 0
       APPLYING_TRANSACTION_LAST_TRANSIENT_ERROR_NUMBER: 0
      APPLYING_TRANSACTION_LAST_TRANSIENT_ERROR_MESSAGE:
    APPLYING_TRANSACTION_LAST_TRANSIENT_ERROR_TIMESTAMP: 0000-00-00 00:00:00.000000
1 row in set (0.00 sec)

查看當前slave最新的gtid信息

root@localhost : (none) 12:05:14> show master status\G
*************************** 1. row ***************************
             File: mysql-bin.000008
         Position: 155
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: 4a8f2de4-1fe1-11ea-a7d3-42549a90e353:1-25757
1 row in set (0.00 sec)

查看xtrabackup備份時的binlog的pos和gtid信息

[root@test-58-aa-f8aa501-0 mydata]# cat xtrabackup_info
...
binlog_pos = filename 'mysql-bin.000050', position '33110447', GTID of the last change '4a8f2de4-1fe1-11ea-a7d3-42549a90e353:1-25757'
...

此時對比發現,slave的gtid信息和備份完成時的gtid是一致的,那為什麼還是出現了主鍵衝突的情況呢?

此時我們根據pos位點解析master的binlog信息

[root@test-58-aa-f8aa500-0 archive]# mysqlbinlog -vv mysql-bin.000050 --start-position=33110447 | head -20
mysqlbinlog: [Warning] unknown variable 'loose_default-character-set=utf8mb4'.
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 195
#191216 17:31:12 server id 41796  end_log_pos 124 CRC32 0xc9ad2c11     Start: binlog v 4, server v 8.0.18 created 191216 17:31:12
BINLOG '
4E73XQ9EowAAeAAAAHwAAAAAAAQAOC4wLjE4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAEwANAAgAAAAABAAEAAAAYAAEGggAAAAICAgCAAAACgoKKioAEjQA
CgERLK3J
'/*!*/;
# at 33110447
#191216 17:31:13 server id 41796  end_log_pos 33110527 CRC32 0x27442dd4     GTID    last_committed=15    sequence_number=34    rbr_only=yes    original_committed_timestamp=1576488674113763    immediate_commit_timestamp=1576488674113763    transaction_length=1034680
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
# original_commit_timestamp=1576488674113763 (2019-12-16 17:31:14.113763 CST)
# immediate_commit_timestamp=1576488674113763 (2019-12-16 17:31:14.113763 CST)
/*!80001 SET @@session.original_commit_timestamp=1576488674113763*//*!*/;
/*!80014 SET @@session.original_server_version=80018*//*!*/;
/*!80014 SET @@session.immediate_server_version=80018*//*!*/;
SET @@SESSION.GTID_NEXT= '4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25776'/*!*/;
# at 33110527

根據binlog的pos位點,發現下一個事務GTID為4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25776;而備份結束的GTID為4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25757。

此時我們根據pos位點解析master的binlog信息

### 主庫查看4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25758的執行語句
[root@test-58-aa-f8aa500-0 archive]# mysqlbinlog -vv mysql-bin.000050 | grep -B 50 4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25759
...
### INSERT INTO `test`.`sbtest24`
### SET
###   @1=1532200 /* INT meta=0 nullable=0 is_null=0 */
###   @2=2512693 /* INT meta=0 nullable=0 is_null=0 */
###   @3='89203990588-69103074818-52114821100-93439894362-23847143576-07484173808-05428369926-02975913993-41574550041-26938280355' /* STRING(360) meta=61032 nullable=0 is_null=0 */
###   @4='45342050513-61911498937-97819152099-08587625932-60582159210' /* STRING(180) meta=65204 nullable=0 is_null=0 */
### INSERT INTO `test`.`sbtest24`
### SET
###   @1=1532201 /* INT meta=0 nullable=0 is_null=0 */
###   @2=2516379 /* INT meta=0 nullable=0 is_null=0 */
###   @3='16016112465-66688149957-19003543589-12752783330-10140834595-17946818527-54058037200-83940103252-07788820899-59081899782' /* STRING(360) meta=61032 nullable=0 is_null=0 */
###   @4='02966014962-81012143426-29059376957-22789496327-25833494842' /* STRING(180) meta=65204 nullable=0 is_null=0 */
### INSERT INTO `test`.`sbtest24`
### SET
###   @1=1532202 /* INT meta=0 nullable=0 is_null=0 */
###   @2=2707896 /* INT meta=0 nullable=0 is_null=0 */
###   @3='16925828637-58090054889-95780407302-10293804842-12632614860-03811213734-46471823942-17333962963-31790381481-07019946586' /* STRING(360) meta=61032 nullable=0 is_null=0 */
###   @4='77178758790-70128854370-00264407205-64007208401-22279618099' /* STRING(180) meta=65204 nullable=0 is_null=0 */
# at 16555027
#191216 17:31:11 server id 41796  end_log_pos 16555058 CRC32 0x6f8135d4         Xid = 36324
COMMIT/*!*/;
# at 16555058
#191216 17:31:11 server id 41796  end_log_pos 16555138 CRC32 0x7fcb97ac         GTID    last_committed=0        sequence_number=17      rbr_only=yes    original_committed_timestamp=1576488673657448   immediate_commit_timestamp=1576488673657448     transaction_length=1034680
/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;
# original_commit_timestamp=1576488673657448 (2019-12-16 17:31:13.657448 CST)
# immediate_commit_timestamp=1576488673657448 (2019-12-16 17:31:13.657448 CST)
/*!80001 SET @@session.original_commit_timestamp=1576488673657448*//*!*/;
/*!80014 SET @@session.original_server_version=80018*//*!*/;
/*!80014 SET @@session.immediate_server_version=80018*//*!*/;
SET @@SESSION.GTID_NEXT= '4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25759'/*!*/;

### 從庫查看事務4a8f2de4-1fe1-11ea-a7d3-42549a90e353:25758的數據是否存在
root@localhost : (none) 09:23:04> select * from test.sbtest24 order by id desc limit 10;
+---------+---------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| id      | k       | c                                                                                                                       | pad                                                         |
+---------+---------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
| 1532202 | 2707896 | 16925828637-58090054889-95780407302-10293804842-12632614860-03811213734-46471823942-17333962963-31790381481-07019946586 | 77178758790-70128854370-00264407205-64007208401-22279618099 |
| 1532201 | 2516379 | 16016112465-66688149957-19003543589-12752783330-10140834595-17946818527-54058037200-83940103252-07788820899-59081899782 | 02966014962-81012143426-29059376957-22789496327-25833494842 |
| 1532200 | 2512693 | 89203990588-69103074818-52114821100-93439894362-23847143576-07484173808-05428369926-02975913993-41574550041-26938280355 | 45342050513-61911498937-97819152099-08587625932-60582159210 |
| 1532199 | 2502576 | 09802368960-91917733453-96239669598-54904648916-77296852719-98016730243-90741158106-18793912721-49864482275-01438682979 | 97383280864-09719424624-52657831677-16997497487-42888128592 |
| 1532198 | 2492769 | 50767886676-93929990978-69338677932-87282029008-76889813973-90185628217-96733941319-68335970853-94711333694-25795217601 | 19336837263-95804875039-06595487043-70523328561-09413324879 |
| 1532197 | 1955396 | 00397406795-74747838176-71394165168-41273836833-69479591787-42154528121-51771653744-67255388362-49886388719-58697178839 | 02673409369-19296574832-28003465943-79301421425-80480578998 |
| 1532196 | 2493231 | 86498404871-42108054116-40071340969-11350042446-06300939909-05201025379-06411857335-03954766664-39537185462-91490507327 | 84818317688-20764928816-31090759998-91731449425-90118225230 |
| 1532195 | 2499593 | 77660688834-45252238334-24137839414-65367466774-19337774130-53076983367-01390519320-71314042514-11226819773-51096244448 | 21997720631-75948312175-91821894792-80556960963-06584097138 |
| 1532194 | 2515138 | 66646860538-72521674543-39777861094-01574256160-18755135370-00487912454-64289722053-52102961561-41410923742-00667455284 | 84070066757-97543385536-80074118200-18008298945-61299807560 |
| 1532193 | 2517655 | 50560105798-09418525134-35753621841-55990140866-27080595470-36899245149-55668180470-33559752487-83646593848-17843254144 | 79647081405-62007589765-85313418965-19437322653-89655822041 |
+---------+---------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
10 rows in set (0.00 sec)

此時我們根據pos位點解析master的binlog信息確認下一個事務已經執行。

二、問題排查

2.1 前期準備

MySQL 8.0的備份取消了FLUSH TABLES WITH READ LOCK的全局加鎖限制,具體備份流程後續會寫文章對比;使用performance_schema.log_status表來獲取binlog的pos和GTID信息。

首先我們介紹下8.0新增的performance_schema.log_status表的具體內容。

root@localhost : (none):47: > select * from performance_schema.log_status\G
*************************** 1. row ***************************
    SERVER_UUID: 9eed5dd0-213c-11ea-a06f-fa9c144d7000
          LOCAL: {"gtid_executed": "d3eff916-0e03-11ea-a98d-fa84dc05a100:1-2344524", "binary_log_file": "mysql-bin.000036", "binary_log_position": 8237993}
    REPLICATION: {"channels": [{"channel_name": "", "relay_log_file": "mysql-relay-bin.000002", "relay_log_position": 130357303}]}
STORAGE_ENGINES: {"InnoDB": {"LSN": 13536450538, "LSN_checkpoint": 13522598241}}
1 row in set (0.00 sec)
  • SERVER_UUID:此服務器實例的服務器UUID。這是隻讀系統變量server_uuid的唯一值。
  • LOCAL:來自主服務器的日誌位置狀態信息,以單個JSON對象的形式提供,並具有以下信息:

    • binary_log_file:當前二進制日誌文件的名稱。
    • binary_log_position:訪問log_status表時的當前二進制日誌位置。
    • gtid_executed:在訪問log_status表時,全局變量gtid_execute的當前值。此信息與binary_log_file和binary_log_position鍵一致。
  • REPLICATION:一個通道的JSON數組,每個通道具有以下信息:

    • channel_name:複製通道的名稱。
    • relay_log_file:複製通道的當前中繼日誌文件的名稱。
    • relay_log_pos:訪問log_status表時的當前中繼日誌位置。
  • STORAGE_ENGINES:來自各個存儲引擎的相關信息,以JSON對象的形式提供,每個適用的存儲引擎都有一個密鑰。

2.2 驗證show master status的輸出信息

evernotecid://232882CB-0E86-4D89-B07A-EA1AE2F94E20/appyinxiangcom/15365034/ENResource/p17613image.png

這裡可以看到,在併發場景下,show master status輸出的GTID和binlog的pos位點不是對應的。查看代碼後發現:binlog的file和pos位點的更新在group commit的flush階段後,即可對其他會話可見。而GTID是在group commit的commit階段後可對其他會話可見,所以也就造成了show master status輸出的binlog信息和GTID信息非原子性。

三、復現問題

3.1 開始備份

[root@qfusion1 ~]# xtrabackup --defaults-file=/etc/mysql/my.cnf --user=root --host=127.0.0.1 --password=xxx --port=3306 --extra-lsndir=/opt/full_lsn --stream=xbstream --backup --target-dir=./ | sshpass -p'letsg0' ssh root@qfusion2 'cat -> /data/backup/full_`date +%F_%H-%M-%S`.tar'
...
191217 19:02:45 [01]        ...done
191217 19:02:45 Finished backing up non-InnoDB tables and files
191217 19:02:45 Executing FLUSH NO_WRITE_TO_BINLOG BINARY LOGS
191217 19:02:46 Selecting LSN and binary log position from p_s.log_status
191217 19:02:46 [00] Streaming /var/lib/mysql/archive/mysql-bin.000035 to <STDOUT> up to position 290601
191217 19:02:46 [00]        ...done
191217 19:02:46 [00] Streaming <STDOUT>
191217 19:02:46 [00]        ...done
191217 19:02:46 [00] Streaming <STDOUT>
191217 19:02:46 [00]        ...done
191217 19:02:46 Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...
xtrabackup: The latest check point (for incremental): '13470381978'
xtrabackup: Stopping log copying thread at LSN 13522426897.
191217 19:02:46 >> log scanned up to (13522598105)

191217 19:02:48 All tables unlocked
191217 19:02:48 [00] Streaming ib_buffer_pool to <STDOUT>
191217 19:02:48 [00]        ...done
191217 19:02:48 Backup created in directory '/root/'
MySQL binlog position: filename 'mysql-bin.000035', position '290601', GTID of the last change 'd3eff916-0e03-11ea-a98d-fa84dc05a100:1-2340959'
191217 19:02:48 [00] Streaming <STDOUT>
191217 19:02:48 [00]        ...done
191217 19:02:48 [00] Streaming <STDOUT>
191217 19:02:48 [00]        ...done
xtrabackup: Transaction log of lsn (12972859512) to (13526909324) was copied.
191217 19:02:50 completed OK!

3.2 查看xtrabackup的信息

[root@qfusion2 full]# ls
backup-my.cnf   ibdata1  mysql-bin.000035  mysql.ibd           sbtest  test      undo_002                xtrabackup_checkpoints  xtrabackup_logfile
ib_buffer_pool  mysql    mysql-bin.index   performance_schema  sys     undo_001  xtrabackup_binlog_info  xtrabackup_info         xtrabackup_tablespaces

[root@qfusion2 full]# cat xtrabackup_binlog_info
mysql-bin.000035    290601    d3eff916-0e03-11ea-a98d-fa84dc05a100:1-2340959

3.3 查看對應的binlog信息

[root@qfusion2 full]# mysqlbinlog -vv mysql-bin.000035 | grep d3eff916-0e03-11ea-a98d-fa84dc05a100 | tail -10
SET @@SESSION.GTID_NEXT= 'd3eff916-0e03-11ea-a98d-fa84dc05a100:2340970'/*!*/;
SET @@SESSION.GTID_NEXT= 'd3eff916-0e03-11ea-a98d-fa84dc05a100:2340971'/*!*/;
SET @@SESSION.GTID_NEXT= 'd3eff916-0e03-11ea-a98d-fa84dc05a100:2340972'/*!*/;
SET @@SESSION.GTID_NEXT= 'd3eff916-0e03-11ea-a98d-fa84dc05a100:2340973'/*!*/;
SET @@SESSION.GTID_NEXT= 'd3eff916-0e03-11ea-a98d-fa84dc05a100:2340974'/*!*/;
SET @@SESSION.GTID_NEXT= 'd3eff916-0e03-11ea-a98d-fa84dc05a100:2340975'/*!*/;
SET @@SESSION.GTID_NEXT= 'd3eff916-0e03-11ea-a98d-fa84dc05a100:2340976'/*!*/;
SET @@SESSION.GTID_NEXT= 'd3eff916-0e03-11ea-a98d-fa84dc05a100:2340977'/*!*/;
SET @@SESSION.GTID_NEXT= 'd3eff916-0e03-11ea-a98d-fa84dc05a100:2340978'/*!*/;
SET @@SESSION.GTID_NEXT= 'd3eff916-0e03-11ea-a98d-fa84dc05a100:2340979'/*!*/;

[root@qfusion2 full]# mysqlbinlog -vv mysql-bin.000035 --start-position=290601
mysqlbinlog: [Warning] unknown variable 'loose_default-character-set=utf8'.
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
DELIMITER /*!*/;
# at 195
#191217 19:02:45 server id 2433306  end_log_pos 124 CRC32 0x6d85a9e8     Start: binlog v 4, server v 8.0.18 created 191217 19:02:45
# Warning: this binlog is either in use or was not closed properly.
BINLOG '
1bX4XQ8aISUAeAAAAHwAAAABAAQAOC4wLjE4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAEwANAAgAAAAABAAEAAAAYAAEGggAAAAICAgCAAAACgoKKioAEjQA
CgHoqYVt
'/*!*/;
SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;
DELIMITER ;
# End of log file
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;

3.4 開始恢復

[root@qfusion2 full]# xtrabackup --defaults-file=/etc/mysql/my.cnf --prepare --use-memory=5G --target-dir=/data/backup/full/

[root@qfusion2 full]# xtrabackup --defaults-file=/etc/mysql/my.cnf --move-back --target-dir=/data/backup/full/

[root@qfusion2 full]# chown -R mysql:mysql /var/lib/mysql/

[root@qfusion2 full]# mysqld_safe --defaults-file=/etc/mysql/my.cnf &

[root@qfusion2 full]# mysql -uroot -pxxx

root@localhost : (none):19: > show master status\G
*************************** 1. row ***************************
             File: mysql-bin.000036
         Position: 195
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: d3eff916-0e03-11ea-a98d-fa84dc05a100:1-2340979
1 row in set (0.00 sec)

[root@qfusion2 full]# cat xtrabackup_binlog_info
mysql-bin.000035    290601    d3eff916-0e03-11ea-a98d-fa84dc05a100:1-2340959

備份恢復完成後,我們可以看到,show master status輸出的GTID信息與xtrabackup_binlog_info文件的GTID是不一致的,對比產品中遇到主鍵衝突的問題中的show master status的GTID和與xtrabackup_binlog_info文件的GTID是一致的;可以確認的事情是恢復之後是有過對數據庫的操作了。此時應該找下原型預研文檔和產品開發人員確認下是否在MySQL恢復之後有其他額外的操作,得到反饋是執行了如下的語句操作:

reset master;
...
set global gtid_purged='xxx'

這些步驟是MySQL 5.7下的搭建備庫必要的操作(下面內容有驗證),因為在5.7備份是沒有binlog的備份,且GTID的持久化是在mysql.gtid_executed表,但是該表不是實時更新的,在主庫的更新策略如下:

  • 主庫開啟binlog時,mysql.gtid_executed表會記錄GTID信息,且在binlog rotation或者server重啟時進行更新。
  • 主庫未開啟binlog時,mysql.gtid_executed表不會記錄任何信息。

那麼MySQL 8.0下的mysql.gtid_executed表的更新策略是什麼樣的呢?此時我們可以查詢官方文檔尋找答案:

簡單翻譯下:如果啟用了二進制日誌(log_bin為ON),則從MySQL 8.0.17起對InnoDB存儲引擎,mysqld以禁用binlog記錄或slave更新日誌記錄的相同方式更新mysql.gtid_executed表,並存儲GTID用於事務提交時的每個事務。但是,在MySQL 8.0.17之前的版本以及其他存儲引擎中,僅在切換binlog或重啟mysqld時,mysqld將所有寫入上一個binlog的事務的GTID寫入mysql.gtid_executed表。

我們也可以測試驗證下:

root@localhost : test:12: > select * from mysql.gtid_executed;show master status;
+--------------------------------------+----------------+--------------+
| source_uuid                          | interval_start | interval_end |
+--------------------------------------+----------------+--------------+
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |              1 |      2764676 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2764677 |      2764783 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2764784 |      2764893 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2764894 |      2765008 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765009 |      2765074 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765075 |      2765143 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765144 |      2765185 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765186 |      2765283 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765284 |      2765384 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765385 |      2765485 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765486 |      2765571 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765572 |      2765652 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765653 |      2765746 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765747 |      2765865 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765866 |      2765948 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765949 |      2766028 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2766029 |      2766052 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2766053 |      2766053 |
+--------------------------------------+----------------+--------------+
18 rows in set (0.00 sec)

+------------------+-----------+--------------+------------------+------------------------------------------------+
| File             | Position  | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                              |
+------------------+-----------+--------------+------------------+------------------------------------------------+
| mysql-bin.000036 | 450959566 |              |                  | d3eff916-0e03-11ea-a98d-fa84dc05a100:1-2766053 |
+------------------+-----------+--------------+------------------+------------------------------------------------+
1 row in set (0.01 sec)

root@localhost : test:12: > insert into test values(1,1,'1');
Query OK, 1 row affected (0.00 sec)

root@localhost : test:12: > insert into test values(2,2,'2');
Query OK, 1 row affected (0.01 sec)

root@localhost : test:12: > select * from mysql.gtid_executed;show master status;
+--------------------------------------+----------------+--------------+
| source_uuid                          | interval_start | interval_end |
+--------------------------------------+----------------+--------------+
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |              1 |      2764676 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2764677 |      2764783 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2764784 |      2764893 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2764894 |      2765008 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765009 |      2765074 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765075 |      2765143 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765144 |      2765185 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765186 |      2765283 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765284 |      2765384 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765385 |      2765485 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765486 |      2765571 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765572 |      2765652 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765653 |      2765746 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765747 |      2765865 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765866 |      2765948 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2765949 |      2766028 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2766029 |      2766052 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2766053 |      2766053 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2766054 |      2766054 |
| d3eff916-0e03-11ea-a98d-fa84dc05a100 |        2766055 |      2766055 |
+--------------------------------------+----------------+--------------+
20 rows in set (0.01 sec)

+------------------+-----------+--------------+------------------+------------------------------------------------+
| File             | Position  | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set                              |
+------------------+-----------+--------------+------------------+------------------------------------------------+
| mysql-bin.000036 | 450960258 |              |                  | d3eff916-0e03-11ea-a98d-fa84dc05a100:1-2766055 |
+------------------+-----------+--------------+------------------+------------------------------------------------+
1 row in set (0.00 sec)

3.5 手動刪除備份的binlog文件

[root@qfusion2 full]# ls
backup-my.cnf   mysql-bin.000035    sbtest    undo_002                xtrabackup_logfile
ib_buffer_pool  mysql-bin.index     sys       xtrabackup_binlog_info  xtrabackup_tablespaces
ibdata1         mysql.ibd           test      xtrabackup_checkpoints
mysql           performance_schema  undo_001  xtrabackup_info

[root@qfusion2 full]# mv mysql-bin.* ../

[root@qfusion2 full]# ls
backup-my.cnf   mysql.ibd           test                    xtrabackup_checkpoints
ib_buffer_pool  performance_schema  undo_001                xtrabackup_info
ibdata1         sbtest              undo_002                xtrabackup_logfile
mysql           sys                 xtrabackup_binlog_info  xtrabackup_tablespaces

[root@qfusion2 full]# xtrabackup --defaults-file=/etc/mysql/my.cnf --prepare --use-memory=5G --target-dir=/data/backup/full/

[root@qfusion2 full]# xtrabackup --defaults-file=/etc/mysql/my.cnf --move-back --target-dir=/data/backup/full/

[root@qfusion2 full]# chown -R mysql:mysql /var/lib/mysql/

[root@qfusion2 full]# mysqld_safe --defaults-file=/etc/mysql/my.cnf &

[root@qfusion2 full]# mysql -uroot -pxxx

root@localhost : (none):46: > show master status\G
*************************** 1. row ***************************
             File: mysql-bin.000003
         Position: 155
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: d3eff916-0e03-11ea-a98d-fa84dc05a100:1-2340959
1 row in set (0.00 sec)

[root@qfusion2 full]# cat xtrabackup_binlog_info
mysql-bin.000035    290601    d3eff916-0e03-11ea-a98d-fa84dc05a100:1-2340959

在手動移除備份生成的相關binlog文件後,發現事務是回滾到xtrabackup_binlog_info記錄的GTID的位置。與MySQL 5.7下的備份恢復的情況一致;具體可以查看下面的驗證。

四、MySQL 5.7下的物理備份

4.1 測試環境

  • MySQL 5.7.28
  • xtrabackup 2.4.17
  • binlog_format=raw,gtid_mode=on,log-bin=1
  • sysbench 1.0.17

4.2 開始備份

[root@qfusion3 archive]# xtrabackup --defaults-file=/etc/mysql/my.cnf --user=root --host=127.0.0.1 --password=abc123 --port=3306 --extra-lsndir=/opt/full_lsn --stream=xbstream --backup --target-dir=./ | sshpass -p'letsg0' ssh root@qfusion4 'cat -> /data/backup/full_`date +%F_%H-%M-%S`.tar'

4.3 開始恢復

[root@qfusion4 full]# cd /data/backup

[root@qfusion4 full]# xbstream -x < full_2019-12-18_14-06-10.tar -C full

[root@qfusion4 full]# cd full

[root@qfusion4 full]# ls
backup-my.cnf   ibdata1  performance_schema  sys      undo002  undo004  undo006  undo008  undo010  undo012  undo014  undo016                 xtrabackup_checkpoints  xtrabackup_logfile
ib_buffer_pool  mysql    sbtest              undo001  undo003  undo005  undo007  undo009  undo011  undo013  undo015  xtrabackup_binlog_info  xtrabackup_info

[root@qfusion4 full]# xtrabackup --defaults-file=/etc/mysql/my.cnf --prepare --use-memory=5G --target-dir=/data/backup/full/

[root@qfusion4 full]# xtrabackup --defaults-file=/etc/mysql/my.cnf --move-back --target-dir=/data/backup/full/

[root@qfusion4 full]# chown -R mysql:mysql /var/lib/mysql/

[root@qfusion4 full]# mysqld_safe --defaults-file=/etc/mysql/my.cnf &

[root@qfusion4 full]# mysql -uroot -pxxx

root@localhost : (none):13: > select * from mysql.gtid_executed;
+--------------------------------------+----------------+--------------+
| source_uuid                          | interval_start | interval_end |
+--------------------------------------+----------------+--------------+
| 189fb8ac-1fe8-11ea-9b92-faf16b07cf00 |              1 |          535 |
+--------------------------------------+----------------+--------------+
1 row in set (0.01 sec)

root@localhost : (none):13: > show master status\G
*************************** 1. row ***************************
             File: mysql-bin.000001
         Position: 154
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: 189fb8ac-1fe8-11ea-9b92-faf16b07cf00:1-535
1 row in set (0.00 sec)

[root@qfusion4 full]# cat xtrabackup_binlog_info
mysql-bin.000003    175325364    189fb8ac-1fe8-11ea-9b92-faf16b07cf00:1-736

可以看到MySQL 5.7版本的備份恢復後的GTID是依賴於mysql.gtid_executed表,但是該表不是實時更新,此時的GTID其實是滯後的,所以也需要額外執行set global gtid_purged=‘xxx’。

五、總結

  • MySQL 8.0下的物理備份生成的GTID值是不可靠,因為備份了最後一個binlog文件,恢復時會recovery到最新的事務。
  • MySQL 8.0下的物理備份會執行FLUSH NO_WRITE_TO_BINLOG BINARY LOGS操作,在有大事務對數據進行修改時,一不小心可能就會出現數據庫hang死,所以建議使用小事務。

Leave a Reply

Your email address will not be published. Required fields are marked *