首页 > 数据库 > MySQL > 正文

MySQL高可用方案MHA在线切换的步骤及原理

2024-07-24 12:31:34
字体:
来源:转载
供稿:网友
      在日常工作中,会碰到如下的场景,如mysql数据库升级,主服务器硬件升级等,这个时候就需要将写操作切换到另外一台服务器上,那么如何进行在线切换呢?同时,要求切换过程短,对业务的影响比较小。
 
       MHA就提供了这样一种优雅的方式,只会堵塞业务0.5~2s的时间,在这段时间内,业务无法读取和写入。
  
集群信息
 
角色                             IP地址                 ServerID      类型
 
Master                         192.168.244.10   1                 写入
 
Candicate master          192.168.244.20   2                 读
 
Slave                           192.168.244.30   3                 读
 
Monitor host                 192.168.244.40                      监控集群组
  
MHA具体的搭建步骤和原理,可参考另外一篇博客:
 
MySQL高可用方案MHA的部署和原理
  
在线切换的步骤
 
1. 关闭MHA监控
 
# masterha_stop --conf=/etc/masterha/app1.cnf
 
2. 在线切换
 
# /usr/local/bin/masterha_master_switch --conf=/etc/masterha/app1.cnf --master_state=alive --new_master_host=192.168.244.20 --new_master_port=3306 --orig_master_is_new_slave --running_updates_limit=10000
 
其中,
 
      --orig_master_is_new_slave是将原master切换为新主的slave,默认情况下,是不添加的。
 
     --running_updates_limit默认为1s,即如果主从延迟时间(Seconds_Behind_Master),或master show processlist中dml操作大于1s,则不会执行切换。
 
在线切换的输出
 
MySQL高可用方案MHA在线切换的步骤及原理
Tue Apr  ::  - [] MHA::MasterRotate version  ::  - [ ::  - [ ::  - [] * Phase  ::  - [ ::  - [warning] Global configuration  /etc/ ::  - [] Reading application default configuration from /etc/masterha/ ::  - [] Reading server configuration from /etc/masterha/ ::  - [] GTID failover mode =  ::  - [] Current Alive Master: .(.: ::  - [ ::  - []   .(.:)  Version=.--bin:enabledTue Apr  ::  - []     Replicating from .(.: ::  - []     Primary candidate  ::  - []   .(.:)  Version=.--bin:enabledTue Apr  ::  - []     Replicating from .(.:.(:)? (YES/no): yes
Tue Apr  ::  - [] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take   ::  - [ ::  - [ ::  - [] Checking replication health on . ::  - [ ::  - [] Checking replication health on . ::  - [ ::  - [] . ::  - [.(.:+--.(.:+--.(.:.(.:+--.(.:+--.(.:.(.:) to .(.:)? (yes/ ::  - [] Checking whether .(.:) is ok  ::  - [ ::  - [] .(.: ::  - [] .(.: ::  - [] ** Phase  ::  - [ ::  - [] * Phase  ::  - [ ::  - [] Executing master ip online change script to disable  ::  - []   /usr/local/bin/master_ip_online_change --command=stop --orig_master_host=. --=. --orig_master_port= --orig_master_user= --orig_master_password= --new_master_host=. --new_master_ip=. --new_master_port= --new_master_user= --new_master_password= --orig_master_ssh_user=root --new_master_ssh_user=root   --orig_master_is_new_slaveTue Apr  ::   ::   Set read_only= ::   ::   . ::  - [ ::  - [ ::  - [ ::  - [ ::  - [] Orig master binlog:pos is mysql-bin.: ::  - []  Waiting to execute all relay logs on .(.: ::  - []  master_pos_wait(mysql-bin.:) completed on .(.:). Executed  ::  - []    ::  - [] Getting new masterTue Apr  ::  - []  mysql-bin.: ::  - [=, MASTER_PORT=, MASTER_LOG_FILE=, MASTER_LOG_POS=, MASTER_USER=, MASTER_PASSWORD=;Tue Apr  ::  - [] Executing master ip online change script to allow  ::  - []   /usr/local/bin/master_ip_online_change --command=start --orig_master_host=. --=. --orig_master_port= --orig_master_user= --orig_master_password= --new_master_host=. --new_master_ip=. --new_master_port= --new_master_user= --new_master_password= --orig_master_ssh_user=root --new_master_ssh_user=root   --orig_master_is_new_slaveTue Apr  ::   Set read_only=. on the new master: . ::  - [ ::  - [ ::  - [] * Switching slaves  ::  - [ ::  - [] -- Slave switch on host .(.:) started, pid:  ::  - [ ::  - [] Log messages from . ::  - [ ::  - []  Waiting to execute all relay logs on .(.: ::  - []  master_pos_wait(mysql-bin.:) completed on .(.:). Executed  ::  - []    ::  - []  Resetting slave .(.:) and starting replication from the new master .(.:)..Tue Apr  ::  - [ ::  - [ ::  - [] End of log messages from . ::  - [ ::  - [] -- Slave switch on host .(.: ::  - [ ::  - [ ::  - [ ::  - [ ::  - []  Resetting slave .(.:) and starting replication from the new master .(.:)..Tue Apr  ::  - [ ::  - [ ::  - [ ::  - [ ::  - [] * Phase  ::  - [ ::  - []  .: Resetting slave  ::  - [] Switching master to .(.:) completed successfully.
MySQL高可用方案MHA在线切换的步骤及原理
 
 
MHA在线切换的原理
 
1. 检查当前的配置信息及主从服务器的信息
 
    包括读取MHA的配置文件/etc/masterha/app1.cnf及检查当前slave的健康状态
 
2. 阻止对当前master的更新
 
   主要通过如下步骤:
 
   1> 等待1.5s($time_until_kill_threads*100ms),等待当前连接断开。
 
   2> 执行 read_only=1,阻止新的DML操作
 
   3> 等待0.5s,等待当前DML操作完成。
 
   4> kill掉所有连接。
 
   5> FLUSH NO_WRITE_TO_BINLOG TABLES
 
   6> FLUSH TABLES WITH READ LOCK
 
3. 等待新master执行完所有的relay log
 
Waiting to execute all relay logs on 192.168.244.20(192.168.244.20:3306)..
4. 将新master的read_only设置为off,并添加VIP
 
5. slave切换到新master上。
 
   1> 等待slave(192.168.244.30)应用完原主从复制产生的relay log,然后执行change master操作切换到新master上。
 
   2> 释放原master上加的锁。
 
   3> 因masterha_master_switch命令行中带有--orig_master_is_new_slave参数,故原master也切换为新master的从。
 
6. 清理新master的相关信息。
 
    主要是执行了reset slave all操作,清除之前的复制信息。
  
MHA在线切换需满足的条件
 
MHA在执行在线切换之前,会判断当前的主从复制信息,只有满足了以下条件,才能执行切换动作:
 
1. 所有SLAVE的IO线程和SQL线程都在运行。
 
2. 所有slave的Seconds_Behind_Master小于或等于running_updates_limit的值,该参数如果没有显示指定的话,则默认为1s
 
3. 在master上,通过show processlist输出,没有一个DML操作的时间大于running_updates_limit的值。
  
在线切换时,打开general log,各个服务器的操作信息
 
注:在执行masterha_master_switch命令时,会有两次确认操作
 
1. It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 192.168.244.10(192.168
.244.10:3306)? (YES/no):
 
2. Starting master switch from 192.168.244.10(192.168.244.10:3306) to 192.168.244.20(192.168.244.20:3306)? (yes/NO):
 
以下输出中间都有两次空白,其中第一次空白之前的输出对应第一次确认之前,第二次之前的输出对应第二次确认之前。
  
原master 192.168.244.10
 
MySQL高可用方案MHA在线切换的步骤及原理
170412 16:52:38    23 Connect    monitor@node4 on
                   23 Query    set autocommit=1
                   23 Query    SELECT CONNECTION_ID() AS Value170412 16:52:39    24 Connect    monitor@node4 on
                   24 Query    set autocommit=1
                   24 Query    SELECT CONNECTION_ID() AS Value                   24 Query    SET wait_timeout=86400
                   24 Query    SELECT @@global.server_id As Value                   24 Query    SELECT VERSION() AS Value                   24 Query    SELECT @@global.gtid_mode As Value                   24 Query    SHOW GLOBAL VARIABLES LIKE 'log_bin'
                   24 Query    SHOW MASTER STATUS                   24 Query    SELECT @@global.datadir AS Value                   24 Query    SELECT @@global.slave_parallel_workers AS Value                   24 Query    SHOW SLAVE STATUS                   24 Query    SELECT @@global.read_only As Value                   24 Query    SELECT @@global.relay_log_purge As Value           
170412 16:54:06    24 Query    FLUSH NO_WRITE_TO_BINLOG TABLES                   24 Query    SELECT GET_LOCK('MHA_Master_High_Availability_Monitor', '0') AS Value                   24 Query    SHOW PROCESSLIST                   
170412 16:55:51    24 Query    SHOW SLAVE STATUS                   24 Query    CHANGE MASTER TO MASTER_HOST='dummy_host'170412 16:55:52    24 Query    SHOW SLAVE STATUS                   24 Query    RESET SLAVE /*!50516 ALL */
                   24 Query    SELECT RELEASE_LOCK('MHA_Master_High_Availability_Monitor') As Value                   24 Quit    
                   25 Connect    monitor@node4 on
                   25 Query    set autocommit=1
                   25 Query    SELECT CONNECTION_ID() AS Value                   25 Query    SET sql_log_bin=0
                   25 Query    SHOW PROCESSLIST                   25 Query    SELECT @@global.read_only As Value                   25 Query    SET GLOBAL read_only=1
                   25 Query    SELECT @@global.read_only As Value                   25 Query    SHOW PROCESSLIST                   25 Query    SET sql_log_bin=1
                   25 Quit    
                   26 Connect    monitor@node4 on
                   26 Query    set autocommit=1
                   26 Query    SELECT CONNECTION_ID() AS Value                   26 Query    SET wait_timeout=86400
                   26 Query    FLUSH TABLES WITH READ LOCK                   26 Query    SHOW MASTER STATUS170412 16:55:53    26 Query    UNLOCK TABLES                   26 Query    CHANGE MASTER TO MASTER_HOST = '192.168.244.20' MASTER_USER = 'repl' MASTER_PASSWORD = <secret> MASTE
R_PORT = 3306 MASTER_LOG_FILE = 'mysql-bin.000010' MASTER_LOG_POS = 120           26 Query    SET GLOBAL relay_log_purge=0
                   26 Query    START SLAVE                   27 Connect Out    repl@192.168.244.20:3306
                   26 Query    SHOW SLAVE STATUS                   26 Query    SELECT RELEASE_LOCK('MHA_Master_High_Availability_Failover') As Value                   26 Quit
MySQL高可用方案MHA在线切换的步骤及原理
  
新master 192.168.244.20
 
MySQL高可用方案MHA在线切换的步骤及原理
170412 16:52:38    23 Connect    monitor@node4 on
                   23 Query    set autocommit=1
                   23 Query    SELECT CONNECTION_ID() AS Value170412 16:52:39    24 Connect    monitor@node4 on
                   24 Query    set autocommit=1
                   24 Query    SELECT CONNECTION_ID() AS Value                   24 Query    SET wait_timeout=86400
                   24 Query    SELECT @@global.server_id As Value                   24 Query    SELECT VERSION() AS Value                   24 Query    SELECT @@global.gtid_mode As Value                   24 Query    SHOW GLOBAL VARIABLES LIKE 'log_bin'
                   24 Query    SHOW MASTER STATUS                   24 Query    SELECT @@global.datadir AS Value                   24 Query    SELECT @@global.slave_parallel_workers AS Value                   24 Query    SHOW SLAVE STATUS                   24 Query    SELECT @@global.read_only As Value                   24 Query    SELECT @@global.relay_log_purge As Value                   24 Query    SELECT @@global.relay_log_info_repository AS Value                   24 Query    SELECT @@global.datadir AS Value                   24 Query    SELECT @@global.relay_log_info_file AS Value                   24 Query    SHOW SLAVE STATUS                   24 Query    SELECT Repl_slave_priv AS Value FROM mysql.user WHERE user = 'repl'
           
           170412 16:54:06    24 Query    SELECT GET_LOCK('MHA_Master_High_Availability_Failover', '0') AS Value                   24 Query    SHOW SLAVE STATUS                   24 Query    SHOW SLAVE STATUS           
170412 16:55:52    24 Query    SHOW PROCESSLIST                   25 Connect    monitor@node4 on
                   25 Query    set autocommit=1
                   25 Query    SELECT CONNECTION_ID() AS Value                   25 Query    SELECT @@global.read_only As Value                   25 Query    SELECT @@global.read_only As Value                   25 Quit    
                   24 Query    SHOW SLAVE STATUS                   24 Query    SELECT MASTER_POS_WAIT('mysql-bin.000017','120',0) AS Result                   24 Query    STOP SLAVE SQL_THREAD                   24 Query    SHOW SLAVE STATUS        &n

(编辑:武林网)

发表评论 共有条评论
用户名: 密码:
验证码: 匿名发表