目录
1.下载
2.搭建mha
2.1 系统配置
2.2 架构
2.3 添加ssh公钥信任
2.4 安装mha节点
2.5 manager配置文件
2.6 检查
2.7 启动manager进程
2.8 碰到的问题
3.测试切换
3.1 正常切换测试
3.2 回切测试
3.3 雪崩测试
3.4 主从不一致切换测试
下载
mha链接地址:http://pan.baidu.com/s/1pJkDGX9#dir/path=%2Fmysql%2FHA%2Fmha
或者:https://code.google.com/p/mysql-master-ha/
添加一个yum源:wget http://dl.fedoraproject.org/pub/epel/5/x86_64/epel-release-5-4.noarch.rpm
rpm -ivh epel-release-5-4.noarch.rpm
Mha分manager节点和node节点。可以下载源码包,或者rpm包,随个人喜好,注意,版本与系统内核要匹配。
搭建mha
1.系统配置
配置如下:
|
系统 |
内核 |
Mysql版本 |
内存 |
|
CentOS release 5.8 |
Linux 2.6.18-308.el5xen |
Mysql 5.5.35 |
2G |
2.架构
服务器列表:
|
IP |
机器名 |
角色 |
|
192.168.2.7 |
haproxy001 |
manager |
|
192.168.1.241 |
Hd-dm-test01 |
Node |
|
192.168.1.242 |
Hd-dm-test02 |
Node |
|
192.168.1.243 |
Hd-dm-test03 |
node |
架构图:
3.添加ssh公钥信任
(1)node节点之间配置公钥信任,Manager节点向node节点添加公钥信任
(2)创建公钥密钥,并互传公钥
[root@haproxy001 ~]$ ssh-keygen -t rsa
结果如下:
[root@haproxy001 ~]$ ls -l ~/.ssh/
-rw------- 1 leiche leiche 1675 07-04 10:18 id_rsa
-rw-r--r-- 1 leiche leiche 399 07-04 10:18 id_rsa.pub
-rw-r--r-- 1 leiche leiche 1182 07-04 10:27 known_hosts
配置manager公钥信任:
[root@haproxy001 ~] ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.16.1.241
[root@haproxy001 ~] ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.16.1.242
[root@haproxy001 ~] ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.16.1.243
配置node节点间的公钥信任:
[root@hd-dm-test01~] ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.16.1.242
[root@hd-dm-test01~] ssh-copy-id -i /root/.ssh/id_rsa.pub root@172.16.1.243
Ssh-copy-id加节点信任是追加的方式,在~/.ssh/目录下生成authorized_keys文件,可以打开查看:
[root@hd-dm-test03 .ssh]# sudo cat authorized_keys ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA35i+lWMDeWJYvamobI3OdmSFXLMt4ngsI6aiJCkv2ZMaeoKEzdeK2GWP9P/F7dU2j1DCgKh8yzF9o4E713k5KW92RmeyglDt590R0PPNvGDFAauQ5wEkufylYLbaJ0fwsJPjeM4ZBrT7RqdYGn3lGQFW32Cih9LcV2A+8NkQbrh/kG2wcTiaejFQFgSKn87nsokhg5L/zY6qkeuV0dPfqfpeSI8uhTI+VUC83/5odeunXrPE30o6fZZDpp2oszt2TEElldZ6pS9mL3ZhGuP/o/IPe+w5/cK9J4C815y2mr9Agr/UUQoj1K4WWRn5uZ5y/pyvAAlbPteNVVlx4djoIw== root@haproxy001
...
(3)检查
[root@haproxy001 ~]# ssh 172.16.1.243 "/sbin/ifconfig |grep 'inet addr' |head -1"
inet addr:172.16.1.243 Bcast:172.16.1.255 Mask:255.255.255.0
4.安装mha节点
在manager服务器2.7上安装manager节点:
rpm -ivh mha4mysql-manager-0.53-0.noarch.rpm
在manager和node服务器安装node节点:
sudo rpm -ivh mha4mysql-node-0.53-0.noarch.rpm
依赖包:
sudo yum -y install perl-DBD-MySQL.x86_64
sudo yum -y install perl-Log-Dispatch
sudo yum -y install perl-Config-Tiny
sudo yum -y install perl-Parallel-ForkManager
5.配置文件
在manager服务器上创建目录
mkdir /etc/masterha
mkdir -p /masterha/app1
生成配置文件/etc/masterha/app1.cnf
[server default]
#manager dir
manager_workdir=/masterha/app1
manager_log=/masterha/app1/manager.log
remote_workdir=/masterha/app1
#mysql manager user
user=root
password=123456
#node server user
ssh_user=root
#replication_user
repl_user=repluser
repl_password=hoodong
#checking master every second
ping_interval=1
#promote script
#shutdown_script=""
#master_ip_failover_script="/usr/local/bin/master_ip_failover"
#master_ip_online_change_script=""
#report_script=""
[server1]
hostname=172.16.1.241
master_binlog_dir="/data/mysqllog/3306"
ssh_port=22
candidate_master=1
[server2]
hostname=172.16.1.242
master_binlog_dir="/data/mysqllog/3306"
ssh_port=22
candidate_master=1
[server3]
hostname=172.16.1.243
master_binlog_dir="/data/mysqllog/3306"
ssh_port=22
candidate_master=1
6.检查:
检查ssh
[root@haproxy001 app1]# masterha_check_ssh --conf=/etc/masterha/app1.cnf
Sat Jul 5 12:57:24 2014 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Jul 5 12:57:24 2014 - [info] Reading application default configurations from /etc/masterha/app1.cnf..
Sat Jul 5 12:57:24 2014 - [info] Reading server configurations from /etc/masterha/app1.cnf..
Sat Jul 5 12:57:24 2014 - [info] Starting SSH connection tests..
Sat Jul 5 12:57:26 2014 - [debug]
Sat Jul 5 12:57:24 2014 - [debug] Connecting via SSH from root@172.16.1.241(172.16.1.241:22) to root@172.16.1.242(172.16.1.242:22)..
Sat Jul 5 12:57:24 2014 - [debug] ok.
Sat Jul 5 12:57:24 2014 - [debug] Connecting via SSH from root@172.16.1.241(172.16.1.241:22) to root@172.16.1.243(172.16.1.243:22)..
Sat Jul 5 12:57:26 2014 - [debug] ok.
Sat Jul 5 12:57:26 2014 - [debug]
Sat Jul 5 12:57:24 2014 - [debug] Connecting via SSH from root@172.16.1.242(172.16.1.242:22) to root@172.16.1.241(172.16.1.241:22)..
Sat Jul 5 12:57:25 2014 - [debug] ok.
Sat Jul 5 12:57:25 2014 - [debug] Connecting via SSH from root@172.16.1.242(172.16.1.242:22) to root@172.16.1.243(172.16.1.243:22)..
Sat Jul 5 12:57:26 2014 - [debug] ok.
Sat Jul 5 12:57:27 2014 - [debug]
Sat Jul 5 12:57:25 2014 - [debug] Connecting via SSH from root@172.16.1.243(172.16.1.243:22) to root@172.16.1.241(172.16.1.241:22)..
Sat Jul 5 12:57:26 2014 - [debug] ok.
Sat Jul 5 12:57:26 2014 - [debug] Connecting via SSH from root@172.16.1.243(172.16.1.243:22) to root@172.16.1.242(172.16.1.242:22)..
Sat Jul 5 12:57:27 2014 - [debug] ok.
Sat Jul 5 12:57:27 2014 - [info] All SSH connection tests passed successfully.
检查复制
[root@haproxy001 app1]# masterha_check_repl --conf=/etc/masterha/app1.cnf
Sat Jul 5 12:57:05 2014 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Jul 5 12:57:05 2014 - [info] Reading application default configurations from /etc/masterha/app1.cnf..
Sat Jul 5 12:57:05 2014 - [info] Reading server configurations from /etc/masterha/app1.cnf..
Sat Jul 5 12:57:05 2014 - [info] MHA::MasterMonitor version 0.53.
Sat Jul 5 12:57:06 2014 - [info] Dead Servers:
Sat Jul 5 12:57:06 2014 - [info] Alive Servers:
Sat Jul 5 12:57:06 2014 - [info] 172.16.1.241(172.16.1.241:3306)
Sat Jul 5 12:57:06 2014 - [info] 172.16.1.242(172.16.1.242:3306)
Sat Jul 5 12:57:06 2014 - [info] 172.16.1.243(172.16.1.243:3306)
Sat Jul 5 12:57:06 2014 - [info] Alive Slaves:
Sat Jul 5 12:57:06 2014 - [info] 172.16.1.242(172.16.1.242:3306) Version=5.5.35-log (oldest major version between slaves) log-bin:enabled
Sat Jul 5 12:57:06 2014 - [info] Replicating from 172.16.1.241(172.16.1.241:3306)
Sat Jul 5 12:57:06 2014 - [info] Primary candidate for the new Master (candidate_master is set)
Sat Jul 5 12:57:06 2014 - [info] 172.16.1.243(172.16.1.243:3306) Version=5.5.35-log (oldest major version between slaves) log-bin:enabled
Sat Jul 5 12:57:06 2014 - [info] Replicating from 172.16.1.241(172.16.1.241:3306)
Sat Jul 5 12:57:06 2014 - [info] Primary candidate for the new Master (candidate_master is set)
Sat Jul 5 12:57:06 2014 - [info] Current Alive Master: 172.16.1.241(172.16.1.241:3306)
Sat Jul 5 12:57:06 2014 - [info] Checking slave configurations..
Sat Jul 5 12:57:06 2014 - [info] Checking replication filtering settings..
Sat Jul 5 12:57:06 2014 - [info] binlog_do_db= , binlog_ignore_db=
Sat Jul 5 12:57:06 2014 - [info] Replication filtering check ok.
Sat Jul 5 12:57:06 2014 - [info] Starting SSH connection tests..
Sat Jul 5 12:57:09 2014 - [info] All SSH connection tests passed successfully.
Sat Jul 5 12:57:09 2014 - [info] Checking MHA Node version..
Sat Jul 5 12:57:09 2014 - [info] Version check ok.
Sat Jul 5 12:57:09 2014 - [info] Checking SSH publickey authentication settings on the current master..
Sat Jul 5 12:57:10 2014 - [info] HealthCheck: SSH to 172.16.1.241 is reachable.
Sat Jul 5 12:57:10 2014 - [info] Master MHA Node version is 0.53.
Sat Jul 5 12:57:10 2014 - [info] Checking recovery script configurations on the current master..
Sat Jul 5 12:57:10 2014 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/mysqllog/3306 --output_file=/masterha/app1/save_binary_logs_test --manager_version=0.53 --start_file=mysql3306.000008
Sat Jul 5 12:57:10 2014 - [info] Connecting to root@172.16.1.241(172.16.1.241)..
Creating /masterha/app1 if not exists.. ok.
Checking output directory is accessible or not..
ok.
Binlog found at /data/mysqllog/3306, up to mysql3306.000008
Sat Jul 5 12:57:11 2014 - [info] Master setting check done.
Sat Jul 5 12:57:11 2014 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Sat Jul 5 12:57:11 2014 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user=root --slave_host=172.16.1.242 --slave_ip=172.16.1.242 --slave_port=3306 --workdir=/masterha/app1 --target_version=5.5.35-log --manager_version=0.53 --relay_log_info=/data/mysqldata/3306/relay-log.info --relay_dir=/data/mysqldata/3306/ --slave_pass=xxx
Sat Jul 5 12:57:11 2014 - [info] Connecting to root@172.16.1.242(172.16.1.242:22)..
Checking slave recovery environment settings..
Opening /data/mysqldata/3306/relay-log.info ... ok.
Relay log found at /data/mysqllog/3306, up to mysql-relay-bin.000005
Temporary relay log file is /data/mysqllog/3306/mysql-relay-bin.000005
Testing mysql connection and privileges.. done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Sat Jul 5 12:57:11 2014 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user=root --slave_host=172.16.1.243 --slave_ip=172.16.1.243 --slave_port=3306 --workdir=/masterha/app1 --target_version=5.5.35-log --manager_version=0.53 --relay_log_info=/data/mysqldata/3306/relay-log.info --relay_dir=/data/mysqldata/3306/ --slave_pass=xxx
Sat Jul 5 12:57:11 2014 - [info] Connecting to root@172.16.1.243(172.16.1.243:22)..
Checking slave recovery environment settings..
Opening /data/mysqldata/3306/relay-log.info ... ok.
Relay log found at /data/mysqllog/3306, up to mysql-relay-bin.000016
Temporary relay log file is /data/mysqllog/3306/mysql-relay-bin.000016
Testing mysql connection and privileges.. done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Sat Jul 5 12:57:12 2014 - [info] Slaves settings check done.
Sat Jul 5 12:57:12 2014 - [info]
172.16.1.241 (current master)
+--172.16.1.242
+--172.16.1.243
Sat Jul 5 12:57:12 2014 - [info] Checking replication health on 172.16.1.242..
Sat Jul 5 12:57:12 2014 - [info] ok.
Sat Jul 5 12:57:12 2014 - [info] Checking replication health on 172.16.1.243..
Sat Jul 5 12:57:12 2014 - [info] ok.
Sat Jul 5 12:57:12 2014 - [warning] master_ip_failover_script is not defined.
Sat Jul 5 12:57:12 2014 - [warning] shutdown_script is not defined.
Sat Jul 5 12:57:12 2014 - [info] Got exit code 0 (Not master dead).
MySQL Replication Health is OK.
7.启动manager节点进程
启动manager节点进程,日志被计入/data/masterha/manager.log
masterha_manager --conf=/etc/masterha/app1.cnf &
查看日志
Sat Jul 5 17:41:15 2014 - [info] MHA::MasterMonitor version 0.53. Sat Jul 5 17:41:15 2014 - [info] Dead Servers: Sat Jul 5 17:41:15 2014 - [info] Alive Servers: Sat Jul 5 17:41:15 2014 - [info] 172.16.1.241(172.16.1.241:3306) Sat Jul 5 17:41:15 2014 - [info] 172.16.1.242(172.16.1.242:3306) Sat Jul 5 17:41:15 2014 - [info] 172.16.1.243(172.16.1.243:3306) Sat Jul 5 17:41:15 2014 - [info] Alive Slaves: Sat Jul 5 17:41:15 2014 - [info] 172.16.1.242(172.16.1.242:3306) Version=5.5.35-log (oldest major version between slaves) log-bin:enabled Sat Jul 5 17:41:15 2014 - [info] Replicating from 172.16.1.241(172.16.1.241:3306) Sat Jul 5 17:41:15 2014 - [info] Primary candidate for the new Master (candidate_master is set) Sat Jul 5 17:41:15 2014 - [info] 172.16.1.243(172.16.1.243:3306) Version=5.5.35-log (oldest major version between slaves) log-bin:enabled Sat Jul 5 17:41:15 2014 - [info] Replicating from 172.16.1.241(172.16.1.241:3306) Sat Jul 5 17:41:15 2014 - [info] Primary candidate for the new Master (candidate_master is set) Sat Jul 5 17:41:15 2014 - [info] Current Alive Master: 172.16.1.241(172.16.1.241:3306) Sat Jul 5 17:41:15 2014 - [info] Checking slave configurations.. Sat Jul 5 17:41:15 2014 - [info] Checking replication filtering settings.. Sat Jul 5 17:41:15 2014 - [info] binlog_do_db= , binlog_ignore_db= Sat Jul 5 17:41:15 2014 - [info] Replication filtering check ok. Sat Jul 5 17:41:15 2014 - [info] Starting SSH connection tests.. Sat Jul 5 17:41:18 2014 - [info] All SSH connection tests passed successfully. Sat Jul 5 17:41:18 2014 - [info] Checking MHA Node version.. Sat Jul 5 17:41:19 2014 - [info] Version check ok. Sat Jul 5 17:41:19 2014 - [info] Checking SSH publickey authentication settings on the current master.. Sat Jul 5 17:41:20 2014 - [info] HealthCheck: SSH to 172.16.1.241 is reachable. Sat Jul 5 17:41:20 2014 - [info] Master MHA Node version is 0.53. Sat Jul 5 17:41:20 2014 - [info] Checking recovery script configurations on the current master.. Sat Jul 5 17:41:20 2014 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/mysqllog/3306 --output_file=/masterha/app1/save_binary_logs_test --manager_version=0.53 --start_file=mysql3306.000008 Sat Jul 5 17:41:20 2014 - [info] Connecting to root@172.16.1.241(172.16.1.241).. Creating /masterha/app1 if not exists.. ok. Checking output directory is accessible or not.. ok. Binlog found at /data/mysqllog/3306, up to mysql3306.000008 Sat Jul 5 17:41:20 2014 - [info] Master setting check done. Sat Jul 5 17:41:20 2014 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers.. Sat Jul 5 17:41:20 2014 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user=root --slave_host=172.16.1.242 --slave_ip=172.16.1.242 --slave_port=3306 --workdir=/masterha/app1 --target_version=5.5.35-log --manager_version=0.53 --relay_log_info=/data/mysqldata/3306/relay-log.info --relay_dir=/data/mysqldata/3306/ --slave_pass=xxx Sat Jul 5 17:41:20 2014 - [info] Connecting to root@172.16.1.242(172.16.1.242:22).. Checking slave recovery environment settings.. Opening /data/mysqldata/3306/relay-log.info ... ok. Relay log found at /data/mysqllog/3306, up to mysql-relay-bin.000005 Temporary relay log file is /data/mysqllog/3306/mysql-relay-bin.000005 Testing mysql connection and privileges.. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Sat Jul 5 17:41:21 2014 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user=root --slave_host=172.16.1.243 --slave_ip=172.16.1.243 --slave_port=3306 --workdir=/masterha/app1 --target_version=5.5.35-log --manager_version=0.53 --relay_log_info=/data/mysqldata/3306/relay-log.info --relay_dir=/data/mysqldata/3306/ --slave_pass=xxx Sat Jul 5 17:41:21 2014 - [info] Connecting to root@172.16.1.243(172.16.1.243:22).. Checking slave recovery environment settings.. Opening /data/mysqldata/3306/relay-log.info ... ok. Relay log found at /data/mysqllog/3306, up to mysql-relay-bin.000016 Temporary relay log file is /data/mysqllog/3306/mysql-relay-bin.000016 Testing mysql connection and privileges.. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Sat Jul 5 17:41:21 2014 - [info] Slaves settings check done. Sat Jul 5 17:41:21 2014 - [info] 172.16.1.241 (current master) +--172.16.1.242 +--172.16.1.243 Sat Jul 5 17:41:21 2014 - [warning] master_ip_failover_script is not defined. Sat Jul 5 17:41:21 2014 - [warning] shutdown_script is not defined. Sat Jul 5 17:41:21 2014 - [info] Set master ping interval 1 seconds. Sat Jul 5 17:41:21 2014 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes. Sat Jul 5 17:41:21 2014 - [info] Starting ping health check on 172.16.1.241(172.16.1.241:3306).. Sat Jul 5 17:41:21 2014 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..