Keepalived 非抢占模式详解 Nginx+keepalived实战
背景:俩节点haproxy通过keepalived实现高可用
说明:haproxy的实际运行过程中,当master发生异常,且后期恢复master正常后,存在抢占或非抢占两种情况。简单点说抢占模式就是,当master宕机后,backup 接管服务。后续当master恢复后,vip漂移到master上,master重新接管服务,多了一次多余的vip切换,而在实际生产中是不需要这样。实际生产中是,当 原先的master恢复后,状态变为backup,不接管服务,这是非抢占模式。
server1为master,server2位backup,且master优先级大于backup。keepalived启动后server1获得master,server2为backup。当server1宕机后, server2接管服务。当server1恢复后,server1重新接管服务变为master,而server2变为backup。属于抢占式
server1和server2都为backup。我们要注意启动server服务的启动顺序,先启动的升级为master,与优先级无关。且配置nopreempt
项
比如server1获得master权限,server2为backup。此时server1宕机后,server2接管服务升级为master。当server1恢复后权限将为backup,不会争抢 server2的master权限,server2将会继续master权限。属于非抢占式
重点:非抢占式俩节点state必须为bakcup,且必须配置nopreempt
注意:这样配置后,我们要注意启动服务的顺序,优先启动的获取master权限,与优先级没有关系了
总结:抢占模式即MASTER从故障中恢复后,会将VIP从BACKUP节点中抢占过来。非抢占模式即MASTER恢复后不抢占BACKUP升级为MASTER后的VIP
1、两个节点的state都必须配置为BACKUP
2、两个节点都必须加上配置 nopreempt
3、其中一个节点的优先级必须要高于另外一个节点的优先级。
keepalived工作原理
keepalived可提供vrrp以及health-check功能,可以只用它提供双机浮动的vip(vrrp虚拟路由功能),这样可以简单实现一个双机热备高可用功能;keepalived是以VRRP虚拟路由冗余协议为基础实现高可用的,可以认为是实现路由器高可用的协议,即将N台提供相同功能的路由器组成一个路由器组,这个组里面有一个master和多个backup,master上面有一个对外提供服务的vip(该路由器所在局域网内其他机器的默认路由为该vip),master会发组播,当backup收不到VRRP包时就认为master宕掉了,这时就需要根据VRRP的优先级来选举一个backup当master。这样的话就可以保证路由器的高可用了。
Keepalived不抢占机制(nopreempt)
当Master出现问题后,Backup会竞选为新的Master,那么之前的Master如果故障恢复后,是继续成为Master还是变成Backup呢?默认情况下,如果没设置不抢占,那么之前的Master起来后还是会继续抢占成为Master,也就是说,整个过程需要发生两次切换;主机诶单出故障会发送Master —> Backup,主节点恢复会发送 Backup —>Master;这样对业务频繁的切换是不能容忍的,因此我们希望Master起来后成为Backup,所以要设置不抢占。
Keepalived里面提供了 nopreempt 这个配置只能用在状态为Backup的机器上,但是我们明明希望的是Master不进行抢占,那没办法,Master的状态也得设置为Backup,也就是说两台负载均衡器都要讲state状态设置为Backup;那么谁是Master?就要通过优先级priority的高低来决定了,优先级高得成为Master,反之。
master节点keepalived配置如下(不抢占机制)
[root@real-server1 ~]# cat /etc/keepalived/keepalived.conf
global_defs {
router_id real-server1 #运行keepalived机器的标识
script_user root
enable_script_security
}
vrrp_script chk_nginx {
script "/data/shell/check_nginx_status.sh" #监控服务脚本,脚本记得授予x执行权限;
interval 2 #指定脚本执行的间隔。单位是秒。默认为1s。
}
vrrp_instance VI_1 {
state BACKUP
interface ens32 #绑定虚拟机的IP
virtual_router_id 151 #虚拟路由id,和从机保持一致
priority 100
nopreempt #设置为不抢占
advert_int 5 #查间隔,默认1秒,VRRP心跳包的发送周期,单位为s 组播信息发送间隔,两个节点设置必须一样
authentication {
auth_type PASS #主辅认证密码(生产环境介意修改),最长支持八位
auth_pass 1111
}
virtual_ipaddress { #虚拟IP地址
192.168.179.199
}
track_script {
chk_nginx
}
}
#这个脚本你可以测试一下,将keepalived正常启动,然后你pkill nginx,执行这个脚本看看后面会发生什么,通过/var/log/messages来验证这个脚本是否正确,或者systemctl status keepalived查看状态,正确之后就可以配置在你的keepalived的配置文件当中
[root@real-server1 ~]# cat /data/shell/check_nginx_status.sh
#!/bin/bash
nginx_status=$(ps -ef | grep nginx | grep -v grep | grep -v check | wc -l)
if [ $nginx_status -eq 0 ];then
systemctl stop keepalived.service
fi
[root@real-server1 ~]# chmod o+x /data/shell/check_nginx_status.sh
#用脚本实现健康检查,如果nginx进程为0就要发生keepalived切换,实现VIP漂移。当你的nginx挂掉了,那么你的keepalived永远都启动不了,因为下面脚本定义了systemctl stop keepalived.service,nginx没有起来那么keepalived起来会自动关闭
#advert_int 5 检查间隔,默认1秒,VRRP心跳包的发送周期,单位为s,组播信息发送间隔,可以看到组播包里面的信息包含了virtual_router_id 151 虚拟路由ID和优先级priority 100
[root@localhost shell]# tcpdump -i ens32 -nn net 224.0.0.18
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens32, link-type EN10MB (Ethernet), capture size 262144 bytes
10:26:42.910170 IP 192.168.179.102 > 224.0.0.18: VRRPv2, Advertisement, vrid 151, prio 100, authtype simple, intvl 5s, length 20
10:26:47.911831 IP 192.168.179.102 > 224.0.0.18: VRRPv2, Advertisement, vrid 151, prio 100, authtype simple, intvl 5s, length 20
10:26:52.915502 IP 192.168.179.102 > 224.0.0.18: VRRPv2, Advertisement, vrid 151, prio 100, authtype simple, intvl 5s, length 20
测试配置是否正确
[root@real-server1 ~]# ps -ef | grep keepalived | grep -v grep
root 70592 1 0 20:05 ? 00:00:00 /usr/sbin/keepalived -D
root 70593 70592 0 20:05 ? 00:00:00 /usr/sbin/keepalived -D
root 70594 70592 0 20:05 ? 00:00:00 /usr/sbin/keepalived -D
#keepalived正常启动的时候,共启动3个进程,一个是父进程,负责监控其子进程,一个是vrrp子进程,另外一个是checkers子进程。
两个子进程都被系统watchlog看管,两个子进程各自负责复杂自己的事。Healthcheck子进程检查各自服务器的健康状况,例如http,lvs。如果healthchecks进程检查到master上服务不可用了,就会通知本机上的vrrp子进程,让他删除通告,并且去掉虚拟IP,转换为BACKUP状态。
Jul 31 20:05:38 real-server1 Keepalived[70591]: Starting Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2
Jul 31 20:05:38 real-server1 Keepalived[70591]: Opening file '/etc/keepalived/keepalived.conf'.
Jul 31 20:05:38 real-server1 systemd: PID file /var/run/keepalived.pid not readable (yet?) after start.
Jul 31 20:05:38 real-server1 Keepalived[70592]: Starting Healthcheck child process, pid=70593
Jul 31 20:05:38 real-server1 Keepalived[70592]: Starting VRRP child process, pid=70594
Jul 31 20:05:38 real-server1 systemd: Started LVS and VRRP High Availability Monitor.
Jul 31 20:05:38 real-server1 Keepalived_healthcheckers[70593]: Opening file '/etc/keepalived/keepalived.conf'.
Jul 31 20:05:38 real-server1 Keepalived_vrrp[70594]: Registering Kernel netlink reflector
Jul 31 20:05:38 real-server1 Keepalived_vrrp[70594]: Registering Kernel netlink command channel
Jul 31 20:05:38 real-server1 Keepalived_vrrp[70594]: Registering gratuitous ARP shared channel
Jul 31 20:05:38 real-server1 Keepalived_vrrp[70594]: Opening file '/etc/keepalived/keepalived.conf'.
Jul 31 20:05:38 real-server1 Keepalived_vrrp[70594]: VRRP_Instance(VI_1) removing protocol VIPs.
Jul 31 20:05:38 real-server1 Keepalived_vrrp[70594]: Using LinkWatch kernel netlink reflector...
Jul 31 20:05:38 real-server1 Keepalived_vrrp[70594]: VRRP_Instance(VI_1) Entering BACKUP STATE
Jul 31 20:05:38 real-server1 Keepalived_vrrp[70594]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]
Jul 31 20:05:38 real-server1 Keepalived_vrrp[70594]: VRRP_Script(chk_nginx) succeeded
#测试健康检查脚本是否有用,可以看到脚本没问题
[root@real-server2 ~]# pkill nginx
Jul 31 20:06:50 real-server1 Keepalived[70592]: Stopping
Jul 31 20:06:50 real-server1 systemd: Stopping LVS and VRRP High Availability Monitor...
Jul 31 20:06:50 real-server1 Keepalived_vrrp[70594]: VRRP_Instance(VI_1) sent 0 priority
Jul 31 20:06:50 real-server1 Keepalived_vrrp[70594]: VRRP_Instance(VI_1) removing protocol VIPs.
Jul 31 20:06:50 real-server1 Keepalived_healthcheckers[70593]: Stopped
Jul 31 20:06:51 real-server1 Keepalived_vrrp[70594]: Stopped
Jul 31 20:06:51 real-server1 systemd: Stopped LVS and VRRP High Availability Monitor.
Jul 31 20:06:51 real-server1 Keepalived[70592]: Stopped Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2
backup节点keepalived配置如下
[root@real-server2 ~]# cat /etc/keepalived/keepalived.conf
global_defs {
router_id real-server2
script_user root
enable_script_security
}
vrrp_script chk_nginx {
script "/data/shell/check_nginx_status.sh"
interval 2
}
vrrp_instance VI_1 {
state BACKUP
interface ens32
virtual_router_id 151
priority 50
advert_int 5
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.179.199
}
track_script {
chk_nginx
}
}
不抢占机制演示如下
#Master节点测试,将nginx进程杀掉,根据健康检查脚本会实现vip漂移到备节点
[root@real-server1 ~]# ip a
2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:61:90:c1 brd ff:ff:ff:ff:ff:ff
inet 192.168.179.103/24 brd 192.168.179.255 scope global ens32
valid_lft forever preferred_lft forever
inet 192.168.179.199/32 scope global ens32
valid_lft forever preferred_lft forever
inet 192.168.179.199/24 brd 192.168.179.255 scope global secondary ens32:1
valid_lft forever preferred_lft forever
inet6 fe80::f54d:5639:6237:2d0e/64 scope link
valid_lft forever preferred_lft forever
[root@real-server1 ~]# pkill nginx
#keepalived主节点日志如下
Jul 31 20:27:45 real-server1 Keepalived[72926]: Stopping
Jul 31 20:27:45 real-server1 systemd: Stopping LVS and VRRP High Availability Monitor...
Jul 31 20:27:45 real-server1 Keepalived_vrrp[72928]: VRRP_Instance(VI_1) sent 0 priority
Jul 31 20:27:45 real-server1 Keepalived_vrrp[72928]: VRRP_Instance(VI_1) removing protocol VIPs.
Jul 31 20:27:45 real-server1 Keepalived_healthcheckers[72927]: Stopped
Jul 31 20:27:46 real-server1 Keepalived_vrrp[72928]: Stopped
Jul 31 20:27:46 real-server1 Keepalived[72926]: Stopped Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2
Jul 31 20:27:46 real-server1 systemd: Stopped LVS and VRRP High Availability Monitor.
#来到备节点查看
[root@real-server2 ~]# ip a
2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 00:0c:29:a7:ff:f7 brd ff:ff:ff:ff:ff:ff
inet 192.168.179.104/24 brd 192.168.179.255 scope global ens32
valid_lft forever preferred_lft forever
inet 192.168.179.199/32 scope global ens32
valid_lft forever preferred_lft forever
inet6 fe80::831c:6df1:a633:742a/64 scope link
valid_lft forever preferred_lft forever
#备节点的keepalived日志,可以看到故障转移成功
Jul 31 08:27:45 real-server2 Keepalived_vrrp[8961]: VRRP_Instance(VI_1) Transition to MASTER STATE
Jul 31 08:27:50 real-server2 Keepalived_vrrp[8961]: VRRP_Instance(VI_1) Entering MASTER STATE
Jul 31 08:27:50 real-server2 Keepalived_vrrp[8961]: VRRP_Instance(VI_1) setting protocol VIPs.
Jul 31 08:27:50 real-server2 Keepalived_vrrp[8961]: Sending gratuitous ARP on ens32 for 192.168.179.199
Jul 31 08:27:50 real-server2 Keepalived_vrrp[8961]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on ens32 for 192.168.179.199
Jul 31 08:27:50 real-server2 Keepalived_vrrp[8961]: Sending gratuitous ARP on ens32 for 192.168.179.199
Jul 31 08:27:50 real-server2 Keepalived_vrrp[8961]: Sending gratuitous ARP on ens32 for 192.168.179.199
Jul 31 08:27:50 real-server2 Keepalived_vrrp[8961]: Sending gratuitous ARP on ens32 for 192.168.179.199
Jul 31 08:27:50 real-server2 Keepalived_vrrp[8961]: Sending gratuitous ARP on ens32 for 192.168.179.199
Jul 31 08:27:55 real-server2 Keepalived_vrrp[8961]: Sending gratuitous ARP on ens32 for 192.168.179.199
Jul 31 08:27:55 real-server2 Keepalived_vrrp[8961]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on ens32 for 192.168.179.199
#Master主节点将nginx启动,并且启动keepalived,可以看到现在进入了backup状态,没有抢占
[root@real-server1 ~]# /usr/local/nginx/sbin/nginx
[root@real-server1 ~]# systemctl start keepalived
Jul 31 20:43:33 real-server1 Keepalived[75196]: Starting Healthcheck child process, pid=75197
Jul 31 20:43:33 real-server1 Keepalived[75196]: Starting VRRP child process, pid=75198
Jul 31 20:43:33 real-server1 systemd: Started LVS and VRRP High Availability Monitor.
Jul 31 20:43:33 real-server1 Keepalived_healthcheckers[75197]: Opening file '/etc/keepalived/keepalived.conf'.
Jul 31 20:43:33 real-server1 Keepalived_vrrp[75198]: Registering Kernel netlink reflector
Jul 31 20:43:33 real-server1 Keepalived_vrrp[75198]: Registering Kernel netlink command channel
Jul 31 20:43:33 real-server1 Keepalived_vrrp[75198]: Registering gratuitous ARP shared channel
Jul 31 20:43:33 real-server1 Keepalived_vrrp[75198]: Opening file '/etc/keepalived/keepalived.conf'.
Jul 31 20:43:33 real-server1 Keepalived_vrrp[75198]: VRRP_Instance(VI_1) removing protocol VIPs.
Jul 31 20:43:33 real-server1 Keepalived_vrrp[75198]: Using LinkWatch kernel netlink reflector...
Jul 31 20:43:33 real-server1 Keepalived_vrrp[75198]: VRRP_Instance(VI_1) Entering BACKUP STATE
Jul 31 20:43:33 real-server1 Keepalived_vrrp[75198]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]
Jul 31 20:43:33 real-server1 Keepalived_vrrp[75198]: VRRP_Script(chk_nginx) succeeded
最后总结上面简化配置为
A调度机器设置为:
vrrp_instance VI_feng
{
....
state backup
priority 100
nopreempt
....
}
B调度机器设置为:
vrrp_instance VI_feng
{
....
state backup
priority 70
nopreempt
....
}
不抢占是配置在优先级高的机器上面,同时状态要是backup,即集群内部要想实现不抢占,状态都需要设置为backup,优先级还是正常有高有低。谁优先级高配置一个不抢占参数nopreempt(因为优先级高的会抢占VIP)。每次抢占就需要发生切换和漂移,来回切换漂移影响业务访问,服务要中断!!!!
目录 返回
首页