理论介绍

Keepalived的简介

Keepalived软件起初是专为LVS负载均衡软件设计的,用来管理并监控LVS集群系统中各个服务节点的状态,后来又加入了可以实现高可用的VRRP功能。因此,Keepalived除了能够管理LVS软件外,还可以作为其他服务(例如:Nginx、Haproxy、MySQL等)的高可用解决方案软件。

  Keepalived软件主要是通过VRRP协议实现高可用功能的。VRRPVirtual Router RedundancyProtocol(虚拟路由器冗余协议)的缩写,VRRP出现的目的就是为了解决静态路由单点故障问题的,它能够保证当个别节点宕机时,整个网络可以不间断地运行。

所以,Keepalived 一方面具有配置管理LVS的功能,同时还具有对LVS下面节点进行健康检查的功能,另一方面也可实现系统网络服务的高可用功能。

keepalived高可用故障切换转移原理

Keepalived高可用服务对之间的故障切换转移,是通过VRRP (Virtual Router Redundancy Protocol,虚拟路由器冗余协议)来实现的。

  Keepalived服务正常工作时,主 Master节点会不断地向备节点发送(多播的方式)心跳消息,用以告诉备Backup节点自己还活看,当主 Master节点发生故障时,就无法发送心跳消息,备节点也就因此无法继续检测到来自主 Master节点的心跳了,于是调用自身的接管程序,接管主Master节点的 IP资源及服务。而当主 Master节点恢复时,备Backup节点又会释放主节点故障时自身接管的IP资源及服务,恢复到原来的备用角色。

什么是VRRP

  1. VRRP,全称 Virtual Router Redundancy Protocol,中文名为虚拟路由冗余协议,VRRP的出现是为了解决静态路由的单点故障。
  2. VRRP是通过一种竟选协议机制来将路由任务交给某台 VRRP路由器的。
  3. VRRP用 IP多播的方式(默认多播地址(224.0_0.18))实现高可用对之间通信。
  4. 工作时主节点发包,备节点接包,当备节点接收不到主节点发的数据包的时候,就启动接管程序接管主节点的开源。备节点可以有多个,通过优先级竞选,但一般 Keepalived系统运维工作中都是一对。
  5. VRRP使用了加密协议加密数据,但Keepalived官方目前还是推荐用明文的方式配置认证类型和密码。

keepalived的工作原理

Keepalived高可用对之间是通过 VRRP进行通信的, VRRP是遑过竞选机制来确定主备的,主的优先级高于备,因此,工作时主会优先获得所有的资源,备节点处于等待状态,当主挂了的时候,备节点就会接管主节点的资源,然后顶替主节点对外提供服务。

  在 Keepalived服务对之间,只有作为主的服务器会一直发送 VRRP广播包,告诉备它还活着,此时备不会枪占主,当主不可用时,即备监听不到主发送的广播包时,就会启动相关服务接管资源,保证业务的连续性.接管速度最快可以小于1秒。

实战演练

使用Keepalived配置实现虚拟IP在多服务器节点漂移

  • 这里为了节约成本和方便演示,我准备了两台服务器Master 192.168.56.101, Backup 192.168.56.102
安装keepalived及其配置文件结构
# 服务器上安装keepalived
[root@localhost ~]# yum install keepalived -y

# 查看安装目录
[root@localhost ~]# rpm -ql keepalived
/etc/keepalived
/etc/keepalived/keepalived.conf (主配置文件)
/etc/sysconfig/keepalived
/usr/bin/genhash
/usr/lib/systemd/system/keepalived.service
/usr/libexec/keepalived
/usr/sbin/keepalived
/usr/share/doc/keepalived-1.3.5
................

# 查看配置文件
[root@localhost ~]# cat /etc/keepalived/keepalived.conf 
! Configuration File for keepalived

# 全局配置
global_defs {
   # 邮箱地址
   notification_email {
     acassen@firewall.loc
     failover@firewall.loc
     sysadmin@firewall.loc
   }
   notification_email_from Alexandre.Cassen@firewall.loc
   smtp_server 192.168.200.1
   smtp_connect_timeout 30
   # 路由id
   router_id LVS_DEVEL
   vrrp_skip_check_adv_addr
   # 严格的vrrp,建议关掉
#   vrrp_strict
#   vrrp_garp_interval 0
#  vrrp_gna_interval 0
}

# vrrp实例 (名称随意换)
vrrp_instance VI_1 {
    # 说明实例为master节点
    state MASTER
    # 指定的网卡名称
    interface eth0
    virtual_router_id 51
    # 优先值,越大则故障转移时被选中顶替master的几率越大
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    # 虚拟ip地址
    virtual_ipaddress {
        192.168.200.16
        192.168.200.17
        192.168.200.18
    }
}
Master节点配置
global_defs {
   notification_email {
     12345678@qq.com
   }
   notification_email_from 188275291160@163.com
   smtp_server 192.168.56.1
   smtp_connect_timeout 30
   router_id Nginx
 # vrrp_skip_check_adv_addr
 # vrrp_strict
 # vrrp_garp_interval 0
 # vrrp_gna_interval 0
}

vrrp_instance Nginx_1 {
    state MASTER
    # 网卡配置 ifconfig查看
    interface enp0s3
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.56.103
    }
}
Backup节点配置
! Configuration File for keepalived

global_defs {
   notification_email {
     12345678@qq.com
   }
   notification_email_from 188275291160@163.com
   smtp_server 192.168.56.1
   smtp_connect_timeout 30
   router_id Nginx
 # vrrp_skip_check_adv_addr
 # vrrp_strict
 # vrrp_garp_interval 0
 # vrrp_gna_interval 0
}

# vrrp实例名要与Master一样
vrrp_instance Nginx_1 {
    state BACKUP
    interface enp0s3
  # router_id与master保持一样,说明是同一个vrrp instance
    virtual_router_id 51
  # 优先级比master低  
    priority 98
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.56.103
    }
}
启动Master keepalived
[root@localhost keepalived]# systemctl start keepalived
[root@localhost keepalived]# tail -f /var/log/messages
Aug 15 02:57:08 localhost Keepalived[14810]: Starting VRRP child process, pid=14812
Aug 15 02:57:08 localhost Keepalived_healthcheckers[14811]: Opening file '/etc/keepalived/keepalived.conf'.
Aug 15 02:57:08 localhost Keepalived_vrrp[14812]: Registering Kernel netlink reflector
Aug 15 02:57:08 localhost Keepalived_vrrp[14812]: Registering Kernel netlink command channel
Aug 15 02:57:08 localhost Keepalived_vrrp[14812]: Registering gratuitous ARP shared channel
Aug 15 02:57:08 localhost Keepalived_vrrp[14812]: Opening file '/etc/keepalived/keepalived.conf'.
Aug 15 02:57:08 localhost Keepalived_vrrp[14812]: VRRP_Instance(Nginx_1) removing protocol VIPs.
Aug 15 02:57:08 localhost Keepalived_vrrp[14812]: Using LinkWatch kernel netlink reflector...
Aug 15 02:57:08 localhost Keepalived_vrrp[14812]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]
Aug 15 02:57:09 localhost Keepalived_vrrp[14812]: VRRP_Instance(Nginx_1) Transition to MASTER STATE
Aug 15 02:57:10 localhost Keepalived_vrrp[14812]: VRRP_Instance(Nginx_1) Entering MASTER STATE
Aug 15 02:57:10 localhost Keepalived_vrrp[14812]: VRRP_Instance(Nginx_1) setting protocol VIPs.
Aug 15 02:57:10 localhost Keepalived_vrrp[14812]: Sending gratuitous ARP on enp0s3 for 192.168.56.103
Aug 15 02:57:10 localhost Keepalived_vrrp[14812]: VRRP_Instance(Nginx_1) Sending/queueing gratuitous ARPs on enp0s3 for 192.168.56.103
Aug 15 02:57:10 localhost Keepalived_vrrp[14812]: Sending gratuitous ARP on enp0s3 for 192.168.56.103
Aug 15 02:57:15 localhost Keepalived_vrrp[14812]: VRRP_Instance(Nginx_1) Sending/queueing gratuitous ARPs on enp0s3 for 192.168.56.103
Aug 15 02:57:15 localhost Keepalived_vrrp[14812]: Sending gratuitous ARP on enp0s3 for 192.168.56.103

# 使用ip a或者ip addr show查看有了192.168.56.103的虚拟ip
[root@localhost keepalived]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:79:dc:34 brd ff:ff:ff:ff:ff:ff
    inet 192.168.56.101/24 brd 192.168.56.255 scope global noprefixroute dynamic enp0s3
       valid_lft 999sec preferred_lft 999sec
    inet 192.168.56.103/32 scope global enp0s3
       valid_lft forever preferred_lft forever
    inet6 fe80::b1d0:38ea:6339:11f8/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

# ping 发现虚拟ip能够提供服务了
[root@localhost keepalived]# ping 192.168.56.103
PING 192.168.56.103 (192.168.56.103) 56(84) bytes of data.
64 bytes from 192.168.56.103: icmp_seq=1 ttl=64 time=0.082 ms
64 bytes from 192.168.56.103: icmp_seq=2 ttl=64 time=0.035 ms
64 bytes from 192.168.56.103: icmp_seq=3 ttl=64 time=0.036 ms
64 bytes from 192.168.56.103: icmp_seq=4 ttl=64 time=0.037 ms
64 bytes from 192.168.56.103: icmp_seq=5 ttl=64 time=0.072 ms
64 bytes from 192.168.56.103: icmp_seq=6 ttl=64 time=0.036
启动Backup keepalived
# VRRP_Instance(Nginx_1) Entering BACKUP STATE
[root@localhost keepalived]# tail -f /var/log/messages
Aug 15 03:03:53 localhost Keepalived_healthcheckers[13529]: Opening file '/etc/keepalived/keepalived.conf'.
Aug 15 03:03:53 localhost Keepalived[13528]: Starting VRRP child process, pid=13530
Aug 15 03:03:53 localhost Keepalived_vrrp[13530]: Registering Kernel netlink reflector
Aug 15 03:03:53 localhost Keepalived_vrrp[13530]: Registering Kernel netlink command channel
Aug 15 03:03:53 localhost Keepalived_vrrp[13530]: Registering gratuitous ARP shared channel
Aug 15 03:03:53 localhost Keepalived_vrrp[13530]: Opening file '/etc/keepalived/keepalived.conf'.
Aug 15 03:03:53 localhost Keepalived_vrrp[13530]: VRRP_Instance(Nginx_1) removing protocol VIPs.
Aug 15 03:03:53 localhost Keepalived_vrrp[13530]: Using LinkWatch kernel netlink reflector...
Aug 15 03:03:53 localhost Keepalived_vrrp[13530]: VRRP_Instance(Nginx_1) Entering BACKUP STATE
Aug 15 03:03:53 localhost Keepalived_vrrp[13530]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]

# 当master正常访问时,虚拟ip只会在master上,否则由backup顶上
[root@localhost keepalived]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: enp0s3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:b3:b2:ab brd ff:ff:ff:ff:ff:ff
    inet 192.168.56.102/24 brd 192.168.56.255 scope global noprefixroute dynamic enp0s3
       valid_lft 807sec preferred_lft 807sec
    inet6 fe80::3ddc:9a9a:23b5:44b5/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever

模拟Master宕机-恢复试验
# master重启, 或systemctl stop keepalived模拟宕机
[root@localhost ~]# reboot

# 查看backup日志,发现backup服务器进入MASTER STATE,同时绑定虚拟ip 192.168.56.103
[root@localhost ~]# tail -f /var/log/messages
Aug 15 20:38:09 localhost Keepalived_vrrp[13605]: VRRP_Instance(Nginx_1) Transition to MASTER STATE
Aug 15 20:38:10 localhost Keepalived_vrrp[13605]: VRRP_Instance(Nginx_1) Entering MASTER STATE
Aug 15 20:38:10 localhost Keepalived_vrrp[13605]: VRRP_Instance(Nginx_1) setting protocol VIPs.
Aug 15 20:38:10 localhost Keepalived_vrrp[13605]: Sending gratuitous ARP on enp0s3 for 192.168.56.103
Aug 15 20:38:10 localhost Keepalived_vrrp[13605]: VRRP_Instance(Nginx_1) Sending/queueing gratuitous ARPs on enp0s3 for 192.168.56.103
Aug 15 20:38:10 localhost Keepalived_vrrp[13605]: Sending gratuitous ARP on enp0s3 for 192.168.56.103

# 当原先的MASTER恢复后,查看日志,发现收到广播信息,另一台服务器优先级为100,再次恢复成Backup,同时removing protocols VIPs;
[root@localhost ~]# tail -f /var/log/messages
Aug 15 20:42:37 localhost Keepalived_vrrp[13605]: VRRP_Instance(Nginx_1) Received advert with higher priority 100, ours 98
Aug 15 20:42:37 localhost Keepalived_vrrp[13605]: VRRP_Instance(Nginx_1) Entering BACKUP STATE
Aug 15 20:42:37 localhost Keepalived_vrrp[13605]: VRRP_Instance(Nginx_1) removing protocol VIPs.

如何修改keepalived日志存放位置

[root@localhost ~]# vim /etc/sysconfig/keepalived 
# Options for keepalived. See `keepalived --help' output and keepalived(8) and
# keepalived.conf(5) man pages for a list of all options. Here are the most
# common ones :
#
# --vrrp               -P    Only run with VRRP subsystem.
# --check              -C    Only run with Health-checker subsystem.
# --dont-release-vrrp  -V    Dont remove VRRP VIPs & VROUTEs on daemon stop.
# --dont-release-ipvs  -I    Dont remove IPVS topology on daemon stop.
# --dump-conf          -d    Dump the configuration data.
# --log-detail         -D    Detailed log messages.
# --log-facility       -S    0-7 Set local syslog facility (default=LOG_DAEMON)
#

# 默认的KEEPALIVED_OPTIONS="-D"
KEEPALIVED_OPTIONS="-D -d -S 0"

[root@localhost ~]# vim /etc/rsyslog.conf 
# Save boot messages also to boot.log
local7.*                                                /var/log/boot.log
  # 增加这一行配置
local0.*                                             /usr/local/nginx/logs/keepalived.log

[root@localhost ~]# systemctl restart rsyslog
[root@localhost ~]# systemctl restart keepalived

Keepalived + Nginx高可用原理

  • 关键点在于,如何使得keepalived知道nginx宕机了。可以写一个脚本监控nginx进程,如果nginx挂了,触发kill掉keepalived,使得VIPs转移到别的服务器,从而实现服务高可用。
[root@localhost shell]# vim nginx_health_check.sh
#!/bin/bash
#

ps -ef | grep nginx | grep -v grep &> /dev/null
# $?不为0(nginx未启动时为0)时杀掉keepalived进程
if [ $? -ne 0]; then
    killall keepalived
fi

# 利用$?(获取上一个命令的退出状态,或者上一个函数的返回值。), nginx未启动时ps -ef | grep nginx | grep -v grep &> /dev/null 返回1,否则返回0
[root@localhost shell]# ps -ef | grep nginx | grep -v grep &> /dev/null
[root@localhost shell]# echo $?
1