LB:(lvs/nginx (http/upstream,stream/upstream))
HP
HA Cluster(high availability
cluster):集群就是一组计算机,它们作为一个整体向用户提供一组网络资源。这些单个计算机系统就是集群的节点(node)。 高可用集群软件的主要作用就是实现故障检查和业务切换的自动化。只有两个节点的高可用集群又称为双机热备,即使用两台服务器互相备份。当一台服务器出现故障时,可由另一台服务器承担服务任务,从而在不需要人工干预的情况下,自动保证系统能持续对外提供服务。双机热备只是高可用集群的一种,高可用集群系统更可以支持两个以上的节点,提供比双机热备更多、更高级的功能,更能满足用户不断出现的需求变化。
资源:组成一个高可用集群的“组件”
HA=MTTF/(MTTF+MTTR)*100% MTTF:平均无故障时间; MTTR:平均修复时间;* SPof:Single Point of Failure 单点故障; 具体的HA衡量标准:* 99% 一年宕机时间不超过4天* 99.9% 一年宕机时间不超过10小时* 99.99% 一年宕机时间不超过1小时* 99.999% 一年宕机时间不超过6分钟**提高系统高可用的解决方案之降低MTTR,即采用冗余方案**
硬件故障:设计缺陷、wear out、自然灾害、....
软件故障:设计缺陷、
双主 active/active active<--->heartbeat<--->active
主备 active/passive active---->heartbeat---->passive
高可用指的是“服务”的高可用 HA Nginx service: vip/Nginx process[/shared storage]
在构建集群时:一般采用奇数个节点;
隔离设备: node隔离:node STONITH = SHooting the other head;关闭故障设备的电源(闪断); 资源隔离:fence 关闭故障设备访问存储的网络端口; quorum: with quorum 大于1/2投票; whitout quorum 小于1/2投票
failover:故障切换,某资源故障时,将资源转移至其他节点的操作; failback: 故障移回,即某资源的主节点故障后重新修改上线后,将转移至其它节点的资源重新切回的过程;
vrrp协议的实现 keepalived ais:完备的HA集群 RHCS(cman) heartbeat corosync
基于vrrp协议实现高可用 vrrp:Virtual Redundant Routing Protocol 虚拟冗余路由协议; 术语 虚拟路由器:Virtual Router 虚拟路由器标识:Vrid (0-255) 物理路由器: master:主设备 通告心跳通告备用设备节点自己的工作状态; backup:备用设备 priority: VIP:Virtual IP VMAC:Virutal MAC (00-00-5E-00-01-vrid) 通告:心跳,优先级等;周期性; 抢占式,非抢占式; 安全工作: 认证:(3中工作方式) 无认证; 简单字符认证; MD5 工作模型:主主,主备 keepalived vrrp协议的软件实现,原声设计的目的为了高可用的ipvs服务; vrrp协议完成地址流动; 为vip地址所在的节点生成ipvs规则(在配置文件中预先定义); 为ipvs集群的各RS做健康状态监测; 基于脚本调用接口通过执行脚本完成脚本中定义的功能,进而影响集群事务; keepalived的核心组件 vrrp stack ipvs wrapper checkers
*各节点时间必须同步;(ntp,chrony) *确保iptables及selinux不会成为阻碍 *各节点之间可通过主机名相互通信(对KA并非必须); 一般使用/etc/hosts解析 *各节点之间的root用户可以基于秘钥认证的ssh服务完成互相通信;(并非必须)
关闭selinux和iptables
iptables -vnL
getenforce
程序环境
主配置文件:/etc/keepalived/keeplived.conf 主程序文件:/usr/sbin/keepalived Unit File :keepalived.service Unit File : /etc/sysconfig/keepalived
配置文件组成:
THE HIERACHY GLOBAL CONFIGURATION gloab definitions static routes/address VRRPD CONFIGURATION vrrp synchronzation groups :vrrp同步组 vrrp instances:每个vrrp instance即一个vrrp路由器; LVS CONFIGURATION virtual server groups virtual servers:ipvs集群的vs和rs
单主配置示例
node1
global_defs {
notification_email {
root@localhost
}
notification_email_from keepalived@localhost 邮件发送服务器
smtp_server 127.0.0.1 邮件服务器地址
smtp_connect_timeout 30 服务器超时时间
router_id node1 节点1
vrrp_mcast_group 224.0.0.80 多播地址
}
vrrp_instance VI_1 {
state MASTER
interface eno16777736
virtual_route_id 88
priority 100
advert_int 1
authentication {
auth_type PASS
ahth_pass fajinide
}
virtual_ipaddress { 172.16.80.80/16 dev eno16777736 label eno16777736:0
}
track_interface {
eno16777736
}
notify_master "/etc/keepalived/notify.sh master"
notify_backup "/etc/keppalived/notify.sh backup"
notify_fault "/etc/keppalived/notify.sh fault"
}
node2
global_defs {
notification_email {
root@localhost
}
notification_email_from keepalived@localhost 邮件发送服务器
smtp_server 127.0.0.1 邮件服务器地址
smtp_connect_timeout 30 服务器超时时间
router_id node2 节点1
vrrp_mcast_group 224.0.0.80 多播地址
}
vrrp_instance VI_1 {
state BACKUP
interface eno16777736
virtual_route_id 88
priority 98
advert_int 1 每隔多长时间进行通告
authentication {
auth_type PASS
ahth_pass fajinide
}
virtual_ipaddress { 172.16.80.80/16 dev eno16777736 label eno16777736:0 虚拟地址
}
track_interface {
eno16777736
}
notify_master "/etc/keepalived/notify.sh master"
notify_backup "/etc/keppalived/notify.sh backup"
notify_fault "/etc/keppalived/notify.sh fault"
}配置语法:
配置虚拟路由器:
vrrp_instance <STRING> {
....
}
专用参数:
state MASTER|BACKUP:当前节点在此虚拟路由器上的初始状态;只能有一个是MASTER,余下的都应该为BACKUP;
interface IFACE_NAME:绑定为当前虚拟路由器使用的物理接口;
virtual_router_id VRID:当前虚拟路由器的惟一标识,范围是0-255;
priority 100:当前主机在此虚拟路径器中的优先级;范围1-254;
advert_int 1:vrrp通告的时间间隔;
authentication {
auth_type AH|PASS
auth_pass <PASSWORD>
}
virtual_ipaddress {
<IPADDR>/<MASK> brd <IPADDR> dev <STRING> scope <SCOPE> label <LABEL>
192.168.200.17/24 dev eth1
192.168.200.18/24 dev eth2 label eth2:1
}
track_interface {
eth0
eth1
...
}
配置要监控的网络接口,一旦接口出现故障,则转为FAULT状态;
nopreempt:定义工作模式为非抢占模式;
preempt_delay 300:抢占式模式下,节点上线后触发新选举操作的延迟时长;
定义通知脚本:
notify_master <STRING>|<QUOTED-STRING>:当前节点成为主节点时触发的脚本;
notify_backup <STRING>|<QUOTED-STRING>:当前节点转为备节点时触发的脚本;
notify_fault <STRING>|<QUOTED-STRING>:当前节点转为“失败”状态时触发的脚本;
notify <STRING>|<QUOTED-STRING>:通用格式的通知触发机制,一个脚本可完成以上三种状态的转换时的通知;示例通知脚本
#!/bin/bash
#
contact=‘root@localhost‘
notify() {
mailsubject="$(hostname) to be $1, vip floating"
mailbody="$(date +‘%F %T‘): vrrp transition, $(hostname) changed to be $1"
echo "$mailbody" | mail -s "$mailsubject" $contact
} case $1 in
master)
notify master
;;
backup)
notify backup
;;
fault)
notify fault
;;
*) echo "Usage: $(basename $0) {master|backup|fault}"
exit 1
;; esacserver1的配置:
global_defs {
notification_email {
root@localhost
}
notification_email_from keepalived@localhost
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id node1
vrrp_mcast_group4 224.0.100.19
}
vrrp_instance VI_1 {
state MASTER
interface eno16777736
virtual_router_id 14
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 571f97b2
}
virtual_ipaddress { 10.1.0.91/16 dev eno16777736
}
}
vrrp_instance VI_2 {
state BACKUP
interface eno16777736
virtual_router_id 15
priority 98
advert_int 1
authentication {
auth_type PASS
auth_pass ab8f07b2
}
virtual_ipaddress { 10.1.0.92/16 dev eno16777736
}
}
server2 的配置
global_defs {
notification_email {
root@localhost
}
notification_email_from keepalived@localhost
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id node1
vrrp_mcast_group4 224.0.100.19
}
vrrp_instance VI_1 {
state BACKUP
interface eno16777736
virtual_router_id 14
priority 98
advert_int 1
authentication {
auth_type PASS
auth_pass 571f97b2
}
virtual_ipaddress { 10.1.0.91/16 dev eno16777736
}
}
vrrp_instance VI_2 {
state MASTER
interface eno16777736
virtual_router_id 15
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass ab8f07b2
}
virtual_ipaddress { 10.1.0.92/16 dev eno16777736
}
}state priority值需改变
配置参数: virtual_server IP port | virtual_server fwmark int { ... real_server { ... } ... } 常用参数: 公共配置定义部分: delay_loop <INT>:服务轮询的时间间隔; lb_algo rr|wrr|lc|wlc|lblc|sh|dh:定义调度方法; lb_kind NAT|DR|TUN:集群的类型; persistence_timeout <INT>:持久连接时长; protocol TCP:服务协议,仅支持TCP; sorry_server <IPADDR> <PORT>:备用服务器地址; real_serverd的定义部分: real_server <IPADDR> <PORT> { weight <INT> 权重 notify_up <STRING>|<QUOTED-STRING> 上线时的通知脚本 notify_down <STRING>|<QUOTED-STRING> 下线时的通知脚本 HTTP_GET|SSL_GET|TCP_CHECK|SMTP_CHECK|MISC_CHECK {...}:定义当前主机的健康状态检测方法; }
HTTP_GET|SSL_GET:应用层检测
HTTP_GET|SSL_GET{
url {
path <URL_PATH>:定义要监控的URL;
status_code <INT>:判断上述检测机制为健康状态的响应码;
digest <STRING>:判断上述检测机制为健康状态的响应的内容的校验码;
**status_code和digest只要一种即可**
}
nb_get_retry <INT>:重试次数;一般需配置
delay_before_retry <INT>:重试之前的延迟时长;一般需配置
connect_ip <IP ADDRESS>:向当前RS的哪个IP地址发起健康状态检测请求
connect_port <PORT>:向当前RS的哪个PORT发起健康状态检测请求
bindto <IP ADDRESS>:发出健康状态检测请求时使用的源地址;
bind_port <PORT>:发出健康状态检测请求时使用的源端口;
connect_timeout <INTEGER>:连接请求的超时时长;一般需配置
}
TCP_CHECK {
connect_ip <IP ADDRESS>:向当前RS的哪个IP地址发起健康状态检测请求
connect_port <PORT>:向当前RS的哪个PORT发起健康状态检测请求
bindto <IP ADDRESS>:发出健康状态检测请求时使用的源地址;
bind_port <PORT>:发出健康状态检测请求时使用的源端口;
connect_timeout <INTEGER>:连接请求的超时时长;
}! Configuration File for keepalived
global_defs {
notification_email {
root@localhost
}
notification_email_from keepalived@localhost
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id node1
vrrp_mcast_group4 224.0.100.19
}
vrrp_instance VI_1 {
state MASTER
interface eno16777736
virtual_router_id 14
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 571f97b2
}
virtual_ipaddress { 10.1.0.93/16 dev eno16777736
}
notify_master "/etc/keepalived/notify.sh master"
notify_backup "/etc/keepalived/notify.sh backup"
notify_fault "/etc/keepalived/notify.sh fault"
}
virtual_server 10.1.0.93 80 {
delay_loop 3
lb_algo rr
lb_kind DR
protocol TCP
sorry_server 127.0.0.1 80
real_server 10.1.0.69 80 {
weight 1
HTTP_GET {
url {
path /
status_code 200
}
connect_timeout 1
nb_get_retry 3
delay_before_retry 1
}
}
real_server 10.1.0.71 80 {
weight 1
HTTP_GET {
url {
path /
status_code 200
}
connect_timeout 1
nb_get_retry 3
delay_before_retry 1
}
}
}
! Configuration File for keepalived
global_defs {
notification_email {
root@localhost
}
notification_email_from kaadmin@localhost
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id node1
vrrp_mcast_group4 224.0.100.67
}
vrrp_instance VI_1 {
state MASTER
interface eno16777736
virtual_router_id 44
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass f1bf7fde
}
virtual_ipaddress { 172.16.0.80/16 dev eno16777736 label eno16777736:0
}
track_interface {
eno16777736
}
notify_master "/etc/keepalived/notify.sh master"
notify_backup "/etc/keepalived/notify.sh backup"
notify_fault "/etc/keepalived/notify.sh fault"
}
vrrp_instance VI_2 {
state BACKUP
interface eno16777736
virtual_router_id 45
priority 98
advert_int 1
authentication {
auth_type PASS
auth_pass f2bf7ade
}
virtual_ipaddress { 172.16.0.90/16 dev eno16777736 label eno16777736:1
}
track_interface {
eno16777736
}
notify_master "/etc/keepalived/notify.sh master"
notify_backup "/etc/keepalived/notify.sh backup"
notify_fault "/etc/keepalived/notify.sh fault"
}
virtual_server fwmark 3 {
delay_loop 2
lb_algo rr
lb_kind DR
nat_mask 255.255.0.0
protocol TCP
sorry_server 127.0.0.1 80
real_server 172.16.0.69 80 {
weight 1
HTTP_GET {
url {
path /
status_code 200
}
connect_timeout 2
nb_get_retry 3
delay_before_retry 3
}
}
real_server 172.16.0.6 80 {
weight 1
HTTP_GET {
url {
path /
status_code 200
}
connect_timeout 2
nb_get_retry 3
delay_before_retry 3
}
}
}
keepalived调用外部的辅助脚本进行资源监控,并根据监控的结果状态能实现优先动态调整;
分两步:(1) 先定义一个脚本;(2) 调用此脚本;
vrrp_script <SCRIPT_NAME> {
script ""
interval INT
weight -INT
}
track_script {
SCRIPT_NAME_1
SCRIPT_NAME_2
...
}
示例:高可用nginx服务
! Configuration File for keepalived
global_defs {
notification_email {
root@localhost
}
notification_email_from keepalived@localhost
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id node1
vrrp_mcast_group4 224.0.100.19
}
vrrp_script chk_down {
script "[[ -f /etc/keepalived/down ]] && exit 1 || exit 0"
interval 1
weight -5
}
vrrp_script chk_nginx {
script "killall -0 nginx && exit 0 || exit 1"
interval 1
weight -5
}
vrrp_instance VI_1 {
state MASTER
interface eno16777736
virtual_router_id 14
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 571f97b2
}
virtual_ipaddress { 10.1.0.93/16 dev eno16777736
}
track_script {
chk_down
chk_nginx
}
notify_master "/etc/keepalived/notify.sh master"
notify_backup "/etc/keepalived/notify.sh backup"
notify_fault "/etc/keepalived/notify.sh fault"
}本文出自 “guo_ruilin” 博客,请务必保留此出处http://guoruilin198.blog.51cto.com/12567311/1898629
原文:http://guoruilin198.blog.51cto.com/12567311/1898629