一、drbd基础
1、存储
DAS: Direct Attached Storage 直接附加存储,块设备
DAS这种存储方式与我们普通的PC存储架构一样,外部存储设备都是直接挂接在服务器内部总线上,数据存储设备是整个服务器结构的一部分。是指将存储设备通过 ide, usb, sata, scsi, sas接口或光纤通道直接连接到一台计算机上。
NAS:Network Attached Storage 网络附加存储
它就是个文件服务器,是文件系统级别,NAS和传统的文件存储服务或直接存储设备不同的地方在于NAS设备上面的操作系统和软件只提供了数据存储、数据访问、以及相关的管理功能;
NAS用的是以文件为单位的通信协议,例如像是NFS(在UNIX系统上很常见)或是SMB(常用于Windows系统)。
SAN:Storage Area Network 存储区域网络
把SCSI协议借助于其它网络协议实现传送的;1991年,IBM公司在S/390服务器中推出了ESCON(Enterprise System Connection)技术。它是基于光纤介质,最大传输速率达17MB/s的服务器访问存储器的一种连接方式。在此基础上,进一步推出了功能更强的ESCON Director(FC SWitch),构建了一套最原始的SAN系统。
存储区域网络(SAN)用的是以区块为单位的通信协议、通常是通过SCSI再转为光纤通道或是iSCSI。(还有其他各种不同的SAN通信协议,像是ATA over Ethernet和HyperSCSI,不过这些都不常见。)
集群文件系统: (支持的节点不多,最多16个)
GFS2, OCFS2,cLVM2
2、drbd
drbd:跨主机的块设备镜像系统
基于网络实现数据镜像,工作于内核
用户空间管理工具:drbdadm, drbdsetup, drbdmeta
工作特性:实时、透明、同步或异步;
每组drbd设备都由"drbd resource"进行定义:
名字:只能由空白字符之外ASCII字符组成;
drbd设备:/dev/drbd#
主设备号:147,
次设备号:从0开始编号
磁盘配置:各主机上用于组成此drbd设备的磁盘或分区;
网络配置:数据同步时的网络通信属性配置;
数据同步模型:
三种协议:Protocal A, B, C
A: Async 异步
数据发送本地网卡发送队列
B: Semi-Sync 半同步
数据发送到对方网卡接收队列
C: sync 同步
数据发送到对方硬盘
drbd工作模型:
master/slave:主/从
dual master: 双主;要求必须在HA集群使用集群文件系统;
二、案例:实现drbd存储;
drbd:可靠性不是很高,有可能丢失数据
drbd:分为两部分
用户空间工具:跟内核版本关系比较松散,只要是能适用于CentOS 6及对应硬件平台的就OK;
内核模块:必须与当下内核版本严格对应;其中drbd内核模块代码已经整合进Linux内核2.6.33以后的版本中,因此,如果您的内核版本高于此版本的话,你只需要安装管理工具即可;
否则,您需要同时安装内核模块和管理工具两个软件包,并且此两者的版本号一定要保持对应。
1、配置前提:时间同步、基于主机名访问,双机互信
详细过程见前面的博文
2、为两个节点准备等同大小的磁盘分区,分区好之后不需要格式化,分好区并且识别出就可以了;
[root@BAIYU_180 ~]# fdisk /dev/sda WARNING: DOS-compatible mode is deprecated. It‘s strongly recommended to switch off the mode (command ‘c‘) and change display units to sectors (command ‘u‘). Command (m for help): n First cylinder (3982-5222, default 3982): Using default value 3982 Command (m for help): n First cylinder (4244-5222, default 4244): Using default value 4244 Last cylinder, +cylinders or +size{K,M,G} (4244-5222, default 5222): +1G Command (m for help): w The partition table has been altered! Calling ioctl() to re-read partition table. WARNING: Re-reading the partition table failed with error 16: 设备或资源忙. The kernel still uses the old table. The new table will be used at the next reboot or after you run partprobe(8) or kpartx(8) Syncing disks. [root@BAIYU_180 ~]# partx -a /dev/sda BLKPG: Device or resource busy error adding partition 1 BLKPG: Device or resource busy error adding partition 2 BLKPG: Device or resource busy error adding partition 3 BLKPG: Device or resource busy error adding partition 4 BLKPG: Device or resource busy error adding partition 5 BLKPG: Device or resource busy error adding partition 6 [root@BAIYU_180 ~]# partx -a /dev/sda BLKPG: Device or resource busy error adding partition 1 BLKPG: Device or resource busy error adding partition 2 BLKPG: Device or resource busy error adding partition 3 BLKPG: Device or resource busy error adding partition 4 BLKPG: Device or resource busy error adding partition 5 BLKPG: Device or resource busy error adding partition 6 BLKPG: Device or resource busy error adding partition 7 BLKPG: Device or resource busy error adding partition 8 [root@BAIYU_180 ~]# fdisk -l Disk /dev/sda: 42.9 GB, 42949672960 bytes 255 heads, 63 sectors/track, 5221 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x000b52c1 Device Boot Start End Blocks Id System /dev/sda1 * 1 66 524288 83 Linux Partition 1 does not end on cylinder boundary. /dev/sda2 66 1371 10485760 83 Linux /dev/sda3 1371 2677 10485760 83 Linux /dev/sda4 2677 5222 20446208 5 Extended /dev/sda5 2677 3982 10485760 83 Linux /dev/sda6 3983 4244 2097152 82 Linux swap / Solaris /dev/sda7 3982 3982 743+ 83 Linux /dev/sda8 4244 4375 1059340 83 Linux
3、安装drdb;
[root@node1 ~]# uname -r # 查看系统内核版本 2.6.32-431.el6.x86_64 [root@BAIYU_179 ~]# rpm -q centos-release centos-release-6-5.el6.centos.11.1.x86_64
下载对应的程序包: drbd-kmdl-2.6.32-431.el6-8.4.3-33.el6.x86_64.rpm # drdb内核模块 drbd-8.4.3-33.el6.x86_64.rpm # drbd主程序 安装drdb: [root@BAIYU_179 ~]# rpm -ivh drbd-8.4.3-33.el6.x86_64.rpm warning: drbd-8.4.3-33.el6.x86_64.rpm: Header V4 DSA/SHA1 Signature, key ID 66534c2b: NOKEY error: Failed dependencies: drbd-kmdl-8.4.3-33.el6 is needed by drbd-8.4.3-33.el6.x86_64 [root@BAIYU_179 ~]# rpm -ivh drbd-8.4.3-33.el6.x86_64.rpm drbd-kmdl-2.6.32-431.el6-8.4.3-33.el 6.x86_64.rpm warning: drbd-8.4.3-33.el6.x86_64.rpm: Header V4 DSA/SHA1 Signature, key ID 66534c2b: NOKEY Preparing... ########################################### [100%] 1:drbd-kmdl-2.6.32-431.el########################################### [ 50%] 2:drbd ########################################### [100%]
4、配置drbd
配置文件:
/etc/drbd.conf 主配置文件
/etc/drbd.d/global_common.conf: 提供全局配置,及多个drbd设备相同的配置;
/etc/drbd.d/*.res: 资源定义;
global: 全局属性,定义drbd自己的工作特性;
common: 通用属性,定义多组drbd设备通用特性;
*.res:资源特有的配置
1)配置/etc/drbd.d/global_common.conf
[root@BAIYU_180 ~]# vi /etc/drbd.d/global_common.conf global { #全局配置 usage-count no; #这个默认为yes表示如果你本机可以连接互联网时drbd会通过互联网收集到你安装drbd的信息,不用可以改为no # minor-count dialog-refresh disable-ip-verification } common { handlers { #处理器 # These are EXAMPLE handlers only. # They may have severe implications, # like hard resetting the node under certain circumstances. # Be careful when chosing your poison. #pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; 这个定义了如果有脑裂了之后找不到主节点怎么处理的 #local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f"; 定义了一旦本地节点发生IO错误时应该怎么处理 # fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; # split-brain "/usr/lib/drbd/notify-split-brain.sh root"; # out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root"; # before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k"; # after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh; } startup { 定义一个节点启动时另一个节点应该怎么做 # wfc-timeout(等待另一个节点上线的超时时长) # degr-wfc-timeout(等待超时后做降级处理) # outdated-wfc-timeout(过期的等待超时) # wait-after-sb(脑裂之后等待多长时长) } options { #定义同步属性 # cpu-mask on-no-data-accessible } disk { on-io-error detach; # IO发生错误,就拆除此节点 # size max-bio-bvecs on-io-error fencing disk-barrier disk-flushes # disk-drain md-flushes resync-rate resync-after al-extents # c-plan-ahead c-delay-target c-fill-target c-max-rate # c-min-rate disk-timeout } net { protocol C; # 同步模型 cram-hmac-alg "sha1"; # 数据同步时使用的加密协议 shared-secret "www.magedu.com"; # 密钥 # protocol timeout max-epoch-size max-buffers unplug-watermark # connect-int ping-int sndbuf-size rcvbuf-size ko-count # allow-two-primaries cram-hmac-alg shared-secret after-sb-0pri # after-sb-1pri after-sb-2pri always-asbp rr-conflict # ping-timeout data-integrity-alg tcp-cork on-congestion # congestion-fill congestion-extents csums-alg verify-alg # use-rle } syncer { rate 1000M; # 最大速率 } }
2)、定义资源,包括资源名,drbd设备,disk以及网络属性,主要是这四个方面;
# cd /etc/drbd.d/ # vim mystore.res 1 resource mystore { #定义一个资源,用关键字resource; 2 device /dev/drbd0; #在磁盘上表现的drbd叫什么名; 3 disk /dev/sda8; #所使用的磁盘设备是哪个; 4 meta-disk internal; 5 on BAIYU_179 { #on说明在哪个节点上,跟uname -n保持一致,有多少个节点 就定义多少个; 6 address 192.168.100.179:7789; #在node1这个节点上监听的套接字,默认监听 在7789端口上; 7 meta-disk internal; #保存drbd元数据信息的,表示就放在自己的 磁盘区分上,也可以放在外部的磁盘上; 8 } 9 on node2.tanxw.com { 10 address 192.168.100.180:7789; 11 } 12 }
复制一份到别一个节点上,使它们的配置文件要保持一致:
[root@BAIYU_179 drbd.d]# ls global_common.conf global_common.conf.orig mystore.res mystore.res.orig [root@BAIYU_179 drbd.d]# scp global_common.conf mystore.res BAIYU_180:/etc/drbd .d root@baiyu_180‘s password: global_common.conf 100% 1995 2.0KB/s 00:00 mystore.res 100% 675 0.7KB/s 00:00
在各节点分别初始化已定义的资源并启动服务:
[root@BAIYU_179 drbd.d]# drbdadm create-md mystore Writing meta data... initializing activity log NOT initializing bitmap lk_bdev_save(/var/lib/drbd/drbd-minor-0.lkbd) failed: No such file or directory New drbd meta data block successfully created. lk_bdev_save(/var/lib/drbd/drbd-minor-0.lkbd) failed: No such file or directory [root@BAIYU_179 lib]# service drbd start Starting DRBD resources: [ create res: mystore prepare disk: mystore adjust disk: mystore adjust net: mystore ] .......... *************************************************************** DRBD‘s startup script waits for the peer node(s) to appear. - In case this node was already a degraded cluster before the reboot the timeout is 0 seconds. [degr-wfc-timeout] - If the peer was available before the reboot the timeout will expire after 0 seconds. [wfc-timeout] (These values are for resource ‘mystore‘; 0 sec -> wait forever) To abort waiting enter ‘yes‘ [ -- ]: [ 43]: To abort waiting enter ‘yes‘ [ -- ]:[ 215]: # 在等待其它节点启动,其它节点也启动后才能启动完成 . [root@BAIYU_179 lib]# cat /proc/drbd # 查看drbd version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-11-29 12:28:00 0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:1059268 [root@BAIYU_179 lib]# date 2015年 10月 27日 星期二 18:53:38 CST [root@BAIYU_179 lib]# drbd-overview # 查看drbd 0:mystore/0 Connected Secondary/Secondary Inconsistent/Inconsistent C r-----
从上面查看的结果:Secondary/Secondary Inconsistent/Inconsistent 知道2个节点都是从,数据没有同步;要把其中一个提升为主才会数据同步:
[root@BAIYU_179 lib]# drbdadm primary --force mystore # 将该节点提升为主节点 [root@BAIYU_179 ~]# cat /proc/drbd version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-11-29 12:28:00 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- ns:0 nr:0 dw:0 dr:664 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
当前节点为主,数据已同步
注意:哪个是主节点哪个就可以挂载使用,不是主节点的连挂载都不可以挂载;
挂载格式化drbd0
[root@BAIYU_179 ~]# mke2fs -t ext4 /dev/drbd0 mke2fs 1.41.12 (17-May-2010) 文件系统标签= 操作系统:Linux 块大小=4096 (log=2) 分块大小=4096 (log=2) Stride=0 blocks, Stripe width=0 blocks 66240 inodes, 264817 blocks 13240 blocks (5.00%) reserved for the super user 第一个数据块=0 Maximum filesystem blocks=272629760 9 block groups 32768 blocks per group, 32768 fragments per group 7360 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376 正在写入inode表: 完成 Creating journal (8192 blocks): 完成 Writing superblocks and filesystem accounting information: 完成 This filesystem will be automatically checked every 38 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. [root@BAIYU_179 ~]# mount /dev/drbd0 /mnt [root@BAIYU_179 ~]# ls /mnt lost+found [root@BAIYU_179 ~]# cp /etc/inittab /mnt [root@BAIYU_179 ~]# umount /dev/drbd0 [root@BAIYU_179 ~]# drbdadm secondary mystore # 降级成从节点 [root@BAIYU_179 ~]# cat /proc/drbd version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-11-29 12:28:00 0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r----- ns:49880 nr:0 dw:49880 dr:1377 al:14 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 [root@BAIYU_179 ~]# ssh BAIYU_180 # 登录到另一个节点 Last login: Tue Oct 27 18:03:41 2015 from 192.168.100.88 You have entered the system: 168.100.180 your username is: root WARNING:Proceed with caution! Have any questions please contact the system administrator [root@BAIYU_180 ~]# drbd-overview 0:mystore/0 Connected Secondary/Secondary UpToDate/UpToDate C r----- [root@BAIYU_180 ~]# drbdadm primary mystore # 提升至主节点 [root@BAIYU_180 ~]# drbd-overview 0:mystore/0 Connected Primary/Secondary UpToDate/UpToDate C r----- [root@BAIYU_180 ~]# mount /dev/drbd0 /mnt [root@BAIYU_180 ~]# ls /mnt # drbd0中的文件还在 inittab lost+found
配置drbd就这么顺利的完成了。
5、将drbd加入HA资源
要自动完成drbd的角色切换得要借助于corosync+pacmaker,那接下来我们就来安装配置corosync和pacemaker;
1)为了让高可用的配置顺利,首先将两个节点都降级为从,停止运行,禁止开机自动启动:
[root@BAIYU_179 ~]# cat /proc/drbd version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-11-29 12:28:00 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- ns:49880 nr:24 dw:49904 dr:1377 al:14 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 [root@BAIYU_179 ~]# ssh BAIYU_180 Last login: Tue Oct 27 21:49:46 2015 from 192.168.100.179 You have entered the system: 168.100.180 your username is: root WARNING:Proceed with caution! Have any questions please contact the system administrator [root@BAIYU_180 ~]# umount /mnt [root@BAIYU_180 ~]# drbdadm secondary mystore [root@BAIYU_180 ~]# service drbd stop Stopping all DRBD resources: . [root@BAIYU_180 ~]# chkconfig drbd off [root@BAIYU_180 ~]# exit logout Connection to BAIYU_180 closed. [root@BAIYU_179 ~]# service drbd stop Stopping all DRBD resources: . [root@BAIYU_179 ~]# chkconfig drbd off
2)安装配置并启动corosync+pacemaker+crm_sh
详细过程上篇博文
[root@BAIYU_179 yum.repos.d]# yum install corosync pacemaker -y [root@BAIYU_179 corosync]# service corosync start; ssh BAIYU_180 service corosync start Starting Corosync Cluster Engine (corosync): [确定] Starting Corosync Cluster Engine (corosync): [确定] [root@BAIYU_179 corosync]# service pacemaker start; ssh BAIYU_180 service pacemaker start Starting Pacemaker Cluster Manager[确定] Starting Pacemaker Cluster Manager[确定] [root@BAIYU_179 corosync]# crm crmadmin crm_error crm_mon crm_resource crm_standby crm_attribute crm_failcount crm_node crm_shadow crm_ticket crm_diff crm_master crm_report crm_simulate crm_verify [root@BAIYU_179 corosync]# crm_mon Attempting connection to the cluster... Last updated: Tue Oct 27 22:20:21 2015 Last change: Tue Oct 27 22:20:16 2015 Stack: classic openais (with plugin) Current DC: BAIYU_179 - partition with quorum Version: 1.1.11-97629de 2 Nodes configured, 2 expected votes 0 Resources configured Online: [ BAIYU_179 BAIYU_180 ] 安装crmsh: [root@BAIYU_180 corosync]# crm crm(live)# status Last updated: Tue Oct 27 22:35:54 2015 Last change: Tue Oct 27 22:20:16 2015 Stack: classic openais (with plugin) Current DC: BAIYU_179 - partition with quorum Version: 1.1.11-97629de 2 Nodes configured, 2 expected votes 0 Resources configured Online: [ BAIYU_179 BAIYU_180 ] crm(live)configure# property stonith-enabled=false crm(live)configure# property no-quorum-policy=ignore crm(live)configure# show node BAIYU_179 node BAIYU_180 property cib-bootstrap-options: dc-version=1.1.11-97629de cluster-infrastructure="classic openais (with plugin)" expected-quorum-votes=2 stonith-enabled=false no-quorum-policy=ignore crm(live)configure# verify crm(live)configure# commit
3)将drbd加入HA
root@BAIYU_179 ~]# crm crm(live)# ra crm(live)ra# classes lsb ocf / heartbeat linbit pacemaker service stonith crm(live)ra# --help back classes help ls quit -h bye end info meta up ? cd exit list providers crm(live)ra# classes lsb ocf / heartbeat linbit pacemaker service stonith crm(live)ra# list ocf CTDB ClusterMon Delay Dummy Filesystem HealthCPU HealthSMART IPaddr IPaddr2 IPsrcaddr LVM MailTo Route SendArp Squid Stateful SysInfo SystemHealth VirtualDomain Xinetd apache conntrackd controld db2 dhcpd drbd ethmonitor exportfs iSCSILogicalUnit mysql named nfsnotify nfsserver nginx pgsql ping pingd postfix remote rsyncd symlink tomcat crm(live)ra# list ocf linbit drbd crm(live)ra# info ocf:linbit:drbd # 查看默认设置和参数
注意:clone资源首先是基本资源
在pacemaker中定义克隆资源的专用属性:
clone-max:最多克隆出的资源份数;
clone-node-max:在单个节点上最多运行几份克隆;
notify:当一份克隆资源启动或停止时,是否通知给其它的副本;
master-max:最多启动几份master资源;
master-node-max:同一个节点最多运行几份master类型资源;
本文出自 “xiexiaojun” 博客,请务必保留此出处http://xiexiaojun.blog.51cto.com/2305291/1706913
HA集群之四:Corosync+Pacemaker+DRBD实现HA Mysql
原文:http://xiexiaojun.blog.51cto.com/2305291/1706913