1.分布式锁实现
我们可以利用临时节点来实现,多个进程都尝试创键临时节点/lock, 但最终只会有一个进程P能创建成功,而其他没能创建成功的进程,可以在节点/lock上Watch(相当于等待锁释放), 一旦进程P处理完事务,断开连接,节点/lock被自动删除,其他进程将得到通知,进而继续创建节点/lock,以争得锁资源。
实现步骤:
打开一个客户端创建临时lock节点
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 0] create -e /lock "lock"
Created /lock
打开第二个客户端,创建临时lock节点报错,说明lock节点已经存在
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 2] create -e /lock "lock"
Node already exists: /lock
关闭第一个客户端后等待几秒钟后,在第二个客户端查看znode目录,lock节点已经不存在
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 9] ls /
[zookeeper]
然后就可创建lock节点
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 10] create -e /lock "lock"
Created /lock
通过以上命令也就模拟出了多个agent共享分布式锁的简单功能。
2.Master-Worker实现
第一个会话创建一个叫/master的临时节点
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 3] create -e /master "master1.example.com:2223"
Created /master
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 4] ls /
[zookeeper, master]
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 5] get /master
"master1.example.com:2223"
cZxid = 0x20000000f
ctime = Wed Mar 16 11:28:18 CST 2016
mZxid = 0x20000000f
mtime = Wed Mar 16 11:28:18 CST 2016
pZxid = 0x20000000f
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x3537d63b3340001
dataLength = 26
numChildren = 0
假设现在还有另一个进程作为master备份节点,开始创建master节点,却被告知master节点已经存在
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 12] create -e /master "master2.example.com:2223"
Node already exists: /master
但有可能在某一瞬间主master就崩溃了,这时备份master应立即转为主master,所以我们需要Watch主master
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 13] stat /master true
cZxid = 0x20000000f
ctime = Wed Mar 16 11:28:18 CST 2016
mZxid = 0x20000000f
mtime = Wed Mar 16 11:28:18 CST 2016
pZxid = 0x20000000f
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x3537d63b3340001
dataLength = 26
numChildren = 0
stat命令可以获取节点的属性,并且监听其是否存在,参数true表明设置Watch。 这时,主master突然崩溃断开连接(第一个的会话),这时第二个会话将得到节点/master删除的通知,并立即转为主master
在主节点退出client端
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 7] quit
Quitting...
2016-03-16 11:31:48,954 [myid:] - INFO [main:ZooKeeper@684] - Session: 0x3537d63b3340001 closed
2016-03-16 11:31:48,955 [myid:] - INFO [main-EventThread:ClientCnxn$EventThread@512] - EventThread shut down
[root@zookeeper1 ~]#
同时观察从节点会收到watch消息
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 14]
WATCHER::
WatchedEvent state:SyncConnected type:NodeDeleted path:/master
查看znode文件系统/master节点已经消失
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 14] ls /
[zookeeper]
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 15] create -e /master "master2.example.com:2223"
Created /master
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 16] ls /
[zookeeper, master]
3.工作者(Workers),任务(Tasks)和分配(Assignments)
先建立分别存在Workers,Tasks和Assignments的节点:/workers,/tasks,/assign。(注意是持久节点)
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 0] create /workers ""
Created /workers
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 1] create /tasks ""
Created /tasks
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 2] create /assign ""
Created /assign
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 3] ls /
[zookeeper, workers, tasks, assign]
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 4]
现在我们的master节点需要监听到节点/workers和/tasks,以便分配task到worker
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 4] ls /workers true
[]
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 5] ls /tasks true
[]
在worker角色
打开另一个会话,假设现在有一个worker可用
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 18] create -e /workers/worker1.example.com "worker1.example.com:2224"
Created /workers/worker1.example.com
此时master也得到/workers子节点变化的通知
WATCHER::
WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/workers
为了收到分配的任务,worker需要创建一个节点 /assign/worker1.example.com,并且监听子节点的变化
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 20] create /assign/worker1.example.com ""
Created /assign/worker1.example.com
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 21] ls /assign/worker1.example.com true
[]
在Client角色
现在假设一个客户端提交了一个任务到服务器中, 并且它必须还得监听该任务节点, 因为客户端必须知道自己提交的任务到底被执行或执行成功没有
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 0] create -s /tasks/task- "cmd"
Created /tasks/task-0000000000
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 1] ls /tasks/task-0000000000 true
[]
这里我们创建了一个连续持久节点,因此其节点名称加上了一个递增整数0000000000, 这时,master节点就感知到有新的任务提交上来了,将其分配给worker1
WATCHER::
WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/tasks
然后master节点检查新的任务,可用的worker节点,并分配任务给worker
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 7] ls /tasks
[task-0000000000]
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 8] ls /workers
[worker1.example.com]
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 9] create /assign/worker1.example.com/task-0000000000 ""
Created /assign/worker1.example.com/task-0000000000
于是,worker节点感知到了分配给自己的任务,并做检查
WATCHER::
WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/assign/worker1.example.com
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 24] ls /assign/worker1.example.com
[task-0000000000]
worker一旦完成了任务,将在对应的任务下增加一个状态节点
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 25] create /tasks/task-0000000000/status "done"
Created /tasks/task-0000000000/status
此时客户端将得到通知,并检查任务执行结果
WATCHER::
WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/tasks/task-0000000000
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 2] get /tasks/task-0000000000
"cmd"
cZxid = 0x20000001d
ctime = Wed Mar 16 13:28:05 CST 2016
mZxid = 0x20000001d
mtime = Wed Mar 16 13:28:05 CST 2016
pZxid = 0x20000001f
cversion = 1
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 5
numChildren = 1
[zk: 127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184(CONNECTED) 3] get /tasks/task-0000000000/status
"done"
cZxid = 0x20000001f
ctime = Wed Mar 16 13:35:33 CST 2016
mZxid = 0x20000001f
mtime = Wed Mar 16 13:35:33 CST 2016
pZxid = 0x20000001f
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 6
numChildren = 0
于是,客户端就知道了任务被执行的结果,这里结果为"done", 表示任务被成功执行。
以上就是整个Master-Worker架构的主要工作机制,虽然只是一个模拟过程, 但是对我们理解Master-Worker工作原理是很有帮助的,对以后要研究代码实现,也是一个很好的铺垫。
4.以上内容主要介绍了zookeeper运用的主要三种模式,最后精炼总结一下:
1)分布式锁实现
通过创建临时节点/lock锁节点的方式,谁先成功创建锁谁就占用锁,谁用完锁谁来释放锁,谁占用锁但程序崩溃就自动释放锁。
2)Master-Worker实现
作为比较常用的主备解决方案原理为:主节点启动占用/master临时节点,被节点启动无法占用/master节点,但备用节点会watch /master节点,当主节点崩溃,备节点收到消息并立即占用主节点/master
3)工作者、任务和分配
存在Workers,Tasks和Assignments的节点
master节点需要监听到节点/workers和/tasks
创建一个workers后master收到消息
worker创建assign节点,监听assign
client创建task节点,并监听此task节点
master感知task上传和确定可用的worker,分配任务给worker
worker感知task被分配,任务处理完成增加节点状态
client感知task处理完成,任务执行最后成功。
原文:http://www.cnblogs.com/run4life/p/5327231.html