集群中的每个节点必须有位置名称,默认情况下consul使用主机名作为名称,我们也可以使用-node命令指定
构建集群
启动首个节点
[root@consul-1 ~]# consul agent -server -bootstrap-expect 1 -data-dir /tmp/consul -node=agent-one -bind=192.168.0.149 -config-dir /etc/consul.d/ ==> WARNING: BootstrapExpect Mode is specified as 1; this is the same as Bootstrap mode. ==> WARNING: Bootstrap mode enabled! Do not enable unless necessary ==> Starting Consul agent... ==> Starting Consul agent RPC... ==> Consul agent running! Node name: ‘agent-one‘ Datacenter: ‘dc1‘ Server: true (bootstrap: true) Client Addr: 127.0.0.1 (HTTP: 8500, HTTPS: -1, DNS: 8600, RPC: 8400) Cluster Addr: 192.168.0.149 (LAN: 8301, WAN: 8302) Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false Atlas: <disabled> ==> Log data will now stream in as it occurs: 2016/03/23 17:34:33 [INFO] serf: EventMemberJoin: agent-one 192.168.0.149 2016/03/23 17:34:33 [INFO] serf: EventMemberJoin: agent-one.dc1 192.168.0.149 2016/03/23 17:34:33 [INFO] raft: Node at 192.168.0.149:8300 [Follower] entering Follower state 2016/03/23 17:34:33 [INFO] consul: adding LAN server agent-one (Addr: 192.168.0.149:8300) (DC: dc1) 2016/03/23 17:34:33 [INFO] consul: adding WAN server agent-one.dc1 (Addr: 192.168.0.149:8300) (DC: dc1) 2016/03/23 17:34:33 [ERR] agent: failed to sync remote state: No cluster leader 2016/03/23 17:34:35 [WARN] raft: Heartbeat timeout reached, starting election 2016/03/23 17:34:35 [INFO] raft: Node at 192.168.0.149:8300 [Candidate] entering Candidate state 2016/03/23 17:34:36 [INFO] raft: Election won. Tally: 1 2016/03/23 17:34:36 [INFO] raft: Node at 192.168.0.149:8300 [Leader] entering Leader state 2016/03/23 17:34:36 [INFO] consul: cluster leadership acquired 2016/03/23 17:34:36 [INFO] consul: New leader elected: agent-one 2016/03/23 17:34:36 [INFO] raft: Disabling EnableSingleNode (bootstrap) 2016/03/23 17:34:36 [INFO] agent: Synced service ‘consul‘ 2016/03/23 17:34:37 [INFO] agent: Synced service ‘web‘
-server | 以服务端模式运行 |
-bootstrap-expect | 指定期望加入的节点数 |
-data-dir | 指定数据存放位置 |
-node | hiding节点名 |
-bind | 指定绑定的IP |
-config-dir | 指定配置目录 |
NOTE:只有达到指定期望加入的节点数后才会出发选举
启动第二个节点
[root@consul-2 ~]# consul agent -data-dir /tmp/consul -node=a2 -bind=192.168.100.103 -config-dir /etc/consul.d ==> Starting Consul agent...==> Starting Consul agent RPC...==> Consul agent running! Node name: ‘a2‘ Datacenter: ‘dc1‘ Server: false (bootstrap: false) Client Addr: 127.0.0.1 (HTTP: 8500, HTTPS: -1, DNS: 8600, RPC: 8400) Cluster Addr: 192.168.100.103 (LAN: 8301, WAN: 8302) Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false Atlas: <disabled>==> Log data will now stream in as it occurs: 2016/03/18 21:51:55 [INFO] serf: EventMemberJoin: a2 192.168.100.103 2016/03/18 21:51:55 [ERR] agent: failed to sync remote state: No known Consul servers 2016/03/18 21:52:17 [ERR] agent: failed to sync remote state: No known Consul servers 2016/03/18 21:52:22 [INFO] agent.rpc: Accepted client: 127.0.0.1:42288 2016/03/18 21:52:38 [ERR] agent: failed to sync remote state: No known Consul servers
现在分别在consul-1节点和consul-2节点启动了两个agent,consul-1用来做server,consul-2用来做client,但是他们互相都不知道对方的存在,都是自己单节点集群中的唯一节点
可以通过consul members查看
[root@consul-1 ~]# consul members Node Address Status Type Build Protocol DC agent-one 192.168.0.149:8301 alive server 0.6.4 2 dc1 [root@consul-2 ~]# consul members Node Address Status Type Build Protocol DC agent-two 192.168.0.161:8301 alive client 0.6.4 2 dc1
加入集群
consul-1加入consul-2
可以从任意一个节点加入到集群中,然后触发选举,选一个leader
[root@consul-1 ~]# consul join 192.168.0.161 Successfully joined cluster by contacting 1 nodes.
加入成功后查看server端的输出
2016/03/23 18:17:10 [INFO] serf: EventMemberJoin: agent-one 192.168.0.149 2016/03/23 18:17:10 [INFO] consul: adding server agent-one (Addr: 192.168.0.149:8300) (DC: dc1) 2016/03/23 18:17:10 [INFO] consul: New leader elected: agent-one 2016/03/23 18:17:12 [INFO] agent: Synced node info
在两个节点上查看成员状态,此时两个能互相识别
[root@consul-1 ~]# consul members Node Address Status Type Build Protocol DC agent-one 192.168.0.149:8301 alive server 0.6.4 2 dc1 agent-two 192.168.0.161:8301 alive client 0.6.4 2 dc1
[root@consul-2 ~]# consul members Node Address Status Type Build Protocol DC agent-one 192.168.0.149:8301 alive server 0.6.4 2 dc1 agent-two 192.168.0.161:8301 alive client 0.6.4 2 dc1
脱离集群
可以使用ctrl+c来平滑退出,也可以使用kill退出,区别是主动告知其他节点自己离开,和被其他节点标记为失效,被发现离开
健康检查
健康检查对于避免将请求发送给运行不正常的服务是一个相当关键的机制
和服务一样,有两种方式来定义检查
·通过配置文件
·使用HTTP API
定义检查
在第二个节点(此时consul-2是server)的配置文件目录中创建两个文件
[root@consul-2 ~]# echo ‘{"check": {"name": "ping", > "script": "ping -c1 google.com >/dev/null", "interval": "30s"}}‘ > >/etc/consul.d/ping.json [root@consul-2 ~]# echo ‘{"service": {"name": "web", "tags": ["rails"], "port": 80, > "check": {"script": "curl localhost >/dev/null 2>&1", "interval": "10s"}}}‘ > >/etc/consul.d/web.json
以上两个检查如果文件退出状态码非0就标记为不健康
重载配置
通过给斤成风阿松SIGHUP的信号来使配置重载
echo {"check": {"name": "ping","script": "ping -c1 soft.dog >/dev/null", "interval": "30s"}} > /etc/consul.d/ping.json echo {"service": {"name": "web", "tags": ["rails"], "port": 80,"check": {"script": "curl localhost >/dev/null 2>&1", "interval": "10s"}}} > /etc/consul.d/web.json cat /etc/consul.d/ping.json {"check": {"name": "ping","script": "ping -c1 soft.dog >/dev/null", "interval": "30s"}} cat /etc/consul.d/web.json {"service": {"name": "web", "tags": ["rails"], "port": 80,"check": {"script": "curl localhost >/dev/null 2>&1", "interval": "10s"}}}
此时可以观察consul-2日志的输出
2016/03/23 19:26:39 [INFO] agent: Synced service ‘web‘ 2016/03/23 19:26:40 [INFO] agent: Synced check ‘ping‘ 2016/03/23 19:26:42 [WARN] agent: Check ‘service:web‘ is now critical 2016/03/23 19:26:53 [WARN] agent: Check ‘service:web‘ is now critical 2016/03/23 19:27:03 [WARN] agent: Check ‘service:web‘ is now critical 2016/03/23 19:27:10 [INFO] agent: Synced check ‘ping‘ 2016/03/23 19:27:13 [WARN] agent: Check ‘service:web‘ is now critical 2016/03/23 19:27:23 [WARN] agent: Check ‘service:web‘ is now critical 2016/03/23 19:27:33 [WARN] agent: Check ‘service:web‘ is now critical 2016/03/23 19:27:43 [WARN] agent: Check ‘service:web‘ is now critical 2016/03/23 19:27:50 [WARN] agent: Check ‘ping‘ is now warning 2016/03/23 19:27:50 [INFO] agent: Synced check ‘ping‘ 2016/03/23 19:27:54 [WARN] agent: Check ‘service:web‘ is now critical 2016/03/23 19:28:04 [WARN] agent: Check ‘service:web‘ is now critical 2016/03/23 19:28:14 [WARN] agent: Check ‘service:web‘ is now critical 2016/03/23 19:28:20 [INFO] agent: Synced check ‘ping‘ 2016/03/23 19:28:24 [WARN] agent: Check ‘service:web‘ is now critical 2016/03/23 19:28:34 [WARN] agent: Check ‘service:web‘ is now critical 2016/03/23 19:28:44 [WARN] agent: Check ‘service:web‘ is now critical 2016/03/23 19:28:54 [WARN] agent: Check ‘service:web‘ is now critical
重新加载配置后,两个检查脚本都成功载入了
ping脚本检查正常,百度是能ping通的,同时由于我们并没有真正在本地启web服务,80端口不存在,也不提供内容,所以检查结果是状态不正常
查看状态
可以使用HTTP API来检查配置
[root@consul-2 ~]# curl http://localhost:8500/v1/health/state/critical [{"Node":"agent-two","CheckID":"service:web","Name":"Service ‘web‘ check","Status":"critical","Notes":"","Output":"","ServiceID":"web","ServiceName":"web","CreateIndex":1206,"ModifyIndex":1206}]
可以在任意一个节点上进行检查
和服务一样,健康也可以使用HTTP API来动态的进行添加、删除和修改
本文出自 “翟军铭的linux博客” 博客,请务必保留此出处http://zhaijunming5.blog.51cto.com/10668883/1754445
原文:http://zhaijunming5.blog.51cto.com/10668883/1754445