对Pod的健康状态检查可以通过两类探针来检查:LIvenessProbe和ReadinessProbe。
一、探针的种类
livenessProbe:用于判断容器是否存活(running状态)。如果探测到容器不健康,则kubelet将杀掉该容器,并根据容器的重启策略做相应的处理。如果一个容器不包含LivenessProbe探针,那么kubelet人为该容器的LivenessProbe探针返回值永远为success。
readinessProbe:用于判断容器是否启动完成(ready状态),可以接收请求。如果检测到失败,则pod的状态将被修改。endpoint controller将从service的endpoint 中删除包含该容器所在pod的endpoint。
二、LivenessProbe三种检测方法:
kubelet定期执行LivenessProbe探针来诊断容器的健康状况。
1. exec:在容器内部执行一条命令,如果该命令的返回码为0,则表明容器健康。
2. tcpSocket:通过容器的IP地址和端口号执行TCP检查,如果能够建立TCP连接,则表明容器健康。
3. httpGet:通过容器的IP地址、端口号及路径调用http Get方法,如果响应的状态码大于等于200且小于400,则认为容器状态健康。
三、liveness探针的exec使用
[root@kub_master k8s]# mkdir healthy [root@kub_master k8s]# cd healthy/
1. 创建pod资源
[root@kub_master healthy]# vim pod_nginx_exec.yaml [root@kub_master healthy]# cat pod_nginx_exec.yaml apiVersion: v1 kind: Pod metadata: name: liveness-exec spec: containers: - name: nginx image: 192.168.0.212:5000/nginx:1.13 ports: - containerPort: 80 args: - /bin/sh - -c - touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600 livenessProbe: exec: command: - cat - /tmp/healthy initialDelaySeconds: 5 periodSeconds: 5 timeoutSeconds: 1
通过执行命令“cat /tmp/healthy”来判断容器运行是否正常。而该pod运行之后,在创建/tmp/healthy文件30s后删除该文件,而livenessProbe健康检查的初始探测时间为5s,探测结果将是fail,将导致kubelet杀掉该容器,并重启它。
2. 测试
[root@kub_master healthy]# kubectl create -f pod_nginx_exec.yaml pod "liveness-exec" created [root@kub_master healthy]# kubectl get pods NAME READY STATUS RESTARTS AGE busybox2 1/1 Running 4 4h liveness-exec 1/1 Running 0 8s mysql-ms870 1/1 Running 0 3h mysql-wp-3651026459-v31gc 1/1 Running 0 5h myweb-5zpl5 1/1 Running 0 3h myweb-g09wf 1/1 Running 0 3h wp-deployment-3182043070-2jmkb 1/1 Running 0 5h wp-deployment-3182043070-r7bmq 1/1 Running 0 5h [root@kub_master healthy]# kubectl describe pod liveness-exec Name: liveness-exec Namespace: default Node: 192.168.0.184/192.168.0.184 Start Time: Sat, 26 Sep 2020 23:34:22 +0800 Labels: <none> Status: Running IP: 172.16.46.4 Controllers: <none> Containers: nginx: Container ID: docker://c12b52e50014546302596a2e235f2d701471e7cf681d993653123fe62ab0e746 Image: 192.168.0.212:5000/nginx:1.13 Image ID: docker-pullable://192.168.0.212:5000/nginx@sha256:e4f0474a75c510f40b37b6b7dc2516241ffa8bde5a442bde3d372c9519c84d90 Port: 80/TCP Args: /bin/sh -c touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600 State: Running Started: Sat, 26 Sep 2020 23:34:23 +0800 Ready: True Restart Count: 0 Liveness: exec [cat /tmp/healthy] delay=5s timeout=1s period=5s #success=1 #failure=3 Volume Mounts: <none> Environment Variables: <none> Conditions: Type Status Initialized True Ready True PodScheduled True No volumes. QoS Class: BestEffort Tolerations: <none> Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 25s 25s 1 {default-scheduler } Normal Scheduled Successfully assigned liveness-exec to 192.168.0.184 25s 25s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Pulling pulling image "192.168.0.212:5000/nginx:1.13" 25s 25s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Pulled Successfully pulled image "192.168.0.212:5000/nginx:1.13" 24s 24s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Created Created container with docker id c12b52e50014; Security:[seccomp=unconfined] 24s 24s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Started Started container with docker id c12b52e50014
[root@kub_master healthy]# kubectl describe pod liveness-exec Name: liveness-exec Namespace: default Node: 192.168.0.184/192.168.0.184 Start Time: Sat, 26 Sep 2020 23:34:22 +0800 Labels: <none> Status: Running IP: 172.16.46.4 Controllers: <none> Containers: nginx: Container ID: docker://26ad03feb4f89e4a16d7a826adcbae2c433d200aa204d416e71a922472794172 Image: 192.168.0.212:5000/nginx:1.13 Image ID: docker-pullable://192.168.0.212:5000/nginx@sha256:e4f0474a75c510f40b37b6b7dc2516241ffa8bde5a442bde3d372c9519c84d90 Port: 80/TCP Args: /bin/sh -c touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600 State: Running Started: Sat, 26 Sep 2020 23:35:37 +0800 Last State: Terminated Reason: Error Exit Code: 137 Started: Sat, 26 Sep 2020 23:34:23 +0800 Finished: Sat, 26 Sep 2020 23:35:37 +0800 Ready: True Restart Count: 1 Liveness: exec [cat /tmp/healthy] delay=5s timeout=1s period=5s #success=1 #failure=3 Volume Mounts: <none> Environment Variables: <none> Conditions: Type Status Initialized True Ready True PodScheduled True No volumes. QoS Class: BestEffort Tolerations: <none> Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 1m 1m 1 {default-scheduler } Normal Scheduled Successfully assigned liveness-exec to 192.168.0.184 1m 1m 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Pulling pulling image "192.168.0.212:5000/nginx:1.13" 1m 1m 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Pulled Successfully pulled image "192.168.0.212:5000/nginx:1.13" 1m 1m 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Created Created container with docker id c12b52e50014; Security:[seccomp=unconfined] 1m 1m 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Started Started container with docker id c12b52e50014 42s 32s 3 {kubelet 192.168.0.184} spec.containers{nginx} Warning Unhealthy Liveness probe failed: cat: /tmp/healthy: No such file or directory 2s 2s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Killing Killing container with docker id c12b52e50014: pod "liveness-exec_default(c00e9e7c-000d-11eb-8a8e-fa163e38ad0d)" container "nginx" is unhealthy, it will be killed and re-created. 2s 2s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Pulled Container image "192.168.0.212:5000/nginx:1.13" already present on machine 2s 2s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Created Created container with docker id 26ad03feb4f8; Security:[seccomp=unconfined] 2s 2s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Started Started container with docker id 26ad03feb4f8
[root@kub_master healthy]# kubectl get pods NAME READY STATUS RESTARTS AGE busybox2 1/1 Running 4 4h liveness-exec 1/1 Running 1 1m mysql-ms870 1/1 Running 0 3h mysql-wp-3651026459-v31gc 1/1 Running 0 5h myweb-5zpl5 1/1 Running 0 3h myweb-g09wf 1/1 Running 0 3h wp-deployment-3182043070-2jmkb 1/1 Running 0 5h wp-deployment-3182043070-r7bmq 1/1 Running 0 5h
四、liveness探针的httpGet使用
1. 创建pod资源
[root@kub_master healthy]# vim pod_nginx_httpGet.yaml [root@kub_master healthy]# cat pod_nginx_httpGet.yaml apiVersion: v1 kind: Pod metadata: name: liveness-httpget spec: containers: - name: nginx image: 192.168.0.212:5000/nginx:1.13 ports: - containerPort: 80 livenessProbe: httpGet: path: /index.html port: 80 initialDelaySeconds: 3 timeoutSeconds: 3 [root@kub_master healthy]# kubectl create -f pod_nginx_httpGet.yaml pod "liveness-httpget" created [root@kub_master healthy]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE liveness-httpget 1/1 Running 0 11s 172.16.46.2 192.168.0.184 [root@kub_master healthy]# kubectl describe pod liveness-httpget Name: liveness-httpget Namespace: default Node: 192.168.0.184/192.168.0.184 Start Time: Sat, 26 Sep 2020 23:48:31 +0800 Labels: <none> Status: Running IP: 172.16.46.2 Controllers: <none> Containers: nginx: Container ID: docker://2212178360bc1615e54e4cd0f3901b6a61de29868759af7efd5b0bf2874ea97a Image: 192.168.0.212:5000/nginx:1.13 Image ID: docker-pullable://192.168.0.212:5000/nginx@sha256:e4f0474a75c510f40b37b6b7dc2516241ffa8bde5a442bde3d372c9519c84d90 Port: 80/TCP State: Running Started: Sat, 26 Sep 2020 23:48:31 +0800 Ready: True Restart Count: 0 Liveness: http-get http://:80/index.html delay=3s timeout=3s period=10s #success=1 #failure=3 Volume Mounts: <none> Environment Variables: <none> Conditions: Type Status Initialized True Ready True PodScheduled True No volumes. QoS Class: BestEffort Tolerations: <none> Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 48s 48s 1 {default-scheduler } Normal Scheduled Successfully assigned liveness-httpget to 192.168.0.184 48s 48s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Pulled Container image "192.168.0.212:5000/nginx:1.13" already present on machine 48s 48s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Created Created container with docker id 2212178360bc; Security:[seccomp=unconfined] 48s 48s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Started Started container with docker id 2212178360bc
kubelet定时发送http请求到localhost:80/index.html来进行容器应用的健康检查
2. 测试
上述可看容器正常,进入容器,移走index.html文件,检测容器是否正常
[root@kub_master healthy]# kubectl exec -it liveness-httpget bash root@liveness-httpget:/# ls /usr/share/nginx/html/ 50x.html index.html root@liveness-httpget:/# mv /usr/share/nginx/html/index.html /tmp root@liveness-httpget:/# ls /usr/share/nginx/html/ 50x.html root@liveness-httpget:/# exit exit
[root@kub_master healthy]# kubectl describe pod liveness-httpget Name: liveness-httpget Namespace: default Node: 192.168.0.184/192.168.0.184 Start Time: Sat, 26 Sep 2020 23:48:31 +0800 Labels: <none> Status: Running IP: 172.16.46.2 Controllers: <none> Containers: nginx: Container ID: docker://2e4358ea39f3967afd15988e9d731105b901e353e44f904769cacadf3910ec33 Image: 192.168.0.212:5000/nginx:1.13 Image ID: docker-pullable://192.168.0.212:5000/nginx@sha256:e4f0474a75c510f40b37b6b7dc2516241ffa8bde5a442bde3d372c9519c84d90 Port: 80/TCP State: Running Started: Sat, 26 Sep 2020 23:55:51 +0800 Last State: Terminated Reason: Completed Exit Code: 0 Started: Sat, 26 Sep 2020 23:48:31 +0800 Finished: Sat, 26 Sep 2020 23:55:51 +0800 Ready: True Restart Count: 1 Liveness: http-get http://:80/index.html delay=3s timeout=3s period=10s #success=1 #failure=3 Volume Mounts: <none> Environment Variables: <none> Conditions: Type Status Initialized True Ready True PodScheduled True No volumes. QoS Class: BestEffort Tolerations: <none> Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 8m 8m 1 {default-scheduler } Normal Scheduled Successfully assigned liveness-httpget to 192.168.0.184 8m 8m 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Created Created container with docker id 2212178360bc; Security:[seccomp=unconfined] 8m 8m 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Started Started container with docker id 2212178360bc 8m 45s 2 {kubelet 192.168.0.184} spec.containers{nginx} Normal Pulled Container image "192.168.0.212:5000/nginx:1.13" already present on machine 1m 45s 3 {kubelet 192.168.0.184} spec.containers{nginx} Warning Unhealthy Liveness probe failed: HTTP probe failed with statuscode: 404 45s 45s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Killing Killing container with docker id 2212178360bc: pod "liveness-httpget_default(ba0c49a6-000f-11eb-8a8e-fa163e38ad0d)" container "nginx" is unhealthy, it will be killed and re-created. 45s 45s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Created Created container with docker id 2e4358ea39f3; Security:[seccomp=unconfined] 45s 45s 1 {kubelet 192.168.0.184} spec.containers{nginx} Normal Started Started container with docker id 2e4358ea39f3 [root@kub_master healthy]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE liveness-httpget 1/1 Running 1 8m 172.16.46.2 192.168.0.184
五、liveness探针的tcpSocket使用
1. 创建pod资源
[root@kub_master healthy]# vim pod_nginx_tcpSocket.yaml [root@kub_master healthy]# cat pod_nginx_tcpSocket.yaml apiVersion: v1 kind: Pod metadata: name: liveness-tcpsocket spec: containers: - name: nginx image: 192.168.0.212:5000/nginx:1.13 ports: - containerPort: 80 livenessProbe: tcpSocket: port: 80 initialDelaySeconds: 3 timeoutSeconds: 3 [root@kub_master healthy]# kubectl create -f pod_nginx_tcpSocket.yaml pod "liveness-tcpsocket" created [root@kub_master healthy]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE liveness-httpget 1/1 Running 1 14m 172.16.46.2 192.168.0.184 liveness-tcpsocket 1/1 Running 0 15s 172.16.81.3 192.168.0.212 [root@kub_master healthy]# kubectl describe pod liveness-tcpsocket Name: liveness-tcpsocket Namespace: default Node: 192.168.0.212/192.168.0.212 Start Time: Sun, 27 Sep 2020 00:03:13 +0800 Labels: <none> Status: Running IP: 172.16.81.3 Controllers: <none> Containers: nginx: Container ID: docker://edfb206ebf8036227b960e0bab346c55c065dae10e8c0f1b447ffef0b74fa01e Image: 192.168.0.212:5000/nginx:1.13 Image ID: docker-pullable://192.168.0.212:5000/nginx@sha256:e4f0474a75c510f40b37b6b7dc2516241ffa8bde5a442bde3d372c9519c84d90 Port: 80/TCP State: Running Started: Sun, 27 Sep 2020 00:03:13 +0800 Ready: True Restart Count: 0 Liveness: tcp-socket :80 delay=3s timeout=3s period=10s #success=1 #failure=3 Volume Mounts: <none> Environment Variables: <none> Conditions: Type Status Initialized True Ready True PodScheduled True No volumes. QoS Class: BestEffort Tolerations: <none> Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 39s 39s 1 {default-scheduler } Normal Scheduled Successfully assigned liveness-tcpsocket to 192.168.0.212 39s 39s 1 {kubelet 192.168.0.212} spec.containers{nginx} Normal Pulling pulling image "192.168.0.212:5000/nginx:1.13" 39s 39s 1 {kubelet 192.168.0.212} spec.containers{nginx} Normal Pulled Successfully pulled image "192.168.0.212:5000/nginx:1.13" 39s 39s 1 {kubelet 192.168.0.212} spec.containers{nginx} Normal Created Created container with docker id edfb206ebf80; Security:[seccomp=unconfined] 39s 39s 1 {kubelet 192.168.0.212} spec.containers{nginx} Normal Started Started container with docker id edfb206ebf80
2.检测80端口
[root@kub_master healthy]# curl -I 172.16.81.3:80 HTTP/1.1 200 OK Server: nginx/1.13.12 Date: Sat, 26 Sep 2020 16:08:46 GMT Content-Type: text/html Content-Length: 612 Last-Modified: Mon, 09 Apr 2018 16:01:09 GMT Connection: keep-alive ETag: "5acb8e45-264" Accept-Ranges: bytes
通过与容器内的localhost:80建立TCP连接进行健康检查
3. 参数说明
对于每种探测方式,都需要设置initialDelaySeconds和timeoutSeconds两个参数,它们的含义分别如下:
initialDelaySeconds:启动容器后首次健康检查的等待时间,单位为s
timeoutSeconds:健康检查发送请求后等待响应的超时时间,单位s。当超时发生时,kubelet会认为容器已经无法提供服务,将重启该容器。
六、readiness探针的httpGet使用
1.创建一个RC和其对应得Service服务
[root@kub_master healthy]# vim rc_nginx_readiness.yaml [root@kub_master healthy]# cat rc_nginx_readiness.yaml apiVersion: v1 kind: ReplicationController metadata: name: readiness-httpget spec: replicas: 2 selector: app: readiness template: metadata: labels: app: readiness spec: containers: - name: readiness image: 192.168.0.212:5000/nginx:1.13 ports: - containerPort: 80 readinessProbe: httpGet: path: /test.html port: 80 initialDelaySeconds: 3 timeoutSeconds: 3 [root@kub_master healthy]# kubectl create -f rc_nginx_readiness.yaml replicationcontroller "readiness-httpget" created [root@kub_master healthy]# kubectl get rc -o wide NAME DESIRED CURRENT READY AGE CONTAINER(S) IMAGE(S) SELECTOR readiness-httpget 2 2 0 9s readiness 192.168.0.212:5000/nginx:1.13 app=readiness [root@kub_master healthy]# kubectl expose rc readiness-httpget --port=80 service "readiness-httpget" exposed [root@kub_master healthy]# kubectl get svc readiness-httpget NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE readiness-httpget 192.168.99.82 <none> 80/TCP 22s
2.查看service服务详细信息
[root@kub_master healthy]# kubectl describe svc readiness-httpget Name: readiness-httpget Namespace: default Labels: app=readiness Selector: app=readiness Type: ClusterIP IP: 192.168.99.82 Port: <unset> 80/TCP Endpoints: Session Affinity: None No events. [root@kub_master healthy]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE liveness-httpget 1/1 Running 1 40m 172.16.46.2 192.168.0.184 liveness-tcpsocket 1/1 Running 0 25m 172.16.81.3 192.168.0.212 readiness-httpget-5nlxb 0/1 Running 0 3m 172.16.81.4 192.168.0.212 readiness-httpget-xbsv7 0/1 Running 0 3m 172.16.46.3 192.168.0.184
发现readiness-httpget Service服务后端节点为空,且创建了2个pod
#进入pod容器中,创建访问页面
[root@kub_master healthy]# kubectl exec -it readiness-httpget-xbsv7 bash root@readiness-httpget-xbsv7:/usr/share/nginx/html# echo "ok">test.html root@readiness-httpget-xbsv7:/usr/share/nginx/html# exit exit #查看svc,发现后端节点ip [root@kub_master healthy]# kubectl describe svc readiness-httpget Name: readiness-httpget Namespace: default Labels: app=readiness Selector: app=readiness Type: ClusterIP IP: 192.168.99.82 Port: <unset> 80/TCP Endpoints: 172.16.46.3:80 Session Affinity: None No events.
原文:https://www.cnblogs.com/jiawei2527/p/13737455.html