Kubernetes 健康检查之 livenessProbe 存活检查
Kubernetes三种探针
k8s支持存活livenessProbe和就绪readinessProbe两种探针,两种探针都支持以下三种方式
一、exec
通过执行shell命令的方式,判断退出状态码是否是0,示例:
exec:
command:
- cat
- /tmp/healthy
二、tcp
通过TCP请求的方式,是否能建立tcp连接,示例:
tcpSocket:
port: 8080
initialDelaySeconds: 15
periodSeconds: 20
三、httpGet
通过发起http请求,判断返回结果是否符合预期,示例:
...
livenessProbe:
httpGet:
path: /healthz
port: 8080
httpHeaders:
- name: X-Custom-Header
value: Awesome
initialDelaySeconds: 3
periodSeconds: 3
- initialDelaySeconds指定了容器启动后多少秒后进行探测
- periodSeconds指定每隔多少秒进行探测
Liveness 探测
Liveness 探测让用户可以自定义判断容器是否健康的条件。如果探测失败,Kubernetes 就会重启容器。还是举例说明,创建如下 Pod:启动进程首先创建文件 /tmp/healthy
,30 秒后删除,在我们的设定中,如果 /tmp/healthy
文件存在,则认为容器处于正常状态,反正则发生故障。
[root@k8s-master ~]# cat liveness.yml
apiVersion: v1
kind: Pod
metadata:
name: liveness-pod
namespace: default
spec:
restartPolicy: OnFailure
containers:
- name: myapp
image: busybox
imagePullPolicy: IfNotPresent
command: ["/bin/sh","-c","touch /tmp/healthy;sleep 30;rm -rf /tmp/healthy;sleep 3600"]
livenessProbe:
exec:
command: ["test","-e","/tmp/healthy"]
initialDelaySeconds: 5
periodSeconds: 5
livenessProbe
部分定义如何执行 Liveness 探测:
-
探测的方法是:通过 test命令检查
/tmp/healthy
文件是否存在。如果命令执行成功,返回值为零,Kubernetes 则认为本次 Liveness 探测成功;如果命令返回值非零,本次 Liveness 探测失败。 -
initialDelaySeconds:5
指定容器启动 5s之后开始执行 Liveness 探测,我们一般会根据应用启动的准备时间来设置。比如某个应用正常启动要花 30 秒,那么initialDelaySeconds
的值就应该大于 30。 -
periodSeconds: 5
指定每 5 秒执行一次 Liveness 探测。Kubernetes 如果连续执行 3 次 Liveness 探测均失败,则会杀掉并重启容器。
下面创建 Pod liveness
:
[root@k8s-master ~]# kubectl apply -f liveness.yml
pod/liveness-pod created
从配置文件可知,最开始的 30 秒,/tmp/healthy
存在,test 命令返回 0,Liveness 探测成功,这段时间 kubectl describe pod liveness
的 Events
部分会显示正常的日志。
[root@k8s-master ~]# kubectl describe pod liveness-pod
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/liveness-pod to k8s-master
Normal Pulled 12s kubelet, k8s-master Container image "busybox" already present on machine
Normal Created 11s kubelet, k8s-master Created container myapp
Normal Started 11s kubelet, k8s-master Started container myapp
35 秒之后,日志会显示 /tmp/healthy
已经不存在,Liveness 探测失败。再过几十秒,几次探测都失败后,容器会被重启。
[root@k8s-master ~]# kubectl describe pod liveness-pod
Command:
/bin/sh
-c
touch /tmp/healthy;sleep 30;rm -rf /tmp/healthy;sleep 3600
State: Running
Started: Fri, 06 Nov 2020 15:40:59 +0800
Last State: Terminated
Reason: Error
Exit Code: 137 #可以看到退出了,OnFailure : 容器终止运行且退出码不为0时重启
Started: Fri, 06 Nov 2020 15:39:45 +0800
Finished: Fri, 06 Nov 2020 15:40:59 +0800
Ready: True
Restart Count: 1 #容器重启了1次
Liveness: exec [test -e /tmp/healthy] delay=5s timeout=1s period=5s #success=1 #failure=3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-dfmvc (ro)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/liveness-pod to k8s-master
Warning Unhealthy 55s (x3 over 65s) kubelet, k8s-master Liveness probe failed:
Normal Killing 55s kubelet, k8s-master Container myapp failed liveness probe, will be restarted #可以看到健康检查失败
Normal Pulled 25s (x2 over 100s) kubelet, k8s-master Container image "busybox" already present on machine
Normal Created 25s (x2 over 99s) kubelet, k8s-master Created container myapp
Normal Started 25s (x2 over 99s) kubelet, k8s-master Started container myapp
[root@k8s-master ~]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
liveness-pod 1/1 Running 3 4m20s 10.244.0.27 k8s-master <none> <none>
除了 Liveness 探测,Kubernetes Health Check 机制还包括 Readiness 探测,下篇博客见。
目录 返回
首页