Docker部署rancherv2.44及故障排查
Docker部署rancherv2.44及故障排查
目录列表
- Docker部署rancherv2.44及故障排查
-
- 1.环境准备
- 2.获取rancher2.4.4镜像
- 3.部署rancher
- 4.访问rancher
-
- 4.1.设置语言为中文
- 4.2.设置站点url路径
- 4.3.rancher首页
- 5.添加k8s集群
-
- 5.1.点击添加集群
- 5.2.选择导入k8s集群
- 5.3.填写集群名称点击创建
-
- 5.4.保存imprt命令
- 5.5.在master运行命令导入rancher
- 5.6.查看系统pod
- 5.7.如何升级pod
- 5.8.如何执行命令
- 6.故障排查
-
- 6.1.cattle-cluster-agent-c66cd4f58-xhfhs pod一直处于ContainerCreating状态
- 6.2.解决rancher仪表盘变红错误
-
- 6.2.1.问题描述
- 6.2.2.问题解决
1.环境准备
操作系统:CentOS 7.5 Docker:19.03.13 Rancher:v2.4.4
2.获取rancher2.4.4镜像
[root@bogon ~]# docker pull rancher/rancher:v2.4.4 v2.4.4: Pulling from rancher/rancher 23884877105a: Pull complete bc38caa0f5b9: Pull complete 2910811b6c42: Pull complete 36505266dcc6: Pull complete 99447ff7670f: Pull complete 879c87dc86fd: Pull complete 5b954e5aebf8: Pull complete 664e1faf26b5: Pull complete bf7ac75d932b: Pull complete 7e972d16ff5b: Pull complete 08314b1e671c: Pull complete d5ce20b3d070: Pull complete 20e75cd9c8e9: Pull complete 80daa2770be8: Pull complete 7fb927855713: Pull complete af20d79674f1: Pull complete d6a9086242eb: Pull complete 887a8f050cee: Pull complete 834df47e622f: Pull complete Digest: sha256:cd9c4574606eb88d63dd9c84e6a7f4ee9998c1f0f4e4ee323cae884c95769041 Status: Downloaded newer image for rancher/rancher:v2.4.4 docker.io/rancher/rancher:v2.4.4
3.部署rancher
1.准备挂载点 [root@bogon ~]# mkdir -p /docker_volume/rancher_home/rancher [root@bogon ~]# mkdir -p /docker_volume/rancher_home/auditlog 2.启动rancher [root@bogon ~]# docker run -d --restart=unless-stopped -p 80:80 -p 443:443 \ -v /docker_volume/rancher_home/rancher:/var/lib/rancher \ -v /docker_volume/rancher_home/auditlog:/var/log/auditlog \ --name rancher rancher/rancher:v2.4.4 30329e53ae9f388a1a11ddb43e6f52e24616dbd41d2a0987a7446ebfac72817d 3.查看容器启动状态 大概在两分钟左右启动成功 [root@bogon ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 30329e53ae9f rancher/rancher:v2.4.4 "entrypoint.sh" 2 minutes ago Up 2 minutes 0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp rancher 4.查看日志 [root@bogon ~]# docker logs -f rancher
4.访问rancher
https://192.168.81.250/ 默认账号密码:admin/admin
首先让设置密码
4.1.设置语言为中文
4.2.设置站点url路径
4.3.rancher首页
5.添加k8s集群
5.1.点击添加集群
5.2.选择导入k8s集群
5.3.填写集群名称点击创建
5.4.保存imprt命令
kubectl create clusterrolebinding cluster-admin-binding --clusterrole cluster-admin --user [USER_ACCOUNT] kubectl apply -f https://192.168.81.180/v3/import/hktmls54zxmfwrlqjw4gcjmct5r7z9869pvjblk4j6b46bdcl275xx.yaml curl --insecure -sfL https://192.168.81.180/v3/import/hktmls54zxmfwrlqjw4gcjmct5r7z9869pvjblk4j6b46bdcl275xx.yaml | kubectl apply -f -
5.5.在master运行命令导入rancher
[root@k8s-master ~]# kubectl apply -f https://192.168.81.180/v3/import/hktmls54zxmfwrlqjw4gcjmct5r7z9869pvjblk4j6b46bdcl275xx.yaml Unable to connect to the server: x509: certificate is valid for 127.0.0.1, 172.17.0.2, not 192.168.81.180 这个报错是因为没有自签证书,运行下面的命令即可解决 [root@k8s-master ~]# curl --insecure -sfL https://192.168.81.180/v3/import/hktmls54zxmfwrlqjw4gcjmct5r7z9869pvjblk4j6b46bdcl275xx.yaml | kubectl apply -f - clusterrole.rbac.authorization.k8s.io/proxy-clusterrole-kubeapiserver created clusterrolebinding.rbac.authorization.k8s.io/proxy-role-binding-kubernetes-master created namespace/cattle-system created serviceaccount/cattle created clusterrolebinding.rbac.authorization.k8s.io/cattle-admin-binding created secret/cattle-credentials-bc8df60 created clusterrole.rbac.authorization.k8s.io/cattle-admin created deployment.apps/cattle-cluster-agent created daemonset.apps/cattle-node-agent created 耐心等待pod启动 [root@k8s-master ~]# kubectl get pod -n cattle-system
当cattle-system名称空间的pod都起来后就可以成功接入rancher了
成功接入后可以看到如下的监控
此处变红会在下面有解决方法
5.6.查看系统pod
点击集群—system
5.7.如何升级pod
选择需要升级的pod—点击三个点—升级
更换镜像即可
5.8.如何执行命令
选择pod—更多—执行命令
这样就能拿到命令行了
6.故障排查
6.1.cattle-cluster-agent-c66cd4f58-xhfhs pod一直处于ContainerCreating状态
cattle-cluster-agent-c66cd4f58-xhfhs pod一直处于ContainerCreating状态,导致无法接入rancher
日志输出
Events:
Type Reason Age From Message
Normal Scheduled 25m default-scheduler Successfully assigned cattle-system/cattle-cluster-agent-c66cd4f58-xhfhs to k8s-node1
Warning NetworkNotReady 25s (x753 over 25m) kubelet, k8s-node1 network is not ready: runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized看日志来判断应该是k8s网络的问题
果然是网络宕了一个
解决方法:
查看改pod在哪个节点运行 [root@k8s-master ~]# kubectl get pod -n kube-system -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES calico-kube-controllers-5b8b769fcd-ktgpz 1/1 Running 1 121m 10.100.235.199 k8s-master <none> <none> calico-node-4nsml 1/1 Running 0 13m 192.168.81.190 k8s-node1 <none> <none> calico-node-dlkgp 1/1 Running 1 121m 192.168.81.180 k8s-master <none> <none> 去node节点load镜像包 [root@k8s-node1 k8s_1_18_6_image]# docker load -i cni.tar.gz 重启该pod kubectl delete pod calico-node-q4c7n -n kube-system解决
6.2.解决rancher仪表盘变红错误
6.2.1.问题描述
仪表盘变红是由于集群健康检查端口没有开启导致的,不过也不影响使用
使用kubectl get cs命令就可以看到集群的监控状态,以下输出表示为正常,输出不是如下显示则会在rancher的仪表盘地方变红
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Z5faxHrx-1610331169101)(E:\jxl工作\运维文档\docker部署rancher.assets\image-20210111100327912.png)]
变红样子
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Ft1OthPe-1610331169102)(E:\jxl工作\运维文档\docker部署rancher.assets\image-20201127174609613.png)]
6.2.2.问题解决
修改kube-scheduler配置文件
[root@k8s-master ~]# vim /etc/kubernetes/manifests/kube-scheduler.yaml 把port=0那行注释
修改kube-controller-manager配置文件
[root@k8s-master ~]# vim /etc/kubernetes/manifests/kube-controller-manager.yaml 把port=0那行注释
修改完即可生效不需要重启,当修改完之后,10251/10252端口起来之后,仪表盘就不会显示红色了
目录 返回
首页