Ceph故障处理(一)-health_warn:clock skew detected on mon
造成集群状态health_warn:clock skew detected on mon节点的原因有两个,一个是mon节点上ntp服务器未启动,另一个是ceph设置的mon的时间偏差阈值比较小。
排查时也应遵循先第一个原因,后第二个原因的方式。
1、确认ntp服务是否正常工作
$ systemctl status ntpd
如果没有安装ntpd,可以参照以下文章进行安装,传送门:Centos7 搭建NTP服务器及客户端同步时间
2、修改ceph配置中的时间偏差阈值
在deploy节点修改配置文件调整时间偏差阈值,命令如下:
$ vim /ceph-install/ceph.conf
在global字段下添加:
mon clock drift allowed = 2
mon clock drift warn backoff = 30
向需要同步的mon节点推送配置文件,命令如下:
[root@cephnode01 my-cluster]# ceph-deploy --overwrite-conf config push cephnode{01,02,03}
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (2.0.1): /usr/bin/ceph-deploy --overwrite-conf config push cephnode01 cephnode02 cephnode03
[ceph_deploy.cli][INFO ] ceph-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : True
[ceph_deploy.cli][INFO ] subcommand : push
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x14296c8>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] client : ['cephnode01', 'cephnode02', 'cephnode03']
[ceph_deploy.cli][INFO ] func : <function config at 0x13bbaa0>
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.config][DEBUG ] Pushing config to cephnode01
[cephnode01][DEBUG ] connected to host: cephnode01
[cephnode01][DEBUG ] detect platform information from remote host
[cephnode01][DEBUG ] detect machine type
[cephnode01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.config][DEBUG ] Pushing config to cephnode02
[cephnode02][DEBUG ] connected to host: cephnode02
[cephnode02][DEBUG ] detect platform information from remote host
[cephnode02][DEBUG ] detect machine type
[cephnode02][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.config][DEBUG ] Pushing config to cephnode03
[cephnode03][DEBUG ] connected to host: cephnode03
[cephnode03][DEBUG ] detect platform information from remote host
[cephnode03][DEBUG ] detect machine type
[cephnode03][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
这里是向node1、node2、node3节点推送,重启mon服务,命令如下:
$ systemctl restart ceph-mon.target
3、验证
[root@cephnode01 my-cluster]# ceph -s
cluster:
id: 406e0c23-755f-4378-bbc9-13548c4d3d64
health: HEALTH_OK
services:
mon: 3 daemons, quorum cephnode01,cephnode02,cephnode03 (age 8m)
mgr: cephnode01(active, since 71m), standbys: cephnode03, cephnode02
mds: 3 up:standby
osd: 3 osds: 3 up (since 11m), 3 in (since 47m)
rgw: 1 daemon active (cephnode01)
task status:
data:
pools: 4 pools, 128 pgs
objects: 187 objects, 1.2 KiB
usage: 3.0 GiB used, 12 GiB / 15 GiB avail
pgs: 128 active+clean
显示health_ok说明问题解决。
目录 返回
首页