Ceph故障处理（一）-health_warn：clock skew detected on mon

21 12月

作者:admin|分类:应用管理

造成集群状态health_warn：clock skew detected on mon节点的原因有两个，一个是mon节点上ntp服务器未启动，另一个是ceph设置的mon的时间偏差阈值比较小。

排查时也应遵循先第一个原因，后第二个原因的方式。

1、确认ntp服务是否正常工作

$ systemctl status ntpd

如果没有安装ntpd，可以参照以下文章进行安装，传送门：Centos7 搭建NTP服务器及客户端同步时间

2、修改ceph配置中的时间偏差阈值

在deploy节点修改配置文件调整时间偏差阈值，命令如下：

$ vim /ceph-install/ceph.conf

在global字段下添加：

mon clock drift allowed = 2
mon clock drift warn backoff = 30

向需要同步的mon节点推送配置文件，命令如下：

[root@cephnode01 my-cluster]# ceph-deploy --overwrite-conf config push cephnode{01,02,03}
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/bin/ceph-deploy --overwrite-conf config push cephnode01 cephnode02 cephnode03
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : True
[ceph_deploy.cli][INFO  ]  subcommand                    : push
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x14296c8>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  client                        : ['cephnode01', 'cephnode02', 'cephnode03']
[ceph_deploy.cli][INFO  ]  func                          : <function config at 0x13bbaa0>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.config][DEBUG ] Pushing config to cephnode01
[cephnode01][DEBUG ] connected to host: cephnode01 
[cephnode01][DEBUG ] detect platform information from remote host
[cephnode01][DEBUG ] detect machine type
[cephnode01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.config][DEBUG ] Pushing config to cephnode02
[cephnode02][DEBUG ] connected to host: cephnode02 
[cephnode02][DEBUG ] detect platform information from remote host
[cephnode02][DEBUG ] detect machine type
[cephnode02][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.config][DEBUG ] Pushing config to cephnode03
[cephnode03][DEBUG ] connected to host: cephnode03 
[cephnode03][DEBUG ] detect platform information from remote host
[cephnode03][DEBUG ] detect machine type
[cephnode03][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf

这里是向node1、node2、node3节点推送，重启mon服务，命令如下：

$ systemctl restart ceph-mon.target

3、验证

[root@cephnode01 my-cluster]# ceph -s
  cluster:
    id:     406e0c23-755f-4378-bbc9-13548c4d3d64
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum cephnode01,cephnode02,cephnode03 (age 8m)
    mgr: cephnode01(active, since 71m), standbys: cephnode03, cephnode02
    mds:  3 up:standby
    osd: 3 osds: 3 up (since 11m), 3 in (since 47m)
    rgw: 1 daemon active (cephnode01)
 
  task status:
 
  data:
    pools:   4 pools, 128 pgs
    objects: 187 objects, 1.2 KiB
    usage:   3.0 GiB used, 12 GiB / 15 GiB avail
    pgs:     128 active+clean

显示health_ok说明问题解决。

一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Ceph故障处理（一）-health_warn：clock skew detected on mon

更多文章推荐

历史上的今天