Skip to content

TROUBLESHOOTING Automatic ceph service reloading

When do that

Diagnostic:

The punchplatform storage cluster (Ceph) is unavailable sometimes. This issue happens recently in the morning, close to 6 AM.

In the system logs, other services are reloading after and/or before the Ceph error.

Examples of logs:

1
2
3
4
5
6
7
May 28 06:25:02 server6 systemd[1]: Reloading LSB: Apache2 web server.
May 28 06:25:02 server6 apache2[21527]:  * Reloading Apache httpd web server apache2
May 28 06:25:02 server6 apache2[21527]:  *
May 28 06:25:02 server6 systemd[1]: Reloaded LSB: Apache2 web server.
May 28 06:25:02 server6 ceph-mon[1790]: 2018-05-28 06:25:02.419028 7fac930dd700 -1 received  signal: Hangup from  PID: 21546 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0
May 28 06:25:02 server6 ceph-mgr[6394]: 2018-05-28 06:25:02.419759 7ff2ce737700 -1 received  signal: Hangup from  PID: 21546 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0
May 28 06:25:02 server6 ceph-osd[7600]: 2018-05-28 06:25:02.421308 7f107347b700 -1 received  signal: Hangup from  PID: 21546 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0

Cause / Explanations

From the release Ubuntu 16.04 (operating system), a daemon is run daily to perform an upgrade of the system. This including a reloading of processes.

In most cases, this behaviour is safe because it allows a strong level of security by patching security breaches. However, restarting a daemon in charge of storage managing is very harmful (for instance ceph).

What to do

We recommend to disable the automatic update of Ubuntu. To perform this action, update the file /etc/apt/apt.conf.d/10periodic :

1
2
3
4
5
$ sudo vim /etc/apt/apt.conf.d/10periodic
# replace
APT::Periodic::Update-Package-Lists "1";
# by 
APT::Periodic::Update-Package-Lists "0";