TROUBLESHOOTING Automatic ceph service reloading¶
When do that¶
Diagnostic:
The punchplatform storage cluster (Ceph) is unavailable sometimes. This issue happens recently in the morning, close to 6 AM.
In the system logs, other services are reloading after and/or before the Ceph error.
Examples of logs:
1 2 3 4 5 6 7 | May 28 06:25:02 server6 systemd[1]: Reloading LSB: Apache2 web server. May 28 06:25:02 server6 apache2[21527]: * Reloading Apache httpd web server apache2 May 28 06:25:02 server6 apache2[21527]: * May 28 06:25:02 server6 systemd[1]: Reloaded LSB: Apache2 web server. May 28 06:25:02 server6 ceph-mon[1790]: 2018-05-28 06:25:02.419028 7fac930dd700 -1 received signal: Hangup from PID: 21546 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0 May 28 06:25:02 server6 ceph-mgr[6394]: 2018-05-28 06:25:02.419759 7ff2ce737700 -1 received signal: Hangup from PID: 21546 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0 May 28 06:25:02 server6 ceph-osd[7600]: 2018-05-28 06:25:02.421308 7f107347b700 -1 received signal: Hangup from PID: 21546 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw UID: 0 |
Cause / Explanations¶
From the release Ubuntu 16.04 (operating system), a daemon is run daily to perform an upgrade of the system. This including a reloading of processes.
In most cases, this behaviour is safe because it allows a strong level of security by patching security breaches. However, restarting a daemon in charge of storage managing is very harmful (for instance ceph).
What to do¶
We recommend to disable the automatic update of Ubuntu. To perform this action, update the file /etc/apt/apt.conf.d/10periodic :
1 2 3 4 5 | $ sudo vim /etc/apt/apt.conf.d/10periodic # replace APT::Periodic::Update-Package-Lists "1"; # by APT::Periodic::Update-Package-Lists "0"; |