[PATCH] udev-rules: Restart kdump service on cpu ADD/REMOVE events

Vivek Goyal vgoyal at redhat.com
Fri Sep 5 20:16:25 UTC 2014


This patch changes restart of kdump service from cpu online/offline events
to cpu add/remove events.

Some people have complained that they are running cpu online/offline tests
at high frequency and kdump restarts at high frequency and systemd disables
the service. As a temporary fix, we committed a patch to never disable 
kdump service.

In general it probably is a good idea to restart kdump service on cpu
add/remove events.

Toshi Kani confirmed following.

- File for /sys/devices/system/cpu/cpuX/crash_notes will be created first
  before ADD event goes out. That means we can not miss creating EFL notes
  for newly created cpu.

- For REMOVE event files under /sys/devices/system/cpu/cpuX/ are removed
  first and then REMOVE event goes out. That means we will remove the elf
  note header for removed cpu.

- There are some race conditions like a cpu is removed but system crashes
  before kdump service restarts. In that case vmcore.c has to be more robust
  to be able to inspect elf notes and discard empty ones.

  Also it is possible that after cpu remove, crash notes memory got reused
  for something else and after crash vmcore.c might see some random data.
  It does basic size checks and discards elf notes if checks don't pass.

  Above rance conditions can happen even with OFFLINE event and there is
  no good way to remove these altogether. So making vmcore.c more robust
  is the right solution here.

Signed-off-by: Vivek Goyal <vgoyal at redhat.com>
---
 98-kexec.rules |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Index: kexec-tools-fedora/98-kexec.rules
===================================================================
--- kexec-tools-fedora.orig/98-kexec.rules	2014-06-03 13:19:04.813120747 -0400
+++ kexec-tools-fedora/98-kexec.rules	2014-09-04 10:59:59.093304225 -0400
@@ -1,4 +1,4 @@
-SUBSYSTEM=="cpu", ACTION=="online", PROGRAM="/bin/systemctl try-restart kdump.service"
-SUBSYSTEM=="cpu", ACTION=="offline", PROGRAM="/bin/systemctl try-restart kdump.service"
+SUBSYSTEM=="cpu", ACTION=="add", PROGRAM="/bin/systemctl try-restart kdump.service"
+SUBSYSTEM=="cpu", ACTION=="remove", PROGRAM="/bin/systemctl try-restart kdump.service"
 SUBSYSTEM=="memory", ACTION=="online", PROGRAM="/bin/systemctl try-restart kdump.service"
 SUBSYSTEM=="memory", ACTION=="offline", PROGRAM="/bin/systemctl try-restart kdump.service"


More information about the kexec mailing list