[PATCH] udev-rules: Restart kdump service on cpu ADD/REMOVE events
WANG Chao
chaowang at redhat.com
Mon Sep 15 13:00:10 UTC 2014
On 09/05/14 at 04:16pm, Vivek Goyal wrote:
> This patch changes restart of kdump service from cpu online/offline events
> to cpu add/remove events.
>
> Some people have complained that they are running cpu online/offline tests
> at high frequency and kdump restarts at high frequency and systemd disables
> the service. As a temporary fix, we committed a patch to never disable
> kdump service.
>
> In general it probably is a good idea to restart kdump service on cpu
> add/remove events.
>
> Toshi Kani confirmed following.
>
> - File for /sys/devices/system/cpu/cpuX/crash_notes will be created first
> before ADD event goes out. That means we can not miss creating EFL notes
> for newly created cpu.
>
> - For REMOVE event files under /sys/devices/system/cpu/cpuX/ are removed
> first and then REMOVE event goes out. That means we will remove the elf
> note header for removed cpu.
>
> - There are some race conditions like a cpu is removed but system crashes
> before kdump service restarts. In that case vmcore.c has to be more robust
> to be able to inspect elf notes and discard empty ones.
>
> Also it is possible that after cpu remove, crash notes memory got reused
> for something else and after crash vmcore.c might see some random data.
> It does basic size checks and discards elf notes if checks don't pass.
>
> Above rance conditions can happen even with OFFLINE event and there is
> no good way to remove these altogether. So making vmcore.c more robust
> is the right solution here.
>
> Signed-off-by: Vivek Goyal <vgoyal at redhat.com>
Restarting kdump service on ADD/REMOVE seems to be more reliable. And
because vmcore can discard empty note at runtime, we don't have to
rebuild elf note.
Acked-by: WANG Chao <chaowang at redhat.com>
> ---
> 98-kexec.rules | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> Index: kexec-tools-fedora/98-kexec.rules
> ===================================================================
> --- kexec-tools-fedora.orig/98-kexec.rules 2014-06-03 13:19:04.813120747 -0400
> +++ kexec-tools-fedora/98-kexec.rules 2014-09-04 10:59:59.093304225 -0400
> @@ -1,4 +1,4 @@
> -SUBSYSTEM=="cpu", ACTION=="online", PROGRAM="/bin/systemctl try-restart kdump.service"
> -SUBSYSTEM=="cpu", ACTION=="offline", PROGRAM="/bin/systemctl try-restart kdump.service"
> +SUBSYSTEM=="cpu", ACTION=="add", PROGRAM="/bin/systemctl try-restart kdump.service"
> +SUBSYSTEM=="cpu", ACTION=="remove", PROGRAM="/bin/systemctl try-restart kdump.service"
> SUBSYSTEM=="memory", ACTION=="online", PROGRAM="/bin/systemctl try-restart kdump.service"
> SUBSYSTEM=="memory", ACTION=="offline", PROGRAM="/bin/systemctl try-restart kdump.service"
More information about the kexec
mailing list