Currently kdump will restart for multiple times and may trigger a initramfs rebuild on memory / CPU hotplug if any configuration is modified. Besides, currently simply restart the service via systemctl directly will take a period of time as well, as kdumpctl will do a lot of config checking, which is not appreciated by udev, udev want everything to get done in a very short time.
And kdumpctl may get kill for several times because systemd will kill the previous service starting process for later restart request. This may introduce extra risk too.
Simply add reload support and use systemdctl reload will not work well either because systemd will ignore later reload request, so kdump may get reload earlier than the hotplug is finished.
This patch series should solve the problem by applying following fixes: - Adding reload support as a fast path to reload kdump resources, bypassing most checking. So the reload progress will be faster and never trigger a initrmafs rebuild. - Let the reload request run in async mode, so udev will never get blocked. - Throttle udev events to avoid unnecessary kdump reloading and make sure kdump will reload after the udev settled.
Update from V2: - Use "-ne" instead of "!=" for comparing return values of functions in kdumpctl. "!=" is used more for pattern matching. - Removed duplicated "is-system-running" check in kdump-udev-throttler - Add more detailed commit message about reason to use --no-block in udev rules
Update from V1: - Add cover letter - Add --no-block option to systemd-run, starting from systemd-v220, systemd-run will run in synchronous mode by default, need to give --no-block explicitly to run in async mode.
Kairui Song (3): kdumpctl: Add reload support Rewrite kdump's udev rules Throttle kdump reload request triggered by udev event
98-kexec.rules | 16 +++++++++--- kdump-udev-throttler | 47 +++++++++++++++++++++++++++++++++++ kdump.service | 1 + kdumpctl | 59 +++++++++++++++++++++++++++++++++++++------- kexec-tools.spec | 3 +++ 5 files changed, 113 insertions(+), 13 deletions(-) create mode 100755 kdump-udev-throttler
Add reload support to kdumpctl, reload will simply unload current loaded kexec crash kernel and initramfs, and load it again.
Changes in /etc/sysconfig/kdump will take effect with kdumpctl reload, but reloading will not check the content of /etc/kdump.conf and won't rebuild anything. reload is fast, the only time-consuming part of kdumpctl reload is loading kernel and initramfs with kexec which is always necessary.
Signed-off-by: Kairui Song kasong@redhat.com --- kdump.service | 1 + kdumpctl | 59 +++++++++++++++++++++++++++++++++++++++++++-------- 2 files changed, 51 insertions(+), 9 deletions(-)
diff --git a/kdump.service b/kdump.service index 5144597..f888dd6 100644 --- a/kdump.service +++ b/kdump.service @@ -7,6 +7,7 @@ DefaultDependencies=no Type=oneshot ExecStart=/usr/bin/kdumpctl start ExecStop=/usr/bin/kdumpctl stop +ExecReload=/usr/bin/kdumpctl reload RemainAfterExit=yes StartLimitInterval=0
diff --git a/kdumpctl b/kdumpctl index ece406f..6236f21 100755 --- a/kdumpctl +++ b/kdumpctl @@ -283,6 +283,16 @@ get_pcs_cluster_modified_files()
setup_initrd() { + KDUMP_BOOTDIR=$(check_boot_dir "${KDUMP_BOOTDIR}") + + if [ -z "$KDUMP_KERNELVER" ]; then + kdump_kver=`uname -r` + else + kdump_kver=$KDUMP_KERNELVER + fi + + kdump_kernel="${KDUMP_BOOTDIR}/${KDUMP_IMG}-${kdump_kver}${KDUMP_IMG_EXT}" + DEFAULT_INITRD="${KDUMP_BOOTDIR}/initramfs-`uname -r`.img" DEFAULT_INITRD_BAK="${KDUMP_BOOTDIR}/.initramfs-`uname -r`.img.default" if [ $DEFAULT_DUMP_MODE == "fadump" ]; then @@ -533,16 +543,8 @@ check_rebuild() local _force_no_rebuild force_no_rebuild="0" local ret system_modified="0"
- KDUMP_BOOTDIR=$(check_boot_dir "${KDUMP_BOOTDIR}") - - if [ -z "$KDUMP_KERNELVER" ]; then - kdump_kver=`uname -r` - else - kdump_kver=$KDUMP_KERNELVER - fi - - kdump_kernel="${KDUMP_BOOTDIR}/${KDUMP_IMG}-${kdump_kver}${KDUMP_IMG_EXT}" setup_initrd + if [ $? -ne 0 ]; then return 1 fi @@ -1004,6 +1006,42 @@ start() echo "Starting kdump: [OK]" }
+reload() +{ + check_current_status + if [ $? -ne 0 ]; then + echo "Kdump is not running: [WARNING]" + return 0 + fi + + if [ $DEFAULT_DUMP_MODE == "fadump" ]; then + stop_fadump + else + stop_kdump + fi + + if [ $? -ne 0 ]; then + echo "Stopping kdump: [FAILED]" + return 1 + fi + + echo "Stopping kdump: [OK]" + + setup_initrd + if [ $? -ne 0 ]; then + echo "Starting kdump: [FAILED]" + return 1 + fi + + start_dump + if [ $? -ne 0 ]; then + echo "Starting kdump: [FAILED]" + return 1 + fi + + echo "Starting kdump: [OK]" +} + stop_fadump() { echo 0 > $FADUMP_REGISTER_SYS_NODE @@ -1087,6 +1125,9 @@ main () esac exit $EXIT_CODE ;; + reload) + reload + ;; restart) stop start
According to udev's man page, PROGRAM is either used to determine device's name or whether the device matches the rule. So we should use RUN insteand. Meanwhile, both RUN / PROGRAM only accepts very short-running foreground tasks, but kdump restart may take a long time if there are any device changes that will lead to image rebuild, which may lead to buggy behavior.
On the other hand, memory / CPU hot plug should never trigger a initramfs rebuild.
To solve this problem, we will use new introduced "kdumpctl reload" instead, and use systemd-run to create a transient service unit for the reload and run it in no-block mode, so udev won't be blocked by anything.
We need to make systemd-run execute in non-blocking mode, and do not synchronously wait for the operation to finish, because udev expect the command line in RUN to be finished immediately, however, kdumpctl reload may take 0.5-1s for an ordinary reload, or even slower on some machines. So we give systemd-run an explicit --no-block option to run in non-blocking mode. Without --no-blocking, systemd-run will verify, enqueue and wait for the operation to finish. By using the --no-block option, systemd-run will only verify and enqueue the unit then return. In this way, we make sure the command is executed asynchronously, and the status will be monitored and logged by systemd, which is reliable and non-blocking.
Another thing to mention is that --no-block is only needed after systemd-v220, before v220 systemd-run uses non-blocking mode by default and --no-block option is not available on earlier systemd versions.
Also reformat the udev rules to a more maintanceable format.
Signed-off-by: Kairui Song kasong@redhat.com --- 98-kexec.rules | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/98-kexec.rules b/98-kexec.rules index e32ee13..a866cf9 100644 --- a/98-kexec.rules +++ b/98-kexec.rules @@ -1,4 +1,12 @@ -SUBSYSTEM=="cpu", ACTION=="add", PROGRAM="/bin/systemctl try-restart kdump.service" -SUBSYSTEM=="cpu", ACTION=="remove", PROGRAM="/bin/systemctl try-restart kdump.service" -SUBSYSTEM=="memory", ACTION=="online", PROGRAM="/bin/systemctl try-restart kdump.service" -SUBSYSTEM=="memory", ACTION=="offline", PROGRAM="/bin/systemctl try-restart kdump.service" +SUBSYSTEM=="cpu", ACTION=="add", GOTO="kdump_reload" +SUBSYSTEM=="cpu", ACTION=="remove", GOTO="kdump_reload" +SUBSYSTEM=="memory", ACTION=="online", GOTO="kdump_reload" +SUBSYSTEM=="memory", ACTION=="offline", GOTO="kdump_reload" + +GOTO="kdump_reload_end" + +LABEL="kdump_reload" + +RUN+="/usr/bin/systemd-run --no-block /usr/bin/kdumpctl reload" + +LABEL="kdump_reload_end"
Previously, kdump will restart / reload for many times on hotplug event, especially memory hotplug events. Hotplugged memory may generate many udev event as memory are managed and hotplugged in small chunks by the kernel.
This results in unnecessary system workload and an actually longer delay of kdump reload and the hotplug event, as udev will either get blocked or kdumpctl will be waiting for other triggered operation.
To fix this, introduce a kdump-udev-throttler as an agent which will be called by udev and merge concurrent kdump restart requests. Tested with a Hyper-V VM which is failing due to udev timeout previously, no new issues found.
Signed-off-by: Kairui Song kasong@redhat.com --- 98-kexec.rules | 2 +- kdump-udev-throttler | 47 ++++++++++++++++++++++++++++++++++++++++++++ kexec-tools.spec | 3 +++ 3 files changed, 51 insertions(+), 1 deletion(-) create mode 100755 kdump-udev-throttler
diff --git a/98-kexec.rules b/98-kexec.rules index a866cf9..2f88c77 100644 --- a/98-kexec.rules +++ b/98-kexec.rules @@ -7,6 +7,6 @@ GOTO="kdump_reload_end"
LABEL="kdump_reload"
-RUN+="/usr/bin/systemd-run --no-block /usr/bin/kdumpctl reload" +RUN+="/usr/bin/systemd-run --no-block /usr/lib/udev/kdump-udev-throttler"
LABEL="kdump_reload_end" diff --git a/kdump-udev-throttler b/kdump-udev-throttler new file mode 100755 index 0000000..6cbb99a --- /dev/null +++ b/kdump-udev-throttler @@ -0,0 +1,47 @@ +#!/bin/bash +# This util helps to reduce the workload of kdump service restarting +# on udev event. When hotplugging memory / CPU, multiple udev +# events may be triggered concurrently, and obviously, we don't want +# to restart kdump service for each event. + +# This script will be called by udev, and make sure kdump service is +# restart after all events we are watching are settled. + +# On each call, this script will update try to aquire the $throttle_lock +# The first instance acquired the file lock will keep waiting for events +# to settle and then reload kdump. Other instances will just exit +# In this way, we can make sure kdump service is restarted immediately +# and for exactly once after udev events are settled. + + +throttle_lock="/var/lock/kdump-udev-throttle" +interval=2 + +# Don't reload kdump service if kdump service is not started by systemd +systemctl is-active kdump.service &>/dev/null || exit 0 + +exec 9>$throttle_lock +if [ $? -ne 0 ]; then + echo "Failed to create the lock file! Fallback to non-throttled kdump service restart" + /bin/kdumpctl reload + exit 1 +fi + +flock -n 9 +if [ $? -ne 0 ]; then + echo "Throttling kdump restart for concurrent udev event" + exit 0 +fi + +# Wait for at least 1 second, at most 4 seconds for udev to settle +# Idealy we will have a less than 1 second lag between udev events settle +# and kdump reload +sleep 1 && udevadm settle --timeout 3 + +# Release the lock, /bin/kdumpctl will block and make the process +# holding two locks at the same time and we might miss some events +exec 9>&- + +/bin/kdumpctl reload + +exit 0 diff --git a/kexec-tools.spec b/kexec-tools.spec index 6330534..91d322f 100644 --- a/kexec-tools.spec +++ b/kexec-tools.spec @@ -29,6 +29,7 @@ Source24: kdump.sysconfig.ppc64le Source25: kdumpctl.8 Source26: live-image-kdump-howto.txt Source27: early-kdump-howto.txt +Source28: kdump-udev-throttler
####################################### # These are sources for mkdumpramfs @@ -171,6 +172,7 @@ install -m 755 %{SOURCE23} $RPM_BUILD_ROOT%{_prefix}/lib/kdump/kdump-lib-initram # For s390x the ELF header is created in the kdump kernel and therefore kexec # udev rules are not required install -m 644 %{SOURCE14} $RPM_BUILD_ROOT%{_udevrulesdir}/98-kexec.rules +install -m 755 %{SOURCE28} $RPM_BUILD_ROOT%{_udevrulesdir}/../kdump-udev-throttler %endif install -m 644 %{SOURCE15} $RPM_BUILD_ROOT%{_mandir}/man5/kdump.conf.5 install -m 644 %{SOURCE16} $RPM_BUILD_ROOT%{_unitdir}/kdump.service @@ -298,6 +300,7 @@ done %config(noreplace,missingok) %{_sysconfdir}/kdump.conf %ifnarch s390x %config %{_udevrulesdir} +%{_udevrulesdir}/../kdump-udev-throttler %endif %{dracutlibdir}/modules.d/* %dir %{_localstatedir}/crash
On 10/30/18 at 06:48pm, Kairui Song wrote:
Currently kdump will restart for multiple times and may trigger a initramfs rebuild on memory / CPU hotplug if any configuration is modified. Besides, currently simply restart the service via systemctl directly will take a period of time as well, as kdumpctl will do a lot of config checking, which is not appreciated by udev, udev want everything to get done in a very short time.
And kdumpctl may get kill for several times because systemd will kill the previous service starting process for later restart request. This may introduce extra risk too.
Simply add reload support and use systemdctl reload will not work well either because systemd will ignore later reload request, so kdump may get reload earlier than the hotplug is finished.
This patch series should solve the problem by applying following fixes:
- Adding reload support as a fast path to reload kdump resources, bypassing most checking. So the reload progress will be faster and never trigger a initrmafs rebuild.
- Let the reload request run in async mode, so udev will never get blocked.
- Throttle udev events to avoid unnecessary kdump reloading and make sure kdump will reload after the udev settled.
Update from V2:
- Use "-ne" instead of "!=" for comparing return values of functions in kdumpctl. "!=" is used more for pattern matching.
- Removed duplicated "is-system-running" check in kdump-udev-throttler
- Add more detailed commit message about reason to use --no-block in udev rules
Update from V1:
- Add cover letter
- Add --no-block option to systemd-run, starting from systemd-v220, systemd-run will run in synchronous mode by default, need to give --no-block explicitly to run in async mode.
Kairui Song (3): kdumpctl: Add reload support Rewrite kdump's udev rules Throttle kdump reload request triggered by udev event
98-kexec.rules | 16 +++++++++--- kdump-udev-throttler | 47 +++++++++++++++++++++++++++++++++++ kdump.service | 1 + kdumpctl | 59 +++++++++++++++++++++++++++++++++++++------- kexec-tools.spec | 3 +++ 5 files changed, 113 insertions(+), 13 deletions(-) create mode 100755 kdump-udev-throttler
-- 2.17.1
Ack
Thanks Dave