Currently kdump use some wrapper around dracut emergency service and emergency shell, this have many problems:
- If dracut-initqueue failed back to emergency mode due to timeout, and faiure action is set to dump_to_rootfs, kdump will try start initqueue again, lead to double timeout error. - Dracut's emergency shell have many builtin actions, like perform dracut emergency_action, ask for root password, generate rdsosreport etc.
This series remove the emergency wrapper, and use a standalone kdump emergency shell. Kdump will always perform final_action after kdump shell, so simplified version works fine.
Update from V3: - Fix initqueue dead loop in patch 1/3 found by Coiby. Checking is-active || is-failure is not valid here since initqueue might stuck in activating status, so check service status instead.
Update from V2: Fix an redundant string escaping issue found by Coiby.
Update from V1: Fix initqueue status detection in patch 1/3, thanks to Coiby.
Kairui Song (3): Don's try to restart dracut-initqueue if it's already there Remove the kdump error handler isolation wrapper Use a customized emergency shell
dracut-kdump-emergency.service | 25 +++++++++---------- dracut-kdump-error-handler.service | 33 ------------------------ dracut-module-setup.sh | 1 - kdump-lib-initramfs.sh | 40 +++++++++++++++++++++++++----- kexec-tools.spec | 2 -- 5 files changed, 46 insertions(+), 55 deletions(-) delete mode 100644 dracut-kdump-error-handler.service
kdump's dump_to_rootfs will try to start initqueue unconditionally. dump_to_rootfs will run after systemd isolate to emergency target, so this is currently accetable.
But there is a problem when initqueue starts the emergency action because of initqueue timeout. dump_to_rootfs will start initqueue and lead to timeout again.
So following patch will remove the previous isolation wrapper, and detect the service status here. Previous isolation makes the detection impossible. Now this detection will be valid and helpful to prevent double timeout or hang.
Signed-off-by: Kairui Song kasong@redhat.com --- kdump-lib-initramfs.sh | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/kdump-lib-initramfs.sh b/kdump-lib-initramfs.sh index 5cb0223..482839a 100755 --- a/kdump-lib-initramfs.sh +++ b/kdump-lib-initramfs.sh @@ -218,8 +218,11 @@ save_opalcore_fs() { dump_to_rootfs() {
- dinfo "Trying to bring up rootfs device" - systemctl start dracut-initqueue + if [[ $(systemctl status dracut-initqueue | sed -n "s/^\s*Active: (\S*)\s.*$/\1/p") == "inactive" ]]; then + dinfo "Trying to bring up initqueue for rootfs mount" + systemctl start dracut-initqueue + fi + dinfo "Waiting for rootfs mount, will timeout after 90 seconds" systemctl start sysroot.mount
The wrapper is introduced in commit 002337c, according to the commit message, the only usage of the wrapper is when dracut-initqueue calls "systemctl start emergency" directly. In that case, emergency is started, but not in a isolation mode, which means dracut-initqueue is still running. On the other hand, emergency will call "systemctl start dracut-initqueue" again when default action is dump_to_rootfs.
systemd would block on the last dracut-initqueue, waiting for the first instance to exit, which leaves us hang.
In previous commit we added initqueue status detect in dump_to_rootfs, so now even without the wrapper, it will not hang.
And actually, previously, with the wrapper, emergency might still hang for like 30s. When dracut called emergency service because initqueue timed out, dump_to_rootfs will try start initqueue again and timeout again. Now with the wrapper removed, we can avoid these two kinds of hangs, bacause without the isolation we can detect initqueue service status correctly in such case.
Also remove the invalid header comments in service file, the service is not part of systemd code. And sync the service spec with dracut.
Signed-off-by: Kairui Song kasong@redhat.com --- dracut-kdump-emergency.service | 25 +++++++++++----------- dracut-kdump-error-handler.service | 33 ------------------------------ dracut-module-setup.sh | 1 - kexec-tools.spec | 2 -- 4 files changed, 12 insertions(+), 49 deletions(-) delete mode 100644 dracut-kdump-error-handler.service
diff --git a/dracut-kdump-emergency.service b/dracut-kdump-emergency.service index e023284..f2f6fad 100644 --- a/dracut-kdump-emergency.service +++ b/dracut-kdump-emergency.service @@ -1,27 +1,26 @@ -# This file is part of systemd. -# -# systemd is free software; you can redistribute it and/or modify it -# under the terms of the GNU Lesser General Public License as published by -# the Free Software Foundation; either version 2.1 of the License, or -# (at your option) any later version. - -# This service will be placed in kdump initramfs and replace both the systemd -# emergency service and dracut emergency shell. IOW, any emergency will be -# kick this service and in turn isolating to kdump error handler. +# This service will run the real kdump error handler code. Executing the +# failure action configured in kdump.conf
[Unit] -Description=Kdump Emergency +Description=Kdump Error Handler DefaultDependencies=no -IgnoreOnIsolate=yes +After=systemd-vconsole-setup.service +Wants=systemd-vconsole-setup.service
[Service] -ExecStart=/usr/bin/systemctl --no-block isolate kdump-error-handler.service +Environment=HOME=/ +Environment=DRACUT_SYSTEMD=1 +Environment=NEWROOT=/sysroot +WorkingDirectory=/ +ExecStart=/bin/kdump-error-handler.sh +ExecStopPost=-/bin/rm -f -- /.console_lock Type=oneshot StandardInput=tty-force StandardOutput=inherit StandardError=inherit KillMode=process IgnoreSIGPIPE=no +TasksMax=infinity
# Bash ignores SIGTERM, so we send SIGHUP instead, to ensure that bash # terminates cleanly. diff --git a/dracut-kdump-error-handler.service b/dracut-kdump-error-handler.service deleted file mode 100644 index a23b75e..0000000 --- a/dracut-kdump-error-handler.service +++ /dev/null @@ -1,33 +0,0 @@ -# This file is part of systemd. -# -# systemd is free software; you can redistribute it and/or modify it -# under the terms of the GNU Lesser General Public License as published by -# the Free Software Foundation; either version 2.1 of the License, or -# (at your option) any later version. - -# This service will run the real kdump error handler code. Executing the -# failure action configured in kdump.conf - -[Unit] -Description=Kdump Error Handler -DefaultDependencies=no -After=systemd-vconsole-setup.service -Wants=systemd-vconsole-setup.service -AllowIsolate=yes - -[Service] -Environment=HOME=/ -Environment=DRACUT_SYSTEMD=1 -Environment=NEWROOT=/sysroot -WorkingDirectory=/ -ExecStart=/bin/kdump-error-handler.sh -Type=oneshot -StandardInput=tty-force -StandardOutput=inherit -StandardError=inherit -KillMode=process -IgnoreSIGPIPE=no - -# Bash ignores SIGTERM, so we send SIGHUP instead, to ensure that bash -# terminates cleanly. -KillSignal=SIGHUP diff --git a/dracut-module-setup.sh b/dracut-module-setup.sh index 80df59c..daad7f1 100755 --- a/dracut-module-setup.sh +++ b/dracut-module-setup.sh @@ -973,7 +973,6 @@ install() { inst "$moddir/kdump-capture.service" "$systemdsystemunitdir/kdump-capture.service" systemctl -q --root "$initdir" add-wants initrd.target kdump-capture.service inst "$moddir/kdump-error-handler.sh" "/usr/bin/kdump-error-handler.sh" - inst "$moddir/kdump-error-handler.service" "$systemdsystemunitdir/kdump-error-handler.service" # Replace existing emergency service and emergency target cp "$moddir/kdump-emergency.service" "$initdir/$systemdsystemunitdir/emergency.service" cp "$moddir/kdump-emergency.target" "$initdir/$systemdsystemunitdir/emergency.target" diff --git a/kexec-tools.spec b/kexec-tools.spec index b33c5ce..1b214e5 100644 --- a/kexec-tools.spec +++ b/kexec-tools.spec @@ -49,7 +49,6 @@ Source101: dracut-module-setup.sh Source102: dracut-monitor_dd_progress Source103: dracut-kdump-error-handler.sh Source104: dracut-kdump-emergency.service -Source105: dracut-kdump-error-handler.service Source106: dracut-kdump-capture.service Source107: dracut-kdump-emergency.target Source108: dracut-early-kdump.sh @@ -226,7 +225,6 @@ cp %{SOURCE101} $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpb cp %{SOURCE102} $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix %{SOURCE102}} cp %{SOURCE103} $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix %{SOURCE103}} cp %{SOURCE104} $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix %{SOURCE104}} -cp %{SOURCE105} $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix %{SOURCE105}} cp %{SOURCE106} $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix %{SOURCE106}} cp %{SOURCE107} $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix %{SOURCE107}} chmod 755 $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix %{SOURCE100}}
Use a modified and minimized version of emergency shell. The differences of this kdump shell and dracut emergency shell are:
- Kdump shell won't generate a rdsosreport automatically - Customized prompts - Never ask root password - Won't tangle with dracut's emergency_action. If emergency_action is set, dracut emergency shell will perform dracut's emergency_action instead of kdump final_action on exit. - If rd.shell=no is set, kdump shell will still work, dracut emergency shell won't, even if kdump failure_action is set to shell.
Signed-off-by: Kairui Song kasong@redhat.com --- kdump-lib-initramfs.sh | 33 +++++++++++++++++++++++++++++---- 1 file changed, 29 insertions(+), 4 deletions(-)
diff --git a/kdump-lib-initramfs.sh b/kdump-lib-initramfs.sh index 482839a..e02fdeb 100755 --- a/kdump-lib-initramfs.sh +++ b/kdump-lib-initramfs.sh @@ -233,10 +233,35 @@ dump_to_rootfs()
kdump_emergency_shell() { - echo "PS1="kdump:\${PWD}# "" >/etc/profile - ddebug "Switching to dracut emergency..." - /bin/dracut-emergency - rm -f /etc/profile + ddebug "Switching to kdump emergency shell..." + + [ -f /etc/profile ] && . /etc/profile + export PS1='kdump:${PWD}# ' + + . /lib/dracut-lib.sh + if [ -f /dracut-state.sh ]; then + . /dracut-state.sh 2>/dev/null + fi + + source_conf /etc/conf.d + + type plymouth >/dev/null 2>&1 && plymouth quit + + source_hook "emergency" + while read _tty rest; do + ( + echo + echo + echo 'Entering kdump emergency mode.' + echo 'Type "journalctl" to view system logs.' + echo 'Type "rdsosreport" to generate a sosreport, you can then' + echo 'save it elsewhere and attach it to a bug report.' + echo + echo + ) > /dev/$_tty + done < /proc/consoles + sh -i -l + /bin/rm -f -- /.console_lock }
do_failure_action()
On Mon, Apr 26, 2021 at 05:09:54PM +0800, Kairui Song wrote:
Currently kdump use some wrapper around dracut emergency service and emergency shell, this have many problems:
- If dracut-initqueue failed back to emergency mode due to timeout, and faiure action is set to dump_to_rootfs, kdump will try start initqueue again, lead to double timeout error.
- Dracut's emergency shell have many builtin actions, like perform dracut emergency_action, ask for root password, generate rdsosreport etc.
This series remove the emergency wrapper, and use a standalone kdump emergency shell. Kdump will always perform final_action after kdump shell, so simplified version works fine.
Update from V3:
- Fix initqueue dead loop in patch 1/3 found by Coiby. Checking is-active || is-failure is not valid here since initqueue might stuck in activating status, so check service status instead.
Update from V2: Fix an redundant string escaping issue found by Coiby.
Update from V1: Fix initqueue status detection in patch 1/3, thanks to Coiby.
Kairui Song (3): Don's try to restart dracut-initqueue if it's already there Remove the kdump error handler isolation wrapper Use a customized emergency shell
dracut-kdump-emergency.service | 25 +++++++++---------- dracut-kdump-error-handler.service | 33 ------------------------ dracut-module-setup.sh | 1 - kdump-lib-initramfs.sh | 40 +++++++++++++++++++++++++----- kexec-tools.spec | 2 -- 5 files changed, 46 insertions(+), 55 deletions(-) delete mode 100644 dracut-kdump-error-handler.service
-- 2.30.2
This patch set looks good to me.
Acked-by: Coiby Xu coxu@redhat.com
kexec mailing list -- kexec@lists.fedoraproject.org To unsubscribe send an email to kexec-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/kexec@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure