Hi, All
This patchset introduce a new kdump emergency service. It will override the existing emergency.service. When fatal error occurs, this emergency service will be triggered and systemd will isolate to emergency path.
This kdump emergency service can will read kdump.conf and act according to the configired "default action" (reboot/poweroff/halt/shell/dump_to_rootfs).
Along with this patchset, kdump-capture.service is introduced as a service unit to run kdump.sh. When kdump-capture.service fails, systemd will isolate to kdump emergency service. I copied all the dependencies from dracut-pre-pivot.service to kdump-capture.service so that kdump.sh will be called the correct time window.
WANG Chao (5): mkdumprd: changes for mount point and options in 2nd kernel cleanup: extract functions from kdump.sh to kdump-lib.sh Introduce kdump error handling service Introduce kdump capture service disable dracut-emergency.service the whole time
dracut-kdump-capture.service | 32 ++++++++ dracut-kdump-emergency.service | 32 ++++++++ dracut-kdump-error-handler.sh | 10 +++ dracut-kdump.sh | 161 +---------------------------------------- dracut-module-setup.sh | 7 +- kdump-lib.sh | 161 +++++++++++++++++++++++++++++++++++++++++ kexec-tools.spec | 7 +- mkdumprd | 25 ++++--- 8 files changed, 264 insertions(+), 171 deletions(-) create mode 100644 dracut-kdump-capture.service create mode 100644 dracut-kdump-emergency.service create mode 100755 dracut-kdump-error-handler.sh
This patch does the following two changes in 2nd kernel: a). dump target is mounted under /sysroot b). append "x-initrd.mount" to the mount options.
With a). we don't need to track what we've mounted in 2nd kernel. We can just umount recursively every mount in /sysroot by command:
umount -R /sysroot
It's very convenient to do so, because it's hard to track what we've mounted when we're in error handling path. So mount everything under /sysroot is reasonable and practical for us.
With b). the mount unit becomes required by "initrd-root-fs.target" rather than it used to be "local-fs.target".
The difference between "initrd-root-fs.target" and "local-fs.target" is, the former has OnFailureIsolate=yes but the later does not. From system.unit(5):
OnFailureIsolate= Takes a boolean argument. If true, the unit listed in OnFailure= will be enqueued in isolation mode, i.e. all units that are not its dependency will be stopped. If this is set, only a single unit may be listed in OnFailure=. Defaults to false.
With OnFailure=emergency.target and OnFailureIsolate=yes, when error occurred during the boot, systemd will "isolate" to emergency.target, that means all the service will be stopped except the dependencies of emergency.target. So that systemd boot process will be interrupt and leave emergency.target as the only one running. Without the isolate, systemd will continue to boot and emergency.target will be interrupted.
Both a) and b) are prerequisite for introducing kdump error handling later.
Signed-off-by: WANG Chao chaowang@redhat.com --- mkdumprd | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-)
diff --git a/mkdumprd b/mkdumprd index b49b74f..a38d190 100644 --- a/mkdumprd +++ b/mkdumprd @@ -97,19 +97,20 @@ target_is_root() {
# caller should ensure $1 is valid and mounted in 1st kernel to_mount() { - local _dev=$1 _s _t _o _mntopts _pdev - - _s=$(findmnt -k -f -n -r -o SOURCE $_dev) - _t=$(findmnt -k -f -n -r -o TARGET,FSTYPE $_dev) - _o=$(findmnt -k -f -n -r -o OPTIONS $_dev) - _o=${_o/#ro/rw} #mount fs target as rw in 2nd kernel - # "nofail" mount could be run later than kdump.sh. So we don't pass nofail - # for short term. - #_o="${_o},nofail" #with nofail set, systemd won't block for mount failure - _mntopts="$_t $_o" + local _dev=$1 _source _target _fstype _options _mntopts _pdev + + _source=$(findmnt -k -f -n -r -o SOURCE $_dev) + _target=$(findmnt -k -f -n -r -o TARGET $_dev) + # mount under /sysroot in 2nd kernel, and we umount -R /sysroot before exit + _target="/sysroot$_target" + _fstype=$(findmnt -k -f -n -r -o FSTYPE $_dev) + _options=$(findmnt -k -f -n -r -o OPTIONS $_dev) + _options=${_options/#ro/rw} #mount fs target as rw in 2nd kernel + _options="$_options,x-initrd.mount" + _mntopts="$_target $_fstype $_options" #for non-nfs _dev converting to use udev persistent name - if [ -b "$_s" ]; then - _pdev="$(get_persistent_dev $_s)" + if [ -b "$_source" ]; then + _pdev="$(get_persistent_dev $_source)" if [ $? -ne 0 ]; then return 1 fi
On Thu, May 08, 2014 at 07:37:13PM +0800, WANG Chao wrote:
This patch does the following two changes in 2nd kernel: a). dump target is mounted under /sysroot b). append "x-initrd.mount" to the mount options.
With a). we don't need to track what we've mounted in 2nd kernel. We can just umount recursively every mount in /sysroot by command:
umount -R /sysroot
Why do we need to "umount" everything under /sysroot?
It's very convenient to do so, because it's hard to track what we've mounted when we're in error handling path. So mount everything under /sysroot is reasonable and practical for us.
With b). the mount unit becomes required by "initrd-root-fs.target" rather than it used to be "local-fs.target".
The difference between "initrd-root-fs.target" and "local-fs.target" is, the former has OnFailureIsolate=yes but the later does not. From system.unit(5):
I don't understand why do we need this. initrd-root-fs.target should be dependent only on root mount unit and no other mount unit. And if one can't mount root, that's a fatal failure and launching emergency shell in "Isolate" mode makes sense.
But not reaching local-fs.target should not be a fatal failure. One should still be able to continue with boot. And to me it makes sense that in this case emergecny shell is launched without isolating rest of the serivces.
That also means that if we fail to mount an nfs or non-root file system, we will fall into emergency shell with sysroot mounted? That's what we want.
OnFailureIsolate= Takes a boolean argument. If true, the unit listed in OnFailure= will be enqueued in isolation mode, i.e. all units that are not its dependency will be stopped. If this is set, only a single unit may be listed in OnFailure=. Defaults to false.
With OnFailure=emergency.target and OnFailureIsolate=yes, when error occurred during the boot, systemd will "isolate" to emergency.target, that means all the service will be stopped except the dependencies of emergency.target. So that systemd boot process will be interrupt and leave emergency.target as the only one running. Without the isolate, systemd will continue to boot and emergency.target will be interrupted.
What do you mean by emergency.target will be interrupted.
Thanks Vivek
Both a) and b) are prerequisite for introducing kdump error handling later.
Signed-off-by: WANG Chao chaowang@redhat.com
mkdumprd | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-)
diff --git a/mkdumprd b/mkdumprd index b49b74f..a38d190 100644 --- a/mkdumprd +++ b/mkdumprd @@ -97,19 +97,20 @@ target_is_root() {
# caller should ensure $1 is valid and mounted in 1st kernel to_mount() {
- local _dev=$1 _s _t _o _mntopts _pdev
- _s=$(findmnt -k -f -n -r -o SOURCE $_dev)
- _t=$(findmnt -k -f -n -r -o TARGET,FSTYPE $_dev)
- _o=$(findmnt -k -f -n -r -o OPTIONS $_dev)
- _o=${_o/#ro/rw} #mount fs target as rw in 2nd kernel
- # "nofail" mount could be run later than kdump.sh. So we don't pass nofail
- # for short term.
- #_o="${_o},nofail" #with nofail set, systemd won't block for mount failure
- _mntopts="$_t $_o"
- local _dev=$1 _source _target _fstype _options _mntopts _pdev
- _source=$(findmnt -k -f -n -r -o SOURCE $_dev)
- _target=$(findmnt -k -f -n -r -o TARGET $_dev)
- # mount under /sysroot in 2nd kernel, and we umount -R /sysroot before exit
- _target="/sysroot$_target"
- _fstype=$(findmnt -k -f -n -r -o FSTYPE $_dev)
- _options=$(findmnt -k -f -n -r -o OPTIONS $_dev)
- _options=${_options/#ro/rw} #mount fs target as rw in 2nd kernel
- _options="$_options,x-initrd.mount"
- _mntopts="$_target $_fstype $_options" #for non-nfs _dev converting to use udev persistent name
- if [ -b "$_s" ]; then
_pdev="$(get_persistent_dev $_s)"
- if [ -b "$_source" ]; then
_pdev="$(get_persistent_dev $_source)" if [ $? -ne 0 ]; then return 1 fi
-- 1.9.0
kexec mailing list kexec@lists.fedoraproject.org https://lists.fedoraproject.org/mailman/listinfo/kexec
On 05/13/14 at 10:50pm, Vivek Goyal wrote:
On Thu, May 08, 2014 at 07:37:13PM +0800, WANG Chao wrote:
This patch does the following two changes in 2nd kernel: a). dump target is mounted under /sysroot b). append "x-initrd.mount" to the mount options.
With a). we don't need to track what we've mounted in 2nd kernel. We can just umount recursively every mount in /sysroot by command:
umount -R /sysroot
Why do we need to "umount" everything under /sysroot?
dump target is mounted under /sysroot/ in 2nd kernel, say "/sysroot/var/crash/".
To keep error handler simple enough, I don't want to check out /etc/fstab about where the dump target is mounted. So it makes life much easier to have the dump target mount under /sysroot/. Before reboot or poweroff, we can simply umount -R /sysroot to have both /sysroot and dump target mount to be unmounted.
It's very convenient to do so, because it's hard to track what we've mounted when we're in error handling path. So mount everything under /sysroot is reasonable and practical for us.
As I said here as well ...
With b). the mount unit becomes required by "initrd-root-fs.target" rather than it used to be "local-fs.target".
The difference between "initrd-root-fs.target" and "local-fs.target" is, the former has OnFailureIsolate=yes but the later does not. From system.unit(5):
I don't understand why do we need this. initrd-root-fs.target should be dependent only on root mount unit and no other mount unit. And if one can't mount root, that's a fatal failure and launching emergency shell in "Isolate" mode makes sense.
No, not with "x-initrd.mount":
# ls -l /run/systemd/generator/initrd-root-fs.target.wants/ total 0 lrwxrwxrwx 1 root 0 36 May 14 13:58 sysroot.mount -> /run/systemd/generator/sysroot.mount
But not reaching local-fs.target should not be a fatal failure. One should still be able to continue with boot. And to me it makes sense that in this case emergecny shell is launched without isolating rest of the serivces.
We must isolate to emergency in case of any error which is critical for initrd.target. So that we can avoid any further systemd unit execution. Because initrd-cleanup.service, as one of those units, would switch root.
Let's take local-fs.target for example:
local-fs.target fails and emergency.service is triggered. And because sysinit.target "Conflicts=" with emergency.service, sysinit.target will be stopped. basic.target "Requires=" sysinit.target, which will in turn be stopped too. initrd.target "Requires=" basic.target, and it will be stopped too. Then initrd-cleanup.service comes into the picture because it runs "After" initrd.target. Given the fact that initrd.target is considered to be done from perspective of initrd-cleanup.service, initrd-cleanup.service will be started and will switch root.
That's why we need to isolate on any error. We want systemd to stop completely except for the emergency services.
These dependencies are complicated and some of the systemd terms are not quite comprehensive so it's possible I made mistake here. But bottom line is "local-fs.target" doesn't work out and systemd switch root. "initrd-root-fs.target" (x-initrd.mount) works flawlessly.
That also means that if we fail to mount an nfs or non-root file system, we will fall into emergency shell with sysroot mounted? That's what we want.
No. None of these are related...
OnFailureIsolate= Takes a boolean argument. If true, the unit listed in OnFailure= will be enqueued in isolation mode, i.e. all units that are not its dependency will be stopped. If this is set, only a single unit may be listed in OnFailure=. Defaults to false.
With OnFailure=emergency.target and OnFailureIsolate=yes, when error occurred during the boot, systemd will "isolate" to emergency.target, that means all the service will be stopped except the dependencies of emergency.target. So that systemd boot process will be interrupt and leave emergency.target as the only one running. Without the isolate, systemd will continue to boot and emergency.target will be interrupted.
What do you mean by emergency.target will be interrupted.
I explain at above why it doesn't work out if we don't isolate.
Thanks WANG Chao
In the later patch, the kdump error handler script can reuse some of the code in kdump.sh. So put the common functions and variables in kdump-lib.sh
Signed-off-by: WANG Chao chaowang@redhat.com --- dracut-kdump.sh | 152 +--------------------------------------------------- kdump-lib.sh | 161 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 163 insertions(+), 150 deletions(-)
diff --git a/dracut-kdump.sh b/dracut-kdump.sh index cb13d92..1960b7e 100755 --- a/dracut-kdump.sh +++ b/dracut-kdump.sh @@ -9,24 +9,6 @@ if [ -f "$initdir/lib/dracut/no-emergency-shell" ]; then fi
set -o pipefail -KDUMP_PATH="/var/crash" -CORE_COLLECTOR="" -DEFAULT_CORE_COLLECTOR="makedumpfile -l --message-level 1 -d 31" -DMESG_COLLECTOR="/sbin/vmcore-dmesg" -DEFAULT_ACTION="reboot -f" -DATEDIR=`date +%Y.%m.%d-%T` -HOST_IP='127.0.0.1' -DUMP_INSTRUCTION="" -SSH_KEY_LOCATION="/root/.ssh/kdump_id_rsa" -KDUMP_SCRIPT_DIR="/kdumpscripts" -DD_BLKSIZE=512 -FINAL_ACTION="reboot -f" -DUMP_RETVAL=0 -conf_file="/etc/kdump.conf" -KDUMP_PRE="" -KDUMP_POST="" -MOUNTS="" - export PATH=$PATH:$KDUMP_SCRIPT_DIR
do_dump() @@ -43,27 +25,6 @@ do_dump() return $_ret }
-do_umount() -{ - if [ -n "$MOUNTS" ]; then - for mount in $MOUNTS; do - ismounted $mount && umount -R $mount - done - fi -} - -do_final_action() -{ - do_umount - eval $FINAL_ACTION -} - -do_default_action() -{ - wait_for_loginit - eval $DEFAULT_ACTION -} - do_kdump_pre() { if [ -n "$KDUMP_PRE" ]; then @@ -83,39 +44,6 @@ add_dump_code() DUMP_INSTRUCTION=$1 }
-# dump_fs <mount point| device> -dump_fs() -{ - local _dev=$(findmnt -k -f -n -r -o SOURCE $1) - local _mp=$(findmnt -k -f -n -r -o TARGET $1) - - echo "kdump: dump target is $_dev" - - if [ -z "$_mp" ]; then - echo "kdump: error: Dump target $_dev is not mounted." - return 1 - fi - MOUNTS="$MOUNTS $_mp" - - # Remove -F in makedumpfile case. We don't want a flat format dump here. - [[ $CORE_COLLECTOR = *makedumpfile* ]] && CORE_COLLECTOR=`echo $CORE_COLLECTOR | sed -e "s/-F//g"` - - echo "kdump: saving to $_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR/" - - mount -o remount,rw $_mp || return 1 - mkdir -p $_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR || return 1 - - save_vmcore_dmesg_fs ${DMESG_COLLECTOR} "$_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR/" - - echo "kdump: saving vmcore" - $CORE_COLLECTOR /proc/vmcore $_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR/vmcore-incomplete || return 1 - mv $_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR/vmcore-incomplete $_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR/vmcore - sync - - echo "kdump: saving vmcore complete" - return 0 -} - dump_raw() { local _raw=$1 @@ -165,21 +93,6 @@ dump_ssh() return 0 }
-save_vmcore_dmesg_fs() { - local _dmesg_collector=$1 - local _path=$2 - - echo "kdump: saving vmcore-dmesg.txt" - $_dmesg_collector /proc/vmcore > ${_path}/vmcore-dmesg-incomplete.txt - _exitcode=$? - if [ $_exitcode -eq 0 ]; then - mv ${_path}/vmcore-dmesg-incomplete.txt ${_path}/vmcore-dmesg.txt - echo "kdump: saving vmcore-dmesg.txt complete" - else - echo "kdump: saving vmcore-dmesg.txt failed" - fi -} - save_vmcore_dmesg_ssh() { local _dmesg_collector=$1 local _path=$2 @@ -218,61 +131,7 @@ get_host_ip()
read_kdump_conf() { - if [ ! -f "$conf_file" ]; then - echo "kdump: $conf_file not found" - return - fi - - # first get the necessary variables - while read config_opt config_val; - do - # remove inline comments after the end of a directive. - config_val=$(strip_comments $config_val) - case "$config_opt" in - path) - KDUMP_PATH="$config_val" - ;; - core_collector) - [ -n "$config_val" ] && CORE_COLLECTOR="$config_val" - ;; - sshkey) - if [ -f "$config_val" ]; then - SSH_KEY_LOCATION=$config_val - fi - ;; - kdump_pre) - KDUMP_PRE="$config_val" - ;; - kdump_post) - KDUMP_POST="$config_val" - ;; - fence_kdump_args) - FENCE_KDUMP_ARGS="$config_val" - ;; - fence_kdump_nodes) - FENCE_KDUMP_NODES="$config_val" - ;; - default) - case $config_val in - shell) - DEFAULT_ACTION="_emergency_shell kdump" - ;; - reboot) - DEFAULT_ACTION="do_umount; reboot -f" - ;; - halt) - DEFAULT_ACTION="do_umount; halt -f" - ;; - poweroff) - DEFAULT_ACTION="do_umount; poweroff -f" - ;; - dump_to_rootfs) - DEFAULT_ACTION="dump_fs $NEWROOT" - ;; - esac - ;; - esac - done < $conf_file + get_kdump_confs
# rescan for add code for dump target while read config_opt config_val; @@ -290,7 +149,7 @@ read_kdump_conf() add_dump_code "dump_ssh $SSH_KEY_LOCATION $config_val" ;; esac - done < $conf_file + done < $KDUMP_CONF }
fence_kdump_notify() @@ -303,13 +162,6 @@ fence_kdump_notify() read_kdump_conf fence_kdump_notify
-if [ -z "$CORE_COLLECTOR" ];then - CORE_COLLECTOR=$DEFAULT_CORE_COLLECTOR - if is_ssh_dump_target || is_raw_dump_target; then - CORE_COLLECTOR="$CORE_COLLECTOR -F" - fi -fi - get_host_ip if [ $? -ne 0 ]; then echo "kdump: get_host_ip exited with non-zero status!" diff --git a/kdump-lib.sh b/kdump-lib.sh index a20c6e8..5acf6f3 100755 --- a/kdump-lib.sh +++ b/kdump-lib.sh @@ -138,3 +138,164 @@ check_save_path_fs() fi }
+ +# +# Below functions and variables are meant to be used in 2nd kernel +# + +KDUMP_PATH="/var/crash" +CORE_COLLECTOR="makedumpfile -l --message-level 1 -d 31" +DMESG_COLLECTOR="/sbin/vmcore-dmesg" +DEFAULT_ACTION="reboot -f" +SSH_KEY_LOCATION="/root/.ssh/kdump_id_rsa" +KDUMP_PRE="" +KDUMP_POST="" +DATEDIR=`date +%Y.%m.%d-%T` +HOST_IP='127.0.0.1' +DUMP_INSTRUCTION="" +KDUMP_SCRIPT_DIR="/kdumpscripts" +DD_BLKSIZE=512 +KDUMP_CONF="/etc/kdump.conf" +NEWROOT="/sysroot" + +get_kdump_confs() +{ + local config_opt config_val + local user_specified_cc + + while read config_opt config_val; + do + # remove inline comments after the end of a directive. + config_val=$(strip_comments $config_val) + case "$config_opt" in + path) + KDUMP_PATH="$config_val" + ;; + core_collector) + CORE_COLLECTOR="$config_val" + user_specified_cc=yes + ;; + sshkey) + SSH_KEY_LOCATION=$config_val + ;; + kdump_pre) + KDUMP_PRE="$config_val" + ;; + kdump_post) + KDUMP_POST="$config_val" + ;; + fence_kdump_args) + FENCE_KDUMP_ARGS="$config_val" + ;; + fence_kdump_nodes) + FENCE_KDUMP_NODES="$config_val" + ;; + default) + case $config_val in + shell) + DEFAULT_ACTION="kdump_emergency_shell" + ;; + reboot) + DEFAULT_ACTION="do_umount; reboot -f" + ;; + halt) + DEFAULT_ACTION="do_umount; halt -f" + ;; + poweroff) + DEFAULT_ACTION="do_umount; poweroff -f" + ;; + dump_to_rootfs) + DEFAULT_ACTION="dump_to_rootfs" + ;; + esac + ;; + esac + done < $KDUMP_CONF + + if is_ssh_dump_target || is_raw_dump_target; then + if [ -z "$user_specified_cc" ]; then + CORE_COLLECTOR="$CORE_COLLECTOR -F" + fi + fi +} + +# dump_fs <mount point| device> +dump_fs() +{ + + local _dev=$(findmnt -k -f -n -r -o SOURCE $1) + local _mp=$(findmnt -k -f -n -r -o TARGET $1) + + echo "kdump: dump target is $_dev" + + if [ -z "$_mp" ]; then + echo "kdump: error: Dump target $_dev is not mounted." + return 1 + fi + + # Remove -F in makedumpfile case. We don't want a flat format dump here. + [[ $CORE_COLLECTOR = *makedumpfile* ]] && CORE_COLLECTOR=`echo $CORE_COLLECTOR | sed -e "s/-F//g"` + + echo "kdump: saving to $_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR/" + + mount -o remount,rw $_mp || return 1 + mkdir -p $_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR || return 1 + + save_vmcore_dmesg_fs ${DMESG_COLLECTOR} "$_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR/" + + echo "kdump: saving vmcore" + $CORE_COLLECTOR /proc/vmcore $_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR/vmcore-incomplete || return 1 + mv $_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR/vmcore-incomplete $_mp/$KDUMP_PATH/$HOST_IP-$DATEDIR/vmcore + sync + + echo "kdump: saving vmcore complete" +} + +save_vmcore_dmesg_fs() { + local _dmesg_collector=$1 + local _path=$2 + + echo "kdump: saving vmcore-dmesg.txt" + $_dmesg_collector /proc/vmcore > ${_path}/vmcore-dmesg-incomplete.txt + _exitcode=$? + if [ $_exitcode -eq 0 ]; then + mv ${_path}/vmcore-dmesg-incomplete.txt ${_path}/vmcore-dmesg.txt + echo "kdump: saving vmcore-dmesg.txt complete" + else + echo "kdump: saving vmcore-dmesg.txt failed" + fi +} + +dump_to_rootfs() +{ + + systemctl start dracut-initqueue + systemctl start sysroot.mount + + dump_fs $NEWROOT +} + +kdump_emergency_shell() +{ + echo "PS1="kdump:\${PWD}# "" >/etc/profile + /bin/dracut-emergency + rm -f /etc/profile +} + +do_umount() +{ + umount -Rf /sysroot +} + +do_default_action() +{ + echo "Kdump: Error Occured, doing default action" + eval $DEFAULT_ACTION +} + + +do_final_action() +{ + do_umount + reboot -f +}
Currently we disable stock emergency.service because interrupting the boot process will stop us from entering kdump capture script. But this isn't promising, when a mandatory target/service fails and emergency is disabled, systemd has no service to run and appears hang to us.
A better approach is to write our own error handling service and act different according to our configured default action in kdump.conf.
This patch introduce a kdump error handling script and an emergency service. This emergency.service unit is overriding the existing one when building kdump initramfs, and when it's started it will call the error handling script.
Signed-off-by: WANG Chao chaowang@redhat.com --- dracut-kdump-emergency.service | 32 ++++++++++++++++++++++++++++++++ dracut-kdump-error-handler.sh | 10 ++++++++++ dracut-module-setup.sh | 3 +++ kexec-tools.spec | 5 ++++- 4 files changed, 49 insertions(+), 1 deletion(-) create mode 100644 dracut-kdump-emergency.service create mode 100755 dracut-kdump-error-handler.sh
diff --git a/dracut-kdump-emergency.service b/dracut-kdump-emergency.service new file mode 100644 index 0000000..a29b14a --- /dev/null +++ b/dracut-kdump-emergency.service @@ -0,0 +1,32 @@ +# This file is part of systemd. +# +# systemd is free software; you can redistribute it and/or modify it +# under the terms of the GNU Lesser General Public License as published by +# the Free Software Foundation; either version 2.1 of the License, or +# (at your option) any later version. + +# See systemd.special(7) for details + +[Unit] +Description=Kdump Error Handler +DefaultDependencies=no +After=systemd-vconsole-setup.service +Wants=systemd-vconsole-setup.service + +[Service] +Environment=HOME=/ +Environment=DRACUT_SYSTEMD=1 +Environment=NEWROOT=/sysroot +WorkingDirectory=/ +ExecStart=/bin/kdump-error-handler.sh +ExecStopPost=-/usr/bin/systemctl --fail --no-block default +Type=oneshot +StandardInput=tty-force +StandardOutput=inherit +StandardError=inherit +KillMode=process +IgnoreSIGPIPE=no + +# Bash ignores SIGTERM, so we send SIGHUP instead, to ensure that bash +# terminates cleanly. +KillSignal=SIGHUP diff --git a/dracut-kdump-error-handler.sh b/dracut-kdump-error-handler.sh new file mode 100755 index 0000000..2c55b04 --- /dev/null +++ b/dracut-kdump-error-handler.sh @@ -0,0 +1,10 @@ +#!/bin/sh + +. /lib/kdump-lib.sh + +set -o pipefail +export PATH=$PATH:$KDUMP_SCRIPT_DIR + +get_kdump_confs +do_default_action +do_final_action diff --git a/dracut-module-setup.sh b/dracut-module-setup.sh index 2a16900..0a03bfa 100755 --- a/dracut-module-setup.sh +++ b/dracut-module-setup.sh @@ -546,6 +546,9 @@ install() { inst "/sbin/vmcore-dmesg" "/sbin/vmcore-dmesg" inst_hook pre-pivot 9999 "$moddir/kdump.sh" inst "/lib/kdump/kdump-lib.sh" "/lib/kdump-lib.sh" + inst "$moddir/kdump-error-handler.sh" "/usr/bin/kdump-error-handler.sh" + # Replace existing emergency service + cp "$moddir/kdump-emergency.service" "$initdir/$systemdsystemunitdir/emergency.service"
# Check for all the devices and if any device is iscsi, bring up iscsi # target. Ideally all this should be pushed into dracut iscsi module diff --git a/kexec-tools.spec b/kexec-tools.spec index a1490db..0e6d25c 100644 --- a/kexec-tools.spec +++ b/kexec-tools.spec @@ -34,6 +34,8 @@ Source22: kdump-dep-generator.sh Source100: dracut-kdump.sh Source101: dracut-module-setup.sh Source102: dracut-monitor_dd_progress +Source103: dracut-kdump-error-handler.sh +Source104: dracut-kdump-emergency.service
Requires(post): systemd-units Requires(preun): systemd-units @@ -188,7 +190,8 @@ mkdir -p -m755 $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpba cp %{SOURCE100} $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix %{SOURCE100}} cp %{SOURCE101} $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix %{SOURCE101}} cp %{SOURCE102} $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix %{SOURCE102}} - +cp %{SOURCE103} $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix %{SOURCE103}} +cp %{SOURCE104} $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix %{SOURCE104}} chmod 755 $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix %{SOURCE100}} chmod 755 $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix %{SOURCE101}}
This patch introduce a new kdump-capture.service which is used to run kdump.sh.
kdump-capture.service has OnFailure=emergency.target and OnFailureIsolate=yes set. When kdump.sh fails, the kdump emergency service will be triggered and enter the error handling path.
In 2nd kernel, the default target for systemd is initrd.target, so we put kdump-capture.service in initrd.target.wants/ and by that, system will start kdump-capture as part of the boot process.
kdump.sh used to run in dracut-pre-pivot hook. Now kdump-capture.service is placed after dracut-pre-pivot.service and other dependencies are all copied from dracut-pre-pivot.service. So the start point of kdump.sh will be almost the same as it used to be.
Signed-off-by: WANG Chao chaowang@redhat.com --- dracut-kdump-capture.service | 32 ++++++++++++++++++++++++++++++++ dracut-kdump.sh | 5 ++--- dracut-module-setup.sh | 4 +++- kexec-tools.spec | 2 ++ 4 files changed, 39 insertions(+), 4 deletions(-) create mode 100644 dracut-kdump-capture.service
diff --git a/dracut-kdump-capture.service b/dracut-kdump-capture.service new file mode 100644 index 0000000..bf5675b --- /dev/null +++ b/dracut-kdump-capture.service @@ -0,0 +1,32 @@ +# This file is part of systemd. +# +# systemd is free software; you can redistribute it and/or modify it +# under the terms of the GNU Lesser General Public License as published by +# the Free Software Foundation; either version 2.1 of the License, or +# (at your option) any later version. + +# See systemd.special(7) for details + +[Unit] +Description=Kdump Capture Service +After=initrd.target initrd-parse-etc.service sysroot.mount +After=dracut-initqueue.service dracut-pre-mount.service dracut-mount.service dracut-pre-pivot.service +Before=initrd-cleanup.service +ConditionPathExists=/etc/initrd-release +OnFailure=emergency.target +OnFailureIsolate=yes + +[Service] +Environment=DRACUT_SYSTEMD=1 +Environment=NEWROOT=/sysroot +Type=oneshot +ExecStart=/bin/kdump.sh +StandardInput=null +StandardOutput=syslog +StandardError=syslog+console +KillMode=process +RemainAfterExit=yes + +# Bash ignores SIGTERM, so we send SIGHUP instead, to ensure that bash +# terminates cleanly. +KillSignal=SIGHUP diff --git a/dracut-kdump.sh b/dracut-kdump.sh index 1960b7e..d092e04 100755 --- a/dracut-kdump.sh +++ b/dracut-kdump.sh @@ -165,8 +165,7 @@ fence_kdump_notify get_host_ip if [ $? -ne 0 ]; then echo "kdump: get_host_ip exited with non-zero status!" - do_default_action - do_final_action + exit 1 fi
if [ -z "$DUMP_INSTRUCTION" ]; then @@ -188,7 +187,7 @@ if [ $? -ne 0 ]; then fi
if [ $DUMP_RETVAL -ne 0 ]; then - do_default_action + exit 1 fi
do_final_action diff --git a/dracut-module-setup.sh b/dracut-module-setup.sh index 0a03bfa..1babd55 100755 --- a/dracut-module-setup.sh +++ b/dracut-module-setup.sh @@ -544,8 +544,10 @@ install() { inst "/bin/cut" "/bin/cut" inst "/sbin/makedumpfile" "/sbin/makedumpfile" inst "/sbin/vmcore-dmesg" "/sbin/vmcore-dmesg" - inst_hook pre-pivot 9999 "$moddir/kdump.sh" inst "/lib/kdump/kdump-lib.sh" "/lib/kdump-lib.sh" + inst "$moddir/kdump.sh" "/usr/bin/kdump.sh" + inst "$moddir/kdump-capture.service" "$systemdsystemunitdir/kdump-capture.service" + ln_r "$systemdsystemunitdir/kdump-capture.service" "$systemdsystemunitdir/initrd.target.wants/kdump-capture.service" inst "$moddir/kdump-error-handler.sh" "/usr/bin/kdump-error-handler.sh" # Replace existing emergency service cp "$moddir/kdump-emergency.service" "$initdir/$systemdsystemunitdir/emergency.service" diff --git a/kexec-tools.spec b/kexec-tools.spec index 0e6d25c..ad0cf99 100644 --- a/kexec-tools.spec +++ b/kexec-tools.spec @@ -36,6 +36,7 @@ Source101: dracut-module-setup.sh Source102: dracut-monitor_dd_progress Source103: dracut-kdump-error-handler.sh Source104: dracut-kdump-emergency.service +Source105: dracut-kdump-capture.service
Requires(post): systemd-units Requires(preun): systemd-units @@ -192,6 +193,7 @@ cp %{SOURCE101} $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpb cp %{SOURCE102} $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix %{SOURCE102}} cp %{SOURCE103} $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix %{SOURCE103}} cp %{SOURCE104} $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix %{SOURCE104}} +cp %{SOURCE105} $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix %{SOURCE105}} chmod 755 $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix %{SOURCE100}} chmod 755 $RPM_BUILD_ROOT/etc/kdump-adv-conf/kdump_dracut_modules/99kdumpbase/%{remove_dracut_prefix %{SOURCE101}}
dracut-emergency.service is conflicting with emergency.service:
"Conflicts=emergency.service emergency.target"
We should always disable dracut-emergency.service. Because if it gets started, our emergency.service will be interrupted and stopped.
And also dracut-emergency.service isn't useful now, since we introduced our own error handler.
Signed-off-by: WANG Chao chaowang@redhat.com --- dracut-kdump.sh | 4 ---- 1 file changed, 4 deletions(-)
diff --git a/dracut-kdump.sh b/dracut-kdump.sh index d092e04..eb4ab42 100755 --- a/dracut-kdump.sh +++ b/dracut-kdump.sh @@ -4,10 +4,6 @@ exec &> /dev/console . /lib/dracut-lib.sh . /lib/kdump-lib.sh
-if [ -f "$initdir/lib/dracut/no-emergency-shell" ]; then - rm -f -- $initdir/lib/dracut/no-emergency-shell -fi - set -o pipefail export PATH=$PATH:$KDUMP_SCRIPT_DIR