This patch set implements firmware-assisted dump support for kdump service. Firmware-assisted dump support depends on existing kdump infrastructure (kdump scripts) present in userland to save dump to the disk. Though existing kdump script will work seemlessly, it still needs to modified to make it aware of presense of firmware- assisted dump feature during service start and stop. These changes are tested successfully on a power box with fedora19.
Changes in v3: 1. Split few functions for readability. 2. Added a cleanup patch to remove unnecessay "function" keyword.
---
Hari Bathini (8): kdump: Modify status routine to check for firmware-assisted dump kdump: Modify kdump script to start the firmware assisted dump. kdump: Modify kdump script to stop firmware assisted dump kdump: Take a backup of original default initrd before rebuilding. kdump: Rebuild default initrd for firmware assisted dump kdump: Get rid of "function" keyword from all functions kdump: Check for /proc/vmcore existence before capturing the vmcore. kdump: Add firmware-assisted dump howto document
dracut-kdump.sh | 3 fadump-howto.txt | 428 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ kdumpctl | 308 +++++++++++++++++++++++++++++++++------ 3 files changed, 691 insertions(+), 48 deletions(-) create mode 100644 fadump-howto.txt
This patch enables kdump script to check if firmware-assisted dump is enabled or not by reading value from '/sys/kernel/fadump_enabled'. The determine_dump_mode() routine sets dump_mode to 'fadump', if fadump is enabled. By default, dump_mode is set to 'kdump' mode.
Modify status routine to check if firmware assisted dump is registered or not by reading value from '/sys/kernel/fadump_registered' file. If it is set to '1' then return status=0 else return status=1.
0 <= Firmware assisted is enabled and running 1 <= Firmware assisted is enabled but not running
Signed-off-by: Mahesh Salgaonkar mahesh@linux.vnet.ibm.com Signed-off-by: Hari Bathini hbathini@linux.vnet.ibm.com --- kdumpctl | 64 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 61 insertions(+), 3 deletions(-)
diff --git a/kdumpctl b/kdumpctl index 481ffed..0009031 100755 --- a/kdumpctl +++ b/kdumpctl @@ -9,6 +9,8 @@ MKDUMPRD="/sbin/mkdumprd -f" SAVE_PATH=/var/crash SSH_KEY_LOCATION="/root/.ssh/kdump_id_rsa" DUMP_TARGET="" +FADUMP_ENABLED_SYS_NODE="/sys/kernel/fadump_enabled" +FADUMP_REGISTER_SYS_NODE="/sys/kernel/fadump_registered"
. /lib/kdump/kdump-lib.sh
@@ -24,6 +26,19 @@ single_instance_lock() flock 9 }
+determine_dump_mode() +{ + # kdump shall be the default dump mode + dump_mode="kdump" + + # Check if firmware-assisted dump is enabled + # if yes, set the dump mode as fadump + if is_fadump_capable; then + echo "Using dump mode fadump" + dump_mode="fadump" + fi +} + # remove_cmdline_param <kernel cmdline> <param1> [<param2>] ... [<paramN>] # Remove a list of kernel parameters from a given kernel cmdline and print the result. # For each "arg" in the removing params list, "arg" and "arg=xxx" will be removed if exists. @@ -372,6 +387,25 @@ function propagate_ssh_key() }
+is_fadump_capable() +{ + # Check if firmware-assisted dump is enabled + # if no, fallback to kdump check + if [ -f $FADUMP_ENABLED_SYS_NODE ]; then + rc=`cat $FADUMP_ENABLED_SYS_NODE` + [ $rc -eq 1 ] && return 0 + fi + return 1 +} + +check_current_fadump_status() +{ + # Check if firmware-assisted dump has been registered. + rc=`cat $FADUMP_REGISTER_SYS_NODE` + [ $rc -eq 1 ] && return 0 + return 1 +} + function check_current_kdump_status() { rc=`cat /sys/kernel/kexec_crash_loaded` @@ -382,6 +416,17 @@ function check_current_kdump_status() fi }
+check_current_status() +{ + if [ $dump_mode == "fadump" ]; then + check_current_fadump_status + else + check_current_kdump_status + fi + + return $? +} + function save_raw() { local kdump_dir @@ -526,6 +571,16 @@ function check_kdump_feasibility() fi }
+check_dump_feasibility() +{ + if [ $dump_mode != "kdump" ]; then + return 0 + fi + + check_kdump_feasibility + return $? +} + function start() { check_config @@ -543,13 +598,13 @@ function start() return 1 fi
- check_kdump_feasibility + check_dump_feasibility if [ $? -ne 0 ]; then echo "Starting kdump: [FAILED]" return 1 fi
- check_current_kdump_status + check_current_status if [ $? == 0 ]; then echo "Kdump already running: [WARNING]" return 0 @@ -597,6 +652,9 @@ fi
main () { + # Determine if the dump mode is kdump or fadump + determine_dump_mode + case "$1" in start) if [ -s /proc/vmcore ]; then @@ -611,7 +669,7 @@ main () ;; status) EXIT_CODE=0 - check_current_kdump_status + check_current_status case "$?" in 0) echo "Kdump is operational"
On 02/27/14 at 01:50pm, Hari Bathini wrote:
This patch enables kdump script to check if firmware-assisted dump is enabled or not by reading value from '/sys/kernel/fadump_enabled'. The determine_dump_mode() routine sets dump_mode to 'fadump', if fadump is enabled. By default, dump_mode is set to 'kdump' mode.
Modify status routine to check if firmware assisted dump is registered or not by reading value from '/sys/kernel/fadump_registered' file. If it is set to '1' then return status=0 else return status=1.
0 <= Firmware assisted is enabled and running 1 <= Firmware assisted is enabled but not running
Signed-off-by: Mahesh Salgaonkar mahesh@linux.vnet.ibm.com Signed-off-by: Hari Bathini hbathini@linux.vnet.ibm.com
kdumpctl | 64 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 61 insertions(+), 3 deletions(-)
diff --git a/kdumpctl b/kdumpctl index 481ffed..0009031 100755 --- a/kdumpctl +++ b/kdumpctl @@ -9,6 +9,8 @@ MKDUMPRD="/sbin/mkdumprd -f" SAVE_PATH=/var/crash SSH_KEY_LOCATION="/root/.ssh/kdump_id_rsa" DUMP_TARGET="" +FADUMP_ENABLED_SYS_NODE="/sys/kernel/fadump_enabled" +FADUMP_REGISTER_SYS_NODE="/sys/kernel/fadump_registered"
. /lib/kdump/kdump-lib.sh
@@ -24,6 +26,19 @@ single_instance_lock() flock 9 }
+determine_dump_mode() +{
- # kdump shall be the default dump mode
- dump_mode="kdump"
Since dump_mode will be used here and there, how about add a global var at the top of this file as the FADUMP_ENABLED_SYS_NODE etc.
- # Check if firmware-assisted dump is enabled
- # if yes, set the dump mode as fadump
- if is_fadump_capable; then
echo "Using dump mode fadump"
dump_mode="fadump"
- fi
+}
Thanks Dave
During service kdump start, if firmware assisted dump is not enabled then fallback to starting of existing kexec based kdump. If firmware assisted is enabled but not running, then start firmware assisted dump by echo'ing 1 to '/sys/kernel/fadump_registered' file.
Signed-off-by: Mahesh Salgaonkar mahesh@linux.vnet.ibm.com --- kdumpctl | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-)
diff --git a/kdumpctl b/kdumpctl index 0009031..b993e1b 100755 --- a/kdumpctl +++ b/kdumpctl @@ -581,6 +581,29 @@ check_dump_feasibility() return $? }
+start_fadump() +{ + echo 1 > $FADUMP_REGISTER_SYS_NODE + if ! check_current_fadump_status; then + echo "fadump: failed to register" + return 1 + fi + + echo "fadump: registered successfully" + return 0 +} + +start_dump() +{ + if [ $dump_mode == "fadump" ]; then + start_fadump + else + load_kdump + fi + + return $? +} + function start() { check_config @@ -622,7 +645,8 @@ function start() echo "Starting kdump: [FAILED]" return 1 fi - load_kdump + + start_dump if [ $? != 0 ]; then echo "Starting kdump: [FAILED]" return 1
During service kdump stop, if firmware assisted dump is enabled and running, then stop firmware assisted dump by echo'ing 0 to '/sys/kernel/fadump_registered' file.
Signed-off-by: Mahesh Salgaonkar mahesh@linux.vnet.ibm.com Signed-off-by: Hari Bathini hbathini@linux.vnet.ibm.com --- kdumpctl | 39 +++++++++++++++++++++++++++++++++------ 1 file changed, 33 insertions(+), 6 deletions(-)
diff --git a/kdumpctl b/kdumpctl index b993e1b..bbc2779 100755 --- a/kdumpctl +++ b/kdumpctl @@ -655,18 +655,45 @@ function start() echo "Starting kdump: [OK]" }
-function stop() +stop_fadump() +{ + check_current_fadump_status && echo 0 > $FADUMP_REGISTER_SYS_NODE + if check_current_fadump_status; then + echo "fadump: failed to un-register" + return 1 + fi + + echo "fadump: un-registered successfully" + return 0 +} + +stop_kdump() { $KEXEC -p -u 2>/dev/null - if [ $? == 0 ]; then - echo "kexec: unloaded kdump kernel" - echo "Stopping kdump: [OK]" - return 0 - else + if [ $? != 0 ]; then echo "kexec: failed to unloaded kdump kernel" + return 1 + fi + + echo "kexec: unloaded kdump kernel" + return 0 +} + +function stop() +{ + if [ $dump_mode == "fadump" ]; then + stop_fadump + else + stop_kdump + fi + + if [ $? != 0 ]; then echo "Stopping kdump: [FAILED]" return 1 fi + + echo "Stopping kdump: [OK]" + return 0 }
if [ ! -f "$KDUMP_CONFIG_FILE" ]; then
Take a backup of original initrd when fadump is used first time or when user has switched from kdump to fadump. This will allow us to fall back to original initrd when kdump service fails to rebuild the fadump ready default initrd. Also, if the user switches from fadump to kdump, then the original initrd will be restored when kdump script is run first time after the switch.
Signed-off-by: Mahesh Salgaonkar mahesh@linux.vnet.ibm.com Signed-off-by: Hari Bathini hbathini@linux.vnet.ibm.com --- kdumpctl | 102 ++++++++++++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 83 insertions(+), 19 deletions(-)
diff --git a/kdumpctl b/kdumpctl index bbc2779..2eb52a5 100755 --- a/kdumpctl +++ b/kdumpctl @@ -15,6 +15,7 @@ FADUMP_REGISTER_SYS_NODE="/sys/kernel/fadump_registered" . /lib/kdump/kdump-lib.sh
standard_kexec_args="-p" +declare -i image_time
if [ -f /etc/sysconfig/kdump ]; then . /etc/sysconfig/kdump @@ -31,12 +32,25 @@ determine_dump_mode() # kdump shall be the default dump mode dump_mode="kdump"
+ if [ -z "$KDUMP_KERNELVER" ]; then + kdump_kver=`uname -r` + else + kdump_kver=$KDUMP_KERNELVER + fi + + default_initrd="${KDUMP_BOOTDIR}/initramfs-${kdump_kver}.img" + kdump_initrd="${KDUMP_BOOTDIR}/initramfs-${kdump_kver}kdump.img" + default_initrd_bak="$default_initrd.default.bak" + # Check if firmware-assisted dump is enabled # if yes, set the dump mode as fadump if is_fadump_capable; then echo "Using dump mode fadump" dump_mode="fadump" fi + + # Handle dump mode switch from kdump to fadump OR fadump to kdump + handle_dump_mode_switch }
# remove_cmdline_param <kernel cmdline> <param1> [<param2>] ... [<paramN>] @@ -83,6 +97,10 @@ function save_core()
function rebuild_initrd() { + if [ $dump_mode == "fadump" ]; then + backup_default_initrd + fi + $MKDUMPRD $kdump_initrd $kdump_kver if [ $? != 0 ]; then echo "mkdumprd: failed to make kdump initrd" >&2 @@ -112,6 +130,61 @@ function check_executable() done }
+backup_default_initrd() +{ + # Check if backup initrd is already present. If not, then + # this is the first time fadump is being used OR user + # has switched from kdump to fadump. + # Take a backup of the original default initrd before + # we rebuild default initrd for fadump support. + if [ ! -e $default_initrd_bak ];then + echo "Backing up default initrd" + cp $default_initrd $default_initrd_bak + sync + fi +} + +handle_dump_mode_switch() +{ + if [ $dump_mode == "fadump" ]; then + if [ -e $kdump_initrd ];then + # This means user has switched from kdump to fadump. + # Remove kdump initrd which is no longer needed + rm -f $kdump_initrd + fi + else + if [ -e $default_initrd_bak ];then + # !fadump and original initrd backup file exists. + # This means user has switched from fadump to kdump. + # Restore the original default initrd. + mv $default_initrd_bak $default_initrd + sync + fi + fi +} + +find_initrd_image_time() +{ + image_time=0 + + # Check to see if dependent files have been modified + # since last build of the image file + if [ $dump_mode == "fadump" ]; then + # If this is the 1st time we are using fadump then let image_time be + # zero to force rebuild initrd. The non-existance of backup initrd + # means this is the 1st time fadump is being used. If it exists then + # return the image time of default initrd. + if [ -e $default_initrd_bak ]; then + image_time=`stat -c "%Y" $default_initrd 2>/dev/null` + fi + else + if [ -f $kdump_initrd ]; then + image_time=`stat -c "%Y" $kdump_initrd 2>/dev/null` + return + fi + fi +} + function check_config() { local nr @@ -171,14 +244,7 @@ function check_rebuild() local extra_modules modified_files="" local _force_rebuild force_rebuild="0"
- if [ -z "$KDUMP_KERNELVER" ]; then - kdump_kver=`uname -r` - else - kdump_kver=$KDUMP_KERNELVER - fi - kdump_kernel="${KDUMP_BOOTDIR}/${KDUMP_IMG}-${kdump_kver}${KDUMP_IMG_EXT}" - kdump_initrd="${KDUMP_BOOTDIR}/initramfs-${kdump_kver}kdump.img"
_force_rebuild=`grep ^force_rebuild $KDUMP_CONFIG_FILE 2>/dev/null` if [ $? -eq 0 ]; then @@ -193,13 +259,9 @@ function check_rebuild() extra_modules=`grep ^extra_modules $KDUMP_CONFIG_FILE` [ -n "$extra_modules" ] && force_rebuild="1"
- #check to see if dependent files has been modified - #since last build of the image file - if [ -f $kdump_initrd ]; then - image_time=`stat -c "%Y" $kdump_initrd 2>/dev/null` - else - image_time=0 - fi + # Find initrd image time based on whether dependent files have been + # modified since last build of the image file + find_initrd_image_time
#also rebuild when cluster conf is changed and fence kdump is enabled. check_fence_kdump $image_time && modified_files="cluster-cib" @@ -225,12 +287,14 @@ function check_rebuild() fi done
- if [ $image_time -eq 0 ]; then - echo -n "No kdump initial ramdisk found."; echo - elif [ "$force_rebuild" != "0" ]; then - echo -n "Force rebuild $kdump_initrd"; echo + if [ $dump_mode == "kdump" ]; then + if [ $image_time -eq 0 ]; then + echo -n "No kdump initial ramdisk found."; echo + elif [ "$force_rebuild" != "0" ]; then + echo -n "Force rebuild $kdump_initrd"; echo + fi elif [ -n "$modified_files" ]; then - echo "Detected change(s) the following file(s):" + echo "Detected change(s) in the following file(s):" echo -n " "; echo "$modified_files" | sed 's/\s/\n /g' else return 0
On 02/27/14 at 01:51pm, Hari Bathini wrote:
Take a backup of original initrd when fadump is used first time or when user has switched from kdump to fadump. This will allow us to fall back to original initrd when kdump service fails to rebuild the fadump ready default initrd. Also, if the user switches from fadump to kdump, then the original initrd will be restored when kdump script is run first time after the switch.
Signed-off-by: Mahesh Salgaonkar mahesh@linux.vnet.ibm.com Signed-off-by: Hari Bathini hbathini@linux.vnet.ibm.com
kdumpctl | 102 ++++++++++++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 83 insertions(+), 19 deletions(-)
diff --git a/kdumpctl b/kdumpctl index bbc2779..2eb52a5 100755 --- a/kdumpctl +++ b/kdumpctl @@ -15,6 +15,7 @@ FADUMP_REGISTER_SYS_NODE="/sys/kernel/fadump_registered" . /lib/kdump/kdump-lib.sh
standard_kexec_args="-p" +declare -i image_time
if [ -f /etc/sysconfig/kdump ]; then . /etc/sysconfig/kdump @@ -31,12 +32,25 @@ determine_dump_mode() # kdump shall be the default dump mode dump_mode="kdump"
- if [ -z "$KDUMP_KERNELVER" ]; then
kdump_kver=`uname -r`
- else
kdump_kver=$KDUMP_KERNELVER
- fi
- default_initrd="${KDUMP_BOOTDIR}/initramfs-${kdump_kver}.img"
- kdump_initrd="${KDUMP_BOOTDIR}/initramfs-${kdump_kver}kdump.img"
- default_initrd_bak="$default_initrd.default.bak"
- # Check if firmware-assisted dump is enabled # if yes, set the dump mode as fadump if is_fadump_capable; then echo "Using dump mode fadump" dump_mode="fadump" fi
- # Handle dump mode switch from kdump to fadump OR fadump to kdump
- handle_dump_mode_switch
}
# remove_cmdline_param <kernel cmdline> <param1> [<param2>] ... [<paramN>] @@ -83,6 +97,10 @@ function save_core()
function rebuild_initrd() {
- if [ $dump_mode == "fadump" ]; then
backup_default_initrd
- fi
- $MKDUMPRD $kdump_initrd $kdump_kver if [ $? != 0 ]; then echo "mkdumprd: failed to make kdump initrd" >&2
@@ -112,6 +130,61 @@ function check_executable() done }
+backup_default_initrd() +{
- # Check if backup initrd is already present. If not, then
- # this is the first time fadump is being used OR user
- # has switched from kdump to fadump.
- # Take a backup of the original default initrd before
- # we rebuild default initrd for fadump support.
- if [ ! -e $default_initrd_bak ];then
echo "Backing up default initrd"
cp $default_initrd $default_initrd_bak
sync
- fi
+}
+handle_dump_mode_switch() +{
- if [ $dump_mode == "fadump" ]; then
if [ -e $kdump_initrd ];then
# This means user has switched from kdump to fadump.
# Remove kdump initrd which is no longer needed
rm -f $kdump_initrd
fi
- else
if [ -e $default_initrd_bak ];then
# !fadump and original initrd backup file exists.
# This means user has switched from fadump to kdump.
# Restore the original default initrd.
mv $default_initrd_bak $default_initrd
sync
fi
- fi
+}
+find_initrd_image_time() +{
- image_time=0
- # Check to see if dependent files have been modified
- # since last build of the image file
- if [ $dump_mode == "fadump" ]; then
# If this is the 1st time we are using fadump then let image_time be
# zero to force rebuild initrd. The non-existance of backup initrd
# means this is the 1st time fadump is being used. If it exists then
# return the image time of default initrd.
if [ -e $default_initrd_bak ]; then
image_time=`stat -c "%Y" $default_initrd 2>/dev/null`
fi
- else
if [ -f $kdump_initrd ]; then
image_time=`stat -c "%Y" $kdump_initrd 2>/dev/null`
return
fi
- fi
+}
function check_config() { local nr @@ -171,14 +244,7 @@ function check_rebuild() local extra_modules modified_files="" local _force_rebuild force_rebuild="0"
if [ -z "$KDUMP_KERNELVER" ]; then
kdump_kver=`uname -r`
else
kdump_kver=$KDUMP_KERNELVER
fi
kdump_kernel="${KDUMP_BOOTDIR}/${KDUMP_IMG}-${kdump_kver}${KDUMP_IMG_EXT}"
kdump_initrd="${KDUMP_BOOTDIR}/initramfs-${kdump_kver}kdump.img"
_force_rebuild=`grep ^force_rebuild $KDUMP_CONFIG_FILE 2>/dev/null` if [ $? -eq 0 ]; then
@@ -193,13 +259,9 @@ function check_rebuild() extra_modules=`grep ^extra_modules $KDUMP_CONFIG_FILE` [ -n "$extra_modules" ] && force_rebuild="1"
- #check to see if dependent files has been modified
- #since last build of the image file
- if [ -f $kdump_initrd ]; then
image_time=`stat -c "%Y" $kdump_initrd 2>/dev/null`
- else
image_time=0
- fi
# Find initrd image time based on whether dependent files have been
# modified since last build of the image file
find_initrd_image_time
#also rebuild when cluster conf is changed and fence kdump is enabled. check_fence_kdump $image_time && modified_files="cluster-cib"
@@ -225,12 +287,14 @@ function check_rebuild() fi done
- if [ $image_time -eq 0 ]; then
echo -n "No kdump initial ramdisk found."; echo
- elif [ "$force_rebuild" != "0" ]; then
echo -n "Force rebuild $kdump_initrd"; echo
- if [ $dump_mode == "kdump" ]; then
if [ $image_time -eq 0 ]; then
echo -n "No kdump initial ramdisk found."; echo
elif [ "$force_rebuild" != "0" ]; then
echo -n "Force rebuild $kdump_initrd"; echo
elif [ -n "$modified_files" ]; thenfi
The above if elif logic looks different from original ones?
echo "Detected change(s) the following file(s):"
echo -n " "; echo "$modified_files" | sed 's/\s/\n /g' else return 0echo "Detected change(s) in the following file(s):"
kexec mailing list kexec@lists.fedoraproject.org https://lists.fedoraproject.org/mailman/listinfo/kexec
On 04/01/2014 03:28 PM, Dave Young wrote:
On 02/27/14 at 01:51pm, Hari Bathini wrote:
Take a backup of original initrd when fadump is used first time or when user has switched from kdump to fadump. This will allow us to fall back to original initrd when kdump service fails to rebuild the fadump ready default initrd. Also, if the user switches from fadump to kdump, then the original initrd will be restored when kdump script is run first time after the switch.
Signed-off-by: Mahesh Salgaonkarmahesh@linux.vnet.ibm.com Signed-off-by: Hari Bathinihbathini@linux.vnet.ibm.com
kdumpctl | 102 ++++++++++++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 83 insertions(+), 19 deletions(-)
diff --git a/kdumpctl b/kdumpctl index bbc2779..2eb52a5 100755 --- a/kdumpctl +++ b/kdumpctl @@ -15,6 +15,7 @@ FADUMP_REGISTER_SYS_NODE="/sys/kernel/fadump_registered" . /lib/kdump/kdump-lib.sh
standard_kexec_args="-p" +declare -i image_time
if [ -f /etc/sysconfig/kdump ]; then . /etc/sysconfig/kdump @@ -31,12 +32,25 @@ determine_dump_mode() # kdump shall be the default dump mode dump_mode="kdump"
if [ -z "$KDUMP_KERNELVER" ]; then
kdump_kver=`uname -r`
else
kdump_kver=$KDUMP_KERNELVER
fi
default_initrd="${KDUMP_BOOTDIR}/initramfs-${kdump_kver}.img"
kdump_initrd="${KDUMP_BOOTDIR}/initramfs-${kdump_kver}kdump.img"
default_initrd_bak="$default_initrd.default.bak"
# Check if firmware-assisted dump is enabled # if yes, set the dump mode as fadump if is_fadump_capable; then echo "Using dump mode fadump" dump_mode="fadump" fi
# Handle dump mode switch from kdump to fadump OR fadump to kdump
handle_dump_mode_switch }
# remove_cmdline_param <kernel cmdline> <param1> [<param2>] ... [<paramN>]
@@ -83,6 +97,10 @@ function save_core()
function rebuild_initrd() {
- if [ $dump_mode == "fadump" ]; then
backup_default_initrd
- fi
- $MKDUMPRD $kdump_initrd $kdump_kver if [ $? != 0 ]; then echo "mkdumprd: failed to make kdump initrd" >&2
@@ -112,6 +130,61 @@ function check_executable() done }
+backup_default_initrd() +{
- # Check if backup initrd is already present. If not, then
- # this is the first time fadump is being used OR user
- # has switched from kdump to fadump.
- # Take a backup of the original default initrd before
- # we rebuild default initrd for fadump support.
- if [ ! -e $default_initrd_bak ];then
echo "Backing up default initrd"
cp $default_initrd $default_initrd_bak
sync
- fi
+}
+handle_dump_mode_switch() +{
- if [ $dump_mode == "fadump" ]; then
if [ -e $kdump_initrd ];then
# This means user has switched from kdump to fadump.
# Remove kdump initrd which is no longer needed
rm -f $kdump_initrd
fi
- else
if [ -e $default_initrd_bak ];then
# !fadump and original initrd backup file exists.
# This means user has switched from fadump to kdump.
# Restore the original default initrd.
mv $default_initrd_bak $default_initrd
sync
fi
- fi
+}
+find_initrd_image_time() +{
- image_time=0
- # Check to see if dependent files have been modified
- # since last build of the image file
- if [ $dump_mode == "fadump" ]; then
# If this is the 1st time we are using fadump then let image_time be
# zero to force rebuild initrd. The non-existance of backup initrd
# means this is the 1st time fadump is being used. If it exists then
# return the image time of default initrd.
if [ -e $default_initrd_bak ]; then
image_time=`stat -c "%Y" $default_initrd 2>/dev/null`
fi
- else
if [ -f $kdump_initrd ]; then
image_time=`stat -c "%Y" $kdump_initrd 2>/dev/null`
return
fi
- fi
+}
- function check_config() { local nr
@@ -171,14 +244,7 @@ function check_rebuild() local extra_modules modified_files="" local _force_rebuild force_rebuild="0"
if [ -z "$KDUMP_KERNELVER" ]; then
kdump_kver=`uname -r`
else
kdump_kver=$KDUMP_KERNELVER
fi
kdump_kernel="${KDUMP_BOOTDIR}/${KDUMP_IMG}-${kdump_kver}${KDUMP_IMG_EXT}"
kdump_initrd="${KDUMP_BOOTDIR}/initramfs-${kdump_kver}kdump.img"
_force_rebuild=`grep ^force_rebuild $KDUMP_CONFIG_FILE 2>/dev/null` if [ $? -eq 0 ]; then
@@ -193,13 +259,9 @@ function check_rebuild() extra_modules=`grep ^extra_modules $KDUMP_CONFIG_FILE` [ -n "$extra_modules" ] && force_rebuild="1"
- #check to see if dependent files has been modified
- #since last build of the image file
- if [ -f $kdump_initrd ]; then
image_time=`stat -c "%Y" $kdump_initrd 2>/dev/null`
- else
image_time=0
- fi
# Find initrd image time based on whether dependent files have been
# modified since last build of the image file
find_initrd_image_time
#also rebuild when cluster conf is changed and fence kdump is enabled. check_fence_kdump $image_time && modified_files="cluster-cib"
@@ -225,12 +287,14 @@ function check_rebuild() fi done
- if [ $image_time -eq 0 ]; then
echo -n "No kdump initial ramdisk found."; echo
- elif [ "$force_rebuild" != "0" ]; then
echo -n "Force rebuild $kdump_initrd"; echo
- if [ $dump_mode == "kdump" ]; then
if [ $image_time -eq 0 ]; then
echo -n "No kdump initial ramdisk found."; echo
elif [ "$force_rebuild" != "0" ]; then
echo -n "Force rebuild $kdump_initrd"; echo
elif [ -n "$modified_files" ]; thenfi
The above if elif logic looks different from original ones?
Messages are put under "kdump" dump mode if condition because: 1. there is no separate initial ramdisk for fadump as in the case of "kdump" mode.. 2. there is a message in rebuild_fadump_initrd() that conveys the same..
Thanks Hari
echo "Detected change(s) the following file(s):"
echo -n " "; echo "$modified_files" | sed 's/\s/\n /g' else return 0echo "Detected change(s) in the following file(s):"
kexec mailing list kexec@lists.fedoraproject.org https://lists.fedoraproject.org/mailman/listinfo/kexec
On 04/03/14 at 04:20pm, Hari Bathini wrote:
On 04/01/2014 03:28 PM, Dave Young wrote:
On 02/27/14 at 01:51pm, Hari Bathini wrote:
Take a backup of original initrd when fadump is used first time or when user has switched from kdump to fadump. This will allow us to fall back to original initrd when kdump service fails to rebuild the fadump ready default initrd. Also, if the user switches from fadump to kdump, then the original initrd will be restored when kdump script is run first time after the switch.
Signed-off-by: Mahesh Salgaonkarmahesh@linux.vnet.ibm.com Signed-off-by: Hari Bathinihbathini@linux.vnet.ibm.com
kdumpctl | 102 ++++++++++++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 83 insertions(+), 19 deletions(-)
diff --git a/kdumpctl b/kdumpctl index bbc2779..2eb52a5 100755 --- a/kdumpctl +++ b/kdumpctl @@ -15,6 +15,7 @@ FADUMP_REGISTER_SYS_NODE="/sys/kernel/fadump_registered" . /lib/kdump/kdump-lib.sh standard_kexec_args="-p" +declare -i image_time if [ -f /etc/sysconfig/kdump ]; then . /etc/sysconfig/kdump @@ -31,12 +32,25 @@ determine_dump_mode() # kdump shall be the default dump mode dump_mode="kdump"
- if [ -z "$KDUMP_KERNELVER" ]; then
kdump_kver=`uname -r`
- else
kdump_kver=$KDUMP_KERNELVER
- fi
- default_initrd="${KDUMP_BOOTDIR}/initramfs-${kdump_kver}.img"
- kdump_initrd="${KDUMP_BOOTDIR}/initramfs-${kdump_kver}kdump.img"
- default_initrd_bak="$default_initrd.default.bak"
- # Check if firmware-assisted dump is enabled # if yes, set the dump mode as fadump if is_fadump_capable; then echo "Using dump mode fadump" dump_mode="fadump" fi
- # Handle dump mode switch from kdump to fadump OR fadump to kdump
- handle_dump_mode_switch
} # remove_cmdline_param <kernel cmdline> <param1> [<param2>] ... [<paramN>] @@ -83,6 +97,10 @@ function save_core() function rebuild_initrd() {
- if [ $dump_mode == "fadump" ]; then
backup_default_initrd
- fi
- $MKDUMPRD $kdump_initrd $kdump_kver if [ $? != 0 ]; then echo "mkdumprd: failed to make kdump initrd" >&2
@@ -112,6 +130,61 @@ function check_executable() done } +backup_default_initrd() +{
- # Check if backup initrd is already present. If not, then
- # this is the first time fadump is being used OR user
- # has switched from kdump to fadump.
- # Take a backup of the original default initrd before
- # we rebuild default initrd for fadump support.
- if [ ! -e $default_initrd_bak ];then
echo "Backing up default initrd"
cp $default_initrd $default_initrd_bak
sync
- fi
+}
+handle_dump_mode_switch() +{
- if [ $dump_mode == "fadump" ]; then
if [ -e $kdump_initrd ];then
# This means user has switched from kdump to fadump.
# Remove kdump initrd which is no longer needed
rm -f $kdump_initrd
fi
- else
if [ -e $default_initrd_bak ];then
# !fadump and original initrd backup file exists.
# This means user has switched from fadump to kdump.
# Restore the original default initrd.
mv $default_initrd_bak $default_initrd
sync
fi
- fi
+}
+find_initrd_image_time() +{
- image_time=0
- # Check to see if dependent files have been modified
- # since last build of the image file
- if [ $dump_mode == "fadump" ]; then
# If this is the 1st time we are using fadump then let image_time be
# zero to force rebuild initrd. The non-existance of backup initrd
# means this is the 1st time fadump is being used. If it exists then
# return the image time of default initrd.
if [ -e $default_initrd_bak ]; then
image_time=`stat -c "%Y" $default_initrd 2>/dev/null`
fi
- else
if [ -f $kdump_initrd ]; then
image_time=`stat -c "%Y" $kdump_initrd 2>/dev/null`
return
fi
- fi
+}
function check_config() { local nr @@ -171,14 +244,7 @@ function check_rebuild() local extra_modules modified_files="" local _force_rebuild force_rebuild="0"
- if [ -z "$KDUMP_KERNELVER" ]; then
kdump_kver=`uname -r`
- else
kdump_kver=$KDUMP_KERNELVER
- fi
- kdump_kernel="${KDUMP_BOOTDIR}/${KDUMP_IMG}-${kdump_kver}${KDUMP_IMG_EXT}"
- kdump_initrd="${KDUMP_BOOTDIR}/initramfs-${kdump_kver}kdump.img" _force_rebuild=`grep ^force_rebuild $KDUMP_CONFIG_FILE 2>/dev/null` if [ $? -eq 0 ]; then
@@ -193,13 +259,9 @@ function check_rebuild() extra_modules=`grep ^extra_modules $KDUMP_CONFIG_FILE` [ -n "$extra_modules" ] && force_rebuild="1"
- #check to see if dependent files has been modified
- #since last build of the image file
- if [ -f $kdump_initrd ]; then
image_time=`stat -c "%Y" $kdump_initrd 2>/dev/null`
- else
image_time=0
- fi
- # Find initrd image time based on whether dependent files have been
- # modified since last build of the image file
- find_initrd_image_time #also rebuild when cluster conf is changed and fence kdump is enabled. check_fence_kdump $image_time && modified_files="cluster-cib"
@@ -225,12 +287,14 @@ function check_rebuild() fi done
- if [ $image_time -eq 0 ]; then
echo -n "No kdump initial ramdisk found."; echo
- elif [ "$force_rebuild" != "0" ]; then
echo -n "Force rebuild $kdump_initrd"; echo
- if [ $dump_mode == "kdump" ]; then
if [ $image_time -eq 0 ]; then
echo -n "No kdump initial ramdisk found."; echo
elif [ "$force_rebuild" != "0" ]; then
echo -n "Force rebuild $kdump_initrd"; echo
elif [ -n "$modified_files" ]; thenfi
The above if elif logic looks different from original ones?
Messages are put under "kdump" dump mode if condition because:
- there is no separate initial ramdisk for fadump as in the case of
"kdump" mode.. 2. there is a message in rebuild_fadump_initrd() that conveys the same..
For below case [[ $dump_mode == "kdump" ]] && [[ $image_time -ne 0 ]] && [[ "$force_rebuild" == "0" ]] && [[ -n "$modified_files" ]]
The kdump image will not be rebuilt, right? Am I missing some thing?
Thanks Hari
echo "Detected change(s) the following file(s):"
echo -n " "; echo "$modified_files" | sed 's/\s/\n /g' else return 0echo "Detected change(s) in the following file(s):"
kexec mailing list kexec@lists.fedoraproject.org https://lists.fedoraproject.org/mailman/listinfo/kexec
On 04/04/2014 10:59 AM, Dave Young wrote:
On 04/03/14 at 04:20pm, Hari Bathini wrote:
On 04/01/2014 03:28 PM, Dave Young wrote:
On 02/27/14 at 01:51pm, Hari Bathini wrote:
Take a backup of original initrd when fadump is used first time or when user has switched from kdump to fadump. This will allow us to fall back to original initrd when kdump service fails to rebuild the fadump ready default initrd. Also, if the user switches from fadump to kdump, then the original initrd will be restored when kdump script is run first time after the switch.
Signed-off-by: Mahesh Salgaonkarmahesh@linux.vnet.ibm.com Signed-off-by: Hari Bathinihbathini@linux.vnet.ibm.com
kdumpctl | 102 ++++++++++++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 83 insertions(+), 19 deletions(-)
diff --git a/kdumpctl b/kdumpctl index bbc2779..2eb52a5 100755 --- a/kdumpctl +++ b/kdumpctl @@ -15,6 +15,7 @@ FADUMP_REGISTER_SYS_NODE="/sys/kernel/fadump_registered" . /lib/kdump/kdump-lib.sh standard_kexec_args="-p" +declare -i image_time if [ -f /etc/sysconfig/kdump ]; then . /etc/sysconfig/kdump @@ -31,12 +32,25 @@ determine_dump_mode() # kdump shall be the default dump mode dump_mode="kdump"
- if [ -z "$KDUMP_KERNELVER" ]; then
kdump_kver=`uname -r`
- else
kdump_kver=$KDUMP_KERNELVER
- fi
- default_initrd="${KDUMP_BOOTDIR}/initramfs-${kdump_kver}.img"
- kdump_initrd="${KDUMP_BOOTDIR}/initramfs-${kdump_kver}kdump.img"
- default_initrd_bak="$default_initrd.default.bak"
- # Check if firmware-assisted dump is enabled # if yes, set the dump mode as fadump if is_fadump_capable; then echo "Using dump mode fadump" dump_mode="fadump" fi
- # Handle dump mode switch from kdump to fadump OR fadump to kdump
- handle_dump_mode_switch } # remove_cmdline_param <kernel cmdline> <param1> [<param2>] ... [<paramN>]
@@ -83,6 +97,10 @@ function save_core() function rebuild_initrd() {
- if [ $dump_mode == "fadump" ]; then
backup_default_initrd
- fi
- $MKDUMPRD $kdump_initrd $kdump_kver if [ $? != 0 ]; then echo "mkdumprd: failed to make kdump initrd" >&2
@@ -112,6 +130,61 @@ function check_executable() done } +backup_default_initrd() +{
- # Check if backup initrd is already present. If not, then
- # this is the first time fadump is being used OR user
- # has switched from kdump to fadump.
- # Take a backup of the original default initrd before
- # we rebuild default initrd for fadump support.
- if [ ! -e $default_initrd_bak ];then
echo "Backing up default initrd"
cp $default_initrd $default_initrd_bak
sync
- fi
+}
+handle_dump_mode_switch() +{
- if [ $dump_mode == "fadump" ]; then
if [ -e $kdump_initrd ];then
# This means user has switched from kdump to fadump.
# Remove kdump initrd which is no longer needed
rm -f $kdump_initrd
fi
- else
if [ -e $default_initrd_bak ];then
# !fadump and original initrd backup file exists.
# This means user has switched from fadump to kdump.
# Restore the original default initrd.
mv $default_initrd_bak $default_initrd
sync
fi
- fi
+}
+find_initrd_image_time() +{
- image_time=0
- # Check to see if dependent files have been modified
- # since last build of the image file
- if [ $dump_mode == "fadump" ]; then
# If this is the 1st time we are using fadump then let image_time be
# zero to force rebuild initrd. The non-existance of backup initrd
# means this is the 1st time fadump is being used. If it exists then
# return the image time of default initrd.
if [ -e $default_initrd_bak ]; then
image_time=`stat -c "%Y" $default_initrd 2>/dev/null`
fi
- else
if [ -f $kdump_initrd ]; then
image_time=`stat -c "%Y" $kdump_initrd 2>/dev/null`
return
fi
- fi
+}
- function check_config() { local nr
@@ -171,14 +244,7 @@ function check_rebuild() local extra_modules modified_files="" local _force_rebuild force_rebuild="0"
- if [ -z "$KDUMP_KERNELVER" ]; then
kdump_kver=`uname -r`
- else
kdump_kver=$KDUMP_KERNELVER
- fi
- kdump_kernel="${KDUMP_BOOTDIR}/${KDUMP_IMG}-${kdump_kver}${KDUMP_IMG_EXT}"
- kdump_initrd="${KDUMP_BOOTDIR}/initramfs-${kdump_kver}kdump.img" _force_rebuild=`grep ^force_rebuild $KDUMP_CONFIG_FILE 2>/dev/null` if [ $? -eq 0 ]; then
@@ -193,13 +259,9 @@ function check_rebuild() extra_modules=`grep ^extra_modules $KDUMP_CONFIG_FILE` [ -n "$extra_modules" ] && force_rebuild="1"
- #check to see if dependent files has been modified
- #since last build of the image file
- if [ -f $kdump_initrd ]; then
image_time=`stat -c "%Y" $kdump_initrd 2>/dev/null`
- else
image_time=0
- fi
- # Find initrd image time based on whether dependent files have been
- # modified since last build of the image file
- find_initrd_image_time #also rebuild when cluster conf is changed and fence kdump is enabled. check_fence_kdump $image_time && modified_files="cluster-cib"
@@ -225,12 +287,14 @@ function check_rebuild() fi done
- if [ $image_time -eq 0 ]; then
echo -n "No kdump initial ramdisk found."; echo
- elif [ "$force_rebuild" != "0" ]; then
echo -n "Force rebuild $kdump_initrd"; echo
- if [ $dump_mode == "kdump" ]; then
if [ $image_time -eq 0 ]; then
echo -n "No kdump initial ramdisk found."; echo
elif [ "$force_rebuild" != "0" ]; then
echo -n "Force rebuild $kdump_initrd"; echo
elif [ -n "$modified_files" ]; thenfi
The above if elif logic looks different from original ones?
Messages are put under "kdump" dump mode if condition because:
- there is no separate initial ramdisk for fadump as in the case of
"kdump" mode.. 2. there is a message in rebuild_fadump_initrd() that conveys the same..
For below case [[ $dump_mode == "kdump" ]] && [[ $image_time -ne 0 ]] && [[ "$force_rebuild" == "0" ]] && [[ -n "$modified_files" ]]
The kdump image will not be rebuilt, right? Am I missing some thing?
oops! Right. I missed out on that. Thanks for pointing out. Will update accordingly..
Thanks Hari
Thanks Hari
echo "Detected change(s) the following file(s):"
echo -n " "; echo "$modified_files" | sed 's/\s/\n /g' else return 0echo "Detected change(s) in the following file(s):"
kexec mailing list kexec@lists.fedoraproject.org https://lists.fedoraproject.org/mailman/listinfo/kexec
The current kdump infrastructure builds a separate initrd which then gets loaded into memory by kexec-tools for use by kdump kernel. But firmware assisted dump (FADUMP) does not use kexec-based approach. After crash, firmware reboots the partition and loads grub loader like the normal booting process does. Hence in the FADUMP approach, the second kernel (after crash) will always use the default initrd (OS built). So, to support FADUMP, change is required, as in to add dump capturing steps, in default initrd.
The current kdumpctl script implementation already has the code to build initrd using mkdumprd. This patch uses the new '--rebuild' option introduced, in dracut, to incrementally build the initramfs image. A backup of default initrd image is taken before rebuilding default initrd image with fadump support. If this operation fails, default initrd image is restored.
Kexec-tools package in rhel7 is now enhanced to insert a out-of-tree kdump module for dracut, which is responsible for adding vmcore capture steps into initrd, if dracut is invoked with "IN_KDUMP" environment variable set to 1. mkdumprd script exports "IN_KDUMP=1" environment variable before invoking dracut to build kdump initrd. This patch relies on this current mechanism of kdump init script.
Dracut patch that introduces '--rebuild' option is accepted upstream (commit id: 659dc319d950999f8d191a81fdc4d3114e9213de).
Signed-off-by: Mahesh Salgaonkar mahesh@linux.vnet.ibm.com Signed-off-by: Hari Bathini hbathini@linux.vnet.ibm.com --- kdumpctl | 47 +++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 43 insertions(+), 4 deletions(-)
diff --git a/kdumpctl b/kdumpctl index 2eb52a5..d58d64c 100755 --- a/kdumpctl +++ b/kdumpctl @@ -95,17 +95,46 @@ function save_core() fi }
-function rebuild_initrd() +rebuild_fadump_initrd() { - if [ $dump_mode == "fadump" ]; then - backup_default_initrd + if [ ! -s "$default_initrd" ]; then + echo "No default initrd found to rebuild for fadump support!" + return 1 + fi + backup_default_initrd + + echo "Rebuilding $default_initrd with fadump support" + $MKDUMPRD --rebuild $default_initrd --kver $kdump_kver + if [ $? != 0 ]; then + echo "mkdumprd: failed to make initrd with fadump support" >&2 + restore_default_initrd + return 1 fi
+ return 0 +} + +rebuild_kdump_initrd() +{ + echo "Rebuilding $kdump_initrd" $MKDUMPRD $kdump_initrd $kdump_kver if [ $? != 0 ]; then echo "mkdumprd: failed to make kdump initrd" >&2 return 1 fi + + return 0 +} + +function rebuild_initrd() +{ + if [ $dump_mode == "fadump" ]; then + rebuild_fadump_initrd + else + rebuild_kdump_initrd + fi + + return $? }
#$1: the files to be checked with IFS=' ' @@ -163,6 +192,17 @@ handle_dump_mode_switch() fi }
+restore_default_initrd() +{ + # We have failed to rebuild initrd for fadump support. + # Restore the original default initrd. + if [ -f $default_initrd_bak ];then + echo "Restored default initrd" + mv $default_initrd_bak $default_initrd + sync + fi +} + find_initrd_image_time() { image_time=0 @@ -300,7 +340,6 @@ function check_rebuild() return 0 fi
- echo "Rebuilding $kdump_initrd" rebuild_initrd return $? }
This cleanup patch removes unnecessary keyword "function" at all places in kdumpctl script.
Signed-off-by: Hari Bathini hbathini@linux.vnet.ibm.com --- kdumpctl | 38 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 19 deletions(-)
diff --git a/kdumpctl b/kdumpctl index d58d64c..367e1e3 100755 --- a/kdumpctl +++ b/kdumpctl @@ -56,7 +56,7 @@ determine_dump_mode() # remove_cmdline_param <kernel cmdline> <param1> [<param2>] ... [<paramN>] # Remove a list of kernel parameters from a given kernel cmdline and print the result. # For each "arg" in the removing params list, "arg" and "arg=xxx" will be removed if exists. -function remove_cmdline_param() +remove_cmdline_param() { local cmdline=$1 shift @@ -70,7 +70,7 @@ function remove_cmdline_param() echo $cmdline }
-function save_core() +save_core() { coredir="/var/crash/`date +"%Y-%m-%d-%H:%M"`"
@@ -126,7 +126,7 @@ rebuild_kdump_initrd() return 0 }
-function rebuild_initrd() +rebuild_initrd() { if [ $dump_mode == "fadump" ]; then rebuild_fadump_initrd @@ -138,7 +138,7 @@ function rebuild_initrd() }
#$1: the files to be checked with IFS=' ' -function check_exist() +check_exist() { for file in $1; do if [ ! -f "$file" ]; then @@ -149,7 +149,7 @@ function check_exist() }
#$1: the files to be checked with IFS=' ' -function check_executable() +check_executable() { for file in $1; do if [ ! -x "$file" ]; then @@ -225,7 +225,7 @@ find_initrd_image_time() fi }
-function check_config() +check_config() { local nr
@@ -262,7 +262,7 @@ function check_config()
# check_fence_kdump <image timestamp> # return 0 if fence_kdump is configured and kdump initrd needs to be rebuilt -function check_fence_kdump() +check_fence_kdump() { local image_time=$1 local cib_time @@ -279,7 +279,7 @@ function check_fence_kdump() return 0 }
-function check_rebuild() +check_rebuild() { local extra_modules modified_files="" local _force_rebuild force_rebuild="0" @@ -346,7 +346,7 @@ function check_rebuild()
# This function check iomem and determines if we have more than # 4GB of ram available. Returns 1 if we do, 0 if we dont -function need_64bit_headers() +need_64bit_headers() { return `tail -n 1 /proc/iomem | awk '{ split ($1, r, "-"); \ print (strtonum("0x" r[2]) > strtonum("0xffffffff")); }'` @@ -355,7 +355,7 @@ function need_64bit_headers() # Load the kdump kerel specified in /etc/sysconfig/kdump # If none is specified, try to load a kdump kernel with the same version # as the currently running kernel. -function load_kdump() +load_kdump() { MEM_RESERVED=$(cat /sys/kernel/kexec_crash_size) if [ $MEM_RESERVED -eq 0 ] @@ -408,7 +408,7 @@ function load_kdump() fi }
-function check_ssh_config() +check_ssh_config() { while read config_opt config_val; do # remove inline comments after the end of a directive. @@ -441,7 +441,7 @@ function check_ssh_config() return 0 }
-function check_ssh_target() +check_ssh_target() { local _ret ssh -q -i $SSH_KEY_LOCATION -o BatchMode=yes $DUMP_TARGET mkdir -p $SAVE_PATH @@ -453,7 +453,7 @@ function check_ssh_target() return 0 }
-function propagate_ssh_key() +propagate_ssh_key() { check_ssh_config if [ $? -ne 0 ]; then @@ -509,7 +509,7 @@ check_current_fadump_status() return 1 }
-function check_current_kdump_status() +check_current_kdump_status() { rc=`cat /sys/kernel/kexec_crash_loaded` if [ $rc == 1 ]; then @@ -530,7 +530,7 @@ check_current_status() return $? }
-function save_raw() +save_raw() { local kdump_dir local raw_target @@ -641,7 +641,7 @@ selinux_relabel() # is 1 and SetupMode is 0, then secure boot is being enforced. # # Assume efivars is mounted at /sys/firmware/efi/efivars. -function is_secure_boot_enforced() +is_secure_boot_enforced() { local secure_boot_file setup_mode_file local secure_boot_byte setup_mode_byte @@ -661,7 +661,7 @@ function is_secure_boot_enforced() return 1 }
-function check_kdump_feasibility() +check_kdump_feasibility() { if is_secure_boot_enforced; then echo "Secure Boot is Enabled. Kdump service can't be started. Disable Secure Boot and retry" @@ -707,7 +707,7 @@ start_dump() return $? }
-function start() +start() { check_config if [ $? -ne 0 ]; then @@ -782,7 +782,7 @@ stop_kdump() return 0 }
-function stop() +stop() { if [ $dump_mode == "fadump" ]; then stop_fadump
The script dracut-kdump.sh is responsible for capturing vmcore during second kernel boot. Currently this script gets installed into kdump initrd as part of kdumpbase dracut module. Since it's always installed into kdump initrd, this script assumes that '/proc/vmcore' will always be present when it is invoked.
With fadump support, 'dracut-kdump.sh' script also gets installed into default initrd to capture vmcore generated by firmware assisted dump. Thus in fadump case, the same initrd is going to be used for normal boot as well as boot after system crash. Hence a check is required to see if '/proc/vmcore' file exists before executing steps to capture vmcore. This check will help to bypass the vmcore capture steps during normal boot process.
Signed-off-by: Mahesh Salgaonkar mahesh@linux.vnet.ibm.com Signed-off-by: Hari Bathini hbathini@linux.vnet.ibm.com --- dracut-kdump.sh | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/dracut-kdump.sh b/dracut-kdump.sh index d9e65ac..9634740 100755 --- a/dracut-kdump.sh +++ b/dracut-kdump.sh @@ -1,5 +1,8 @@ #!/bin/sh
+# continue only if /proc/vmcore is present. +[ ! -f /proc/vmcore ] && return + exec &> /dev/console . /lib/dracut-lib.sh . /lib/kdump-lib.sh
This patch adds fadump howto document to kexec-tools. The document is prepared in reference to kexec-kdump-howto.txt document.
Signed-off-by: Mahesh Salgaonkar mahesh@linux.vnet.ibm.com Signed-off-by: Hari Bathini hbathini@linux.vnet.ibm.com --- fadump-howto.txt | 428 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 428 insertions(+) create mode 100644 fadump-howto.txt
diff --git a/fadump-howto.txt b/fadump-howto.txt new file mode 100644 index 0000000..30748af --- /dev/null +++ b/fadump-howto.txt @@ -0,0 +1,428 @@ +Firmware assisted dump (fadump) HOWTO + +Introduction + +Firmware assisted dump is a new feature in the 3.4 mainline kernel supported +only on powerpc architecture. The goal of firmware-assisted dump is to enable +the dump of a crashed system, and to do so from a fully-reset system, and to +minimize the total elapsed time until the system is back in production use. A +complete documentation on implementation can be found at +Documentation/powerpc/firmware-assisted-dump.txt in upstream linux kernel tree +from 3.4 version and above. + +Please note that the firmware-assisted dump feature is only available on Power6 +and above systems with recent firmware versions. + +Overview + +Fadump + +Fadump is a robust kernel crash dumping mechanism to get reliable kernel crash +dump with assistance from firmware. This approach does not use kexec, instead +firmware assists in booting the kdump kernel while preserving memory contents. +Unlike kdump, the system is fully reset, and loaded with a fresh copy of the +kernel. In particular, PCI and I/O devices are reinitialized and are in a +clean, consistent state. This second kernel, often called a capture kernel, +boots with very little memory and captures the dump image. + +The first kernel registers the sections of memory with the Power firmware for +dump preservation during OS initialization. These registered sections of memory +are reserved by the first kernel during early boot. When a system crashes, the +Power firmware fully resets the system, preserves all the system memory +contents, save the the low memory (boot memory of size larger of 5% of system +RAM or 256MB) of RAM to the previous registered region. It will also save +system registers, and hardware PTE's. + +Fadump is supported only on ppc64 platform. The standard kernel and capture +kernel are one and the same on ppc64. + +If you're reading this document, you should already have kexec-tools +installed. If not, you install it via the following command: + + # yum install kexec-tools + +Fadump Operational Flow: + +Like kdump, fadump also exports the ELF formatted kernel crash dump through +/proc/vmcore. Hence existing kdump infrastructure can be used to capture fadump +vmcore. The idea is to keep the functionality transparent to end user. From +user perspective there is no change in the way kdump init script works. + +However, unlike kdump, fadump does not pre-load kdump kernel and initrd into +reserved memory, instead it always uses default OS initrd during second boot +after crash. Hence, for fadump, we rebuild the new kdump initrd and replace it +with default initrd. Before replacing existing default initrd we take a backup +of original default initrd which is restored back when user decides to switch +to kdump. The dracut package has been enhanced to rebuild the default initrd +with vmcore capture steps as per /etc/kdump.conf + +The control flow of fadump works as follows: +01. System panics. +02. At the crash, kernel informs power firmware that kernel has crashed. +03. Firmware takes the control and reboots the entire system preserving + only the memory (resets all other devices). +04. The reboot follows the normal booting process (non-kexec). +05. The boot loader loads the default kernel and initrd from /boot +06. The default initrd loads and runs /init +07. dracut-kdump.sh script present in fadump aware default initrd checks if + '/proc/vmcore' file exists before executing steps to capture vmcore. + (This check will help to bypass the vmcore capture steps during normal boot + process.) +09. Captures dump according to /etc/kdump.conf +10. Is dump capture successful (yes goto 12, no goto 11) +11. Perfom the default action specified in /etc/kdump.conf (Default action + is reboot, if unspecified) +12. Reboot + + +How to configure fadump: + +Again, we assume if you're reading this document, you should already have +kexec-tools installed. If not, you install it via the following command: + + # yum install kexec-tools + +To be able to do much of anything interesting in the way of debug analysis, +you'll also need to install the kernel-debuginfo package, of the same arch +as your running kernel, and the crash utility: + + # yum --enablerepo=*debuginfo install kernel-debuginfo.$(uname -m) crash + +Next up, we need to modify some boot parameters to enable firmware assisted +dump. With the help of grubby, it's very easy to append "fadump=on" to the end +of your kernel boot parameters. Optionally, user can also append +'fadump_reserve_mem=X' kernel cmdline to specify size of the memory to reserve +for boot memory dump preservation. + + # grubby --args="fadump=on" --update-kernel=/boot/vmlinuz-`uname -r` + +The term 'boot memory' means size of the low memory chunk that is required for +a kernel to boot successfully when booted with restricted memory. By default, +the boot memory size will be the larger of 5% of system RAM or 256MB. +Alternatively, user can also specify boot memory size through boot parameter +'fadump_reserve_mem=' which will override the default calculated size. Use this +option if default boot memory size is not sufficient for second kernel to boot +successfully. + +After making said changes, reboot your system, so that the specified memory is +reserved and left untouched by the normal system. Take note that the output of +'free -m' will show X MB less memory than without this parameter, which is +expected. If you see OOM (Out Of Memory) error messages while loading capture +kernel, then you should bump up the memory reservation size. + +Now that you've got that reserved memory region set up, you want to turn on +the kdump init script: + + # chkconfig kdump on + +Then, start up kdump as well: + + # systemctl start kdump.service + +This should turn on the firmware assisted functionality in kernel by +echo'ing 1 to /sys/kernel/fadump_registered, leaving the system ready +to capture a vmcore upon crashing. To test this out, you can force-crash +your system by echo'ing a c into /proc/sysrq-trigger: + + # echo c > /proc/sysrq-trigger + +You should see some panic output, followed by the system reset and booting into +fresh copy of kernel. When default initrd loads and runs /init, vmcore should +be copied out to disk (by default, in /var/crash/YYYY-MM-DD-HH:MM/vmcore), +then the system rebooted back into your normal kernel. + +Once back to your normal kernel, you can use the previously installed crash +kernel in conjunction with the previously installed kernel-debuginfo to +perform postmortem analysis: + + # crash /usr/lib/debug/lib/modules/2.6.17-1.2621.el5/vmlinux + /var/crash/2006-08-23-15:34/vmcore + + crash> bt + +and so on... + +Saving vmcore-dmesg.txt +---------------------- +Kernel log bufferes are one of the most important information available +in vmcore. Now before saving vmcore, kernel log bufferes are extracted +from /proc/vmcore and saved into a file vmcore-dmesg.txt. After +vmcore-dmesg.txt, vmcore is saved. Destination disk and directory for +vmcore-dmesg.txt is same as vmcore. Note that kernel log buffers will +not be available if dump target is raw device. + +Dump Triggering methods: + +This section talks about the various ways, other than a Kernel Panic, in which +fadump can be triggered. The following methods assume that fadump is configured +on your system, with the scripts enabled as described in the section above. + +1) AltSysRq C + +FAdump can be triggered with the combination of the 'Alt','SysRq' and 'C' +keyboard keys. Please refer to the following link for more details: + +http://kbase.redhat.com/faq/FAQ_43_5559.shtm + +In addition, on PowerPC boxes, fadump can also be triggered via Hardware +Management Console(HMC) using 'Ctrl', 'O' and 'C' keyboard keys. + +2) Kernel OOPs + +If we want to generate a dump everytime the Kernel OOPses, we can achieve this +by setting the 'Panic On OOPs' option as follows: + + # echo 1 > /proc/sys/kernel/panic_on_oops + +3) PowerPC specific methods: + +On IBM PowerPC machines, issuing a soft reset invokes the XMON debugger(if +XMON is configured). To configure XMON one needs to compile the kernel with +the CONFIG_XMON and CONFIG_XMON_DEFAULT options, or by compiling with +CONFIG_XMON and booting the kernel with xmon=on option. + +Following are the ways to remotely issue a soft reset on PowerPC boxes, which +would drop you to XMON. Pressing a 'X' (capital alphabet X) followed by an +'Enter' here will trigger the dump. + +3.1) HMC + +Hardware Management Console(HMC) available on Power4 and Power5 machines allow +partitions to be reset remotely. This is specially useful in hang situations +where the system is not accepting any keyboard inputs. + +Once you have HMC configured, the following steps will enable you to trigger +fadump via a soft reset: + +On Power4 + Using GUI + + * In the right pane, right click on the partition you wish to dump. + * Select "Operating System->Reset". + * Select "Soft Reset". + * Select "Yes". + + Using HMC Commandline + + # reset_partition -m <machine> -p <partition> -t soft + +On Power5 + Using GUI + + * In the right pane, right click on the partition you wish to dump. + * Select "Restart Partition". + * Select "Dump". + * Select "OK". + + Using HMC Commandline + + # chsysstate -m <managed system name> -n <lpar name> -o dumprestart -r lpar + +3.2) Blade Management Console for Blade Center + +To initiate a dump operation, go to Power/Restart option under "Blade Tasks" in +the Blade Management Console. Select the corresponding blade for which you want +to initate the dump and then click "Restart blade with NMI". This issues a +system reset and invokes xmon debugger. + + +Advanced Setups: + +In addition to being able to capture a vmcore to your system's local file +system, fadump can be configured to capture a vmcore to a number of other +locations, including a raw disk partition, a dedicated file system, an NFS +mounted file system, or a remote system via ssh/scp. Additional options +exist for specifying the relative path under which the dump is captured, +what to do if the capture fails, and for compressing and filtering the dump +(so as to produce smaller, more manageable, vmcore files). + +In theory, dumping to a location other than the local file system should be +safer than fadump's default setup, as its possible the default setup will try +dumping to a file system that has become corrupted. The raw disk partition and +dedicated file system options allow you to still dump to the local system, +but without having to remount your possibly corrupted file system(s), +thereby decreasing the chance a vmcore won't be captured. Dumping to an +NFS server or remote system via ssh/scp also has this advantage, as well +as allowing for the centralization of vmcore files, should you have several +systems from which you'd like to obtain vmcore files. Of course, note that +these configurations could present problems if your network is unreliable. + +Advanced setups are configured via modifications to /etc/kdump.conf, +which out of the box, is fairly well documented itself. Any alterations to +/etc/kdump.conf should be followed by a restart of the kdump service, so +the changes can be incorporated in the fadump aware default initrd. Restarting +the kdump service is as simple as '/sbin/systemctl restart kdump.service'. + + +Note that kdump.conf is used as a configuration mechanism for capturing dump +files from the initramfs (in the interests of safety), the root file system is +mounted, and the init process is started, only as a last resort if the +initramfs fails to capture the vmcore. As such, configuration made in +/etc/kdump.conf is only applicable to capture recorded in the initramfs. If +for any reason the init process is started on the root file system, only a +simple copying of the vmcore from /proc/vmcore to /var/crash/$DATE/vmcore will +be performed. + +For both local filesystem and nfs dump the dump target must be mounted before +building fadump aware initramfs. That means one needs to put an entry for the +dump file system in /etc/fstab so that after reboot when kdump service starts, +it can find the dump target and build initramfs instead of failing. +Usually the dump target should be used only for fadump. If you worry about +someone uses the filesystem for something else other than dumping vmcore +you can mount it as read-only. Mkdumprd will still remount it as read-write +for creating dump directory and will move it back to read-only afterwards. + +Raw partition + +Raw partition dumping requires that a disk partition in the system, at least +as large as the amount of memory in the system, be left unformatted. Assuming +/dev/vg/lv_kdump is left unformatted, kdump.conf can be configured with +'raw /dev/vg/lv_kdump', and the vmcore file will be copied via dd directly +onto partition /dev/vg/lv_kdump. Restart the kdump service via +'/sbin/systemctl restart kdump.service' to commit this change to your fadump +aware default initrd. Dump target should be persistent device name, such as lvm +or device mapper canonical name. + +Dedicated file system + +Similar to raw partition dumping, you can format a partition with the file +system of your choice, Again, it should be at least as large as the amount +of memory in the system. Assuming it should be at least as large as the +amount of memory in the system. Assuming /dev/vg/lv_kdump has been +formatted ext4, specify 'ext4 /dev/vg/lv_kdump' in kdump.conf, and a +vmcore file will be copied onto the file system after it has been mounted. +Dumping to a dedicated partition has the advantage that you can dump multiple +vmcores to the file system, space permitting, without overwriting previous ones, +as would be the case in a raw partition setup. Restart the kdump service via +'/sbin/systemctl restart kdump.service' to commit this change to +your fadump aware default initrd. Note that for local file systems ext4 and +ext2 are supported as dumpable targets. Kdump will not prevent you from +specifying other filesystems, and they will most likely work, but their +operation cannot be guaranteed. For instance specifying a vfat filesystem or +msdos filesystem will result in a successful load of the kdump service, but +during crash recovery, the dump will fail if the system has more than 2GB of +memory (since vfat and msdos filesystems do not support more than 2GB files). +Be careful of your filesystem selection when using this target. + +It is recommended to use persistent device names or UUID/LABEL for file system +dumps. One example of persistent device is /dev/vg/<devname>. + +NFS mount + +Dumping over NFS requires an NFS server configured to export a file system +with full read/write access for the root user. All operations done within +the fadump aware default initrd are done as root, and to write out a vmcore +file, we obviously must be able to write to the NFS mount. Configuring an NFS +server is outside the scope of this document, but either the no_root_squash +or anonuid options on the NFS server side are likely of interest to permit +the fadump aware default initrd operations write to the NFS mount as root. + +Assuming your're exporting /dump on the machine nfs-server.example.com, +once the mount is properly configured, specify it in kdump.conf, via +'nfs nfs-server.example.com:/dump'. The server portion can be specified either +by host name or IP address. Following a system crash, the fadump aware default +initrd will mount the NFS mount and copy out the vmcore to your NFS server. +Restart the kdump service via '/sbin/systemctl restart kdump.service' to commit +this change to your fadump aware initrd. + +Remote system via ssh/scp + +Dumping over ssh/scp requires setting up passwordless ssh keys for every +machine you wish to have dump via this method. First up, configure kdump.conf +for ssh/scp dumping, adding a config line of 'ssh user@server', where 'user' +can be any user on the target system you choose, and 'server' is the host +name or IP address of the target system. Using a dedicated, restricted user +account on the target system is recommended, as there will be keyless ssh +access to this account. + +Once kdump.conf is appropriately configured, issue the command +'kdumpctl propagate' to automatically set up the ssh host keys and transmit +the necessary bits to the target server. You'll have to type in 'yes' +to accept the host key for your targer server if this is the first time +you've connected to it, and then input the target system user's password +to send over the necessary ssh key file. Restart the kdump service via +'/sbin/systemctl restart kdump.service' to commit this change to the fadump +aware default initrd. + +Path + +By default, local file system vmcore files are written to /var/crash/%DATE +on the local system, ssh/scp dumps to /var/crash/%HOST-%DATE on the target +system, dedicated file system partition dumps to ./var/crash/%DATE, and +NFS dumps to ./var/crash/%HOST-%DATE, the latter two both relative to +their respective mount points within the fadump initrd (usually /mnt). The +'/var/crash' portion of the path can be overridden using kdump.conf's 'path' +variable, should you wish to write the vmcore out to a different location. For +example, 'path /data/coredumps' would lead to vmcore files being written to +/data/coredumps/%DATE if you were dumping to your local file system. Note +that the path option is ignored if your kdump configuration results in the +core being saved from the initscripts in the root filesystem. + +Kdump Post-Capture Executable + +It is possible to specify a custom script or binary you wish to run following +an attempt to capture a vmcore. The executable is passed an exit code from +the capture process, which can be used to trigger different actions from +within your post-capture executable. + +Kdump Pre-Capture Executable + +It is possible to specify a custom script or binary you wish to run before +capturing a vmcore. Exit status of this binary is interpreted: +0 - continue with dump process as usual +non 0 - reboot the system + +Extra Binaries + +If you have specific binaries or scripts you want to have made available +within your fadump aware default initrd, you can specify them by their full +path, and they will be included in your initrd, along with all dependent +libraries. This may be particularly useful for those running post-capture +scripts that rely on other binaries. + +Extra Modules + +By default, only the bare minimum of kernel modules will be included in your +fadump aware default initrd. Should you wish to capture your vmcore files to a +non-boot-path storage device, such as an iscsi target disk or clustered file +system, you may need to manually specify additional kernel modules to load into +your fadump aware default initrd. + +Default action +============== +Default action specifies what to do when dump to configured dump target +fails. By default, default action is "reboot" and that is system reboots +if attempt to save dump to dump target fails. + +There are other default actions available though. + +- dump_to_rootfs + This option tries to mount root and save dump on root filesystem + in a path specified by "path". This option will generally make + sense when dump target is not root filesystem. For example, if + dump is being saved over network using "ssh" then one can specify + default to "dump_to_rootfs" to try saving dump to root filesystem + if dump over network fails. + +- shell + Drop into a shell session inside initramfs. +- halt + Halt system after failure +- poweroff + Poweroff system after failure. + +Compression and filtering + +Refer "Compression and filtering" section in "kexec-kdump-howto.txt" document. +Compression and filtering are same for kdump & fadump. + + +Notes on rootfs mount: +Dracut is designed to mount rootfs by default. If rootfs mounting fails it +will refuse to go on. So fadump leaves rootfs mounting to dracut currently. +We make the assumtion that proper root= cmdline is being passed to dracut +initramfs for the time being. If you need modify "KDUMP_COMMANDLINE=" in +/etc/sysconfig/kdump, you will need to make sure that appropriate root= +options are copied from /proc/cmdline. In general it is best to append +command line options using "KDUMP_COMMANDLINE_APPEND=" instead of replacing +the original command line completely.
On 02/27/14 at 01:51pm, Hari Bathini wrote:
This patch adds fadump howto document to kexec-tools. The document is prepared in reference to kexec-kdump-howto.txt document.
Signed-off-by: Mahesh Salgaonkar mahesh@linux.vnet.ibm.com Signed-off-by: Hari Bathini hbathini@linux.vnet.ibm.com
fadump-howto.txt | 428 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 428 insertions(+) create mode 100644 fadump-howto.txt
diff --git a/fadump-howto.txt b/fadump-howto.txt new file mode 100644 index 0000000..30748af --- /dev/null +++ b/fadump-howto.txt @@ -0,0 +1,428 @@ +Firmware assisted dump (fadump) HOWTO
+Introduction
+Firmware assisted dump is a new feature in the 3.4 mainline kernel supported +only on powerpc architecture. The goal of firmware-assisted dump is to enable +the dump of a crashed system, and to do so from a fully-reset system, and to +minimize the total elapsed time until the system is back in production use. A +complete documentation on implementation can be found at +Documentation/powerpc/firmware-assisted-dump.txt in upstream linux kernel tree +from 3.4 version and above.
+Please note that the firmware-assisted dump feature is only available on Power6 +and above systems with recent firmware versions.
+Overview
+Fadump
+Fadump is a robust kernel crash dumping mechanism to get reliable kernel crash +dump with assistance from firmware. This approach does not use kexec, instead +firmware assists in booting the kdump kernel while preserving memory contents. +Unlike kdump, the system is fully reset, and loaded with a fresh copy of the +kernel. In particular, PCI and I/O devices are reinitialized and are in a +clean, consistent state. This second kernel, often called a capture kernel, +boots with very little memory and captures the dump image.
+The first kernel registers the sections of memory with the Power firmware for +dump preservation during OS initialization. These registered sections of memory +are reserved by the first kernel during early boot. When a system crashes, the +Power firmware fully resets the system, preserves all the system memory +contents, save the the low memory (boot memory of size larger of 5% of system
s/the the/the
+RAM or 256MB) of RAM to the previous registered region. It will also save +system registers, and hardware PTE's.
+Fadump is supported only on ppc64 platform. The standard kernel and capture +kernel are one and the same on ppc64.
+If you're reading this document, you should already have kexec-tools +installed. If not, you install it via the following command:
- # yum install kexec-tools
+Fadump Operational Flow:
+Like kdump, fadump also exports the ELF formatted kernel crash dump through +/proc/vmcore. Hence existing kdump infrastructure can be used to capture fadump +vmcore. The idea is to keep the functionality transparent to end user. From +user perspective there is no change in the way kdump init script works.
+However, unlike kdump, fadump does not pre-load kdump kernel and initrd into +reserved memory, instead it always uses default OS initrd during second boot +after crash. Hence, for fadump, we rebuild the new kdump initrd and replace it +with default initrd. Before replacing existing default initrd we take a backup +of original default initrd which is restored back when user decides to switch +to kdump. The dracut package has been enhanced to rebuild the default initrd +with vmcore capture steps as per /etc/kdump.conf
+The control flow of fadump works as follows: +01. System panics. +02. At the crash, kernel informs power firmware that kernel has crashed. +03. Firmware takes the control and reboots the entire system preserving
- only the memory (resets all other devices).
+04. The reboot follows the normal booting process (non-kexec). +05. The boot loader loads the default kernel and initrd from /boot +06. The default initrd loads and runs /init +07. dracut-kdump.sh script present in fadump aware default initrd checks if
- '/proc/vmcore' file exists before executing steps to capture vmcore.
- (This check will help to bypass the vmcore capture steps during normal boot
process.)
+09. Captures dump according to /etc/kdump.conf +10. Is dump capture successful (yes goto 12, no goto 11) +11. Perfom the default action specified in /etc/kdump.conf (Default action
- is reboot, if unspecified)
+12. Reboot
+How to configure fadump:
+Again, we assume if you're reading this document, you should already have +kexec-tools installed. If not, you install it via the following command:
- # yum install kexec-tools
+To be able to do much of anything interesting in the way of debug analysis, +you'll also need to install the kernel-debuginfo package, of the same arch +as your running kernel, and the crash utility:
- # yum --enablerepo=*debuginfo install kernel-debuginfo.$(uname -m) crash
+Next up, we need to modify some boot parameters to enable firmware assisted +dump. With the help of grubby, it's very easy to append "fadump=on" to the end +of your kernel boot parameters. Optionally, user can also append +'fadump_reserve_mem=X' kernel cmdline to specify size of the memory to reserve +for boot memory dump preservation.
- # grubby --args="fadump=on" --update-kernel=/boot/vmlinuz-`uname -r`
+The term 'boot memory' means size of the low memory chunk that is required for +a kernel to boot successfully when booted with restricted memory. By default, +the boot memory size will be the larger of 5% of system RAM or 256MB. +Alternatively, user can also specify boot memory size through boot parameter +'fadump_reserve_mem=' which will override the default calculated size. Use this +option if default boot memory size is not sufficient for second kernel to boot +successfully.
+After making said changes, reboot your system, so that the specified memory is +reserved and left untouched by the normal system. Take note that the output of +'free -m' will show X MB less memory than without this parameter, which is +expected. If you see OOM (Out Of Memory) error messages while loading capture +kernel, then you should bump up the memory reservation size.
+Now that you've got that reserved memory region set up, you want to turn on +the kdump init script:
- # chkconfig kdump on
should be changed to systemctl enable kdump.service
+Then, start up kdump as well:
- # systemctl start kdump.service
+This should turn on the firmware assisted functionality in kernel by +echo'ing 1 to /sys/kernel/fadump_registered, leaving the system ready +to capture a vmcore upon crashing. To test this out, you can force-crash +your system by echo'ing a c into /proc/sysrq-trigger:
- # echo c > /proc/sysrq-trigger
+You should see some panic output, followed by the system reset and booting into +fresh copy of kernel. When default initrd loads and runs /init, vmcore should +be copied out to disk (by default, in /var/crash/YYYY-MM-DD-HH:MM/vmcore),
The date format is date +%Y.%m.%d-%T which would be expanded to: %Y.%m.%d-%H:%M:%S
+then the system rebooted back into your normal kernel.
+Once back to your normal kernel, you can use the previously installed crash +kernel in conjunction with the previously installed kernel-debuginfo to +perform postmortem analysis:
- # crash /usr/lib/debug/lib/modules/2.6.17-1.2621.el5/vmlinux
- /var/crash/2006-08-23-15:34/vmcore
- crash> bt
+and so on...
+Saving vmcore-dmesg.txt +---------------------- +Kernel log bufferes are one of the most important information available +in vmcore. Now before saving vmcore, kernel log bufferes are extracted +from /proc/vmcore and saved into a file vmcore-dmesg.txt. After +vmcore-dmesg.txt, vmcore is saved. Destination disk and directory for +vmcore-dmesg.txt is same as vmcore. Note that kernel log buffers will +not be available if dump target is raw device.
+Dump Triggering methods:
+This section talks about the various ways, other than a Kernel Panic, in which +fadump can be triggered. The following methods assume that fadump is configured +on your system, with the scripts enabled as described in the section above.
+1) AltSysRq C
+FAdump can be triggered with the combination of the 'Alt','SysRq' and 'C' +keyboard keys. Please refer to the following link for more details:
+http://kbase.redhat.com/faq/FAQ_43_5559.shtm
+In addition, on PowerPC boxes, fadump can also be triggered via Hardware +Management Console(HMC) using 'Ctrl', 'O' and 'C' keyboard keys.
+2) Kernel OOPs
+If we want to generate a dump everytime the Kernel OOPses, we can achieve this +by setting the 'Panic On OOPs' option as follows:
- # echo 1 > /proc/sys/kernel/panic_on_oops
+3) PowerPC specific methods:
+On IBM PowerPC machines, issuing a soft reset invokes the XMON debugger(if +XMON is configured). To configure XMON one needs to compile the kernel with +the CONFIG_XMON and CONFIG_XMON_DEFAULT options, or by compiling with +CONFIG_XMON and booting the kernel with xmon=on option.
+Following are the ways to remotely issue a soft reset on PowerPC boxes, which +would drop you to XMON. Pressing a 'X' (capital alphabet X) followed by an +'Enter' here will trigger the dump.
+3.1) HMC
+Hardware Management Console(HMC) available on Power4 and Power5 machines allow +partitions to be reset remotely. This is specially useful in hang situations +where the system is not accepting any keyboard inputs.
+Once you have HMC configured, the following steps will enable you to trigger +fadump via a soft reset:
+On Power4
- Using GUI
- In the right pane, right click on the partition you wish to dump.
- Select "Operating System->Reset".
- Select "Soft Reset".
- Select "Yes".
- Using HMC Commandline
- # reset_partition -m <machine> -p <partition> -t soft
+On Power5
- Using GUI
- In the right pane, right click on the partition you wish to dump.
- Select "Restart Partition".
- Select "Dump".
- Select "OK".
- Using HMC Commandline
- # chsysstate -m <managed system name> -n <lpar name> -o dumprestart -r lpar
+3.2) Blade Management Console for Blade Center
+To initiate a dump operation, go to Power/Restart option under "Blade Tasks" in +the Blade Management Console. Select the corresponding blade for which you want +to initate the dump and then click "Restart blade with NMI". This issues a +system reset and invokes xmon debugger.
+Advanced Setups:
Is these duplicate chunks really necessary, I would suggest change this file for fadump only, and keep the general part in original howto so we only need update one file when we need update it in the future. I means but not limited to below sections about kdump configuration related stuff..
+In addition to being able to capture a vmcore to your system's local file +system, fadump can be configured to capture a vmcore to a number of other +locations, including a raw disk partition, a dedicated file system, an NFS +mounted file system, or a remote system via ssh/scp. Additional options +exist for specifying the relative path under which the dump is captured, +what to do if the capture fails, and for compressing and filtering the dump +(so as to produce smaller, more manageable, vmcore files).
+In theory, dumping to a location other than the local file system should be +safer than fadump's default setup, as its possible the default setup will try +dumping to a file system that has become corrupted. The raw disk partition and +dedicated file system options allow you to still dump to the local system, +but without having to remount your possibly corrupted file system(s), +thereby decreasing the chance a vmcore won't be captured. Dumping to an +NFS server or remote system via ssh/scp also has this advantage, as well +as allowing for the centralization of vmcore files, should you have several +systems from which you'd like to obtain vmcore files. Of course, note that +these configurations could present problems if your network is unreliable.
+Advanced setups are configured via modifications to /etc/kdump.conf, +which out of the box, is fairly well documented itself. Any alterations to +/etc/kdump.conf should be followed by a restart of the kdump service, so +the changes can be incorporated in the fadump aware default initrd. Restarting +the kdump service is as simple as '/sbin/systemctl restart kdump.service'.
+Note that kdump.conf is used as a configuration mechanism for capturing dump +files from the initramfs (in the interests of safety), the root file system is +mounted, and the init process is started, only as a last resort if the +initramfs fails to capture the vmcore. As such, configuration made in +/etc/kdump.conf is only applicable to capture recorded in the initramfs. If +for any reason the init process is started on the root file system, only a +simple copying of the vmcore from /proc/vmcore to /var/crash/$DATE/vmcore will +be performed.
+For both local filesystem and nfs dump the dump target must be mounted before +building fadump aware initramfs. That means one needs to put an entry for the +dump file system in /etc/fstab so that after reboot when kdump service starts, +it can find the dump target and build initramfs instead of failing. +Usually the dump target should be used only for fadump. If you worry about +someone uses the filesystem for something else other than dumping vmcore +you can mount it as read-only. Mkdumprd will still remount it as read-write +for creating dump directory and will move it back to read-only afterwards.
+Raw partition
+Raw partition dumping requires that a disk partition in the system, at least +as large as the amount of memory in the system, be left unformatted. Assuming +/dev/vg/lv_kdump is left unformatted, kdump.conf can be configured with +'raw /dev/vg/lv_kdump', and the vmcore file will be copied via dd directly +onto partition /dev/vg/lv_kdump. Restart the kdump service via +'/sbin/systemctl restart kdump.service' to commit this change to your fadump +aware default initrd. Dump target should be persistent device name, such as lvm +or device mapper canonical name.
+Dedicated file system
+Similar to raw partition dumping, you can format a partition with the file +system of your choice, Again, it should be at least as large as the amount +of memory in the system. Assuming it should be at least as large as the +amount of memory in the system. Assuming /dev/vg/lv_kdump has been +formatted ext4, specify 'ext4 /dev/vg/lv_kdump' in kdump.conf, and a +vmcore file will be copied onto the file system after it has been mounted. +Dumping to a dedicated partition has the advantage that you can dump multiple +vmcores to the file system, space permitting, without overwriting previous ones, +as would be the case in a raw partition setup. Restart the kdump service via +'/sbin/systemctl restart kdump.service' to commit this change to +your fadump aware default initrd. Note that for local file systems ext4 and +ext2 are supported as dumpable targets. Kdump will not prevent you from +specifying other filesystems, and they will most likely work, but their +operation cannot be guaranteed. For instance specifying a vfat filesystem or +msdos filesystem will result in a successful load of the kdump service, but +during crash recovery, the dump will fail if the system has more than 2GB of +memory (since vfat and msdos filesystems do not support more than 2GB files). +Be careful of your filesystem selection when using this target.
+It is recommended to use persistent device names or UUID/LABEL for file system +dumps. One example of persistent device is /dev/vg/<devname>.
+NFS mount
+Dumping over NFS requires an NFS server configured to export a file system +with full read/write access for the root user. All operations done within +the fadump aware default initrd are done as root, and to write out a vmcore +file, we obviously must be able to write to the NFS mount. Configuring an NFS +server is outside the scope of this document, but either the no_root_squash +or anonuid options on the NFS server side are likely of interest to permit +the fadump aware default initrd operations write to the NFS mount as root.
+Assuming your're exporting /dump on the machine nfs-server.example.com, +once the mount is properly configured, specify it in kdump.conf, via +'nfs nfs-server.example.com:/dump'. The server portion can be specified either +by host name or IP address. Following a system crash, the fadump aware default +initrd will mount the NFS mount and copy out the vmcore to your NFS server. +Restart the kdump service via '/sbin/systemctl restart kdump.service' to commit +this change to your fadump aware initrd.
+Remote system via ssh/scp
+Dumping over ssh/scp requires setting up passwordless ssh keys for every +machine you wish to have dump via this method. First up, configure kdump.conf +for ssh/scp dumping, adding a config line of 'ssh user@server', where 'user' +can be any user on the target system you choose, and 'server' is the host +name or IP address of the target system. Using a dedicated, restricted user +account on the target system is recommended, as there will be keyless ssh +access to this account.
+Once kdump.conf is appropriately configured, issue the command +'kdumpctl propagate' to automatically set up the ssh host keys and transmit +the necessary bits to the target server. You'll have to type in 'yes' +to accept the host key for your targer server if this is the first time +you've connected to it, and then input the target system user's password +to send over the necessary ssh key file. Restart the kdump service via +'/sbin/systemctl restart kdump.service' to commit this change to the fadump +aware default initrd.
+Path
+By default, local file system vmcore files are written to /var/crash/%DATE +on the local system, ssh/scp dumps to /var/crash/%HOST-%DATE on the target +system, dedicated file system partition dumps to ./var/crash/%DATE, and +NFS dumps to ./var/crash/%HOST-%DATE, the latter two both relative to +their respective mount points within the fadump initrd (usually /mnt). The +'/var/crash' portion of the path can be overridden using kdump.conf's 'path' +variable, should you wish to write the vmcore out to a different location. For +example, 'path /data/coredumps' would lead to vmcore files being written to +/data/coredumps/%DATE if you were dumping to your local file system. Note +that the path option is ignored if your kdump configuration results in the +core being saved from the initscripts in the root filesystem.
+Kdump Post-Capture Executable
+It is possible to specify a custom script or binary you wish to run following +an attempt to capture a vmcore. The executable is passed an exit code from +the capture process, which can be used to trigger different actions from +within your post-capture executable.
+Kdump Pre-Capture Executable
+It is possible to specify a custom script or binary you wish to run before +capturing a vmcore. Exit status of this binary is interpreted: +0 - continue with dump process as usual +non 0 - reboot the system
+Extra Binaries
+If you have specific binaries or scripts you want to have made available +within your fadump aware default initrd, you can specify them by their full +path, and they will be included in your initrd, along with all dependent +libraries. This may be particularly useful for those running post-capture +scripts that rely on other binaries.
+Extra Modules
+By default, only the bare minimum of kernel modules will be included in your +fadump aware default initrd. Should you wish to capture your vmcore files to a +non-boot-path storage device, such as an iscsi target disk or clustered file +system, you may need to manually specify additional kernel modules to load into +your fadump aware default initrd.
+Default action +============== +Default action specifies what to do when dump to configured dump target +fails. By default, default action is "reboot" and that is system reboots +if attempt to save dump to dump target fails.
+There are other default actions available though.
+- dump_to_rootfs
- This option tries to mount root and save dump on root filesystem
- in a path specified by "path". This option will generally make
- sense when dump target is not root filesystem. For example, if
- dump is being saved over network using "ssh" then one can specify
- default to "dump_to_rootfs" to try saving dump to root filesystem
- if dump over network fails.
+- shell
- Drop into a shell session inside initramfs.
+- halt
- Halt system after failure
+- poweroff
- Poweroff system after failure.
+Compression and filtering
+Refer "Compression and filtering" section in "kexec-kdump-howto.txt" document. +Compression and filtering are same for kdump & fadump.
+Notes on rootfs mount: +Dracut is designed to mount rootfs by default. If rootfs mounting fails it +will refuse to go on. So fadump leaves rootfs mounting to dracut currently. +We make the assumtion that proper root= cmdline is being passed to dracut +initramfs for the time being. If you need modify "KDUMP_COMMANDLINE=" in +/etc/sysconfig/kdump, you will need to make sure that appropriate root= +options are copied from /proc/cmdline. In general it is best to append +command line options using "KDUMP_COMMANDLINE_APPEND=" instead of replacing +the original command line completely.
kexec mailing list kexec@lists.fedoraproject.org https://lists.fedoraproject.org/mailman/listinfo/kexec
On 02/27/14 at 01:50pm, Hari Bathini wrote:
This patch set implements firmware-assisted dump support for kdump service. Firmware-assisted dump support depends on existing kdump infrastructure (kdump scripts) present in userland to save dump to the disk. Though existing kdump script will work seemlessly, it still needs to modified to make it aware of presense of firmware- assisted dump feature during service start and stop. These changes are tested successfully on a power box with fedora19.
Changes in v3:
- Split few functions for readability.
- Added a cleanup patch to remove unnecessay "function" keyword.
Hello Hari
Sorry for late jump in, the patches can not apply on top of latest kdumpctl, probably it has changed since your post. Please see a few comments in other reply for the code itself.
Thanks Dave
On Tue, Apr 01, 2014 at 06:02:47PM +0800, Dave Young wrote:
On 02/27/14 at 01:50pm, Hari Bathini wrote:
This patch set implements firmware-assisted dump support for kdump service. Firmware-assisted dump support depends on existing kdump infrastructure (kdump scripts) present in userland to save dump to the disk. Though existing kdump script will work seemlessly, it still needs to modified to make it aware of presense of firmware- assisted dump feature during service start and stop. These changes are tested successfully on a power box with fedora19.
Changes in v3:
- Split few functions for readability.
- Added a cleanup patch to remove unnecessay "function" keyword.
Hello Hari
Sorry for late jump in, the patches can not apply on top of latest kdumpctl, probably it has changed since your post. Please see a few comments in other reply for the code itself.
I think very soon two more patch series will go in. Bao's cleanup for determining disk based on "path" and Martin's patches for generic cluster suppport.
Hari, wait a little for Chao to process these two patch series and rebase your patch series on top of above two and repost.
Thanks Vivek
On 04/01/14 at 10:58am, Vivek Goyal wrote:
On Tue, Apr 01, 2014 at 06:02:47PM +0800, Dave Young wrote:
On 02/27/14 at 01:50pm, Hari Bathini wrote:
This patch set implements firmware-assisted dump support for kdump service. Firmware-assisted dump support depends on existing kdump infrastructure (kdump scripts) present in userland to save dump to the disk. Though existing kdump script will work seemlessly, it still needs to modified to make it aware of presense of firmware- assisted dump feature during service start and stop. These changes are tested successfully on a power box with fedora19.
Changes in v3:
- Split few functions for readability.
- Added a cleanup patch to remove unnecessay "function" keyword.
Hello Hari
Sorry for late jump in, the patches can not apply on top of latest kdumpctl, probably it has changed since your post. Please see a few comments in other reply for the code itself.
I think very soon two more patch series will go in. Bao's cleanup for determining disk based on "path" and Martin's patches for generic cluster suppport.
Hari, wait a little for Chao to process these two patch series and rebase your patch series on top of above two and repost.
Hi, Hari
I've pushed both the patchset from Bao and Martin on master branch. You can pull these latest changes and rebase your patch series.
Thanks WANG Chao
On 04/24/2014 09:10 AM, WANG Chao wrote:
On 04/01/14 at 10:58am, Vivek Goyal wrote:
On Tue, Apr 01, 2014 at 06:02:47PM +0800, Dave Young wrote:
On 02/27/14 at 01:50pm, Hari Bathini wrote:
This patch set implements firmware-assisted dump support for kdump service. Firmware-assisted dump support depends on existing kdump infrastructure (kdump scripts) present in userland to save dump to the disk. Though existing kdump script will work seemlessly, it still needs to modified to make it aware of presense of firmware- assisted dump feature during service start and stop. These changes are tested successfully on a power box with fedora19.
Changes in v3:
- Split few functions for readability.
- Added a cleanup patch to remove unnecessay "function" keyword.
Hello Hari
Sorry for late jump in, the patches can not apply on top of latest kdumpctl, probably it has changed since your post. Please see a few comments in other reply for the code itself.
I think very soon two more patch series will go in. Bao's cleanup for determining disk based on "path" and Martin's patches for generic cluster suppport.
Hari, wait a little for Chao to process these two patch series and rebase your patch series on top of above two and repost.
Hi, Hari
I've pushed both the patchset from Bao and Martin on master branch. You can pull these latest changes and rebase your patch series.
Thanks WANG Chao
Chao, thanks for the update. Rebased and posted the updated patches.
Thanks Hari