This is a patchset to add fence kdump support.
In cluster environment, fence kdump is used to notify all the other nodes that current is crashed and stop from being fenced off.
The patchset has the following features:
1. rebuild kdump initrd regarding timestamp of fence kdump config or cluster configuration. 2. setup a required working environment for fence kdump in 2nd kernel. 3. fence_kdump_send notify other nodes to stop the crashed one being fenced off before dumping process. 4. add kdump-in-cluster-environment.txt
v2->v3: From Vivek: (4/6) kdump.sh: make var nodes local (6/6) module-setup: avoid adding duplicates instead of removing them at last
WANG Chao (5): kdump-lib: add common variables and function for fence kdump kdumpctl: rebuild kdump initramfs if cluster or fence_kdump config is changed. kdump.sh: send fence kdump message to other nodes in the cluster module-setup.sh: setup fence kdump environment module-setup: remove duplicated ip= line
arthur (1): doc: Add kdump-in-cluster-environment.txt
dracut-kdump.sh | 15 +++++++++++ dracut-module-setup.sh | 54 ++++++++++++++++++++++++++++++++++++-- kdump-in-cluster-environment.txt | 56 ++++++++++++++++++++++++++++++++++++++++ kdump-lib.sh | 17 +++++++++++- kdumpctl | 26 +++++++++++++++++++ kexec-tools.spec | 3 +++ 6 files changed, 168 insertions(+), 3 deletions(-) create mode 100644 kdump-in-cluster-environment.txt
From: arthur zzou@redhat.com
From: arthur zzou@redhat.com
Since kdump already support dump in cluster environment, this patch add a howto file to RPM package to describe how to configure kdump in cluster environment.
Signed-off-by: arthur zzou@redhat.com --- kdump-in-cluster-environment.txt | 66 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+) create mode 100644 kdump-in-cluster-environment.txt
diff --git a/kdump-in-cluster-environment.txt b/kdump-in-cluster-environment.txt new file mode 100644 index 0000000..c27a5d7 --- /dev/null +++ b/kdump-in-cluster-environment.txt @@ -0,0 +1,66 @@ +Kdump-in-cluster-environment HOWTO + +Introduction + +Kdump is a kexec based crash dumping mechansim for Linux. This docuement +illustrate how to configure kdump in cluster environment to allow the kdump +crash recovery service complete without being preempted by traditional power +fencing methods. + +Overview + +Kexec/Kdump + +Details about Kexec/Kdump are available in Kexec-Kdump-howto file and will not +be described here. + +fence_kdump + +fence_kdump is an I/O fencing agent to be used with the kdump crash recovery +service. When the fence_kdump agent is invoked, it will listen for a message +from the failed node that acknowledges that the failed node is executing the +kdump crash kernel. Note that fence_kdump is not a replacement for traditional +fencing methods. The fence_kdump agent can only detect that a node has entered +the kdump crash recovery service. This allows the kdump crash recovery service +complete without being preempted by traditional power fencing methods. + +fence_kdump_send + +fence_kdump_send is a utility used to send messages that acknowledge that the +node itself has entered the kdump crash recovery service. The fence_kdump_send +utility is typically run in the kdump kernel after a cluster node has +encountered a kernel panic. Once the cluster node has entered the kdump crash +recovery service, fence_kdump_send will periodically send messages to all +cluster nodes. When the fence_kdump agent receives a valid message from the +failed nodes, fencing is complete. + +How to configure cluster environment: + +If we want to use kdump in cluster environment, fence-agents-kdump should be +installed in every nodes in the cluster. You can achieve this via the following +command: + + # yum install -y fence-agents-kdump + +Next is to add kdump_fence to the cluster. Assuming that the cluster consists +of three nodes, they are node1, node2 and node3, and use Pacemaker to perform +resource management and pcs as cli configuration tool. + +With pcs it is easy to add a stonith resource to the cluster. For example, add +a stonith resource named mykdumpfence with fence type of fence_kdump via the +following commands: + + # pcs stonith create mykdumpfence fence_kdump \ + pcmk_host_check=static-list pcmk_host_list="node1 node2 node3" + # pcs stonith update mykdumpfence pcmk_monitor_action=metadata --force + # pcs stonith update mykdumpfence pcmk_status_action=metadata --force + # pcs stonith update mykdumpfence pcmk_reboot_action=off --force + +Then enable stonith + # pcs property set stonith-enabled=true + +How to configure kdump: + +Actually there is nothing special in configuration between normal kdump and +cluster environment kdump. So please refer to Kexec-Kdump-howto file for more +information.
Add following common variables and function:
$FENCE_KDUMP_CONIFG: configuration file /etc/sysconfig/fence_kdump $FENCE_KDUMP_NODES: configuration file /etc/fence_kdump_nodes $FENCE_KDUMP_SEND: executable /usr/libexec/fence_kdump_send is_fence_kdump(): used to determine if the system is in a cluster and configured with fence_kdump.
Signed-off-by: WANG Chao chaowang@redhat.com --- kdump-lib.sh | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/kdump-lib.sh b/kdump-lib.sh index e73ac09..aac0c5f 100755 --- a/kdump-lib.sh +++ b/kdump-lib.sh @@ -1,8 +1,12 @@ #!/bin/sh # -# Kdump common functions +# Kdump common variables and functions #
+FENCE_KDUMP_CONFIG="/etc/sysconfig/fence_kdump" +FENCE_KDUMP_SEND="/usr/libexec/fence_kdump_send" +FENCE_KDUMP_NODES="/etc/fence_kdump_nodes" + is_ssh_dump_target() { grep -q "^ssh[[:blank:]].*@" /etc/kdump.conf @@ -22,3 +26,14 @@ strip_comments() { echo $@ | sed -e 's/(.*)#.*/\1/' } + +# Check if fence kdump is configured in cluster +is_fence_kdump() +{ + # no pcs or fence_kdump_send executables installed? + type -P pcs > /dev/null || return 1 + [ -x $FENCE_KDUMP_SEND ] || return 1 + + # fence kdump not configured? + (pcs cluster cib | grep -q 'type="fence_kdump"') &> /dev/null || return 1 +}
If the system is configured fence kdump, we need to update kdump initramfs if cluster or fence_kdump config is newer.
In RHEL7, cluster config is no longer keeping locally but stored remotely. Fortunately we can use a pcs tool to retrieve the xml based config and parse the last changed time from that.
/etc/sysconfig/fence_kdump is used to configure runtime arguments to fence_kdump_send. So We have to pass the arguments to 2nd kernel.
When cluster config or /etc/sysconfig/fence_kdump is newer than local kdump initramfs, we must rebuild initramfs to adapt changes in cluster.
For example:
Detected change(s) the following file(s):
cluster-cib /etc/sysconfig/fence_kdump Rebuilding /boot/initramfs-xxxkdump.img [..]
Signed-off-by: WANG Chao chaowang@redhat.com --- kdumpctl | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+)
diff --git a/kdumpctl b/kdumpctl index 46ae633..abcdffd 100755 --- a/kdumpctl +++ b/kdumpctl @@ -132,6 +132,25 @@ function check_config() return 0 }
+# check_fence_kdump <image timestamp> +# return 0 if fence_kdump is configured and kdump initrd needs to be rebuilt +function check_fence_kdump() +{ + local image_time=$1 + local cib_time + + is_fence_kdump || return 1 + + cib_time=`pcs cluster cib | xmllint --xpath 'string(/cib/@cib-last-written)' - | \ + xargs -0 date +%s --date` + + if [ -z $cib_time -o $cib_time -le $image_time ]; then + return 1 + fi + + return 0 +} + function check_rebuild() { local extra_modules modified_files="" @@ -167,6 +186,9 @@ function check_rebuild() image_time=0 fi
+ #also rebuild when cluster conf is changed and fence kdump is enabled. + check_fence_kdump $image_time && modified_files="cluster-cib" + EXTRA_BINS=`grep ^kdump_post $KDUMP_CONFIG_FILE | cut -d\ -f2` CHECK_FILES=`grep ^kdump_pre $KDUMP_CONFIG_FILE | cut -d\ -f2` EXTRA_BINS="$EXTRA_BINS $CHECK_FILES" @@ -174,6 +196,10 @@ function check_rebuild() EXTRA_BINS="$EXTRA_BINS $CHECK_FILES" files="$KDUMP_CONFIG_FILE $kdump_kernel $EXTRA_BINS"
+ if [ -f $FENCE_KDUMP_CONFIG ]; then + files="$files $FENCE_KDUMP_CONFIG" + fi + check_exist "$files" && check_executable "$EXTRA_BINS" [ $? -ne 0 ] && return 1
In 2nd kernel, to prevent the crashed system from being fenced off, fence kdump message must be send to other nodes in the cluster periodically before dumping process.
We preserve every node's name in /etc/fence_kdump_nodes in the initrd, so we parse this file and send notify them.
Signed-off-by: WANG Chao chaowang@redhat.com --- dracut-kdump.sh | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/dracut-kdump.sh b/dracut-kdump.sh index 4d8616f..d9e65ac 100755 --- a/dracut-kdump.sh +++ b/dracut-kdump.sh @@ -287,6 +287,21 @@ read_kdump_conf() done < $conf_file }
+fence_kdump_notify() +{ + local nodes + + if [ -f $FENCE_KDUMP_NODES ]; then + if [ -f $FENCE_KDUMP_CONFIG ]; then + . $FENCE_KDUMP_CONFIG + fi + + read nodes < $FENCE_KDUMP_NODES + $FENCE_KDUMP_SEND $FENCE_KDUMP_OPTS $nodes & + fi +} + +fence_kdump_notify read_kdump_conf
if [ -z "$CORE_COLLECTOR" ];then
This patch is used to setup fence kdump environment when building kdump initrd: 1. Check if it's cluster and fence_kdump is configured. 2. Get all the nodes in the cluster and pass them to 2nd kernel via /etc/fence_kdump_nodes 3. Setup network interface which will be used by fence kdump notifier in 2nd kernel. 4. Install fence kdump notifier (/usr/libexec/fence_kdump_send) to initrd.
Signed-off-by: WANG Chao chaowang@redhat.com --- dracut-module-setup.sh | 45 +++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 43 insertions(+), 2 deletions(-)
diff --git a/dracut-module-setup.sh b/dracut-module-setup.sh index c013430..02f0280 100755 --- a/dracut-module-setup.sh +++ b/dracut-module-setup.sh @@ -20,6 +20,10 @@ depends() { _dep="$_dep drm" fi
+ if is_fence_kdump; then + _dep="$_dep network" + fi + echo $_dep return 0 } @@ -234,9 +238,14 @@ kdump_install_net() { fi
kdump_setup_netdev "${_netdev}" + #save netdev used for kdump as cmdline - echo "kdumpnic=${_netdev}" > ${initdir}/etc/cmdline.d/60kdumpnic.conf - echo "bootdev=${_netdev}" > ${initdir}/etc/cmdline.d/70bootdev.conf + #fence kdump would override bootdev and kdumpnic, we should avoid that. + if [ ! -f ${initdir}${initdir}/etc/cmdline.d/60kdumpnic.conf ] && + [ ! -f ${initdir}/etc/cmdline.d/70bootdev.conf ]; then + echo "kdumpnic=${_netdev}" > ${initdir}/etc/cmdline.d/60kdumpnic.conf + echo "bootdev=${_netdev}" > ${initdir}/etc/cmdline.d/70bootdev.conf + fi }
#install kdump.conf and what user specifies in kdump.conf @@ -263,6 +272,7 @@ kdump_install_conf() { esac done < /etc/kdump.conf
+ kdump_check_fence_kdump inst "/tmp/$$-kdump.conf" "/etc/kdump.conf" rm -f /tmp/$$-kdump.conf } @@ -393,6 +403,37 @@ kdump_check_iscsi_targets () { }
+# setup fence_kdump in cluster +# setup proper network and install needed files +# also preserve '[node list]' for 2nd kernel /etc/fence_kdump_nodes +kdump_check_fence_kdump () { + local nodes + is_fence_kdump || return 1 + + # get cluster nodes from cluster cib, get interface and ip address + nodelist=`pcs cluster cib | xmllint --xpath "/cib/status/node_state/@uname" -` + + # nodelist is formed as 'uname="node1" uname="node2" ... uname="nodeX"' + # we need to convert each to node1, node2 ... nodeX in each iteration + for node in ${nodelist}; do + # convert $node from 'uname="nodeX"' to 'nodeX' + eval $node + nodename=$uname + # Skip its own node name + if [ "$nodename" = `hostname` ]; then + continue + fi + nodes="$nodes $nodename" + + kdump_install_net $nodename + done + echo + + echo "$nodes" > ${initdir}/$FENCE_KDUMP_NODES + dracut_install $FENCE_KDUMP_SEND + dracut_install -o $FENCE_KDUMP_CONFIG +} + install() { kdump_install_conf >"$initdir/lib/dracut/no-emergency-shell"
In the remote dump case, and if fence kdump is configured, chances are that the same network interface will be setup more than once. One time for network dump, the other times for fence kdump. The result is we will have two or more duplicate ip= configuration in 40ip.conf.
These are exactly duplicates, however dracut will refuse to continue and raise a fatal error if there are duplicate configuration for the same interface. So we have to avoid adding these duplicates.
Signed-off-by: WANG Chao chaowang@redhat.com --- dracut-module-setup.sh | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/dracut-module-setup.sh b/dracut-module-setup.sh index 02f0280..18821f7 100755 --- a/dracut-module-setup.sh +++ b/dracut-module-setup.sh @@ -182,7 +182,7 @@ kdump_setup_znet() { # Setup dracut to bringup a given network interface kdump_setup_netdev() { local _netdev=$1 - local _static _proto + local _static _proto _ip_conf _ip_opts _ifname_opts
if [ "$(uname -m)" = "s390x" ]; then kdump_setup_znet $_netdev @@ -196,7 +196,14 @@ kdump_setup_netdev() { _proto=dhcp fi
- echo " ip=${_static}$_netdev:${_proto}" > ${initdir}/etc/cmdline.d/40ip.conf + _ip_conf="${initdir}/etc/cmdline.d/40ip.conf" + _ip_opts=" ip=${_static}$_netdev:${_proto}" + + # dracut doesn't allow duplicated configuration for same NIC, even they're exactly the same. + # so we have to avoid adding duplicates + if [ ! -f $_ip_conf ] || ! grep -q $_ip_opts $_ip_conf; then + echo "$_ip_opts" >> $_ip_conf + fi
if kdump_is_bridge "$_netdev"; then kdump_setup_bridge "$_netdev" @@ -207,7 +214,8 @@ kdump_setup_netdev() { elif kdump_is_vlan "$_netdev"; then kdump_setup_vlan "$_netdev" else - echo " ifname=$_netdev:$(kdump_get_mac_addr $_netdev)" >> ${initdir}/etc/cmdline.d/40ip.conf + _ifname_opts=" ifname=$_netdev:$(kdump_get_mac_addr $_netdev)" + echo "$_ifname_opts" >> $_ip_conf fi
kdump_setup_dns "$_netdev"
On Tue, Jan 28, 2014 at 11:49:24AM +0800, WANG Chao wrote:
This is a patchset to add fence kdump support.
In cluster environment, fence kdump is used to notify all the other nodes that current is crashed and stop from being fenced off.
The patchset has the following features:
- rebuild kdump initrd regarding timestamp of fence kdump config or cluster configuration.
- setup a required working environment for fence kdump in 2nd kernel.
- fence_kdump_send notify other nodes to stop the crashed one being fenced off before dumping process.
- add kdump-in-cluster-environment.txt
Ack to the series.
Acked-by: Vivek Goyal vgoyal@redhat.com
Chao, can you please modify your 5th patch in series and put some comments where you try to avoid overwriting bootdev and kdumpnic.
Thanks Vivek
v2->v3: From Vivek: (4/6) kdump.sh: make var nodes local (6/6) module-setup: avoid adding duplicates instead of removing them at last
WANG Chao (5): kdump-lib: add common variables and function for fence kdump kdumpctl: rebuild kdump initramfs if cluster or fence_kdump config is changed. kdump.sh: send fence kdump message to other nodes in the cluster module-setup.sh: setup fence kdump environment module-setup: remove duplicated ip= line
arthur (1): doc: Add kdump-in-cluster-environment.txt
dracut-kdump.sh | 15 +++++++++++ dracut-module-setup.sh | 54 ++++++++++++++++++++++++++++++++++++-- kdump-in-cluster-environment.txt | 56 ++++++++++++++++++++++++++++++++++++++++ kdump-lib.sh | 17 +++++++++++- kdumpctl | 26 +++++++++++++++++++ kexec-tools.spec | 3 +++ 6 files changed, 168 insertions(+), 3 deletions(-) create mode 100644 kdump-in-cluster-environment.txt
-- 1.8.4.2
kexec mailing list kexec@lists.fedoraproject.org https://lists.fedoraproject.org/mailman/listinfo/kexec