These patches add support to load module in kdump kernel for active watchdog in primary kernel. Patches have been tested with mei_me and iTCO watchdog.
Pratyush Anand (2): Watchdog: Load module for only active watchdog in kdump kernel kdumpctl: force rebuild in case of dynamic system modification
dracut-module-setup.sh | 19 +++++++++++++------ kdump-lib.sh | 39 +++++++++++++++++++++++++++++++++++++++ kdumpctl | 23 +++++++++++++++++++++++ 3 files changed, 75 insertions(+), 6 deletions(-)
Currently we take care to load only iTCO_wdt if this module was loaded in primary kernel. This new approach is capitalizing on recent changes proposed in the kernel which are following:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=65... http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=90... http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=33...
Above kernel patches fixes/adds two major aspects: 1) It fixes parent of watchdog_device so that /sys/class/watchdog/watchdogn/device is populated. 2) Adds some sysfs device attributes so that we can read different watchdog status.
With the above support, now we can find out whether a watchdog is active or not. If it is active, we can also find out the driver/module responsible for that watchdog device.
Proposed patch uses above kernel support and then loads relevant wdt modules in kdump initramfs only for the active device.
Testing: It has been tested with kernel-4.5.0-0.rc1.git2.1.fc24 and fc23 user space.
- Added RuntimeWatchdogSec=40s in /etc/systemd/system.conf - Modified cmdline for crashkernel=256M - systemctl enable kdump - [temporarily changed /etc/kdump.conf to stop at dracut shell] - restart - echo c > /proc/sysrq-trigger
On dracut shell I can see kdump:/# cat /sys/class/watchdog/watchdog0/identity iTCO_wdt kdump:/# cat /sys/class/watchdog/watchdog0/state active
Tested also with correct /etc/kdump.conf and it was able to save the vmcore.
Assumption: Both watchdog and kdump daemon are managed by systemd.systemd starts watchdog daemon before kdump. If an user changes the watchdog status afterwards then he/she will have to execute `kdumpctrl restart`.
Limitations: This patch will be able to recognize an active wdt, only if its driver has been written in watchdog-core framework and registered with watchdog_class.
Signed-off-by: Pratyush Anand panand@redhat.com --- dracut-module-setup.sh | 19 +++++++++++++------ kdump-lib.sh | 39 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 52 insertions(+), 6 deletions(-)
diff --git a/dracut-module-setup.sh b/dracut-module-setup.sh index 4cd7107c4a35..32032304476e 100755 --- a/dracut-module-setup.sh +++ b/dracut-module-setup.sh @@ -706,10 +706,17 @@ install() { }
installkernel() { - wdt=$(lsmod|cut -f1 -d' '|grep "wdt$") - if [ -n "$wdt" ]; then - [ "$wdt" = "iTCO_wdt" ] && instmods lpc_ich && - echo "rd.driver.pre=lpc_ich,iTCO_wdt " >> ${initdir}/etc/cmdline.d/00-wdt.conf - instmods $wdt - fi + wdtcmdline=$(get_wdt_cmdline) + + if [ "$?" = "0" ]; then + echo $wdtcmdline >> ${initdir}/etc/cmdline.d/00-wdt.conf + IFS='=,' read -ra modarr <<< "$wdtcmdline" + count='1' + while [ "$count" -lt ${#modarr[@]} ] + do + instmods ${modarr[$count]} + count=`expr $count + 1` + done + fi + return 0 } diff --git a/kdump-lib.sh b/kdump-lib.sh index 4d3420652b2f..f60909bca047 100755 --- a/kdump-lib.sh +++ b/kdump-lib.sh @@ -230,3 +230,42 @@ is_hostname() fi echo $1 | grep -q "[a-zA-Z]" } + +get_wdt_cmdline() +{ + local wdtcmdline="" + wdtcls=/sys/class/watchdog + cd $wdtcls + for dir in */; do + cd $dir + active=`[ -f state ] && cat state` + if [ "$active" = "active" ]; then + # device/modalias will return driver of this device + wdtdrv=`cat device/modalias` + # There can be more than one module represnted by same + # modalias. Currently load all of them. + # TODO: Need to find a way to avoid any unwanted module + # represented by modalias + wdtdrv=`modprobe -R $wdtdrv | tr "\n" "," | sed 's/.$//'` + wdtcmdline="rd.driver.pre=$wdtdrv" + # however in some cases, we also need to check that if + # there is a specific driver for the parent bus/device. + # In such cases we also need to enable driver for parent + # bus/device. + wdtppath="device/.."; + while [ -f "$wdtppath/modalias" ] + do + wdtpdrv=`cat $wdtppath/modalias` + wdtpdrv=`modprobe -R $wdtpdrv | tr "\n" "," | sed 's/.$//'` + wdtcmdline="$wdtcmdline,$wdtpdrv" + wdtppath="$wdtppath/.." + done + echo "$wdtcmdline" + return 0 + fi + cd .. + done + + echo "$wdtcmdline" + return 1 +}
Hi, Pratyush
Thanks for the effort. Let's cc people who may be interested in the problem.
On 02/02/16 at 10:58am, Pratyush Anand wrote:
Currently we take care to load only iTCO_wdt if this module was loaded in primary kernel. This new approach is capitalizing on recent changes proposed in the kernel which are following:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=65... http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=90... http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=33...
Above kernel patches fixes/adds two major aspects:
- It fixes parent of watchdog_device so that
/sys/class/watchdog/watchdogn/device is populated. 2) Adds some sysfs device attributes so that we can read different watchdog status.
With the above support, now we can find out whether a watchdog is active or not. If it is active, we can also find out the driver/module responsible for that watchdog device.
Proposed patch uses above kernel support and then loads relevant wdt modules in kdump initramfs only for the active device.
Testing: It has been tested with kernel-4.5.0-0.rc1.git2.1.fc24 and fc23 user space.
- Added RuntimeWatchdogSec=40s in /etc/systemd/system.conf
- Modified cmdline for crashkernel=256M
- systemctl enable kdump
- [temporarily changed /etc/kdump.conf to stop at dracut shell]
- restart
- echo c > /proc/sysrq-trigger
On dracut shell I can see kdump:/# cat /sys/class/watchdog/watchdog0/identity iTCO_wdt kdump:/# cat /sys/class/watchdog/watchdog0/state active
Tested also with correct /etc/kdump.conf and it was able to save the vmcore.
Assumption: Both watchdog and kdump daemon are managed by systemd.systemd starts watchdog daemon before kdump. If an user changes the watchdog status afterwards then he/she will have to execute `kdumpctrl restart`.
Limitations: This patch will be able to recognize an active wdt, only if its driver has been written in watchdog-core framework and registered with watchdog_class.
I have several thing to make clear:
* Previously we tried to add it as a dracut module in upstream dracut. https://www.mail-archive.com/initramfs@vger.kernel.org/msg03299.html In dracut there is a simple watchdog module which is used for testing purpose only. We have a reliable way to get hostonly live watchdog module based on your kernel patches, so it might be better to retry to enhance the dracut watchdog module. Kdump use dracut --hostonly, we can add using wdt module for hostonly mode. In this way we can add another dracut kernel cmdline like rd.nowdt so that one can disable wdt functionality in initramfs if he/she want the original behavior.
* Previously we planned to only enabling wdt when driver initialization can not stop wdt eg. in 2nd kernel the wdt status is active after insmod. But if we copy systemd.conf from 1st kernel to initrd, it will by default enable the wdt. Do you think it is a better way? Maybe it can address the concern from Prarit?
* What if in 1st kernel one do not use systemd as wdt daemon? Then in 2nd kernel the behavior will be just like what we planned, insmod, driver stop the wdt, but still need kick it when driver failed to stop the wdt?
Signed-off-by: Pratyush Anand panand@redhat.com
dracut-module-setup.sh | 19 +++++++++++++------ kdump-lib.sh | 39 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 52 insertions(+), 6 deletions(-)
diff --git a/dracut-module-setup.sh b/dracut-module-setup.sh index 4cd7107c4a35..32032304476e 100755 --- a/dracut-module-setup.sh +++ b/dracut-module-setup.sh @@ -706,10 +706,17 @@ install() { }
installkernel() {
- wdt=$(lsmod|cut -f1 -d' '|grep "wdt$")
- if [ -n "$wdt" ]; then
[ "$wdt" = "iTCO_wdt" ] && instmods lpc_ich &&echo "rd.driver.pre=lpc_ich,iTCO_wdt " >> ${initdir}/etc/cmdline.d/00-wdt.confinstmods $wdt- fi
- wdtcmdline=$(get_wdt_cmdline)
- if [ "$?" = "0" ]; then
echo $wdtcmdline >> ${initdir}/etc/cmdline.d/00-wdt.confIFS='=,' read -ra modarr <<< "$wdtcmdline"count='1'while [ "$count" -lt ${#modarr[@]} ]doinstmods ${modarr[$count]}count=`expr $count + 1`done- fi
- return 0
} diff --git a/kdump-lib.sh b/kdump-lib.sh index 4d3420652b2f..f60909bca047 100755 --- a/kdump-lib.sh +++ b/kdump-lib.sh @@ -230,3 +230,42 @@ is_hostname() fi echo $1 | grep -q "[a-zA-Z]" }
+get_wdt_cmdline() +{
- local wdtcmdline=""
- wdtcls=/sys/class/watchdog
- cd $wdtcls
- for dir in */; do
cd $diractive=`[ -f state ] && cat state`if [ "$active" = "active" ]; then# device/modalias will return driver of this devicewdtdrv=`cat device/modalias`# There can be more than one module represnted by same# modalias. Currently load all of them.# TODO: Need to find a way to avoid any unwanted module# represented by modaliaswdtdrv=`modprobe -R $wdtdrv | tr "\n" "," | sed 's/.$//'`wdtcmdline="rd.driver.pre=$wdtdrv"# however in some cases, we also need to check that if# there is a specific driver for the parent bus/device.# In such cases we also need to enable driver for parent# bus/device.wdtppath="device/..";while [ -f "$wdtppath/modalias" ]dowdtpdrv=`cat $wdtppath/modalias`wdtpdrv=`modprobe -R $wdtpdrv | tr "\n" "," | sed 's/.$//'`wdtcmdline="$wdtcmdline,$wdtpdrv"wdtppath="$wdtppath/.."doneecho "$wdtcmdline"return 0ficd ..- done
- echo "$wdtcmdline"
- return 1
+}
2.5.0 _______________________________________________ kexec mailing list kexec@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/kexec@lists.fedoraproject.org
Thanks Dave
On Thu, Feb 18, 2016 at 08:44:25PM +0800, Dave Young wrote:
Hi, Pratyush
Thanks for the effort. Let's cc people who may be interested in the problem.
Hi,
Just to clarify things, these patches build on top of the kernel patches you posted a couple of weeks ago right? And these changes in particular make sure the right hardware watchdog driver is loaded into the kdump initrd and correctly load and enabled in the second kernel?
The only thing I might add to appease Prarit, is to perhaps add a kdump.conf option to disable loading or (better yet inclusion) of the watchdog driver in the second kernel. Unless he is ok with my other suggestion of adding some logic to makedumpfile instead.
Cheers, Don
On 02/02/16 at 10:58am, Pratyush Anand wrote:
Currently we take care to load only iTCO_wdt if this module was loaded in primary kernel. This new approach is capitalizing on recent changes proposed in the kernel which are following:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=65... http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=90... http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=33...
Above kernel patches fixes/adds two major aspects:
- It fixes parent of watchdog_device so that
/sys/class/watchdog/watchdogn/device is populated. 2) Adds some sysfs device attributes so that we can read different watchdog status.
With the above support, now we can find out whether a watchdog is active or not. If it is active, we can also find out the driver/module responsible for that watchdog device.
Proposed patch uses above kernel support and then loads relevant wdt modules in kdump initramfs only for the active device.
Testing: It has been tested with kernel-4.5.0-0.rc1.git2.1.fc24 and fc23 user space.
- Added RuntimeWatchdogSec=40s in /etc/systemd/system.conf
- Modified cmdline for crashkernel=256M
- systemctl enable kdump
- [temporarily changed /etc/kdump.conf to stop at dracut shell]
- restart
- echo c > /proc/sysrq-trigger
On dracut shell I can see kdump:/# cat /sys/class/watchdog/watchdog0/identity iTCO_wdt kdump:/# cat /sys/class/watchdog/watchdog0/state active
Tested also with correct /etc/kdump.conf and it was able to save the vmcore.
Assumption: Both watchdog and kdump daemon are managed by systemd.systemd starts watchdog daemon before kdump. If an user changes the watchdog status afterwards then he/she will have to execute `kdumpctrl restart`.
Limitations: This patch will be able to recognize an active wdt, only if its driver has been written in watchdog-core framework and registered with watchdog_class.
I have several thing to make clear:
- Previously we tried to add it as a dracut module in upstream dracut.
https://www.mail-archive.com/initramfs@vger.kernel.org/msg03299.html In dracut there is a simple watchdog module which is used for testing purpose only. We have a reliable way to get hostonly live watchdog module based on your kernel patches, so it might be better to retry to enhance the dracut watchdog module. Kdump use dracut --hostonly, we can add using wdt module for hostonly mode. In this way we can add another dracut kernel cmdline like rd.nowdt so that one can disable wdt functionality in initramfs if he/she want the original behavior.
- Previously we planned to only enabling wdt when driver initialization can not
stop wdt eg. in 2nd kernel the wdt status is active after insmod. But if we copy systemd.conf from 1st kernel to initrd, it will by default enable the wdt. Do you think it is a better way? Maybe it can address the concern from Prarit?
- What if in 1st kernel one do not use systemd as wdt daemon? Then in 2nd
kernel the behavior will be just like what we planned, insmod, driver stop the wdt, but still need kick it when driver failed to stop the wdt?
Signed-off-by: Pratyush Anand panand@redhat.com
dracut-module-setup.sh | 19 +++++++++++++------ kdump-lib.sh | 39 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 52 insertions(+), 6 deletions(-)
diff --git a/dracut-module-setup.sh b/dracut-module-setup.sh index 4cd7107c4a35..32032304476e 100755 --- a/dracut-module-setup.sh +++ b/dracut-module-setup.sh @@ -706,10 +706,17 @@ install() { }
installkernel() {
- wdt=$(lsmod|cut -f1 -d' '|grep "wdt$")
- if [ -n "$wdt" ]; then
[ "$wdt" = "iTCO_wdt" ] && instmods lpc_ich &&echo "rd.driver.pre=lpc_ich,iTCO_wdt " >> ${initdir}/etc/cmdline.d/00-wdt.confinstmods $wdt- fi
- wdtcmdline=$(get_wdt_cmdline)
- if [ "$?" = "0" ]; then
echo $wdtcmdline >> ${initdir}/etc/cmdline.d/00-wdt.confIFS='=,' read -ra modarr <<< "$wdtcmdline"count='1'while [ "$count" -lt ${#modarr[@]} ]doinstmods ${modarr[$count]}count=`expr $count + 1`done- fi
- return 0
} diff --git a/kdump-lib.sh b/kdump-lib.sh index 4d3420652b2f..f60909bca047 100755 --- a/kdump-lib.sh +++ b/kdump-lib.sh @@ -230,3 +230,42 @@ is_hostname() fi echo $1 | grep -q "[a-zA-Z]" }
+get_wdt_cmdline() +{
- local wdtcmdline=""
- wdtcls=/sys/class/watchdog
- cd $wdtcls
- for dir in */; do
cd $diractive=`[ -f state ] && cat state`if [ "$active" = "active" ]; then# device/modalias will return driver of this devicewdtdrv=`cat device/modalias`# There can be more than one module represnted by same# modalias. Currently load all of them.# TODO: Need to find a way to avoid any unwanted module# represented by modaliaswdtdrv=`modprobe -R $wdtdrv | tr "\n" "," | sed 's/.$//'`wdtcmdline="rd.driver.pre=$wdtdrv"# however in some cases, we also need to check that if# there is a specific driver for the parent bus/device.# In such cases we also need to enable driver for parent# bus/device.wdtppath="device/..";while [ -f "$wdtppath/modalias" ]dowdtpdrv=`cat $wdtppath/modalias`wdtpdrv=`modprobe -R $wdtpdrv | tr "\n" "," | sed 's/.$//'`wdtcmdline="$wdtcmdline,$wdtpdrv"wdtppath="$wdtppath/.."doneecho "$wdtcmdline"return 0ficd ..- done
- echo "$wdtcmdline"
- return 1
+}
2.5.0 _______________________________________________ kexec mailing list kexec@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/kexec@lists.fedoraproject.org
Thanks Dave
Hi Don,
Thanks for your feedback.
On 18/02/2016:10:27:26 AM, Don Zickus wrote:
On Thu, Feb 18, 2016 at 08:44:25PM +0800, Dave Young wrote:
Hi, Pratyush
Thanks for the effort. Let's cc people who may be interested in the problem.
Hi,
Just to clarify things, these patches build on top of the kernel patches you posted a couple of weeks ago right? And these changes in particular make sure the right hardware watchdog driver is loaded into the kdump initrd and correctly load and enabled in the second kernel?
Yes.
The only thing I might add to appease Prarit, is to perhaps add a kdump.conf option to disable loading or (better yet inclusion) of the watchdog driver
I think rd.nowdt suggestion from Dave is to take care of the above requirement.
in the second kernel. Unless he is ok with my other suggestion of adding some logic to makedumpfile instead.
Hummm, not sure if it could take care. We can have other core_collector option as well like scp which might go wrong as well.
~Pratyush
Hi Dave,
Thanks for your review and inputs.
On 18/02/2016:08:44:25 PM, Dave Young wrote:
I have several thing to make clear:
- Previously we tried to add it as a dracut module in upstream dracut.
https://www.mail-archive.com/initramfs@vger.kernel.org/msg03299.html In dracut there is a simple watchdog module which is used for testing purpose only. We have a reliable way to get hostonly live watchdog module based on your kernel patches, so it might be better to retry to enhance the dracut watchdog module. Kdump use dracut --hostonly, we can add using wdt module for hostonly mode. In this way we can add another dracut kernel cmdline like rd.nowdt so that one can disable wdt functionality in initramfs if he/she want the original behavior.
Its a good idea for implementing a switch, which will help to insert or not to insert active wdt module in kdump kernel. Since, I do not have much idea about dracut, so to understand it better for implementation perspective: - user can pass rd.nowdt through KDUMP_COMMANDLINE_APPEND of /etc/sysconfig/kdump - We will have an upstream dracut change, which will check that if rd.nowdt was *not* passed in command line then, add modules from rd.driver.pre.
- Previously we planned to only enabling wdt when driver initialization can not
stop wdt eg. in 2nd kernel the wdt status is active after insmod. But if we copy systemd.conf from 1st kernel to initrd, it will by default enable the wdt. Do you think it is a better way? Maybe it can address the concern from Prarit?
I think, its better to keep same state of wdt what was in primary kernel (if rd.nowdt was not passed). I am not sure if we need to implement systemd.conf copy, I think that is already done currently. Both on fedora and RHEL as you can see in commit log, wdt is active in kdump kernel if it was active in primary kernel.
kdump:/# cat /sys/class/watchdog/watchdog0/state active
- What if in 1st kernel one do not use systemd as wdt daemon? Then in 2nd
kernel the behavior will be just like what we planned, insmod, driver stop the wdt, but still need kick it when driver failed to stop the wdt?
If it is not systemd and a custom wdt application, then I think user will need to pass name of the application in /etc/kdump.conf:extra_bins and the kick command in /etc/kdump.conf:kdump_pre.
~Pratyush
Hi, Pratyush
Ccing harald for dracut discussion
On 02/18/16 at 09:02pm, Pratyush Anand wrote:
Hi Dave,
Thanks for your review and inputs.
On 18/02/2016:08:44:25 PM, Dave Young wrote:
I have several thing to make clear:
- Previously we tried to add it as a dracut module in upstream dracut.
https://www.mail-archive.com/initramfs@vger.kernel.org/msg03299.html In dracut there is a simple watchdog module which is used for testing purpose only. We have a reliable way to get hostonly live watchdog module based on your kernel patches, so it might be better to retry to enhance the dracut watchdog module. Kdump use dracut --hostonly, we can add using wdt module for hostonly mode. In this way we can add another dracut kernel cmdline like rd.nowdt so that one can disable wdt functionality in initramfs if he/she want the original behavior.
Its a good idea for implementing a switch, which will help to insert or not to insert active wdt module in kdump kernel. Since, I do not have much idea about dracut, so to understand it better for implementation perspective:
- user can pass rd.nowdt through KDUMP_COMMANDLINE_APPEND of /etc/sysconfig/kdump
Right
- We will have an upstream dracut change, which will check that if rd.nowdt was *not* passed in command line then, add modules from rd.driver.pre.
I meant about moving patch 1/2 to dracut, not only about the new param.
Dracut git is here: git://git.kernel.org/pub/scm/boot/dracut/dracut.git
There already a dracut module named 04watchdog, see dracut/modules.d/04watchdog
But it is used as test only for now. It might need more investigation if we can add code into the module. In 04watchdog it does not use systemd also it hardcode to load modules like below: modprobe ib700wdt modprobe i6300esb
- Previously we planned to only enabling wdt when driver initialization can not
stop wdt eg. in 2nd kernel the wdt status is active after insmod. But if we copy systemd.conf from 1st kernel to initrd, it will by default enable the wdt. Do you think it is a better way? Maybe it can address the concern from Prarit?
I think, its better to keep same state of wdt what was in primary kernel (if rd.nowdt was not passed). I am not sure if we need to implement systemd.conf copy, I think that is already done currently. Both on fedora and RHEL as you can see in commit log, wdt is active in kdump kernel if it was active in primary kernel.
What I meant is want to know if it is a new design while working on the patch. though It sounds reasonable as well.
systemd dracut module will copy system.conf, it is exactly the reason why it is active in 2nd kernel. If nobody copied it then for iTCO_wdt the state should not be active because driver will disable it during module init.
According our previous discuss we will load the wdt driver, only kick it when driver can not disable it. For example drop the wdt setup in system.conf in kdump kernel, if the state is still active after insmod then we will enable it and kick wdt.
One case is like we talked before if wdt driver failed to disable the wdt then it will still be active, then does systemd still work as expected?
kdump:/# cat /sys/class/watchdog/watchdog0/state active
- What if in 1st kernel one do not use systemd as wdt daemon? Then in 2nd
kernel the behavior will be just like what we planned, insmod, driver stop the wdt, but still need kick it when driver failed to stop the wdt?
If it is not systemd and a custom wdt application, then I think user will need to pass name of the application in /etc/kdump.conf:extra_bins and the kick command in /etc/kdump.conf:kdump_pre.
Ok, it make sense, but we need tell user that when we add the wdt into initrd.
Thanks Dave
On 02/19/16 at 08:00pm, Dave Young wrote:
Hi, Pratyush
Ccing harald for dracut discussion
Seems the code part was dropped
Here is the thread for Harald to get the details: https://lists.fedoraproject.org/archives/list/kexec@lists.fedoraproject.org/...
There were some kernel patches for us to get the right kernel modules from wdt device being used.
http://www.spinics.net/lists/linux-watchdog/msg07294.html http://www.spinics.net/lists/linux-watchdog/msg07282.html
On 02/18/16 at 09:02pm, Pratyush Anand wrote:
Hi Dave,
Thanks for your review and inputs.
On 18/02/2016:08:44:25 PM, Dave Young wrote:
I have several thing to make clear:
- Previously we tried to add it as a dracut module in upstream dracut.
https://www.mail-archive.com/initramfs@vger.kernel.org/msg03299.html In dracut there is a simple watchdog module which is used for testing purpose only. We have a reliable way to get hostonly live watchdog module based on your kernel patches, so it might be better to retry to enhance the dracut watchdog module. Kdump use dracut --hostonly, we can add using wdt module for hostonly mode. In this way we can add another dracut kernel cmdline like rd.nowdt so that one can disable wdt functionality in initramfs if he/she want the original behavior.
Its a good idea for implementing a switch, which will help to insert or not to insert active wdt module in kdump kernel. Since, I do not have much idea about dracut, so to understand it better for implementation perspective:
- user can pass rd.nowdt through KDUMP_COMMANDLINE_APPEND of /etc/sysconfig/kdump
Right
- We will have an upstream dracut change, which will check that if rd.nowdt was *not* passed in command line then, add modules from rd.driver.pre.
I meant about moving patch 1/2 to dracut, not only about the new param.
Dracut git is here: git://git.kernel.org/pub/scm/boot/dracut/dracut.git
There already a dracut module named 04watchdog, see dracut/modules.d/04watchdog
But it is used as test only for now. It might need more investigation if we can add code into the module. In 04watchdog it does not use systemd also it hardcode to load modules like below: modprobe ib700wdt modprobe i6300esb
- Previously we planned to only enabling wdt when driver initialization can not
stop wdt eg. in 2nd kernel the wdt status is active after insmod. But if we copy systemd.conf from 1st kernel to initrd, it will by default enable the wdt. Do you think it is a better way? Maybe it can address the concern from Prarit?
I think, its better to keep same state of wdt what was in primary kernel (if rd.nowdt was not passed). I am not sure if we need to implement systemd.conf copy, I think that is already done currently. Both on fedora and RHEL as you can see in commit log, wdt is active in kdump kernel if it was active in primary kernel.
What I meant is want to know if it is a new design while working on the patch. though It sounds reasonable as well.
systemd dracut module will copy system.conf, it is exactly the reason why it is active in 2nd kernel. If nobody copied it then for iTCO_wdt the state should not be active because driver will disable it during module init.
According our previous discuss we will load the wdt driver, only kick it when driver can not disable it. For example drop the wdt setup in system.conf in kdump kernel, if the state is still active after insmod then we will enable it and kick wdt.
One case is like we talked before if wdt driver failed to disable the wdt then it will still be active, then does systemd still work as expected?
kdump:/# cat /sys/class/watchdog/watchdog0/state active
- What if in 1st kernel one do not use systemd as wdt daemon? Then in 2nd
kernel the behavior will be just like what we planned, insmod, driver stop the wdt, but still need kick it when driver failed to stop the wdt?
If it is not systemd and a custom wdt application, then I think user will need to pass name of the application in /etc/kdump.conf:extra_bins and the kick command in /etc/kdump.conf:kdump_pre.
Ok, it make sense, but we need tell user that when we add the wdt into initrd.
Thanks Dave
Hi Dave,
On 19/02/2016:08:00:23 PM, Dave Young wrote:
Hi, Pratyush
Ccing harald for dracut discussion
On 02/18/16 at 09:02pm, Pratyush Anand wrote:
Hi Dave,
Thanks for your review and inputs.
On 18/02/2016:08:44:25 PM, Dave Young wrote:
I have several thing to make clear:
- Previously we tried to add it as a dracut module in upstream dracut.
https://www.mail-archive.com/initramfs@vger.kernel.org/msg03299.html In dracut there is a simple watchdog module which is used for testing purpose only. We have a reliable way to get hostonly live watchdog module based on your kernel patches, so it might be better to retry to enhance the dracut watchdog module. Kdump use dracut --hostonly, we can add using wdt module for hostonly mode. In this way we can add another dracut kernel cmdline like rd.nowdt so that one can disable wdt functionality in initramfs if he/she want the original behavior.
Its a good idea for implementing a switch, which will help to insert or not to insert active wdt module in kdump kernel. Since, I do not have much idea about dracut, so to understand it better for implementation perspective:
- user can pass rd.nowdt through KDUMP_COMMANDLINE_APPEND of /etc/sysconfig/kdump
Right
- We will have an upstream dracut change, which will check that if rd.nowdt was *not* passed in command line then, add modules from rd.driver.pre.
I meant about moving patch 1/2 to dracut, not only about the new param.
If I understood it correctly: - a dracut service (04watchdog) will check if it is --host-only mode - if yes, then it will check if it is !rd.nowdt - if yes, then update rd.driver.pre with the active wdt module name (as we are doing in this patch).
@Herald: Please let me know your opinion about above proposed changes in upstream dracut.
Also, does --host-only mode used by any other application than kdump? If yes, then should it be wise to keep same implementation for them as well? If not, then how can we detect that it is kdump who is using dracut?
Dracut git is here: git://git.kernel.org/pub/scm/boot/dracut/dracut.git
There already a dracut module named 04watchdog, see dracut/modules.d/04watchdog
But it is used as test only for now. It might need more investigation if we can
Sure, will do some experiment to verify the possibility.
add code into the module. In 04watchdog it does not use systemd also it hardcode to load modules like below: modprobe ib700wdt modprobe i6300esb
I think that hardcode part will still remain as it is. We need to make changes only when /dev/watchdog exist.
- Previously we planned to only enabling wdt when driver initialization can not
stop wdt eg. in 2nd kernel the wdt status is active after insmod. But if we copy systemd.conf from 1st kernel to initrd, it will by default enable the wdt. Do you think it is a better way? Maybe it can address the concern from Prarit?
I think, its better to keep same state of wdt what was in primary kernel (if rd.nowdt was not passed). I am not sure if we need to implement systemd.conf copy, I think that is already done currently. Both on fedora and RHEL as you can see in commit log, wdt is active in kdump kernel if it was active in primary kernel.
What I meant is want to know if it is a new design while working on the patch. though It sounds reasonable as well.
Well, I did not made any change in design of this patch. It seems that systemd is behaving like that. Then, it seemed more reasonable to me as well, because in this way we can also get benefit of watchdog in kdump kernel. If something goes wrong during dump process then watchdog will restart the machine (if it was active in primary kernel as well).
systemd dracut module will copy system.conf, it is exactly the reason why it is active in 2nd kernel. If nobody copied it then for iTCO_wdt the state should not be active because driver will disable it during module init.
Yes, exactly.
According our previous discuss we will load the wdt driver, only kick it when driver can not disable it. For example drop the wdt setup in system.conf in kdump kernel, if the state is still active after insmod then we will enable it and kick wdt.
Yes, so this is what I did not put ( drop the wdt setup in system.conf in kdump kernel) and it was a mistake. But, now with second thought it seems more appealing in current form, as I do not see any side effect, rather I see an advantage of enabled watchdog in kdump kernel.
One case is like we talked before if wdt driver failed to disable the wdt then it will still be active, then does systemd still work as expected?
OK, I will get back on this soon.
kdump:/# cat /sys/class/watchdog/watchdog0/state active
- What if in 1st kernel one do not use systemd as wdt daemon? Then in 2nd
kernel the behavior will be just like what we planned, insmod, driver stop the wdt, but still need kick it when driver failed to stop the wdt?
If it is not systemd and a custom wdt application, then I think user will need to pass name of the application in /etc/kdump.conf:extra_bins and the kick command in /etc/kdump.conf:kdump_pre.
Ok, it make sense, but we need tell user that when we add the wdt into initrd.
Yes..that we need to document.
Thanks for your feedback.
~Pratyush
Hi, Pratyush
On 02/22/16 at 05:59pm, Pratyush Anand wrote:
Hi Dave,
On 19/02/2016:08:00:23 PM, Dave Young wrote:
Hi, Pratyush
Ccing harald for dracut discussion
On 02/18/16 at 09:02pm, Pratyush Anand wrote:
Hi Dave,
Thanks for your review and inputs.
On 18/02/2016:08:44:25 PM, Dave Young wrote:
I have several thing to make clear:
- Previously we tried to add it as a dracut module in upstream dracut.
https://www.mail-archive.com/initramfs@vger.kernel.org/msg03299.html In dracut there is a simple watchdog module which is used for testing purpose only. We have a reliable way to get hostonly live watchdog module based on your kernel patches, so it might be better to retry to enhance the dracut watchdog module. Kdump use dracut --hostonly, we can add using wdt module for hostonly mode. In this way we can add another dracut kernel cmdline like rd.nowdt so that one can disable wdt functionality in initramfs if he/she want the original behavior.
Its a good idea for implementing a switch, which will help to insert or not to insert active wdt module in kdump kernel. Since, I do not have much idea about dracut, so to understand it better for implementation perspective:
- user can pass rd.nowdt through KDUMP_COMMANDLINE_APPEND of /etc/sysconfig/kdump
Right
- We will have an upstream dracut change, which will check that if rd.nowdt was *not* passed in command line then, add modules from rd.driver.pre.
I meant about moving patch 1/2 to dracut, not only about the new param.
If I understood it correctly:
- a dracut service (04watchdog) will check if it is --host-only mode
- if yes, then it will check if it is !rd.nowdt
- if yes, then update rd.driver.pre with the active wdt module name (as we are doing in this patch).
@Herald: Please let me know your opinion about above proposed changes in upstream dracut.
Also, does --host-only mode used by any other application than kdump? If yes, then should it be wise to keep same implementation for them as well? If not, then how can we detect that it is kdump who is using dracut?
--hostonly is a general option which is not only for kdump, that is why I think we should do it in dracut. It may make sense to add hostonly wdt logic so that it will help kexec reboot.
Dracut git is here: git://git.kernel.org/pub/scm/boot/dracut/dracut.git
There already a dracut module named 04watchdog, see dracut/modules.d/04watchdog
But it is used as test only for now. It might need more investigation if we can
Sure, will do some experiment to verify the possibility.
add code into the module. In 04watchdog it does not use systemd also it hardcode to load modules like below: modprobe ib700wdt modprobe i6300esb
I think that hardcode part will still remain as it is. We need to make changes only when /dev/watchdog exist.
- Previously we planned to only enabling wdt when driver initialization can not
stop wdt eg. in 2nd kernel the wdt status is active after insmod. But if we copy systemd.conf from 1st kernel to initrd, it will by default enable the wdt. Do you think it is a better way? Maybe it can address the concern from Prarit?
I think, its better to keep same state of wdt what was in primary kernel (if rd.nowdt was not passed). I am not sure if we need to implement systemd.conf copy, I think that is already done currently. Both on fedora and RHEL as you can see in commit log, wdt is active in kdump kernel if it was active in primary kernel.
What I meant is want to know if it is a new design while working on the patch. though It sounds reasonable as well.
Well, I did not made any change in design of this patch. It seems that systemd is behaving like that. Then, it seemed more reasonable to me as well, because in this way we can also get benefit of watchdog in kdump kernel. If something goes wrong during dump process then watchdog will restart the machine (if it was active in primary kernel as well).
Thanks, I agree that current approatch in the patchset is good enough.
systemd dracut module will copy system.conf, it is exactly the reason why it is active in 2nd kernel. If nobody copied it then for iTCO_wdt the state should not be active because driver will disable it during module init.
Yes, exactly.
According our previous discuss we will load the wdt driver, only kick it when driver can not disable it. For example drop the wdt setup in system.conf in kdump kernel, if the state is still active after insmod then we will enable it and kick wdt.
Yes, so this is what I did not put ( drop the wdt setup in system.conf in kdump kernel) and it was a mistake. But, now with second thought it seems more appealing in current form, as I do not see any side effect, rather I see an advantage of enabled watchdog in kdump kernel.
One case is like we talked before if wdt driver failed to disable the wdt then it will still be active, then does systemd still work as expected?
OK, I will get back on this soon.
kdump:/# cat /sys/class/watchdog/watchdog0/state active
- What if in 1st kernel one do not use systemd as wdt daemon? Then in 2nd
kernel the behavior will be just like what we planned, insmod, driver stop the wdt, but still need kick it when driver failed to stop the wdt?
If it is not systemd and a custom wdt application, then I think user will need to pass name of the application in /etc/kdump.conf:extra_bins and the kick command in /etc/kdump.conf:kdump_pre.
Ok, it make sense, but we need tell user that when we add the wdt into initrd.
Yes..that we need to document.
Thanks for your feedback.
~Pratyush
Thanks Dave
Hi Dave,
On 23/02/2016:09:14:10 AM, Dave Young wrote:
Hi, Pratyush
On 02/22/16 at 05:59pm, Pratyush Anand wrote:
Also, does --host-only mode used by any other application than kdump? If yes, then should it be wise to keep same implementation for them as well? If not, then how can we detect that it is kdump who is using dracut?
--hostonly is a general option which is not only for kdump, that is why I think we should do it in dracut. It may make sense to add hostonly wdt logic so that it will help kexec reboot.
Lets say we have another application (non kdump) which creates an initramfs with `dracut --host-only`. So what I wanted to confirm here that, with proposed implementation initramfs of that non-kdump application will also load drivers for active watchdog. Will that be acceptable to other applications?
Well, I did not made any change in design of this patch. It seems that systemd is behaving like that. Then, it seemed more reasonable to me as well, because in this way we can also get benefit of watchdog in kdump kernel. If something goes wrong during dump process then watchdog will restart the machine (if it was active in primary kernel as well).
Thanks, I agree that current approatch in the patchset is good enough.
Thanks :-)
One case is like we talked before if wdt driver failed to disable the wdt then it will still be active, then does systemd still work as expected?
OK, I will get back on this soon.
I did some change in iTCO_wdt.c's probe to not to stop wdt specifically. Kicked watchdog, restarted kdump and crashed the system. I made kdump kernel to stop at dracut and then waited for sufficient time, but did not see system rebooting with watchdog, so it seems that some other operation in probe caused to stop watchdog. So, that exact situation what you were suggesting is difficult to reproduce (atleast) with iTCO. However, I am wondering, why would any driver not insure that wdt is stopped specifically. If it is left started by any chance then, won't it be an unstable situation where system could reset before an application start kicking it's wdt. So, in my understanding if a watchdog is not specifically stopping it in probe() then, its a bug in the driver which need to be fixed.
~Pratyush
On 02/24/16 at 08:31pm, Pratyush Anand wrote:
Hi Dave,
On 23/02/2016:09:14:10 AM, Dave Young wrote:
Hi, Pratyush
On 02/22/16 at 05:59pm, Pratyush Anand wrote:
Also, does --host-only mode used by any other application than kdump? If yes, then should it be wise to keep same implementation for them as well? If not, then how can we detect that it is kdump who is using dracut?
--hostonly is a general option which is not only for kdump, that is why I think we should do it in dracut. It may make sense to add hostonly wdt logic so that it will help kexec reboot.
Lets say we have another application (non kdump) which creates an initramfs with `dracut --host-only`. So what I wanted to confirm here that, with proposed implementation initramfs of that non-kdump application will also load drivers for active watchdog. Will that be acceptable to other applications?
Hostonly means packing things useful for the current host only so that I think for each dracut module we can check current used setup and do same in initramfs.
Though we have systemd supporting watchdog, but if kernel hangs in initramfs phase it will be useless because we do not add watchdog drivers in initrd. So I think it is also useful to 1st kernel normal boot.
For people who do not want it we can add a cmdline param to disable the behavior.
Well, I did not made any change in design of this patch. It seems that systemd is behaving like that. Then, it seemed more reasonable to me as well, because in this way we can also get benefit of watchdog in kdump kernel. If something goes wrong during dump process then watchdog will restart the machine (if it was active in primary kernel as well).
Thanks, I agree that current approatch in the patchset is good enough.
Thanks :-)
One case is like we talked before if wdt driver failed to disable the wdt then it will still be active, then does systemd still work as expected?
OK, I will get back on this soon.
I did some change in iTCO_wdt.c's probe to not to stop wdt specifically. Kicked watchdog, restarted kdump and crashed the system. I made kdump kernel to stop at dracut and then waited for sufficient time, but did not see system rebooting with watchdog, so it seems that some other operation in probe caused to stop watchdog. So, that exact situation what you were suggesting is difficult to reproduce (atleast) with iTCO.
We need to keep it in mind, since it is not reproducible I think we can leave it as is and address problems in the future if we have.
However, I am wondering, why would any driver not insure that wdt is stopped specifically. If it is left started by any chance then, won't it be an unstable situation where system could reset before an application start kicking it's wdt. So, in my understanding if a watchdog is not specifically stopping it in probe() then, its a bug in the driver which need to be fixed.
I'm not sure if it is a bug, Don may have more thoughts about it.
Thanks Dave
There could be some dynamic system modification, which may affect kdump kernel boot process. For example, if status of a watchdog device is changed by an user then initramfs must be rebuilt on the basis of new watchdog status.
This patch adds a path to check such dynamic system modifications. While doing that, it also adds a checker for watchdog status modification.
Testing: ------------------------------------------------------- Initramfs wdt state Prev Current Result ------------------------------------------------------- Not Exist NA X Rebuild Exist Inact Inact No Rebuild Exist Inact Act Force Rebuild Exist Act Inact Force Rebuild Exist Act Act(Same wdt) No Rebuild Exist Act Act(Diff wdt) Force Rebuild
Signed-off-by: Pratyush Anand panand@redhat.com --- kdumpctl | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+)
diff --git a/kdumpctl b/kdumpctl index 9f7e56b1a524..58d6ccf4b90d 100755 --- a/kdumpctl +++ b/kdumpctl @@ -322,6 +322,27 @@ setup_target_initrd() fi }
+is_wdt_modified() +{ + [ -f $TARGET_INITRD ] || return 0 + + wdtcmdline=$(get_wdt_cmdline) + if [ "$?" = "0" ];then + lsinitrd -f etc/cmdline.d/00-wdt.conf $TARGET_INITRD | grep $wdtcmdline &> /dev/null + else + [ "$(lsinitrd -f etc/cmdline.d/00-wdt.conf $TARGET_INITRD)" = "" ] + fi + + return $? +} + +is_system_modified() +{ + is_wdt_modified || return 1 + + return 0 +} + check_rebuild() { local extra_modules modified_files="" @@ -383,6 +404,8 @@ check_rebuild() fi done
+ is_system_modified || force_rebuild="1" + #check if target initrd has fadump support if [ "$DEFAULT_DUMP_MODE" = "fadump" ] && [ -f "$TARGET_INITRD" ]; then initramfs_has_fadump=`lsinitrd -m $TARGET_INITRD | grep ^kdumpbase$ | wc -l`
Hi, Pratyush
Ccing people who may be intrested in the issue..
I will read and comment about the code details after I finished reading the details, but I would like to first comment about things in my mind.
On 02/02/16 at 10:58am, Pratyush Anand wrote:
There could be some dynamic system modification, which may affect kdump kernel boot process. For example, if status of a watchdog device is changed by an user then initramfs must be rebuilt on the basis of new watchdog status.
This patch adds a path to check such dynamic system modifications. While doing that, it also adds a checker for watchdog status modification.
It still only checks when kdump service restart, one need manually restart kdump service or it will be run at next reboot. User need be aware of wdt setup influent kdump service. Maybe we should document it explictly and tell user kdump is adding wdt.
Or we should introduce a way to automatilly detect the change, not sure how to do it and if it is worth..
Testing:
Initramfs wdt state Prev Current Result
Not Exist NA X Rebuild Exist Inact Inact No Rebuild Exist Inact Act Force Rebuild Exist Act Inact Force Rebuild Exist Act Act(Same wdt) No Rebuild Exist Act Act(Diff wdt) Force Rebuild
Signed-off-by: Pratyush Anand panand@redhat.com
kdumpctl | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+)
diff --git a/kdumpctl b/kdumpctl index 9f7e56b1a524..58d6ccf4b90d 100755 --- a/kdumpctl +++ b/kdumpctl @@ -322,6 +322,27 @@ setup_target_initrd() fi }
+is_wdt_modified() +{
- [ -f $TARGET_INITRD ] || return 0
- wdtcmdline=$(get_wdt_cmdline)
- if [ "$?" = "0" ];then
lsinitrd -f etc/cmdline.d/00-wdt.conf $TARGET_INITRD | grep $wdtcmdline &> /dev/null- else
[ "$(lsinitrd -f etc/cmdline.d/00-wdt.conf $TARGET_INITRD)" = "" ]- fi
- return $?
+}
+is_system_modified() +{
- is_wdt_modified || return 1
- return 0
+}
check_rebuild() { local extra_modules modified_files="" @@ -383,6 +404,8 @@ check_rebuild() fi done
- is_system_modified || force_rebuild="1"
- #check if target initrd has fadump support if [ "$DEFAULT_DUMP_MODE" = "fadump" ] && [ -f "$TARGET_INITRD" ]; then initramfs_has_fadump=`lsinitrd -m $TARGET_INITRD | grep ^kdumpbase$ | wc -l`
-- 2.5.0 _______________________________________________ kexec mailing list kexec@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/kexec@lists.fedoraproject.org
Thanks Dave
Hi Dave,
On 18/02/2016:08:56:57 PM, Dave Young wrote:
Hi, Pratyush
Ccing people who may be intrested in the issue..
I will read and comment about the code details after I finished reading the details, but I would like to first comment about things in my mind.
On 02/02/16 at 10:58am, Pratyush Anand wrote:
There could be some dynamic system modification, which may affect kdump kernel boot process. For example, if status of a watchdog device is changed by an user then initramfs must be rebuilt on the basis of new watchdog status.
This patch adds a path to check such dynamic system modifications. While doing that, it also adds a checker for watchdog status modification.
It still only checks when kdump service restart, one need manually restart kdump service or it will be run at next reboot. User need be aware of wdt setup influent kdump service. Maybe we should document it explictly and tell user kdump is adding wdt.
Yes, it can be documented.
Or we should introduce a way to automatilly detect the change, not sure how to do it and if it is worth..
I had thought of it, but did not get any good idea to do it. However, we can still keep it as an open point and can think about it as next enhancement.
~Pratyush
Hi, Pratyush
Or we should introduce a way to automatilly detect the change, not sure how to do it and if it is worth..
I had thought of it, but did not get any good idea to do it. However, we can still keep it as an open point and can think about it as next enhancement.
Agreed.
Thanks Dave