On 12/20/18 at 03:31pm, lijiang wrote:
在 2018年12月20日 13:22, Baoquan He 写道:
> On 12/20/18 at 01:06pm, Lianbo Jiang wrote:
What:
>> By default, early kdump reboots the system after capturing
the vmcore.
>> If the problematic system is continuously crashing due to some issue
>> during early boot stage, the system may fall into infinite loop restart
>> like this:
>>
>> boot -----> crash -----> early kdump (dump vmcore)
>> ^ |
>> '.........(reboot).............'
>
> Didn't tell why it will fall into infinite loop in early kdump.
>
Thanks for your comment.
Why:
Originally, early kdump and normal kdump are not distinguished. Early
kdump
only loads kernel and initramfs as early as possible, it uses the same logic
as normal kdump captures vmcore and executes default/final action.
Plus: earlykernel is captured the crash during kernel initialization, it
might be pid 0 process only, so could crash alwasy on that machine.
For normal kdump, the default/final actions are the reboot after capturing
the vmcore, so early kdump is also the same action. When the problematic
system is crashing at early boot stage, it will switch the kdump-capture
service and dump vmcore, and then reboot system, boot, crash, dump vmcore...
again and again.
>>
How:
To fix it, the system crash at early stage is only captured by early kdump,
and the rest is captured by normal kdump. That to say, when normal kdump
...
> >> But now, the system crash at early stage is only captured by early kdump,
> >
> > This 'now' is meaning the time for the current code, or for the code
after
> > this patch applied?
> >
>
> After this patch applied.
>
> >> and the rest is captured by normal kdump. That to say, when normal kdump
> >> service starts, it will load it again and override early kdump. It is
> >> helpful to control the logic of early kdump and normal kdump separately
> >> in final action(it is called by kdump-capture.service). For example,
> >> early kdump always passes the 'rd.earlykdump' to the second kernel
when
> >> early kdump is enabled, but normal kdump doesn't pass the
'rd.earlykdump'
> >> to the second kernel at any time. So they can be distinguished in the
> >> second kernel.
> >>
> >> Signed-off-by: Lianbo Jiang <lijiang(a)redhat.com>
> >> ---
> >> dracut-early-kdump.sh | 2 +-
> >> dracut-kdump-error-handler.sh | 7 ++++++-
> >> dracut-kdump.sh | 6 +++++-
> >> kdump-lib-initramfs.sh | 16 +++++++++++++---
> >> kdump.sysconfig | 2 +-
> >> kdump.sysconfig.aarch64 | 2 +-
> >> kdump.sysconfig.i386 | 2 +-
> >> kdump.sysconfig.ppc64 | 2 +-
> >> kdump.sysconfig.ppc64le | 2 +-
> >> kdump.sysconfig.s390x | 2 +-
> >> kdump.sysconfig.x86_64 | 2 +-
> >> kdumpctl | 15 +++++++++++++--
> >> 12 files changed, 45 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/dracut-early-kdump.sh b/dracut-early-kdump.sh
> >> index 34a9909..a799dbf 100755
> >> --- a/dracut-early-kdump.sh
> >> +++ b/dracut-early-kdump.sh
> >> @@ -58,7 +58,7 @@ early_kdump_load()
> >> fi
> >>
> >> $KEXEC ${EARLY_KEXEC_ARGS} $standard_kexec_args \
> >> - --command-line="$EARLY_KDUMP_CMDLINE" \
> >> + --command-line="$EARLY_KDUMP_CMDLINE rd.earlykdump" \
> >> --initrd=$EARLY_KDUMP_INITRD $EARLY_KDUMP_KERNEL
> >> if [ $? == 0 ]; then
> >> echo "kexec: loaded early-kdump kernel"
> >> diff --git a/dracut-kdump-error-handler.sh b/dracut-kdump-error-handler.sh
> >> index 2f0f1d1..4f0e58c 100755
> >> --- a/dracut-kdump-error-handler.sh
> >> +++ b/dracut-kdump-error-handler.sh
> >> @@ -1,5 +1,6 @@
> >> #!/bin/sh
> >>
> >> +. /lib/dracut-lib.sh
> >> . /lib/kdump-lib-initramfs.sh
> >>
> >> set -o pipefail
> >> @@ -7,4 +8,8 @@ export PATH=$PATH:$KDUMP_SCRIPT_DIR
> >>
> >> get_kdump_confs
> >> do_default_action
> >> -do_final_action
> >> +if getargbool 0 rd.earlykdump; then
> >> + do_earlykdump_final_action
> >> +else
> >> + do_final_action
> >> +fi
> >
> > Can we do it like this?
> > if getargbool 0 rd.earlykdump; then
> > FINAL_ACTION="systemctl poweroff"
> > fi
> > do_final_action
> >
>
> Sure, of course.
>
> >> diff --git a/dracut-kdump.sh b/dracut-kdump.sh
> >> index b75c2a5..119b006 100755
> >> --- a/dracut-kdump.sh
> >> +++ b/dracut-kdump.sh
> >> @@ -201,4 +201,8 @@ if [ $DUMP_RETVAL -ne 0 ]; then
> >> exit 1
> >> fi
> >>
> >> -do_final_action
> >> +if getargbool 0 rd.earlykdump; then
> >> + do_earlykdump_final_action
> >> +else
> >> + do_final_action
> >> +fi
> >> diff --git a/kdump-lib-initramfs.sh b/kdump-lib-initramfs.sh
> >> index 7ba99b6..f95cbcf 100755
> >> --- a/kdump-lib-initramfs.sh
> >> +++ b/kdump-lib-initramfs.sh
> >> @@ -6,7 +6,7 @@ KDUMP_PATH="/var/crash"
> >> CORE_COLLECTOR=""
> >> DEFAULT_CORE_COLLECTOR="makedumpfile -l --message-level 1 -d 31"
> >> DMESG_COLLECTOR="/sbin/vmcore-dmesg"
> >> -DEFAULT_ACTION="systemctl reboot -f"
> >> +DEFAULT_ACTION=""
> >> DATEDIR=`date +%Y-%m-%d-%T`
> >> HOST_IP='127.0.0.1'
> >> DUMP_INSTRUCTION=""
> >> @@ -14,6 +14,7 @@ SSH_KEY_LOCATION="/root/.ssh/kdump_id_rsa"
> >> KDUMP_SCRIPT_DIR="/kdumpscripts"
> >> DD_BLKSIZE=512
> >> FINAL_ACTION="systemctl reboot -f"
> >> +EARLYKDUMP_FINAL_ACTION="systemctl poweroff"
> >> KDUMP_CONF="/etc/kdump.conf"
> >> KDUMP_PRE=""
> >> KDUMP_POST=""
> >> @@ -155,11 +156,20 @@ kdump_emergency_shell()
> >>
> >> do_default_action()
> >> {
> >> - echo "Kdump: Executing default action $DEFAULT_ACTION"
> >> - eval $DEFAULT_ACTION
> >> + if [ $DEFAULT_ACTION == "" ]; then
> >> + echo "Kdump: default action string is null."
> >> + else
> >> + echo "Kdump: Executing default action $DEFAULT_ACTION"
> >> + eval $DEFAULT_ACTION
> >> + fi
> >> }
> >>
> >> do_final_action()
> >> {
> >> eval $FINAL_ACTION
> >> }
> >> +
> >> +do_earlykdump_final_action()
> >> +{
> >> + eval $EARLYKDUMP_FINAL_ACTION
> >> +}
> >> diff --git a/kdump.sysconfig b/kdump.sysconfig
> >> index ffe1df8..b011c1c 100644
> >> --- a/kdump.sysconfig
> >> +++ b/kdump.sysconfig
> >> @@ -17,7 +17,7 @@ KDUMP_COMMANDLINE=""
> >> # This variable lets us remove arguments from the current kdump
commandline
> >> # as taken from either KDUMP_COMMANDLINE above, or from /proc/cmdline
> >> # NOTE: some arguments such as crashkernel will always be removed
> >> -KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug quiet"
> >> +KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug quiet
rd.earlykdump"
> >
> > I don't understand. earlykdump get normal kernel's cmdline, then add
> > rd.earlykdump, while normal kdump will reload and get normal kernel's
> > cmdline again, why do you need to remove it here? Normal kdump will
> > inherit early kdump's cmdline?
> >
>
> The rd.earlykdump is added to kernel command line in grub.cfg. However, early kdump
> and normal kdump can get the same parameters from /proc/cmdline in the first kernel.
>
> Early kdump passes the rd.earlykdump to the second kernel, but normal kdump
doesn't
> need it, normal kdump needs to remove the rd.earlykdump.
>
> So which can distinguish early kdump and normal kdump in the second kernel. It helps
> to control the logic of kdump capture service. For example: default action/final
action.
>
> Thanks.
>
> >>
> >> # This variable lets us append arguments to the current kdump commandline
> >> # after processed by KDUMP_COMMANDLINE_REMOVE
> >> diff --git a/kdump.sysconfig.aarch64 b/kdump.sysconfig.aarch64
> >> index 0a6b14c..b8b8865 100644
> >> --- a/kdump.sysconfig.aarch64
> >> +++ b/kdump.sysconfig.aarch64
> >> @@ -17,7 +17,7 @@ KDUMP_COMMANDLINE=""
> >> # This variable lets us remove arguments from the current kdump
commandline
> >> # as taken from either KDUMP_COMMANDLINE above, or from /proc/cmdline
> >> # NOTE: some arguments such as crashkernel will always be removed
> >> -KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug quiet"
> >> +KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug quiet
rd.earlykdump"
> >>
> >> # This variable lets us append arguments to the current kdump commandline
> >> # after processed by KDUMP_COMMANDLINE_REMOVE
> >> diff --git a/kdump.sysconfig.i386 b/kdump.sysconfig.i386
> >> index 18c407e..b9a2835 100644
> >> --- a/kdump.sysconfig.i386
> >> +++ b/kdump.sysconfig.i386
> >> @@ -17,7 +17,7 @@ KDUMP_COMMANDLINE=""
> >> # This variable lets us remove arguments from the current kdump
commandline
> >> # as taken from either KDUMP_COMMANDLINE above, or from /proc/cmdline
> >> # NOTE: some arguments such as crashkernel will always be removed
> >> -KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug quiet"
> >> +KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug quiet
rd.earlykdump"
> >>
> >> # This variable lets us append arguments to the current kdump commandline
> >> # after processed by KDUMP_COMMANDLINE_REMOVE
> >> diff --git a/kdump.sysconfig.ppc64 b/kdump.sysconfig.ppc64
> >> index 55a01cc..e1ea9c6 100644
> >> --- a/kdump.sysconfig.ppc64
> >> +++ b/kdump.sysconfig.ppc64
> >> @@ -17,7 +17,7 @@ KDUMP_COMMANDLINE=""
> >> # This variable lets us remove arguments from the current kdump
commandline
> >> # as taken from either KDUMP_COMMANDLINE above, or from /proc/cmdline
> >> # NOTE: some arguments such as crashkernel will always be removed
> >> -KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug quiet"
> >> +KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug quiet
rd.earlykdump"
> >>
> >> # This variable lets us append arguments to the current kdump commandline
> >> # after processed by KDUMP_COMMANDLINE_REMOVE
> >> diff --git a/kdump.sysconfig.ppc64le b/kdump.sysconfig.ppc64le
> >> index 55a01cc..e1ea9c6 100644
> >> --- a/kdump.sysconfig.ppc64le
> >> +++ b/kdump.sysconfig.ppc64le
> >> @@ -17,7 +17,7 @@ KDUMP_COMMANDLINE=""
> >> # This variable lets us remove arguments from the current kdump
commandline
> >> # as taken from either KDUMP_COMMANDLINE above, or from /proc/cmdline
> >> # NOTE: some arguments such as crashkernel will always be removed
> >> -KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug quiet"
> >> +KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug quiet
rd.earlykdump"
> >>
> >> # This variable lets us append arguments to the current kdump commandline
> >> # after processed by KDUMP_COMMANDLINE_REMOVE
> >> diff --git a/kdump.sysconfig.s390x b/kdump.sysconfig.s390x
> >> index b3aec3c..4c5fc00 100644
> >> --- a/kdump.sysconfig.s390x
> >> +++ b/kdump.sysconfig.s390x
> >> @@ -17,7 +17,7 @@ KDUMP_COMMANDLINE=""
> >> # This variable lets us remove arguments from the current kdump
commandline
> >> # as taken from either KDUMP_COMMANDLINE above, or from /proc/cmdline
> >> # NOTE: some arguments such as crashkernel will always be removed
> >> -KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug quiet"
> >> +KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug quiet
rd.earlykdump"
> >>
> >> # This variable lets us append arguments to the current kdump commandline
> >> # after processed by KDUMP_COMMANDLINE_REMOVE
> >> diff --git a/kdump.sysconfig.x86_64 b/kdump.sysconfig.x86_64
> >> index f269d02..3cccbfc 100644
> >> --- a/kdump.sysconfig.x86_64
> >> +++ b/kdump.sysconfig.x86_64
> >> @@ -17,7 +17,7 @@ KDUMP_COMMANDLINE=""
> >> # This variable lets us remove arguments from the current kdump
commandline
> >> # as taken from either KDUMP_COMMANDLINE above, or from /proc/cmdline
> >> # NOTE: some arguments such as crashkernel will always be removed
> >> -KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug quiet"
> >> +KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug quiet
rd.earlykdump"
> >>
> >> # This variable lets us append arguments to the current kdump commandline
> >> # after processed by KDUMP_COMMANDLINE_REMOVE
> >> diff --git a/kdumpctl b/kdumpctl
> >> index fe6af22..a8fd53d 100755
> >> --- a/kdumpctl
> >> +++ b/kdumpctl
> >> @@ -26,7 +26,7 @@ image_time=0
> >> standard_kexec_args="-p"
> >>
> >> # Some default values in case /etc/sysconfig/kdump doesn't include
> >> -KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug"
> >> +KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug
rd.earlykdump"
> >>
> >> if [ -f /etc/sysconfig/kdump ]; then
> >> . /etc/sysconfig/kdump
> >> @@ -942,6 +942,11 @@ check_default_config()
> >> fi
> >> }
> >>
> >> +check_rd_earlykdump()
> >> +{
> >> + egrep "rd.earlykdump" /proc/cmdline
> >> +}
> >> +
> >> start()
> >> {
> >> check_dump_feasibility
> >> @@ -969,7 +974,13 @@ start()
> >> check_current_status
> >> if [ $? == 0 ]; then
> >> echo "Kdump already running: [WARNING]"
> >> - return 0
> >> + check_rd_earlykdump
> >> + #if earlykdump loaded, it will stop and start.
> >> + if [ $? -eq 0 ]; then
> >> + stop
> >> + else
> >> + return 0
> >> + fi
> >> fi
> >>
> >> if check_ssh_config; then
> >> --
> >> 2.17.1
> >>