Re: [PATCH] Avoid falling into infinite loop restart when using a problematic system

Tuesday, 25 December 2018

On 12/24/18 at 12:17pm, Dave Young wrote:
...
 On 12/21/18 at 07:38pm, Dave Young wrote:
 > On 12/21/18 at 04:47pm, Buland Singh wrote:
 > > On 12/21/18 4:14 PM, Dave Young wrote:
 > > > On 12/21/18 at 12:59pm, Buland Singh wrote:
 > > > > On 12/21/18 12:29 PM, Kairui Song wrote:
 > > > > > Hi, Dave, Lianbo
 > > > > > 
 > > > > > My concern is that crash loop may generate tons of dump cores,
and the
 > > > > > dump target may get filled up by dump cores,
 > > > > > that may have larger potential risk. Else I think it's good
to leave
 > > > > > it as it is.
 > > > > > 
 > > > > > On Fri, Dec 21, 2018 at 2:05 PM lijiang
<lijiang(a)redhat.com&gt; wrote:
 > > > > > > 
 > > > > > > 在 2018年12月21日 10:49, Dave Young 写道:
 > > > > > > > + more people
 > > > > > > > On 12/20/18 at 04:49pm, lijiang wrote:
 > > > > > > > > 在 2018年12月20日 13:57, Dave Young 写道:
 > > > > > > > > > On 12/20/18 at 01:06pm, Lianbo Jiang wrote:
 > > > > > > > > > > By default, early kdump reboots the
system after capturing the vmcore.
 > > > > > > > > > > If the problematic system is
continuously crashing due to some issue
 > > > > > > > > > > during early boot stage, the system may
fall into infinite loop restart
 > > > > > > > > > > like this:
 > > > > > > > > > > 
 > > > > > > > > > >       boot -----> crash ----->
early kdump (dump vmcore)
 > > > > > > > > > >         ^                             
|
 > > > > > > > > > >        
'.........(reboot).............'
 > > > > > > > > > > 
 > > > > > > > > > > But now, the system crash at early
stage is only captured by early kdump,
 > > > > > > > > > > and the rest is captured by normal
kdump. That to say, when normal kdump
 > > > > > > > > > > service starts, it will load it again
and override early kdump. It is
 > > > > > > > > > > helpful to control the logic of early
kdump and normal kdump separately
 > > > > > > > > > > in final action(it is called by
kdump-capture.service). For example,
 > > > > > > > > > > early kdump always passes the
'rd.earlykdump' to the second kernel when
 > > > > > > > > > > early kdump is enabled, but normal
kdump doesn't pass the 'rd.earlykdump'
 > > > > > > > > > > to the second kernel at any time. So
they can be distinguished in the
 > > > > > > > > > > second kernel.
 > > > > > > > > > 
 > > > > > > > > > Hmm, I'm confused about the param
passing above.
 > > > > > > > > > 
 > > > > > > > > 
 > > > > > > > > I copy some messages from another email, please
refer to this one:
 > > > > > > > > [--->
 > > > > > > > > The rd.earlykdump is added to kernel command line
in grub.cfg. However, early kdump
 > > > > > > > > and normal kdump can get the same parameters from
/proc/cmdline in the first kernel.
 > > > > > > > > 
 > > > > > > > > Early kdump passes the rd.earlykdump to the
second kernel, but normal kdump doesn't
 > > > > > > > > need it, normal kdump needs to remove the
rd.earlykdump.
 > > > > > > > > 
 > > > > > > > > So which can distinguish early kdump and normal
kdump in the second kernel. It helps
 > > > > > > > > to control the logic of kdump capture service.
For example: default action/final action.
 > > > > > > > > ]
 > > > > > > > 
 > > > > > > > The description is confusing, "ealy kdump passes
... to the second
 > > > > > > > kernel", for example about this,  the real thing
is one person adds the
 > > > > > > > param in 1st kernel cmdline, kexec-tools
takes/inherits and pass to 2nd
 > > > > > > > kernel.
 > > > > > > > 
 > > > > > > 
 > > > > > > Yes. Good point. Thanks for your explanation.
 > > > > > > 
 > > > > > > > Anyway this is patch log issue.
 > > > > > > > 
 > > > > > > > > 
 > > > > > > > > > Early or non early just means about the
service loading phase, in
 > > > > > > > > 
 > > > > > > > > Yes. This patch used the same method what you
said. When normal kdump service starts,
 > > > > > > > > it will reload. At the same time, early kdump
will be overwritten by normal kdump.
 > > > > > > > 
 > > > > > > > Probably "early kdump load" is better than
"early kdump" in words.
 > > > > > > > 
 > > > > > > > > 
 > > > > > > > > > initramfs or not, I notice dracut/systemd
will print some message about
 > > > > > > > > > they are running in initramfs, so probably
you can check how to get it
 > > > > > > > > > with same way,  if this is not initramfs
then just unload before the
 > > > > > > > > > check in kdump loading.
 > > > > > > > > > 
 > > > > > > > > > The picture like below:
 > > > > > > > > > 
 > > > > > > > > > Kernel boot ->
 > > > > > > > > > 
 > > > > > > > > >      initramfs ---
 > > > > > > > > >           early kdump load
 > > > > > > > > > ---- Mark A  ----
 > > > > > > > > >      initramfs switch root
 > > > > > > > > > 
 > > > > > > > > >           system startup (real root fs)
 > > > > > > > > >                  service a
 > > > > > > > > >                  service b ... (eg.
networking etc.)
 > > > > > > > > >                  kdump service start
 > > > > > > > > > -----Mark B -----
 > > > > > > > > >                           load kdump kernel
again
 > > > > > > > > > 
 > > > > > > > > > 
 > > > > > > > > > The problem will happen between Mark A and
Mark B, during this period,
 > > > > > > > > > there could be repeated crash ->
earlykdump_load,  there might be some
 > > > > > > > > > random crash as well since during the real
root fs service startup,
 > > > > > > > > > for example after network is ready if some
network workload cause a
 > > > > > > > > > panic, it maybe not 100% reproducible,  so
it seems we still need to
 > > > > > > > > > make the poweroff configurable. eg.
 > > > > > > > > > 
 > > > > > > > > > default is poweroff, but one can choose if
he can.
 > > > > > > > > > 
 > > > > > > > > 
 > > > > > > > > Yes, default is poweroff for early kdump. Unless
kdump capture service
 > > > > > > > > happens error or enters the emergency service,
one can choose the default
 > > > > > > > > action.(configure default=xxx in kdump.conf)
 > > > > > > > 
 > > > > > > > For default action instead of final action if you
hardcode it, then even if
 > > > > > > > one set default as reboot it still poweroff.
 > > > > > > > 
 > > > > > > 
 > > > > > > If really need, that can be improved.
 > > > > > > 
 > > > > > > > [snip]
 > > > > > > > 
 > > > > > > > > > > 
 > > > > > > > > > > +check_rd_earlykdump()
 > > > > > > > > > > +{
 > > > > > > > > > > +    egrep "rd.earlykdump"
/proc/cmdline
 > > > > > > > > > > +}
 > > > > > > > > > > +
 > > > > > > > > > >    start()
 > > > > > > > > > >    {
 > > > > > > > > > >      check_dump_feasibility
 > > > > > > > > > > @@ -969,7 +974,13 @@ start()
 > > > > > > > > > >      check_current_status
 > > > > > > > > > >      if [ $? == 0 ]; then
 > > > > > > > > > >              echo "Kdump already
running: [WARNING]"
 > > > > > > > > > > -          return 0
 > > > > > > > > > > +          check_rd_earlykdump
 > > > > > > > > > > +          #if earlykdump loaded, it
will stop and start.
 > > > > > > > > > > +          if [ $? -eq 0 ]; then
 > > > > > > > > > > +                  stop
 > > > > > > > > > 
 > > > > > > > > > kdumpctl start can run not only by system
startup services, one can also
 > > > > > > > > > run it manually or in udev rule.
 > > > > > > > > > 
 > > > > > > > > > The checking of kernel cmdline seems not
enough.
 > > > > > > > > > 
 > > > > > > > > 
 > > > > > > > > Here it means that if kdump has beend loaded,
check whether early kdump did it.
 > > > > > > > > If yes, let normal kdump load it again, otherwise
no need to do anything.
 > > > > > > > 
 > > > > > > > As we discussed you do not get my points here :)
 > > > > > > > 
 > > > > > > > check_rd_earlykdump will be always true once kernel
bootup, so there is
 > > > > > > > no way to get the first time of normal kdump load and
the later one.
 > > > > > > > 
 > > > > > > > The early boot time panic to address for this patch is
the 100%
 > > > > > > > reproducible panic,  for this kinds of panic admin
should be able to see
 > > > > > > > it when he boot the machine. So rethinking about this
the best way may
 > > > > > > > be just a wontfix.
 > > > > > > > 
 > > > > > > > Let's consider below use cases:
 > > > > > > > 
 > > > > > > > * First install:
 > > > > > > > 
 > > > > > > > install os ->
 > > > > > > > ---A
 > > > > > > >       reboot ->
 > > > > > > >           ...
 > > > > > > > ---B
 > > > > > > >           kdump service start
 > > > > > > >              -> create kdump initrd
 > > > > > > >           ...
 > > > > > > >           boot finished
 > > > > > > >           recreate default initrd and enable early
kdump
 > > > > > > >           goto A
 > > > > > > > 
 > > > > > > > Panic happened between A and B, if it is predictable,
eg. 100%
 > > > > > > > reproducible, then admin should already see it, then
he/she can control
 > > > > > > > and stop the repeating crash/kdump loop
 > > > > > > > if the panic is not 100% reproducible then use reboot
as final action is just fine
 > > > > > > > 
 > > > > > > > * Other use cases eg. updating kernel or some other
components:
 > > > > > > > It is similar with the intall os use case because if
one update kernel
 > > > > > > > or critical components it is likely they need
regenerate kdump initrd,
 > > > > > > > and then repack it into early kdump default initrd, in
this case admin
 > > > > > > > should be able to see the panic loop and handle it.
 > > > > > > > Also if the panic is not 100% reproducible then we are
just fine.
 > > > > > > > 
 > > > > > > > If we choose to split early and late kdump load, there
could be other
 > > > > > > > side effects, and make the logic even complicated.
 > > > > > > > 
 > > > > > > > So...  the better way may be just leave it as is, and
maybe add some
 > > > > > > > documentation.
 > > > > > > > 
 > > > > > > 
 > > > > > > It is a good way to document the risks that may exist.
 > > > > > > 
 > > > > > > Just like the public transportation, we all know that it
has the risk, but we still choose it.
 > > > > > > 
 > > > > > > Thanks.
 > > > > > > 
 > > > > > > > Thoughts?
 > > > > > > > 
 > > > > > > > Thanks
 > > > > > > > Dave
 > > > > > > > 
 > > > > > 
 > > > > 
 > > > > Hello Dave et al.
 > > > > 
 > > > > Kindly check the below condition, assumption and share your
thoughts.
 > > > 
 > > > Buland, thank you for the reply.
 > > > 
 > > > > 
 > > > > Condition:
 > > > > 
 > > > > [0x1] System is up and running with the kernel version X.
 > > > > [0x2] Admin performed the kernel Y upgrade.
 > > > > [0x3] Running kernel X crashed.
 > > > > [0x4] Normal kdump rebooted the system and captured the kernel crash
dump of the kernel X.
 > > > > [0x5] System rebooted with the newly installed kernel Y.
 > > > > [0x6] Let's assume that due to some unknown reason the booting
kernel Y also crashed (assume that the panic is 100% reproducible).
 > > > > [0x7] Early kdump started dumping the kernel crash dump of the
booting kernel Y.
 > > > > [0x8] Early kdump rebooted the system and stuck in the loop.
 > > > > 
 > > > > Assumption: 1
 > > > > 
 > > > > What if the problematic system is in data center and admin is not
aware of this situation?
 > > > > 
 > > > > [0x1] The dump target will be filled with the multiple copies of the
kernel crash dump?
 > > > > 
 > > > >        [ 1 kernel crash dump of the kernel X and 'n' kernel
crash dump of the kernel Y]
 > > > > 
 > > > > [0x2] The system will reboot in the loop?
 > > > 
 > > > It is hard to define,  for original kdump service without early kdump
 > > > load, it is also possible after one replaced a kernel the new kernel
panics
 > > > during boot phase just after kdump get loaded.
 > > > 
 > > > Admin at least should test a reboot after updating a kernel?
 > > 
 > > Agree, but not sure if all the admins will follow this rule :)
 > > 
 > > 
 > > > > Assumption: 2
 > > > > 
 > > > > What if the dump target in on the local disk?
 > > > > 
 > > > > [0x1] Admin needs to power off the system manually to retrieve the
kernel crash dump of the kernel X and Y from the resume mode.
 > > > > [0x2] Admin needs to remove the multiple copies of the kernel crash
dump of the booting kernel Y.
 > > > > [0x3] Admin might get confused while differentiating between the
kernel crash dump of the kernel X and Kernel Y.
 > > > 
 > > > As for the worst case we all admit this is a problem,  we are more than
 > > > happy to make admin be easier and fix it :)
 > > > 
 > > > As I said we are exporting about this problem see if we have a good
 > > > solution.
 > > > 
 > > > But if we can assume predictable panic can be avoid then the situation
 > > > will be better.
 > > 
 > > One suggestion, can we have a separate default behavior for normal kdump and
early kdump?
 > 
 > Yes if we can.
 > 
 > Kdump service start either early or late we just use a syscall to load
 > another kernel/initrd into pre-reserved memory.
 > 
 > So once kdump service started we can not differenciate kdump kernel was
 > loaded early or late.

 But still not sure about it.  As we can see for normal kdump there will
 be similar issue existed, eg. between C and D if a reproducible panic
 also happens every time during late boot phase

 So it is hard to define this is early kdump only, just more likely for
 early kdump.

 ---A
 initramfs kdump load
    switch root
       other services startup
 ---B
       kdump service start
 ---C
       other services start up
       ...
 ---D
       boot finished

 If we consider this as a early kdump only/must-fix issue, thinking about
 it we should split into two issues:

 1. how to determine early load and then reload while kdumpctl start 

 If "kdump" service is not active but kdump kernel loaded then
 it should have been "early loaded".  Then something will like this:

 kdumpctl start()
   if kdump is loaded:
   	if systemd kdump.service is active 
Typo, I means the inverse logic:
	if !systemd kdump.service is active

> 		# early loaded
> 		stop and continue to load again
> 	else
> 		print a warning service is already running and then
> 		return
>   else
> 	go ahead to load and start
> 
> 2. how to set the reboot action for early and late loading:
> 
> use an cmdline like rd.earlykdump.noreboot,  default value is true, if
> one want to use alternate he/she can add rd.earlykdump.noreboot=0 
> 
> Add extra config optioins is also a choice but seems much complicated,
> not only final action, also default action,  for default action in case
> partially saved vmcore it still will occupy the whole disk.  So seems we
> just should replace reboot with shutdown for any cases.
> 
> > 
> > > 
> > > Eg:
> > > 
> > > In /etc/kdump.conf file.
> > > default  reboot     (Default action to be taken by normal kdump and early
kdump if dumping fails)
> > > ndefault reboot     (Default action to be taken once the dumping is
successful by normal kdump)
> > > edefault poweroff   (Default action to be taken once the dumping is
successful by early kdump)
> > > 
> > > Note: Create 'ndefault' & 'edefault' options identical
to 'default' option so that an admin can alter the behavior as per the
requirement.
> > > 
> > > [0x1] If kernel crash dump is captured by normal kdump then
'reboot' the system (by default) or take action as per 'ndefault' value.
> > > [0x2] If kernel crash dump is captured by early kdump then
'poweroff' the system 9by default) or take action as per 'edefault'
value.
> > > 
> > 
> > We can introduce something so that early kdump service load and set
> > poweroff as default,  but when normal kdump service start we have two
> > choices:
> > 
> > 1. go ahead to use the early loaded setup without reload like we have in
> > the code now, in this case normal kdump will also use poweroff no matter
> > what is set in /etc/kdump.conf
> > 
> > 2. reload kernel/initrd with normal setup in /etc/kdump.conf.  In this
> > way, we need get if this is a late service startup.  Because if we
> > blindly reload then it will affect later udev triggered kdump restart.
> > 
> > For example manually start the service like kdumpctl start, origianlly
> > it will just print a msg about the service is already running.  But now
> > the "start" == "stop, then start", this may be not a big
problem but
> > looks odd.
> > 
> > Correct my self about the udev hotplug triggered events, it should be ok
> > because it will call restart that means stop then start, this change
> > will not affects it.
> > 
> > BTW, the default action here should be "final action",  there is a
> > "default" action can be configured in kdump.conf which is used for
kdump
> > kernel to do after vmcore saving failed, it is more like a failsafe
> > action.  But the default "default" is also "reboot".  When
kdump
> > successfully saves a vmcore it will go to "final action" (==reboot)
> > which is not configurable.
> > 
> > > --
> > > Buland
> > > 
> > > 
> > 
> > Thanks
> > Dave

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [PATCH] Avoid falling into infinite loop restart when using a problematic system