On 12/25/18 at 10:24am, lijiang wrote:
在 2018年12月24日 12:17, Dave Young 写道:
> On 12/21/18 at 07:38pm, Dave Young wrote:
>> On 12/21/18 at 04:47pm, Buland Singh wrote:
>>> On 12/21/18 4:14 PM, Dave Young wrote:
>>>> On 12/21/18 at 12:59pm, Buland Singh wrote:
>>>>> On 12/21/18 12:29 PM, Kairui Song wrote:
>>>>>> Hi, Dave, Lianbo
>>>>>>
>>>>>> My concern is that crash loop may generate tons of dump cores,
and the
>>>>>> dump target may get filled up by dump cores,
>>>>>> that may have larger potential risk. Else I think it's good
to leave
>>>>>> it as it is.
>>>>>>
>>>>>> On Fri, Dec 21, 2018 at 2:05 PM lijiang
<lijiang(a)redhat.com> wrote:
>>>>>>>
>>>>>>> 在 2018年12月21日 10:49, Dave Young 写道:
>>>>>>>> + more people
>>>>>>>> On 12/20/18 at 04:49pm, lijiang wrote:
>>>>>>>>> 在 2018年12月20日 13:57, Dave Young 写道:
>>>>>>>>>> On 12/20/18 at 01:06pm, Lianbo Jiang wrote:
>>>>>>>>>>> By default, early kdump reboots the system
after capturing the vmcore.
>>>>>>>>>>> If the problematic system is continuously
crashing due to some issue
>>>>>>>>>>> during early boot stage, the system may fall
into infinite loop restart
>>>>>>>>>>> like this:
>>>>>>>>>>>
>>>>>>>>>>> boot -----> crash -----> early
kdump (dump vmcore)
>>>>>>>>>>> ^ |
>>>>>>>>>>>
'.........(reboot).............'
>>>>>>>>>>>
>>>>>>>>>>> But now, the system crash at early stage is
only captured by early kdump,
>>>>>>>>>>> and the rest is captured by normal kdump.
That to say, when normal kdump
>>>>>>>>>>> service starts, it will load it again and
override early kdump. It is
>>>>>>>>>>> helpful to control the logic of early kdump
and normal kdump separately
>>>>>>>>>>> in final action(it is called by
kdump-capture.service). For example,
>>>>>>>>>>> early kdump always passes the
'rd.earlykdump' to the second kernel when
>>>>>>>>>>> early kdump is enabled, but normal kdump
doesn't pass the 'rd.earlykdump'
>>>>>>>>>>> to the second kernel at any time. So they
can be distinguished in the
>>>>>>>>>>> second kernel.
>>>>>>>>>>
>>>>>>>>>> Hmm, I'm confused about the param passing
above.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I copy some messages from another email, please
refer to this one:
>>>>>>>>> [--->
>>>>>>>>> The rd.earlykdump is added to kernel command line in
grub.cfg. However, early kdump
>>>>>>>>> and normal kdump can get the same parameters from
/proc/cmdline in the first kernel.
>>>>>>>>>
>>>>>>>>> Early kdump passes the rd.earlykdump to the second
kernel, but normal kdump doesn't
>>>>>>>>> need it, normal kdump needs to remove the
rd.earlykdump.
>>>>>>>>>
>>>>>>>>> So which can distinguish early kdump and normal
kdump in the second kernel. It helps
>>>>>>>>> to control the logic of kdump capture service. For
example: default action/final action.
>>>>>>>>> ]
>>>>>>>>
>>>>>>>> The description is confusing, "ealy kdump passes
... to the second
>>>>>>>> kernel", for example about this, the real thing is
one person adds the
>>>>>>>> param in 1st kernel cmdline, kexec-tools takes/inherits
and pass to 2nd
>>>>>>>> kernel.
>>>>>>>>
>>>>>>>
>>>>>>> Yes. Good point. Thanks for your explanation.
>>>>>>>
>>>>>>>> Anyway this is patch log issue.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Early or non early just means about the service
loading phase, in
>>>>>>>>>
>>>>>>>>> Yes. This patch used the same method what you said.
When normal kdump service starts,
>>>>>>>>> it will reload. At the same time, early kdump will
be overwritten by normal kdump.
>>>>>>>>
>>>>>>>> Probably "early kdump load" is better than
"early kdump" in words.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> initramfs or not, I notice dracut/systemd will
print some message about
>>>>>>>>>> they are running in initramfs, so probably you
can check how to get it
>>>>>>>>>> with same way, if this is not initramfs then
just unload before the
>>>>>>>>>> check in kdump loading.
>>>>>>>>>>
>>>>>>>>>> The picture like below:
>>>>>>>>>>
>>>>>>>>>> Kernel boot ->
>>>>>>>>>>
>>>>>>>>>> initramfs ---
>>>>>>>>>> early kdump load
>>>>>>>>>> ---- Mark A ----
>>>>>>>>>> initramfs switch root
>>>>>>>>>>
>>>>>>>>>> system startup (real root fs)
>>>>>>>>>> service a
>>>>>>>>>> service b ... (eg. networking
etc.)
>>>>>>>>>> kdump service start
>>>>>>>>>> -----Mark B -----
>>>>>>>>>> load kdump kernel
again
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The problem will happen between Mark A and Mark
B, during this period,
>>>>>>>>>> there could be repeated crash ->
earlykdump_load, there might be some
>>>>>>>>>> random crash as well since during the real root
fs service startup,
>>>>>>>>>> for example after network is ready if some
network workload cause a
>>>>>>>>>> panic, it maybe not 100% reproducible, so it
seems we still need to
>>>>>>>>>> make the poweroff configurable. eg.
>>>>>>>>>>
>>>>>>>>>> default is poweroff, but one can choose if he
can.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yes, default is poweroff for early kdump. Unless
kdump capture service
>>>>>>>>> happens error or enters the emergency service, one
can choose the default
>>>>>>>>> action.(configure default=xxx in kdump.conf)
>>>>>>>>
>>>>>>>> For default action instead of final action if you
hardcode it, then even if
>>>>>>>> one set default as reboot it still poweroff.
>>>>>>>>
>>>>>>>
>>>>>>> If really need, that can be improved.
>>>>>>>
>>>>>>>> [snip]
>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> +check_rd_earlykdump()
>>>>>>>>>>> +{
>>>>>>>>>>> + egrep "rd.earlykdump"
/proc/cmdline
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> start()
>>>>>>>>>>> {
>>>>>>>>>>> check_dump_feasibility
>>>>>>>>>>> @@ -969,7 +974,13 @@ start()
>>>>>>>>>>> check_current_status
>>>>>>>>>>> if [ $? == 0 ]; then
>>>>>>>>>>> echo "Kdump already
running: [WARNING]"
>>>>>>>>>>> - return 0
>>>>>>>>>>> + check_rd_earlykdump
>>>>>>>>>>> + #if earlykdump loaded, it will
stop and start.
>>>>>>>>>>> + if [ $? -eq 0 ]; then
>>>>>>>>>>> + stop
>>>>>>>>>>
>>>>>>>>>> kdumpctl start can run not only by system
startup services, one can also
>>>>>>>>>> run it manually or in udev rule.
>>>>>>>>>>
>>>>>>>>>> The checking of kernel cmdline seems not
enough.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Here it means that if kdump has beend loaded, check
whether early kdump did it.
>>>>>>>>> If yes, let normal kdump load it again, otherwise no
need to do anything.
>>>>>>>>
>>>>>>>> As we discussed you do not get my points here :)
>>>>>>>>
>>>>>>>> check_rd_earlykdump will be always true once kernel
bootup, so there is
>>>>>>>> no way to get the first time of normal kdump load and
the later one.
>>>>>>>>
>>>>>>>> The early boot time panic to address for this patch is
the 100%
>>>>>>>> reproducible panic, for this kinds of panic admin
should be able to see
>>>>>>>> it when he boot the machine. So rethinking about this
the best way may
>>>>>>>> be just a wontfix.
>>>>>>>>
>>>>>>>> Let's consider below use cases:
>>>>>>>>
>>>>>>>> * First install:
>>>>>>>>
>>>>>>>> install os ->
>>>>>>>> ---A
>>>>>>>> reboot ->
>>>>>>>> ...
>>>>>>>> ---B
>>>>>>>> kdump service start
>>>>>>>> -> create kdump initrd
>>>>>>>> ...
>>>>>>>> boot finished
>>>>>>>> recreate default initrd and enable early
kdump
>>>>>>>> goto A
>>>>>>>>
>>>>>>>> Panic happened between A and B, if it is predictable,
eg. 100%
>>>>>>>> reproducible, then admin should already see it, then
he/she can control
>>>>>>>> and stop the repeating crash/kdump loop
>>>>>>>> if the panic is not 100% reproducible then use reboot as
final action is just fine
>>>>>>>>
>>>>>>>> * Other use cases eg. updating kernel or some other
components:
>>>>>>>> It is similar with the intall os use case because if one
update kernel
>>>>>>>> or critical components it is likely they need regenerate
kdump initrd,
>>>>>>>> and then repack it into early kdump default initrd, in
this case admin
>>>>>>>> should be able to see the panic loop and handle it.
>>>>>>>> Also if the panic is not 100% reproducible then we are
just fine.
>>>>>>>>
>>>>>>>> If we choose to split early and late kdump load, there
could be other
>>>>>>>> side effects, and make the logic even complicated.
>>>>>>>>
>>>>>>>> So... the better way may be just leave it as is, and
maybe add some
>>>>>>>> documentation.
>>>>>>>>
>>>>>>>
>>>>>>> It is a good way to document the risks that may exist.
>>>>>>>
>>>>>>> Just like the public transportation, we all know that it has
the risk, but we still choose it.
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>>> Thoughts?
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Dave
>>>>>>>>
>>>>>>
>>>>>
>>>>> Hello Dave et al.
>>>>>
>>>>> Kindly check the below condition, assumption and share your
thoughts.
>>>>
>>>> Buland, thank you for the reply.
>>>>
>>>>>
>>>>> Condition:
>>>>>
>>>>> [0x1] System is up and running with the kernel version X.
>>>>> [0x2] Admin performed the kernel Y upgrade.
>>>>> [0x3] Running kernel X crashed.
>>>>> [0x4] Normal kdump rebooted the system and captured the kernel crash
dump of the kernel X.
>>>>> [0x5] System rebooted with the newly installed kernel Y.
>>>>> [0x6] Let's assume that due to some unknown reason the booting
kernel Y also crashed (assume that the panic is 100% reproducible).
>>>>> [0x7] Early kdump started dumping the kernel crash dump of the
booting kernel Y.
>>>>> [0x8] Early kdump rebooted the system and stuck in the loop.
>>>>>
>>>>> Assumption: 1
>>>>>
>>>>> What if the problematic system is in data center and admin is not
aware of this situation?
>>>>>
>>>>> [0x1] The dump target will be filled with the multiple copies of the
kernel crash dump?
>>>>>
>>>>> [ 1 kernel crash dump of the kernel X and 'n' kernel
crash dump of the kernel Y]
>>>>>
>>>>> [0x2] The system will reboot in the loop?
>>>>
>>>> It is hard to define, for original kdump service without early kdump
>>>> load, it is also possible after one replaced a kernel the new kernel
panics
>>>> during boot phase just after kdump get loaded.
>>>>
>>>> Admin at least should test a reboot after updating a kernel?
>>>
>>> Agree, but not sure if all the admins will follow this rule :)
>>>
>>>
>>>>> Assumption: 2
>>>>>
>>>>> What if the dump target in on the local disk?
>>>>>
>>>>> [0x1] Admin needs to power off the system manually to retrieve the
kernel crash dump of the kernel X and Y from the resume mode.
>>>>> [0x2] Admin needs to remove the multiple copies of the kernel crash
dump of the booting kernel Y.
>>>>> [0x3] Admin might get confused while differentiating between the
kernel crash dump of the kernel X and Kernel Y.
>>>>
>>>> As for the worst case we all admit this is a problem, we are more than
>>>> happy to make admin be easier and fix it :)
>>>>
>>>> As I said we are exporting about this problem see if we have a good
>>>> solution.
>>>>
>>>> But if we can assume predictable panic can be avoid then the situation
>>>> will be better.
>>>
>>> One suggestion, can we have a separate default behavior for normal kdump and
early kdump?
>>
>> Yes if we can.
>>
>> Kdump service start either early or late we just use a syscall to load
>> another kernel/initrd into pre-reserved memory.
>>
>> So once kdump service started we can not differenciate kdump kernel was
>> loaded early or late.
>
> But still not sure about it. As we can see for normal kdump there will
> be similar issue existed, eg. between C and D if a reproducible panic
> also happens every time during late boot phase
>
> So it is hard to define this is early kdump only, just more likely for
> early kdump.
>
> ---A
> initramfs kdump load
> switch root
> other services startup
> ---B
> kdump service start
> ---C
> other services start up
> ...
> ---D
> boot finished
>
> If we consider this as a early kdump only/must-fix issue, thinking about
> it we should split into two issues:
>
> 1. how to determine early load and then reload while kdumpctl start
>
> If "kdump" service is not active but kdump kernel loaded then
> it should have been "early loaded". Then something will like this:
>
Thanks for your suggestion.
If we executed the command: systemctl restart kdump.service and then start kdump.service
again,
it won't exactly distinguish early kdump and normal kdump.
I'm also looking for other solutions.
As we talked in meeting, I means "is not active" below pseudo code
missed a "!"
It should be good if no better way.
>
> > kdumpctl start()
> > if kdump is loaded:
> > if systemd kdump.service is active
> > # early loaded
> > stop and continue to load again
> > else
> > print a warning service is already running and then
> > return
> > else
> > go ahead to load and start
> >
> > 2. how to set the reboot action for early and late loading:
> >
> > use an cmdline like rd.earlykdump.noreboot, default value is true, if
> > one want to use alternate he/she can add rd.earlykdump.noreboot=0
> >
> > Add extra config optioins is also a choice but seems much complicated,
> > not only final action, also default action, for default action in case
> > partially saved vmcore it still will occupy the whole disk. So seems we
> > just should replace reboot with shutdown for any cases.
> >
> >>
> >>>
> >>> Eg:
> >>>
> >>> In /etc/kdump.conf file.
> >>> default reboot (Default action to be taken by normal kdump and
early kdump if dumping fails)
> >>> ndefault reboot (Default action to be taken once the dumping is
successful by normal kdump)
> >>> edefault poweroff (Default action to be taken once the dumping is
successful by early kdump)
> >>>
> >>> Note: Create 'ndefault' & 'edefault' options
identical to 'default' option so that an admin can alter the behavior as per the
requirement.
> >>>
> >>> [0x1] If kernel crash dump is captured by normal kdump then
'reboot' the system (by default) or take action as per 'ndefault' value.
> >>> [0x2] If kernel crash dump is captured by early kdump then
'poweroff' the system 9by default) or take action as per 'edefault'
value.
> >>>
> >>
> >> We can introduce something so that early kdump service load and set
> >> poweroff as default, but when normal kdump service start we have two
> >> choices:
> >>
> >> 1. go ahead to use the early loaded setup without reload like we have in
> >> the code now, in this case normal kdump will also use poweroff no matter
> >> what is set in /etc/kdump.conf
> >>
> >> 2. reload kernel/initrd with normal setup in /etc/kdump.conf. In this
> >> way, we need get if this is a late service startup. Because if we
> >> blindly reload then it will affect later udev triggered kdump restart.
> >>
> >> For example manually start the service like kdumpctl start, origianlly
> >> it will just print a msg about the service is already running. But now
> >> the "start" == "stop, then start", this may be not a big
problem but
> >> looks odd.
> >>
> >> Correct my self about the udev hotplug triggered events, it should be ok
> >> because it will call restart that means stop then start, this change
> >> will not affects it.
> >>
> >> BTW, the default action here should be "final action", there is
a
> >> "default" action can be configured in kdump.conf which is used for
kdump
> >> kernel to do after vmcore saving failed, it is more like a failsafe
> >> action. But the default "default" is also "reboot".
When kdump
> >> successfully saves a vmcore it will go to "final action"
(==reboot)
> >> which is not configurable.
> >>
> >>> --
> >>> Buland
> >>>
> >>>
> >>
> >> Thanks
> >> Dave