On Wed, Sep 21, 2022 at 12:45:15PM +0530, Hari Bathini wrote:
On 21/09/22 10:09 am, Coiby Xu wrote:
>Hi Hari,
>
>On Fri, Sep 16, 2022 at 07:07:24PM +0530, Hari Bathini wrote:
>>Since commit c5bdd2d8f195 ("kdump-lib: use non-debug kernels first"),
>>non-debug kernel is preferred, over the debug variant, as dump capture
>>kernel to reduce memory consumption. This works alright for kdump as
>>the capture kernel is loaded using kexec.
>>
>>In case of fadump, regular boot loader is used to load the capture
>>kernel. So, the default kernel needs to be used as capture kernel as
>>well. But with commit c5bdd2d8f195, initrd of a different kernel is
>>made dump capture capable breaking fadump's able to capture dump
>>properly. Fix this by sticking with the debug variant in case of
>>fadump.
>
>If I understand the commit message correctly, the problem is for fadump,
>the dump capture kernel is a debug kernel but the initrd matches a
>normal kernel.
>My only concern if the default crashkernel value for fadump i.e.
>4G-16G:768M,16G-64G:1G,64G-128G:2G,128G-1T:4G,1T-2T:6G,2T-4T:12G,4T-8T:20G,8T-16T:36G,16T-32T:64G,32T-64T:128G,64T-:180G
>works for debug kernel on POWER machines especially the virtual machines
>that may have smaller system memory. If there is OOM issue, we need to
>fix this mismatch between fadump kernel and initrd in a different way,
>for example, instructs fadump to load the normal kernel instead.
Unlike kdump, there is no such thing as loading the kernel for fadump.
On crash, system goes through the grub bootloader. So, the kernel that
is the default boot entry in the grub bootloader is booted. So, fadump
works with the assumption that the default boot entry is configured with
fadump (be it debug variant or the regular one).
Thanks for the explanation!
As for the OOM issues,
the existing warning ("Using debug kernel, you may need to set a larger
crashkernel than the default value.") in kdumpctl should be sufficient?
Thanks to Xiaoying who have confirmed no OOM issue is observed for
crashkernel=768M for debug kernel. So my concern is not a valid one.
I've merged the patch. Thanks!
Thanks
Hari
--
Best regards,
Coiby