On 11/22/17 at 01:18pm, Baoquan He wrote:
On 11/22/17 at 12:40pm, Dave Young wrote:
> On 11/22/17 at 11:25am, Baoquan He wrote:
> > On 11/22/17 at 11:17am, Dave Young wrote:
> > > Hi Baoquan,
> > >
> > > On 11/21/17 at 05:07pm, Baoquan He wrote:
> > > > In commit:
> > > >
> > > > commit 2040103bd7 ("kdumpctl: sanity check of nr_cpus for
x86_64 in case running out of vectors")
> > > >
> > > > ... function check_kdump_cpus() was introduced to check if number of
cpu
> > > > is enough to boot to provide enough interrupt vectors.
> > > >
> > > > As we know, there are 256 interrupt vectors on each cpu to correspond
to
> > > > irq. Except of those system reserved vectors, only part of 256 will
be
> > > > used for device. So if there are many IO devices on a system, E.g
more
> > > > than 256 irqs are needed to map, then specifying 'nr_cpus=1'
in kernel
> > > > cmdline may not work. Especially on large NUMA system, with multiple
> > > > cpus and multiple devices.
> > > >
> > > > However, usually firmware engineer will consider the irq number
> > > > corresponding to IO devices and CPU number, so it's not a problem
for
> > > > normal kernel. In kdump kernel, it may be a problem since we default
to
> > > > specify 'nr_cpus=1' in kernel cmdline. Or maybe not, because
for many
> > > > devices which is not needed for vmcore dumping, drivers related to
them
> > > > are not added to kdump kernel, then the message 'Warning:
nr_cpus=1 may
> > > > not be enough for kdump boot' could be false positive to a large
extent.
> > > > So in this patch, change 'Warning: ...' to 'Note:
...' so that QA or
> > > > customers won't open bug for a false positive report.
> > > >
> > > > Signed-off-by: Baoquan He <bhe(a)redhat.com>
> > > > ---
> > > > kdumpctl | 2 +-
> > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/kdumpctl b/kdumpctl
> > > > index d4e00503a221..012369da8879 100755
> > > > --- a/kdumpctl
> > > > +++ b/kdumpctl
> > > > @@ -171,7 +171,7 @@ check_kdump_cpus()
> > > > return
> > > > fi
> > > >
> > > > - echo -n "Warning: nr_cpus=1 may not be enough for kdump
boot,"
> > > > + echo -n "[Note]: nr_cpus=1 may not be enough for kdump
boot,"
> > >
> > > It looks better to remove the brackets.
> >
> > OK, will change. thx
> >
> > >
> > > > echo " try nr_cpus=$nr_min or larger instead"
> > >
> > > this sentence can also be changed to for example:
> > > try nr_cpus=$nr_min or larger instead if nr_cpus=1 does not work.
> >
> > This is one sentence with the above 'Note' one since the above
printing
> > uses 'echo -n'.
>
> Even if it is one sentence, it still sounds like leading user to use nr_cpu=X
> without a test. It should be better to notify people to use larger
> nr_cpus only when nr_cpus=1 does not work in a test.
>
> Original sentence like below, it looks to suggest people to use
> nr_cpus=$nr_min once he see this note message.
>
I don't get what you mean. The origin sentence is:
"nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger
instead"
This happens when kernel boot if kdump.service is enabled by default or
run systemctl restart kdump. People see the 'Note' message, and they may
do a test to check it. And the former half sentence has told if
'nr_cpus=1 may not be enough', then .... Are you suggesting to change it
as
"If nr_cpus=1 does not work for kdump boot, try nr_cpus=$nr_min or larger
instead"
Otherwise the original half one and the adding one you suggested are
duuplicate. I can change it as above one.
Hmm, seems I did not describe it clearly, let me retry :)
I means original "nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or
larger instead"
seems suggest people to use nr_cpus=$nr_min once they see this note
message even without a test. So maybe say it more clear like below:
"nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead in
case nr_cpus=1 does not work"
About the nr_cpus=1 test in 1st kernel, it can prove the algorithm is
correct if 1st kernel can not boot with nr_cpus=1
Since xunlei did not find a machine to do a real test, so it is better
to test it since we see such message now on a real machine.
Thanks
Baoquan
> >
> > >
> > > > }
> > > >
> > > > --
> > > > 2.5.5
> > > > _______________________________________________
> > > > kexec mailing list -- kexec(a)lists.fedoraproject.org
> > > > To unsubscribe send an email to kexec-leave(a)lists.fedoraproject.org
> > >
> > > Thanks
> > > Dave
>
> Thanks
> Dave
Thanks
Dave