On Wed, Nov 22, 2017 at 12:43 PM, Dave Young <dyoung(a)redhat.com> wrote:
On 11/22/17 at 03:09pm, Dave Young wrote:
> On 11/22/17 at 02:34pm, Baoquan He wrote:
> > On 11/22/17 at 01:30pm, Dave Young wrote:
> > > On 11/22/17 at 01:18pm, Baoquan He wrote:
> > > > On 11/22/17 at 12:40pm, Dave Young wrote:
> > > > > On 11/22/17 at 11:25am, Baoquan He wrote:
> > > > > > On 11/22/17 at 11:17am, Dave Young wrote:
> > > > > > > Hi Baoquan,
> > > > > > >
> > > > > > > On 11/21/17 at 05:07pm, Baoquan He wrote:
> > > > > > > > In commit:
> > > > > > > >
> > > > > > > > commit 2040103bd7 ("kdumpctl: sanity check
of nr_cpus for x86_64 in case running out of vectors")
> > > > > > > >
> > > > > > > > ... function check_kdump_cpus() was introduced to
check if number of cpu
> > > > > > > > is enough to boot to provide enough interrupt
vectors.
> > > > > > > >
> > > > > > > > As we know, there are 256 interrupt vectors on
each cpu to correspond to
> > > > > > > > irq. Except of those system reserved vectors, only
part of 256 will be
> > > > > > > > used for device. So if there are many IO devices
on a system, E.g more
> > > > > > > > than 256 irqs are needed to map, then specifying
'nr_cpus=1' in kernel
> > > > > > > > cmdline may not work. Especially on large NUMA
system, with multiple
> > > > > > > > cpus and multiple devices.
> > > > > > > >
> > > > > > > > However, usually firmware engineer will consider
the irq number
> > > > > > > > corresponding to IO devices and CPU number, so
it's not a problem for
> > > > > > > > normal kernel. In kdump kernel, it may be a
problem since we default to
> > > > > > > > specify 'nr_cpus=1' in kernel cmdline. Or
maybe not, because for many
> > > > > > > > devices which is not needed for vmcore dumping,
drivers related to them
> > > > > > > > are not added to kdump kernel, then the message
'Warning: nr_cpus=1 may
> > > > > > > > not be enough for kdump boot' could be false
positive to a large extent.
> > > > > > > > So in this patch, change 'Warning: ...' to
'Note: ...' so that QA or
> > > > > > > > customers won't open bug for a false positive
report.
> > > > > > > >
> > > > > > > > Signed-off-by: Baoquan He <bhe(a)redhat.com>
> > > > > > > > ---
> > > > > > > > kdumpctl | 2 +-
> > > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/kdumpctl b/kdumpctl
> > > > > > > > index d4e00503a221..012369da8879 100755
> > > > > > > > --- a/kdumpctl
> > > > > > > > +++ b/kdumpctl
> > > > > > > > @@ -171,7 +171,7 @@ check_kdump_cpus()
> > > > > > > > return
> > > > > > > > fi
> > > > > > > >
> > > > > > > > - echo -n "Warning: nr_cpus=1 may not
be enough for kdump boot,"
> > > > > > > > + echo -n "[Note]: nr_cpus=1 may not be
enough for kdump boot,"
> > > > > > >
> > > > > > > It looks better to remove the brackets.
> > > > > >
> > > > > > OK, will change. thx
> > > > > >
> > > > > > >
> > > > > > > > echo " try nr_cpus=$nr_min or larger
instead"
> > > > > > >
> > > > > > > this sentence can also be changed to for example:
> > > > > > > try nr_cpus=$nr_min or larger instead if nr_cpus=1 does
not work.
> > > > > >
> > > > > > This is one sentence with the above 'Note' one since
the above printing
> > > > > > uses 'echo -n'.
> > > > >
> > > > > Even if it is one sentence, it still sounds like leading user to
use nr_cpu=X
> > > > > without a test. It should be better to notify people to use
larger
> > > > > nr_cpus only when nr_cpus=1 does not work in a test.
> > > > >
> > > > > Original sentence like below, it looks to suggest people to use
> > > > > nr_cpus=$nr_min once he see this note message.
> > > > >
> > > > I don't get what you mean. The origin sentence is:
> > > >
> > > > "nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min
or larger instead"
> > > >
> > > > This happens when kernel boot if kdump.service is enabled by default
or
> > > > run systemctl restart kdump. People see the 'Note' message,
and they may
> > > > do a test to check it. And the former half sentence has told if
> > > > 'nr_cpus=1 may not be enough', then .... Are you suggesting to
change it
> > > > as
> > > >
> > > > "If nr_cpus=1 does not work for kdump boot, try nr_cpus=$nr_min
or larger instead"
> > > >
> > > > Otherwise the original half one and the adding one you suggested are
> > > > duuplicate. I can change it as above one.
> > >
> > > Hmm, seems I did not describe it clearly, let me retry :)
> > >
> > > I means original "nr_cpus=1 may not be enough for kdump boot, try
nr_cpus=$nr_min or larger instead"
> > > seems suggest people to use nr_cpus=$nr_min once they see this note
> > > message even without a test. So maybe say it more clear like below:
> > > "nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or
larger instead in case nr_cpus=1 does not work"
> >
> > This looks too long when print in screen. I personaly think the old
> > one is good enough, or adjust as I said. But I am fine if you insist.
>
> I did not notice the 'echo -n' so original message is over 80 chars
> already..
If this go out of kdumpctl, then we do not need the message, the
standalone script just tell the estimated nr_cpus number. Opinion?
If we still add this message in kdumpctl then this message should better
to be in 2 lines.
I also agree that the existing statement needs to be improved:
"nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or
larger instead"
Here is some background on a similar issue I saw:
While debugging an issue with kdump kernel boot, which lead to a
"kernel BUG at arch/x86/kernel/apic/io_apic.c",
I realized that the number of irqs is limited to 256 in the x86_64
kdump kernel boot, thus causing
MSI allocations for some of the drivers to fail as a number of drivers
would be trying to acquire MSI interrupts and some or the other driver
would fail
while allocating the MSIs.
In my view, a extension to the existing NOTE or a new NOTE, would
further help which mentions that the number of irqs might be limited
due to the nr_cpus being equal to 1,
thus causing some the drivers's probe() to fail or cause a kernel
panic or lead to a BUG().
However, the text 'nr_cpus=1' is confusing to me as well , as a user
might have boot'ed a kdump kernel with 'nr_cpus>1', so we can try and
generalize that
further..
Regards,
Bhupesh
> > > seems suggest people to use nr_cpus=$nr_min once they
see this note
> > > message even without a test. So maybe say it more clear like below:
> > > "nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or
larger instead in case nr_cpus=1 does not work"
>>
>>
>> >
>> >
>> > >
>> > > About the nr_cpus=1 test in 1st kernel, it can prove the algorithm is
>> > > correct if 1st kernel can not boot with nr_cpus=1
>> > >
>> > > Since xunlei did not find a machine to do a real test, so it is better
>> > > to test it since we see such message now on a real machine.
>> >
>> > Well, I checked the formula corresponding to kernel code, and think
it's
>> > reasonable. It's not a precise calculation, just a estimation value. So
>> > I personally think the testing in 1st kernel doesn't make sense, since
>> > 'nr_cpus=1' working in 1st kernel wont' tell the calculation not
good,
>> > and 'nr_cpus=1' not working in 1st kernel won't prove it's
very precise.
>> > My personal thought. If you insist, I can take a test.
>>
>> It is an estimation, but personally I think a test is still worth..
>>
>> We missed another thing about this nr_cpus issue, that is for example
>> one already set nr_cpus=4 and 4>=nr_min, but we still tell that
>> "nr_cpus=1 may not be enough" it sounds odd.
>>
>> How about move this function and warning out of kdumpctl as another
>> script like /usr/bin/kdump_estimate_nr_cpus.sh, if people need they can
>> call it by themselves, we just document it in kdump howto but do not run
>> it in kdumpctl automatically?
>>
>>
>> >
>> > > >
>> > > > > >
>> > > > > > >
>> > > > > > > > }
>> > > > > > > >
>> > > > > > > > --
>> > > > > > > > 2.5.5
>> > > > > > > >
_______________________________________________
>> > > > > > > > kexec mailing list --
kexec(a)lists.fedoraproject.org
>> > > > > > > > To unsubscribe send an email to
kexec-leave(a)lists.fedoraproject.org
>> > > > > > >
>> > > > > > > Thanks
>> > > > > > > Dave
>> > > > >
>> > > > > Thanks
>> > > > > Dave
>> > >
>> > > Thanks
>> > > Dave
>>
>> Thanks
>> Dave
> _______________________________________________
> kexec mailing list -- kexec(a)lists.fedoraproject.org
> To unsubscribe send an email to kexec-leave(a)lists.fedoraproject.org