On 11/22/17 at 02:34pm, Baoquan He wrote:
On 11/22/17 at 01:30pm, Dave Young wrote:
On 11/22/17 at 01:18pm, Baoquan He wrote:
On 11/22/17 at 12:40pm, Dave Young wrote:
On 11/22/17 at 11:25am, Baoquan He wrote:
On 11/22/17 at 11:17am, Dave Young wrote:
Hi Baoquan,
On 11/21/17 at 05:07pm, Baoquan He wrote: > In commit: > > commit 2040103bd7 ("kdumpctl: sanity check of nr_cpus for x86_64 in case running out of vectors") > > ... function check_kdump_cpus() was introduced to check if number of cpu > is enough to boot to provide enough interrupt vectors. > > As we know, there are 256 interrupt vectors on each cpu to correspond to > irq. Except of those system reserved vectors, only part of 256 will be > used for device. So if there are many IO devices on a system, E.g more > than 256 irqs are needed to map, then specifying 'nr_cpus=1' in kernel > cmdline may not work. Especially on large NUMA system, with multiple > cpus and multiple devices. > > However, usually firmware engineer will consider the irq number > corresponding to IO devices and CPU number, so it's not a problem for > normal kernel. In kdump kernel, it may be a problem since we default to > specify 'nr_cpus=1' in kernel cmdline. Or maybe not, because for many > devices which is not needed for vmcore dumping, drivers related to them > are not added to kdump kernel, then the message 'Warning: nr_cpus=1 may > not be enough for kdump boot' could be false positive to a large extent. > So in this patch, change 'Warning: ...' to 'Note: ...' so that QA or > customers won't open bug for a false positive report. > > Signed-off-by: Baoquan He bhe@redhat.com > --- > kdumpctl | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kdumpctl b/kdumpctl > index d4e00503a221..012369da8879 100755 > --- a/kdumpctl > +++ b/kdumpctl > @@ -171,7 +171,7 @@ check_kdump_cpus() > return > fi > > - echo -n "Warning: nr_cpus=1 may not be enough for kdump boot," > + echo -n "[Note]: nr_cpus=1 may not be enough for kdump boot,"
It looks better to remove the brackets.
OK, will change. thx
> echo " try nr_cpus=$nr_min or larger instead"
this sentence can also be changed to for example: try nr_cpus=$nr_min or larger instead if nr_cpus=1 does not work.
This is one sentence with the above 'Note' one since the above printing uses 'echo -n'.
Even if it is one sentence, it still sounds like leading user to use nr_cpu=X without a test. It should be better to notify people to use larger nr_cpus only when nr_cpus=1 does not work in a test.
Original sentence like below, it looks to suggest people to use nr_cpus=$nr_min once he see this note message.
I don't get what you mean. The origin sentence is:
"nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead"
This happens when kernel boot if kdump.service is enabled by default or run systemctl restart kdump. People see the 'Note' message, and they may do a test to check it. And the former half sentence has told if 'nr_cpus=1 may not be enough', then .... Are you suggesting to change it as
"If nr_cpus=1 does not work for kdump boot, try nr_cpus=$nr_min or larger instead"
Otherwise the original half one and the adding one you suggested are duuplicate. I can change it as above one.
Hmm, seems I did not describe it clearly, let me retry :)
I means original "nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead" seems suggest people to use nr_cpus=$nr_min once they see this note message even without a test. So maybe say it more clear like below: "nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead in case nr_cpus=1 does not work"
This looks too long when print in screen. I personaly think the old one is good enough, or adjust as I said. But I am fine if you insist.
I did not notice the 'echo -n' so original message is over 80 chars already..
About the nr_cpus=1 test in 1st kernel, it can prove the algorithm is correct if 1st kernel can not boot with nr_cpus=1
Since xunlei did not find a machine to do a real test, so it is better to test it since we see such message now on a real machine.
Well, I checked the formula corresponding to kernel code, and think it's reasonable. It's not a precise calculation, just a estimation value. So I personally think the testing in 1st kernel doesn't make sense, since 'nr_cpus=1' working in 1st kernel wont' tell the calculation not good, and 'nr_cpus=1' not working in 1st kernel won't prove it's very precise. My personal thought. If you insist, I can take a test.
It is an estimation, but personally I think a test is still worth..
We missed another thing about this nr_cpus issue, that is for example one already set nr_cpus=4 and 4>=nr_min, but we still tell that "nr_cpus=1 may not be enough" it sounds odd.
How about move this function and warning out of kdumpctl as another script like /usr/bin/kdump_estimate_nr_cpus.sh, if people need they can call it by themselves, we just document it in kdump howto but do not run it in kdumpctl automatically?
> } > > -- > 2.5.5 > _______________________________________________ > kexec mailing list -- kexec@lists.fedoraproject.org > To unsubscribe send an email to kexec-leave@lists.fedoraproject.org
Thanks Dave
Thanks Dave
Thanks Dave
Thanks Dave