On 11/22/17 at 01:30pm, Dave Young wrote:
On 11/22/17 at 01:18pm, Baoquan He wrote:
On 11/22/17 at 12:40pm, Dave Young wrote:
On 11/22/17 at 11:25am, Baoquan He wrote:
On 11/22/17 at 11:17am, Dave Young wrote:
Hi Baoquan,
On 11/21/17 at 05:07pm, Baoquan He wrote:
In commit:
commit 2040103bd7 ("kdumpctl: sanity check of nr_cpus for x86_64 in case running out of vectors")
... function check_kdump_cpus() was introduced to check if number of cpu is enough to boot to provide enough interrupt vectors.
As we know, there are 256 interrupt vectors on each cpu to correspond to irq. Except of those system reserved vectors, only part of 256 will be used for device. So if there are many IO devices on a system, E.g more than 256 irqs are needed to map, then specifying 'nr_cpus=1' in kernel cmdline may not work. Especially on large NUMA system, with multiple cpus and multiple devices.
However, usually firmware engineer will consider the irq number corresponding to IO devices and CPU number, so it's not a problem for normal kernel. In kdump kernel, it may be a problem since we default to specify 'nr_cpus=1' in kernel cmdline. Or maybe not, because for many devices which is not needed for vmcore dumping, drivers related to them are not added to kdump kernel, then the message 'Warning: nr_cpus=1 may not be enough for kdump boot' could be false positive to a large extent. So in this patch, change 'Warning: ...' to 'Note: ...' so that QA or customers won't open bug for a false positive report.
Signed-off-by: Baoquan He bhe@redhat.com
kdumpctl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kdumpctl b/kdumpctl index d4e00503a221..012369da8879 100755 --- a/kdumpctl +++ b/kdumpctl @@ -171,7 +171,7 @@ check_kdump_cpus() return fi
- echo -n "Warning: nr_cpus=1 may not be enough for kdump boot,"
- echo -n "[Note]: nr_cpus=1 may not be enough for kdump boot,"
It looks better to remove the brackets.
OK, will change. thx
echo " try nr_cpus=$nr_min or larger instead"
this sentence can also be changed to for example: try nr_cpus=$nr_min or larger instead if nr_cpus=1 does not work.
This is one sentence with the above 'Note' one since the above printing uses 'echo -n'.
Even if it is one sentence, it still sounds like leading user to use nr_cpu=X without a test. It should be better to notify people to use larger nr_cpus only when nr_cpus=1 does not work in a test.
Original sentence like below, it looks to suggest people to use nr_cpus=$nr_min once he see this note message.
I don't get what you mean. The origin sentence is:
"nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead"
This happens when kernel boot if kdump.service is enabled by default or run systemctl restart kdump. People see the 'Note' message, and they may do a test to check it. And the former half sentence has told if 'nr_cpus=1 may not be enough', then .... Are you suggesting to change it as
"If nr_cpus=1 does not work for kdump boot, try nr_cpus=$nr_min or larger instead"
Otherwise the original half one and the adding one you suggested are duuplicate. I can change it as above one.
Hmm, seems I did not describe it clearly, let me retry :)
I means original "nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead" seems suggest people to use nr_cpus=$nr_min once they see this note message even without a test. So maybe say it more clear like below: "nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead in case nr_cpus=1 does not work"
This looks too long when print in screen. I personaly think the old one is good enough, or adjust as I said. But I am fine if you insist.
About the nr_cpus=1 test in 1st kernel, it can prove the algorithm is correct if 1st kernel can not boot with nr_cpus=1
Since xunlei did not find a machine to do a real test, so it is better to test it since we see such message now on a real machine.
Well, I checked the formula corresponding to kernel code, and think it's reasonable. It's not a precise calculation, just a estimation value. So I personally think the testing in 1st kernel doesn't make sense, since 'nr_cpus=1' working in 1st kernel wont' tell the calculation not good, and 'nr_cpus=1' not working in 1st kernel won't prove it's very precise. My personal thought. If you insist, I can take a test.
}
-- 2.5.5 _______________________________________________ kexec mailing list -- kexec@lists.fedoraproject.org To unsubscribe send an email to kexec-leave@lists.fedoraproject.org
Thanks Dave
Thanks Dave
Thanks Dave