In commit:
commit 2040103bd7 ("kdumpctl: sanity check of nr_cpus for x86_64 in case running out of vectors")
... function check_kdump_cpus() was introduced to check if number of cpu is enough to boot to provide enough interrupt vectors.
As we know, there are 256 interrupt vectors on each cpu to correspond to irq. Except of those system reserved vectors, only part of 256 will be used for device. So if there are many IO devices on a system, E.g more than 256 irqs are needed to map, then specifying 'nr_cpus=1' in kernel cmdline may not work. Especially on large NUMA system, with multiple cpus and multiple devices.
However, usually firmware engineer will consider the irq number corresponding to IO devices and CPU number, so it's not a problem for normal kernel. In kdump kernel, it may be a problem since we default to specify 'nr_cpus=1' in kernel cmdline. Or maybe not, because for many devices which is not needed for vmcore dumping, drivers related to them are not added to kdump kernel, then the message 'Warning: nr_cpus=1 may not be enough for kdump boot' could be false positive to a large extent. So in this patch, change 'Warning: ...' to 'Note: ...' so that QA or customers won't open bug for a false positive report.
Signed-off-by: Baoquan He bhe@redhat.com --- kdumpctl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kdumpctl b/kdumpctl index d4e00503a221..012369da8879 100755 --- a/kdumpctl +++ b/kdumpctl @@ -171,7 +171,7 @@ check_kdump_cpus() return fi
- echo -n "Warning: nr_cpus=1 may not be enough for kdump boot," + echo -n "[Note]: nr_cpus=1 may not be enough for kdump boot," echo " try nr_cpus=$nr_min or larger instead" }
Hi Baoquan,
On 11/21/17 at 05:07pm, Baoquan He wrote:
In commit:
commit 2040103bd7 ("kdumpctl: sanity check of nr_cpus for x86_64 in case running out of vectors")
... function check_kdump_cpus() was introduced to check if number of cpu is enough to boot to provide enough interrupt vectors.
As we know, there are 256 interrupt vectors on each cpu to correspond to irq. Except of those system reserved vectors, only part of 256 will be used for device. So if there are many IO devices on a system, E.g more than 256 irqs are needed to map, then specifying 'nr_cpus=1' in kernel cmdline may not work. Especially on large NUMA system, with multiple cpus and multiple devices.
However, usually firmware engineer will consider the irq number corresponding to IO devices and CPU number, so it's not a problem for normal kernel. In kdump kernel, it may be a problem since we default to specify 'nr_cpus=1' in kernel cmdline. Or maybe not, because for many devices which is not needed for vmcore dumping, drivers related to them are not added to kdump kernel, then the message 'Warning: nr_cpus=1 may not be enough for kdump boot' could be false positive to a large extent. So in this patch, change 'Warning: ...' to 'Note: ...' so that QA or customers won't open bug for a false positive report.
Signed-off-by: Baoquan He bhe@redhat.com
kdumpctl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kdumpctl b/kdumpctl index d4e00503a221..012369da8879 100755 --- a/kdumpctl +++ b/kdumpctl @@ -171,7 +171,7 @@ check_kdump_cpus() return fi
- echo -n "Warning: nr_cpus=1 may not be enough for kdump boot,"
- echo -n "[Note]: nr_cpus=1 may not be enough for kdump boot,"
It looks better to remove the brackets.
echo " try nr_cpus=$nr_min or larger instead"
this sentence can also be changed to for example: try nr_cpus=$nr_min or larger instead if nr_cpus=1 does not work.
}
-- 2.5.5 _______________________________________________ kexec mailing list -- kexec@lists.fedoraproject.org To unsubscribe send an email to kexec-leave@lists.fedoraproject.org
Thanks Dave
On 11/22/17 at 11:17am, Dave Young wrote:
Hi Baoquan,
On 11/21/17 at 05:07pm, Baoquan He wrote:
In commit:
commit 2040103bd7 ("kdumpctl: sanity check of nr_cpus for x86_64 in case running out of vectors")
... function check_kdump_cpus() was introduced to check if number of cpu is enough to boot to provide enough interrupt vectors.
As we know, there are 256 interrupt vectors on each cpu to correspond to irq. Except of those system reserved vectors, only part of 256 will be used for device. So if there are many IO devices on a system, E.g more than 256 irqs are needed to map, then specifying 'nr_cpus=1' in kernel cmdline may not work. Especially on large NUMA system, with multiple cpus and multiple devices.
However, usually firmware engineer will consider the irq number corresponding to IO devices and CPU number, so it's not a problem for normal kernel. In kdump kernel, it may be a problem since we default to specify 'nr_cpus=1' in kernel cmdline. Or maybe not, because for many devices which is not needed for vmcore dumping, drivers related to them are not added to kdump kernel, then the message 'Warning: nr_cpus=1 may not be enough for kdump boot' could be false positive to a large extent. So in this patch, change 'Warning: ...' to 'Note: ...' so that QA or customers won't open bug for a false positive report.
Signed-off-by: Baoquan He bhe@redhat.com
kdumpctl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kdumpctl b/kdumpctl index d4e00503a221..012369da8879 100755 --- a/kdumpctl +++ b/kdumpctl @@ -171,7 +171,7 @@ check_kdump_cpus() return fi
- echo -n "Warning: nr_cpus=1 may not be enough for kdump boot,"
- echo -n "[Note]: nr_cpus=1 may not be enough for kdump boot,"
It looks better to remove the brackets.
OK, will change. thx
echo " try nr_cpus=$nr_min or larger instead"
this sentence can also be changed to for example: try nr_cpus=$nr_min or larger instead if nr_cpus=1 does not work.
This is one sentence with the above 'Note' one since the above printing uses 'echo -n'.
}
-- 2.5.5 _______________________________________________ kexec mailing list -- kexec@lists.fedoraproject.org To unsubscribe send an email to kexec-leave@lists.fedoraproject.org
Thanks Dave
On 11/22/17 at 11:25am, Baoquan He wrote:
On 11/22/17 at 11:17am, Dave Young wrote:
Hi Baoquan,
On 11/21/17 at 05:07pm, Baoquan He wrote:
In commit:
commit 2040103bd7 ("kdumpctl: sanity check of nr_cpus for x86_64 in case running out of vectors")
... function check_kdump_cpus() was introduced to check if number of cpu is enough to boot to provide enough interrupt vectors.
As we know, there are 256 interrupt vectors on each cpu to correspond to irq. Except of those system reserved vectors, only part of 256 will be used for device. So if there are many IO devices on a system, E.g more than 256 irqs are needed to map, then specifying 'nr_cpus=1' in kernel cmdline may not work. Especially on large NUMA system, with multiple cpus and multiple devices.
However, usually firmware engineer will consider the irq number corresponding to IO devices and CPU number, so it's not a problem for normal kernel. In kdump kernel, it may be a problem since we default to specify 'nr_cpus=1' in kernel cmdline. Or maybe not, because for many devices which is not needed for vmcore dumping, drivers related to them are not added to kdump kernel, then the message 'Warning: nr_cpus=1 may not be enough for kdump boot' could be false positive to a large extent. So in this patch, change 'Warning: ...' to 'Note: ...' so that QA or customers won't open bug for a false positive report.
Signed-off-by: Baoquan He bhe@redhat.com
kdumpctl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kdumpctl b/kdumpctl index d4e00503a221..012369da8879 100755 --- a/kdumpctl +++ b/kdumpctl @@ -171,7 +171,7 @@ check_kdump_cpus() return fi
- echo -n "Warning: nr_cpus=1 may not be enough for kdump boot,"
- echo -n "[Note]: nr_cpus=1 may not be enough for kdump boot,"
It looks better to remove the brackets.
OK, will change. thx
echo " try nr_cpus=$nr_min or larger instead"
this sentence can also be changed to for example: try nr_cpus=$nr_min or larger instead if nr_cpus=1 does not work.
This is one sentence with the above 'Note' one since the above printing uses 'echo -n'.
Even if it is one sentence, it still sounds like leading user to use nr_cpu=X without a test. It should be better to notify people to use larger nr_cpus only when nr_cpus=1 does not work in a test.
Original sentence like below, it looks to suggest people to use nr_cpus=$nr_min once he see this note message.
"nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min instead"
}
-- 2.5.5 _______________________________________________ kexec mailing list -- kexec@lists.fedoraproject.org To unsubscribe send an email to kexec-leave@lists.fedoraproject.org
Thanks Dave
Thanks Dave
On 11/22/17 at 12:40pm, Dave Young wrote:
On 11/22/17 at 11:25am, Baoquan He wrote:
On 11/22/17 at 11:17am, Dave Young wrote:
Hi Baoquan,
On 11/21/17 at 05:07pm, Baoquan He wrote:
In commit:
commit 2040103bd7 ("kdumpctl: sanity check of nr_cpus for x86_64 in case running out of vectors")
... function check_kdump_cpus() was introduced to check if number of cpu is enough to boot to provide enough interrupt vectors.
As we know, there are 256 interrupt vectors on each cpu to correspond to irq. Except of those system reserved vectors, only part of 256 will be used for device. So if there are many IO devices on a system, E.g more than 256 irqs are needed to map, then specifying 'nr_cpus=1' in kernel cmdline may not work. Especially on large NUMA system, with multiple cpus and multiple devices.
However, usually firmware engineer will consider the irq number corresponding to IO devices and CPU number, so it's not a problem for normal kernel. In kdump kernel, it may be a problem since we default to specify 'nr_cpus=1' in kernel cmdline. Or maybe not, because for many devices which is not needed for vmcore dumping, drivers related to them are not added to kdump kernel, then the message 'Warning: nr_cpus=1 may not be enough for kdump boot' could be false positive to a large extent. So in this patch, change 'Warning: ...' to 'Note: ...' so that QA or customers won't open bug for a false positive report.
Signed-off-by: Baoquan He bhe@redhat.com
kdumpctl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kdumpctl b/kdumpctl index d4e00503a221..012369da8879 100755 --- a/kdumpctl +++ b/kdumpctl @@ -171,7 +171,7 @@ check_kdump_cpus() return fi
- echo -n "Warning: nr_cpus=1 may not be enough for kdump boot,"
- echo -n "[Note]: nr_cpus=1 may not be enough for kdump boot,"
It looks better to remove the brackets.
OK, will change. thx
echo " try nr_cpus=$nr_min or larger instead"
this sentence can also be changed to for example: try nr_cpus=$nr_min or larger instead if nr_cpus=1 does not work.
This is one sentence with the above 'Note' one since the above printing uses 'echo -n'.
Even if it is one sentence, it still sounds like leading user to use nr_cpu=X without a test. It should be better to notify people to use larger nr_cpus only when nr_cpus=1 does not work in a test.
Original sentence like below, it looks to suggest people to use nr_cpus=$nr_min once he see this note message.
I don't get what you mean. The origin sentence is:
"nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead"
This happens when kernel boot if kdump.service is enabled by default or run systemctl restart kdump. People see the 'Note' message, and they may do a test to check it. And the former half sentence has told if 'nr_cpus=1 may not be enough', then .... Are you suggesting to change it as
"If nr_cpus=1 does not work for kdump boot, try nr_cpus=$nr_min or larger instead"
Otherwise the original half one and the adding one you suggested are duuplicate. I can change it as above one.
Thanks Baoquan
}
-- 2.5.5 _______________________________________________ kexec mailing list -- kexec@lists.fedoraproject.org To unsubscribe send an email to kexec-leave@lists.fedoraproject.org
Thanks Dave
Thanks Dave
On 11/22/17 at 01:18pm, Baoquan He wrote:
On 11/22/17 at 12:40pm, Dave Young wrote:
On 11/22/17 at 11:25am, Baoquan He wrote:
On 11/22/17 at 11:17am, Dave Young wrote:
Hi Baoquan,
On 11/21/17 at 05:07pm, Baoquan He wrote:
In commit:
commit 2040103bd7 ("kdumpctl: sanity check of nr_cpus for x86_64 in case running out of vectors")
... function check_kdump_cpus() was introduced to check if number of cpu is enough to boot to provide enough interrupt vectors.
As we know, there are 256 interrupt vectors on each cpu to correspond to irq. Except of those system reserved vectors, only part of 256 will be used for device. So if there are many IO devices on a system, E.g more than 256 irqs are needed to map, then specifying 'nr_cpus=1' in kernel cmdline may not work. Especially on large NUMA system, with multiple cpus and multiple devices.
However, usually firmware engineer will consider the irq number corresponding to IO devices and CPU number, so it's not a problem for normal kernel. In kdump kernel, it may be a problem since we default to specify 'nr_cpus=1' in kernel cmdline. Or maybe not, because for many devices which is not needed for vmcore dumping, drivers related to them are not added to kdump kernel, then the message 'Warning: nr_cpus=1 may not be enough for kdump boot' could be false positive to a large extent. So in this patch, change 'Warning: ...' to 'Note: ...' so that QA or customers won't open bug for a false positive report.
Signed-off-by: Baoquan He bhe@redhat.com
kdumpctl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kdumpctl b/kdumpctl index d4e00503a221..012369da8879 100755 --- a/kdumpctl +++ b/kdumpctl @@ -171,7 +171,7 @@ check_kdump_cpus() return fi
- echo -n "Warning: nr_cpus=1 may not be enough for kdump boot,"
- echo -n "[Note]: nr_cpus=1 may not be enough for kdump boot,"
It looks better to remove the brackets.
OK, will change. thx
echo " try nr_cpus=$nr_min or larger instead"
this sentence can also be changed to for example: try nr_cpus=$nr_min or larger instead if nr_cpus=1 does not work.
This is one sentence with the above 'Note' one since the above printing uses 'echo -n'.
Even if it is one sentence, it still sounds like leading user to use nr_cpu=X without a test. It should be better to notify people to use larger nr_cpus only when nr_cpus=1 does not work in a test.
Original sentence like below, it looks to suggest people to use nr_cpus=$nr_min once he see this note message.
I don't get what you mean. The origin sentence is:
"nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead"
I tried to make a better printing message, while it's a little awkward to print too much to customer who may not care about the principal, and it's also very hard to tell clearly without too few words.
This happens when kernel boot if kdump.service is enabled by default or run systemctl restart kdump. People see the 'Note' message, and they may do a test to check it. And the former half sentence has told if 'nr_cpus=1 may not be enough', then .... Are you suggesting to change it as
"If nr_cpus=1 does not work for kdump boot, try nr_cpus=$nr_min or larger instead"
Otherwise the original half one and the adding one you suggested are duuplicate. I can change it as above one.
Thanks Baoquan
}
-- 2.5.5 _______________________________________________ kexec mailing list -- kexec@lists.fedoraproject.org To unsubscribe send an email to kexec-leave@lists.fedoraproject.org
Thanks Dave
Thanks Dave
On 11/22/17 at 01:18pm, Baoquan He wrote:
On 11/22/17 at 12:40pm, Dave Young wrote:
On 11/22/17 at 11:25am, Baoquan He wrote:
On 11/22/17 at 11:17am, Dave Young wrote:
Hi Baoquan,
On 11/21/17 at 05:07pm, Baoquan He wrote:
In commit:
commit 2040103bd7 ("kdumpctl: sanity check of nr_cpus for x86_64 in case running out of vectors")
... function check_kdump_cpus() was introduced to check if number of cpu is enough to boot to provide enough interrupt vectors.
As we know, there are 256 interrupt vectors on each cpu to correspond to irq. Except of those system reserved vectors, only part of 256 will be used for device. So if there are many IO devices on a system, E.g more than 256 irqs are needed to map, then specifying 'nr_cpus=1' in kernel cmdline may not work. Especially on large NUMA system, with multiple cpus and multiple devices.
However, usually firmware engineer will consider the irq number corresponding to IO devices and CPU number, so it's not a problem for normal kernel. In kdump kernel, it may be a problem since we default to specify 'nr_cpus=1' in kernel cmdline. Or maybe not, because for many devices which is not needed for vmcore dumping, drivers related to them are not added to kdump kernel, then the message 'Warning: nr_cpus=1 may not be enough for kdump boot' could be false positive to a large extent. So in this patch, change 'Warning: ...' to 'Note: ...' so that QA or customers won't open bug for a false positive report.
Signed-off-by: Baoquan He bhe@redhat.com
kdumpctl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kdumpctl b/kdumpctl index d4e00503a221..012369da8879 100755 --- a/kdumpctl +++ b/kdumpctl @@ -171,7 +171,7 @@ check_kdump_cpus() return fi
- echo -n "Warning: nr_cpus=1 may not be enough for kdump boot,"
- echo -n "[Note]: nr_cpus=1 may not be enough for kdump boot,"
It looks better to remove the brackets.
OK, will change. thx
echo " try nr_cpus=$nr_min or larger instead"
this sentence can also be changed to for example: try nr_cpus=$nr_min or larger instead if nr_cpus=1 does not work.
This is one sentence with the above 'Note' one since the above printing uses 'echo -n'.
Even if it is one sentence, it still sounds like leading user to use nr_cpu=X without a test. It should be better to notify people to use larger nr_cpus only when nr_cpus=1 does not work in a test.
Original sentence like below, it looks to suggest people to use nr_cpus=$nr_min once he see this note message.
I don't get what you mean. The origin sentence is:
"nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead"
This happens when kernel boot if kdump.service is enabled by default or run systemctl restart kdump. People see the 'Note' message, and they may do a test to check it. And the former half sentence has told if 'nr_cpus=1 may not be enough', then .... Are you suggesting to change it as
"If nr_cpus=1 does not work for kdump boot, try nr_cpus=$nr_min or larger instead"
Otherwise the original half one and the adding one you suggested are duuplicate. I can change it as above one.
Hmm, seems I did not describe it clearly, let me retry :)
I means original "nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead" seems suggest people to use nr_cpus=$nr_min once they see this note message even without a test. So maybe say it more clear like below: "nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead in case nr_cpus=1 does not work"
About the nr_cpus=1 test in 1st kernel, it can prove the algorithm is correct if 1st kernel can not boot with nr_cpus=1
Since xunlei did not find a machine to do a real test, so it is better to test it since we see such message now on a real machine.
Thanks Baoquan
}
-- 2.5.5 _______________________________________________ kexec mailing list -- kexec@lists.fedoraproject.org To unsubscribe send an email to kexec-leave@lists.fedoraproject.org
Thanks Dave
Thanks Dave
Thanks Dave
On 11/22/17 at 01:30pm, Dave Young wrote:
On 11/22/17 at 01:18pm, Baoquan He wrote:
On 11/22/17 at 12:40pm, Dave Young wrote:
On 11/22/17 at 11:25am, Baoquan He wrote:
On 11/22/17 at 11:17am, Dave Young wrote:
Hi Baoquan,
On 11/21/17 at 05:07pm, Baoquan He wrote:
In commit:
commit 2040103bd7 ("kdumpctl: sanity check of nr_cpus for x86_64 in case running out of vectors")
... function check_kdump_cpus() was introduced to check if number of cpu is enough to boot to provide enough interrupt vectors.
As we know, there are 256 interrupt vectors on each cpu to correspond to irq. Except of those system reserved vectors, only part of 256 will be used for device. So if there are many IO devices on a system, E.g more than 256 irqs are needed to map, then specifying 'nr_cpus=1' in kernel cmdline may not work. Especially on large NUMA system, with multiple cpus and multiple devices.
However, usually firmware engineer will consider the irq number corresponding to IO devices and CPU number, so it's not a problem for normal kernel. In kdump kernel, it may be a problem since we default to specify 'nr_cpus=1' in kernel cmdline. Or maybe not, because for many devices which is not needed for vmcore dumping, drivers related to them are not added to kdump kernel, then the message 'Warning: nr_cpus=1 may not be enough for kdump boot' could be false positive to a large extent. So in this patch, change 'Warning: ...' to 'Note: ...' so that QA or customers won't open bug for a false positive report.
Signed-off-by: Baoquan He bhe@redhat.com
kdumpctl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kdumpctl b/kdumpctl index d4e00503a221..012369da8879 100755 --- a/kdumpctl +++ b/kdumpctl @@ -171,7 +171,7 @@ check_kdump_cpus() return fi
- echo -n "Warning: nr_cpus=1 may not be enough for kdump boot,"
- echo -n "[Note]: nr_cpus=1 may not be enough for kdump boot,"
It looks better to remove the brackets.
OK, will change. thx
echo " try nr_cpus=$nr_min or larger instead"
this sentence can also be changed to for example: try nr_cpus=$nr_min or larger instead if nr_cpus=1 does not work.
This is one sentence with the above 'Note' one since the above printing uses 'echo -n'.
Even if it is one sentence, it still sounds like leading user to use nr_cpu=X without a test. It should be better to notify people to use larger nr_cpus only when nr_cpus=1 does not work in a test.
Original sentence like below, it looks to suggest people to use nr_cpus=$nr_min once he see this note message.
I don't get what you mean. The origin sentence is:
"nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead"
This happens when kernel boot if kdump.service is enabled by default or run systemctl restart kdump. People see the 'Note' message, and they may do a test to check it. And the former half sentence has told if 'nr_cpus=1 may not be enough', then .... Are you suggesting to change it as
"If nr_cpus=1 does not work for kdump boot, try nr_cpus=$nr_min or larger instead"
Otherwise the original half one and the adding one you suggested are duuplicate. I can change it as above one.
Hmm, seems I did not describe it clearly, let me retry :)
I means original "nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead" seems suggest people to use nr_cpus=$nr_min once they see this note message even without a test. So maybe say it more clear like below: "nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead in case nr_cpus=1 does not work"
This looks too long when print in screen. I personaly think the old one is good enough, or adjust as I said. But I am fine if you insist.
About the nr_cpus=1 test in 1st kernel, it can prove the algorithm is correct if 1st kernel can not boot with nr_cpus=1
Since xunlei did not find a machine to do a real test, so it is better to test it since we see such message now on a real machine.
Well, I checked the formula corresponding to kernel code, and think it's reasonable. It's not a precise calculation, just a estimation value. So I personally think the testing in 1st kernel doesn't make sense, since 'nr_cpus=1' working in 1st kernel wont' tell the calculation not good, and 'nr_cpus=1' not working in 1st kernel won't prove it's very precise. My personal thought. If you insist, I can take a test.
}
-- 2.5.5 _______________________________________________ kexec mailing list -- kexec@lists.fedoraproject.org To unsubscribe send an email to kexec-leave@lists.fedoraproject.org
Thanks Dave
Thanks Dave
Thanks Dave
On 11/22/17 at 02:34pm, Baoquan He wrote:
On 11/22/17 at 01:30pm, Dave Young wrote:
On 11/22/17 at 01:18pm, Baoquan He wrote:
On 11/22/17 at 12:40pm, Dave Young wrote:
On 11/22/17 at 11:25am, Baoquan He wrote:
On 11/22/17 at 11:17am, Dave Young wrote:
Hi Baoquan,
On 11/21/17 at 05:07pm, Baoquan He wrote: > In commit: > > commit 2040103bd7 ("kdumpctl: sanity check of nr_cpus for x86_64 in case running out of vectors") > > ... function check_kdump_cpus() was introduced to check if number of cpu > is enough to boot to provide enough interrupt vectors. > > As we know, there are 256 interrupt vectors on each cpu to correspond to > irq. Except of those system reserved vectors, only part of 256 will be > used for device. So if there are many IO devices on a system, E.g more > than 256 irqs are needed to map, then specifying 'nr_cpus=1' in kernel > cmdline may not work. Especially on large NUMA system, with multiple > cpus and multiple devices. > > However, usually firmware engineer will consider the irq number > corresponding to IO devices and CPU number, so it's not a problem for > normal kernel. In kdump kernel, it may be a problem since we default to > specify 'nr_cpus=1' in kernel cmdline. Or maybe not, because for many > devices which is not needed for vmcore dumping, drivers related to them > are not added to kdump kernel, then the message 'Warning: nr_cpus=1 may > not be enough for kdump boot' could be false positive to a large extent. > So in this patch, change 'Warning: ...' to 'Note: ...' so that QA or > customers won't open bug for a false positive report. > > Signed-off-by: Baoquan He bhe@redhat.com > --- > kdumpctl | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kdumpctl b/kdumpctl > index d4e00503a221..012369da8879 100755 > --- a/kdumpctl > +++ b/kdumpctl > @@ -171,7 +171,7 @@ check_kdump_cpus() > return > fi > > - echo -n "Warning: nr_cpus=1 may not be enough for kdump boot," > + echo -n "[Note]: nr_cpus=1 may not be enough for kdump boot,"
It looks better to remove the brackets.
OK, will change. thx
> echo " try nr_cpus=$nr_min or larger instead"
this sentence can also be changed to for example: try nr_cpus=$nr_min or larger instead if nr_cpus=1 does not work.
This is one sentence with the above 'Note' one since the above printing uses 'echo -n'.
Even if it is one sentence, it still sounds like leading user to use nr_cpu=X without a test. It should be better to notify people to use larger nr_cpus only when nr_cpus=1 does not work in a test.
Original sentence like below, it looks to suggest people to use nr_cpus=$nr_min once he see this note message.
I don't get what you mean. The origin sentence is:
"nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead"
This happens when kernel boot if kdump.service is enabled by default or run systemctl restart kdump. People see the 'Note' message, and they may do a test to check it. And the former half sentence has told if 'nr_cpus=1 may not be enough', then .... Are you suggesting to change it as
"If nr_cpus=1 does not work for kdump boot, try nr_cpus=$nr_min or larger instead"
Otherwise the original half one and the adding one you suggested are duuplicate. I can change it as above one.
Hmm, seems I did not describe it clearly, let me retry :)
I means original "nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead" seems suggest people to use nr_cpus=$nr_min once they see this note message even without a test. So maybe say it more clear like below: "nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead in case nr_cpus=1 does not work"
This looks too long when print in screen. I personaly think the old one is good enough, or adjust as I said. But I am fine if you insist.
I did not notice the 'echo -n' so original message is over 80 chars already..
About the nr_cpus=1 test in 1st kernel, it can prove the algorithm is correct if 1st kernel can not boot with nr_cpus=1
Since xunlei did not find a machine to do a real test, so it is better to test it since we see such message now on a real machine.
Well, I checked the formula corresponding to kernel code, and think it's reasonable. It's not a precise calculation, just a estimation value. So I personally think the testing in 1st kernel doesn't make sense, since 'nr_cpus=1' working in 1st kernel wont' tell the calculation not good, and 'nr_cpus=1' not working in 1st kernel won't prove it's very precise. My personal thought. If you insist, I can take a test.
It is an estimation, but personally I think a test is still worth..
We missed another thing about this nr_cpus issue, that is for example one already set nr_cpus=4 and 4>=nr_min, but we still tell that "nr_cpus=1 may not be enough" it sounds odd.
How about move this function and warning out of kdumpctl as another script like /usr/bin/kdump_estimate_nr_cpus.sh, if people need they can call it by themselves, we just document it in kdump howto but do not run it in kdumpctl automatically?
> } > > -- > 2.5.5 > _______________________________________________ > kexec mailing list -- kexec@lists.fedoraproject.org > To unsubscribe send an email to kexec-leave@lists.fedoraproject.org
Thanks Dave
Thanks Dave
Thanks Dave
Thanks Dave
On 11/22/17 at 03:09pm, Dave Young wrote:
On 11/22/17 at 02:34pm, Baoquan He wrote:
On 11/22/17 at 01:30pm, Dave Young wrote:
On 11/22/17 at 01:18pm, Baoquan He wrote:
On 11/22/17 at 12:40pm, Dave Young wrote:
On 11/22/17 at 11:25am, Baoquan He wrote:
On 11/22/17 at 11:17am, Dave Young wrote: > Hi Baoquan, > > On 11/21/17 at 05:07pm, Baoquan He wrote: > > In commit: > > > > commit 2040103bd7 ("kdumpctl: sanity check of nr_cpus for x86_64 in case running out of vectors") > > > > ... function check_kdump_cpus() was introduced to check if number of cpu > > is enough to boot to provide enough interrupt vectors. > > > > As we know, there are 256 interrupt vectors on each cpu to correspond to > > irq. Except of those system reserved vectors, only part of 256 will be > > used for device. So if there are many IO devices on a system, E.g more > > than 256 irqs are needed to map, then specifying 'nr_cpus=1' in kernel > > cmdline may not work. Especially on large NUMA system, with multiple > > cpus and multiple devices. > > > > However, usually firmware engineer will consider the irq number > > corresponding to IO devices and CPU number, so it's not a problem for > > normal kernel. In kdump kernel, it may be a problem since we default to > > specify 'nr_cpus=1' in kernel cmdline. Or maybe not, because for many > > devices which is not needed for vmcore dumping, drivers related to them > > are not added to kdump kernel, then the message 'Warning: nr_cpus=1 may > > not be enough for kdump boot' could be false positive to a large extent. > > So in this patch, change 'Warning: ...' to 'Note: ...' so that QA or > > customers won't open bug for a false positive report. > > > > Signed-off-by: Baoquan He bhe@redhat.com > > --- > > kdumpctl | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/kdumpctl b/kdumpctl > > index d4e00503a221..012369da8879 100755 > > --- a/kdumpctl > > +++ b/kdumpctl > > @@ -171,7 +171,7 @@ check_kdump_cpus() > > return > > fi > > > > - echo -n "Warning: nr_cpus=1 may not be enough for kdump boot," > > + echo -n "[Note]: nr_cpus=1 may not be enough for kdump boot," > > It looks better to remove the brackets.
OK, will change. thx
> > > echo " try nr_cpus=$nr_min or larger instead" > > this sentence can also be changed to for example: > try nr_cpus=$nr_min or larger instead if nr_cpus=1 does not work.
This is one sentence with the above 'Note' one since the above printing uses 'echo -n'.
Even if it is one sentence, it still sounds like leading user to use nr_cpu=X without a test. It should be better to notify people to use larger nr_cpus only when nr_cpus=1 does not work in a test.
Original sentence like below, it looks to suggest people to use nr_cpus=$nr_min once he see this note message.
I don't get what you mean. The origin sentence is:
"nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead"
This happens when kernel boot if kdump.service is enabled by default or run systemctl restart kdump. People see the 'Note' message, and they may do a test to check it. And the former half sentence has told if 'nr_cpus=1 may not be enough', then .... Are you suggesting to change it as
"If nr_cpus=1 does not work for kdump boot, try nr_cpus=$nr_min or larger instead"
Otherwise the original half one and the adding one you suggested are duuplicate. I can change it as above one.
Hmm, seems I did not describe it clearly, let me retry :)
I means original "nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead" seems suggest people to use nr_cpus=$nr_min once they see this note message even without a test. So maybe say it more clear like below: "nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead in case nr_cpus=1 does not work"
This looks too long when print in screen. I personaly think the old one is good enough, or adjust as I said. But I am fine if you insist.
I did not notice the 'echo -n' so original message is over 80 chars already..
If this go out of kdumpctl, then we do not need the message, the standalone script just tell the estimated nr_cpus number. Opinion?
If we still add this message in kdumpctl then this message should better to be in 2 lines.
About the nr_cpus=1 test in 1st kernel, it can prove the algorithm is correct if 1st kernel can not boot with nr_cpus=1
Since xunlei did not find a machine to do a real test, so it is better to test it since we see such message now on a real machine.
Well, I checked the formula corresponding to kernel code, and think it's reasonable. It's not a precise calculation, just a estimation value. So I personally think the testing in 1st kernel doesn't make sense, since 'nr_cpus=1' working in 1st kernel wont' tell the calculation not good, and 'nr_cpus=1' not working in 1st kernel won't prove it's very precise. My personal thought. If you insist, I can take a test.
It is an estimation, but personally I think a test is still worth..
We missed another thing about this nr_cpus issue, that is for example one already set nr_cpus=4 and 4>=nr_min, but we still tell that "nr_cpus=1 may not be enough" it sounds odd.
How about move this function and warning out of kdumpctl as another script like /usr/bin/kdump_estimate_nr_cpus.sh, if people need they can call it by themselves, we just document it in kdump howto but do not run it in kdumpctl automatically?
> > > } > > > > -- > > 2.5.5 > > _______________________________________________ > > kexec mailing list -- kexec@lists.fedoraproject.org > > To unsubscribe send an email to kexec-leave@lists.fedoraproject.org > > Thanks > Dave
Thanks Dave
Thanks Dave
Thanks Dave
On Wed, Nov 22, 2017 at 12:43 PM, Dave Young dyoung@redhat.com wrote:
On 11/22/17 at 03:09pm, Dave Young wrote:
On 11/22/17 at 02:34pm, Baoquan He wrote:
On 11/22/17 at 01:30pm, Dave Young wrote:
On 11/22/17 at 01:18pm, Baoquan He wrote:
On 11/22/17 at 12:40pm, Dave Young wrote:
On 11/22/17 at 11:25am, Baoquan He wrote: > On 11/22/17 at 11:17am, Dave Young wrote: > > Hi Baoquan, > > > > On 11/21/17 at 05:07pm, Baoquan He wrote: > > > In commit: > > > > > > commit 2040103bd7 ("kdumpctl: sanity check of nr_cpus for x86_64 in case running out of vectors") > > > > > > ... function check_kdump_cpus() was introduced to check if number of cpu > > > is enough to boot to provide enough interrupt vectors. > > > > > > As we know, there are 256 interrupt vectors on each cpu to correspond to > > > irq. Except of those system reserved vectors, only part of 256 will be > > > used for device. So if there are many IO devices on a system, E.g more > > > than 256 irqs are needed to map, then specifying 'nr_cpus=1' in kernel > > > cmdline may not work. Especially on large NUMA system, with multiple > > > cpus and multiple devices. > > > > > > However, usually firmware engineer will consider the irq number > > > corresponding to IO devices and CPU number, so it's not a problem for > > > normal kernel. In kdump kernel, it may be a problem since we default to > > > specify 'nr_cpus=1' in kernel cmdline. Or maybe not, because for many > > > devices which is not needed for vmcore dumping, drivers related to them > > > are not added to kdump kernel, then the message 'Warning: nr_cpus=1 may > > > not be enough for kdump boot' could be false positive to a large extent. > > > So in this patch, change 'Warning: ...' to 'Note: ...' so that QA or > > > customers won't open bug for a false positive report. > > > > > > Signed-off-by: Baoquan He bhe@redhat.com > > > --- > > > kdumpctl | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/kdumpctl b/kdumpctl > > > index d4e00503a221..012369da8879 100755 > > > --- a/kdumpctl > > > +++ b/kdumpctl > > > @@ -171,7 +171,7 @@ check_kdump_cpus() > > > return > > > fi > > > > > > - echo -n "Warning: nr_cpus=1 may not be enough for kdump boot," > > > + echo -n "[Note]: nr_cpus=1 may not be enough for kdump boot," > > > > It looks better to remove the brackets. > > OK, will change. thx > > > > > > echo " try nr_cpus=$nr_min or larger instead" > > > > this sentence can also be changed to for example: > > try nr_cpus=$nr_min or larger instead if nr_cpus=1 does not work. > > This is one sentence with the above 'Note' one since the above printing > uses 'echo -n'.
Even if it is one sentence, it still sounds like leading user to use nr_cpu=X without a test. It should be better to notify people to use larger nr_cpus only when nr_cpus=1 does not work in a test.
Original sentence like below, it looks to suggest people to use nr_cpus=$nr_min once he see this note message.
I don't get what you mean. The origin sentence is:
"nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead"
This happens when kernel boot if kdump.service is enabled by default or run systemctl restart kdump. People see the 'Note' message, and they may do a test to check it. And the former half sentence has told if 'nr_cpus=1 may not be enough', then .... Are you suggesting to change it as
"If nr_cpus=1 does not work for kdump boot, try nr_cpus=$nr_min or larger instead"
Otherwise the original half one and the adding one you suggested are duuplicate. I can change it as above one.
Hmm, seems I did not describe it clearly, let me retry :)
I means original "nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead" seems suggest people to use nr_cpus=$nr_min once they see this note message even without a test. So maybe say it more clear like below: "nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead in case nr_cpus=1 does not work"
This looks too long when print in screen. I personaly think the old one is good enough, or adjust as I said. But I am fine if you insist.
I did not notice the 'echo -n' so original message is over 80 chars already..
If this go out of kdumpctl, then we do not need the message, the standalone script just tell the estimated nr_cpus number. Opinion?
If we still add this message in kdumpctl then this message should better to be in 2 lines.
I also agree that the existing statement needs to be improved: "nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead"
Here is some background on a similar issue I saw: While debugging an issue with kdump kernel boot, which lead to a "kernel BUG at arch/x86/kernel/apic/io_apic.c", I realized that the number of irqs is limited to 256 in the x86_64 kdump kernel boot, thus causing MSI allocations for some of the drivers to fail as a number of drivers would be trying to acquire MSI interrupts and some or the other driver would fail while allocating the MSIs.
In my view, a extension to the existing NOTE or a new NOTE, would further help which mentions that the number of irqs might be limited due to the nr_cpus being equal to 1, thus causing some the drivers's probe() to fail or cause a kernel panic or lead to a BUG().
However, the text 'nr_cpus=1' is confusing to me as well , as a user might have boot'ed a kdump kernel with 'nr_cpus>1', so we can try and generalize that further..
Regards, Bhupesh
seems suggest people to use nr_cpus=$nr_min once they see this note message even without a test. So maybe say it more clear like below: "nr_cpus=1 may not be enough for kdump boot, try nr_cpus=$nr_min or larger instead in case nr_cpus=1 does not work"
About the nr_cpus=1 test in 1st kernel, it can prove the algorithm is correct if 1st kernel can not boot with nr_cpus=1
Since xunlei did not find a machine to do a real test, so it is better to test it since we see such message now on a real machine.
Well, I checked the formula corresponding to kernel code, and think it's reasonable. It's not a precise calculation, just a estimation value. So I personally think the testing in 1st kernel doesn't make sense, since 'nr_cpus=1' working in 1st kernel wont' tell the calculation not good, and 'nr_cpus=1' not working in 1st kernel won't prove it's very precise. My personal thought. If you insist, I can take a test.
It is an estimation, but personally I think a test is still worth..
We missed another thing about this nr_cpus issue, that is for example one already set nr_cpus=4 and 4>=nr_min, but we still tell that "nr_cpus=1 may not be enough" it sounds odd.
How about move this function and warning out of kdumpctl as another script like /usr/bin/kdump_estimate_nr_cpus.sh, if people need they can call it by themselves, we just document it in kdump howto but do not run it in kdumpctl automatically?
> > > > > > } > > > > > > -- > > > 2.5.5 > > > _______________________________________________ > > > kexec mailing list -- kexec@lists.fedoraproject.org > > > To unsubscribe send an email to kexec-leave@lists.fedoraproject.org > > > > Thanks > > Dave
Thanks Dave
Thanks Dave
Thanks Dave
kexec mailing list -- kexec@lists.fedoraproject.org To unsubscribe send an email to kexec-leave@lists.fedoraproject.org
On 11/22/17 at 11:25am, Baoquan He wrote:
On 11/22/17 at 11:17am, Dave Young wrote:
Hi Baoquan,
On 11/21/17 at 05:07pm, Baoquan He wrote:
In commit:
commit 2040103bd7 ("kdumpctl: sanity check of nr_cpus for x86_64 in case running out of vectors")
... function check_kdump_cpus() was introduced to check if number of cpu is enough to boot to provide enough interrupt vectors.
As we know, there are 256 interrupt vectors on each cpu to correspond to irq. Except of those system reserved vectors, only part of 256 will be used for device. So if there are many IO devices on a system, E.g more than 256 irqs are needed to map, then specifying 'nr_cpus=1' in kernel cmdline may not work. Especially on large NUMA system, with multiple cpus and multiple devices.
However, usually firmware engineer will consider the irq number corresponding to IO devices and CPU number, so it's not a problem for normal kernel. In kdump kernel, it may be a problem since we default to specify 'nr_cpus=1' in kernel cmdline. Or maybe not, because for many devices which is not needed for vmcore dumping, drivers related to them are not added to kdump kernel, then the message 'Warning: nr_cpus=1 may not be enough for kdump boot' could be false positive to a large extent. So in this patch, change 'Warning: ...' to 'Note: ...' so that QA or customers won't open bug for a false positive report.
Signed-off-by: Baoquan He bhe@redhat.com
kdumpctl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kdumpctl b/kdumpctl index d4e00503a221..012369da8879 100755 --- a/kdumpctl +++ b/kdumpctl @@ -171,7 +171,7 @@ check_kdump_cpus() return fi
- echo -n "Warning: nr_cpus=1 may not be enough for kdump boot,"
- echo -n "[Note]: nr_cpus=1 may not be enough for kdump boot,"
It looks better to remove the brackets.
OK, will change. thx
echo " try nr_cpus=$nr_min or larger instead"
this sentence can also be changed to for example: try nr_cpus=$nr_min or larger instead if nr_cpus=1 does not work.
This is one sentence with the above 'Note' one since the above printing uses 'echo -n'.
}
-- 2.5.5 _______________________________________________ kexec mailing list -- kexec@lists.fedoraproject.org To unsubscribe send an email to kexec-leave@lists.fedoraproject.org
Thanks Dave
kexec mailing list -- kexec@lists.fedoraproject.org To unsubscribe send an email to kexec-leave@lists.fedoraproject.org
BTW, Baoquan, as we know this message appears in a test on a 4T machine, have you get chance to test nr_cpus=1 on 1st kernel, does it work?
On 11/22/17 at 12:41pm, Dave Young wrote:
On 11/22/17 at 11:25am, Baoquan He wrote:
On 11/22/17 at 11:17am, Dave Young wrote:
Hi Baoquan,
On 11/21/17 at 05:07pm, Baoquan He wrote:
In commit:
commit 2040103bd7 ("kdumpctl: sanity check of nr_cpus for x86_64 in case running out of vectors")
... function check_kdump_cpus() was introduced to check if number of cpu is enough to boot to provide enough interrupt vectors.
As we know, there are 256 interrupt vectors on each cpu to correspond to irq. Except of those system reserved vectors, only part of 256 will be used for device. So if there are many IO devices on a system, E.g more than 256 irqs are needed to map, then specifying 'nr_cpus=1' in kernel cmdline may not work. Especially on large NUMA system, with multiple cpus and multiple devices.
However, usually firmware engineer will consider the irq number corresponding to IO devices and CPU number, so it's not a problem for normal kernel. In kdump kernel, it may be a problem since we default to specify 'nr_cpus=1' in kernel cmdline. Or maybe not, because for many devices which is not needed for vmcore dumping, drivers related to them are not added to kdump kernel, then the message 'Warning: nr_cpus=1 may not be enough for kdump boot' could be false positive to a large extent. So in this patch, change 'Warning: ...' to 'Note: ...' so that QA or customers won't open bug for a false positive report.
Signed-off-by: Baoquan He bhe@redhat.com
kdumpctl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kdumpctl b/kdumpctl index d4e00503a221..012369da8879 100755 --- a/kdumpctl +++ b/kdumpctl @@ -171,7 +171,7 @@ check_kdump_cpus() return fi
- echo -n "Warning: nr_cpus=1 may not be enough for kdump boot,"
- echo -n "[Note]: nr_cpus=1 may not be enough for kdump boot,"
It looks better to remove the brackets.
OK, will change. thx
echo " try nr_cpus=$nr_min or larger instead"
this sentence can also be changed to for example: try nr_cpus=$nr_min or larger instead if nr_cpus=1 does not work.
This is one sentence with the above 'Note' one since the above printing uses 'echo -n'.
}
-- 2.5.5 _______________________________________________ kexec mailing list -- kexec@lists.fedoraproject.org To unsubscribe send an email to kexec-leave@lists.fedoraproject.org
Thanks Dave
kexec mailing list -- kexec@lists.fedoraproject.org To unsubscribe send an email to kexec-leave@lists.fedoraproject.org
BTW, Baoquan, as we know this message appears in a test on a 4T machine, have you get chance to test nr_cpus=1 on 1st kernel, does it work?
I didn't try.