On 01/09/2017 at 11:10 AM, Dave Young wrote:
On 01/06/17 at 05:10pm, Xunlei Pang wrote:
> On 01/06/2017 at 04:46 PM, Dave Young wrote:
>> Hi, Xunlei
>> Thanks for the patch, a few comments replied inline.
>> On 01/06/17 at 02:37pm, Xunlei Pang wrote:
>>> Check the number of cpus for x86_64 kdump kernel to boot with.
>>> We met an issue for x86_64: kdump runs out of vectors with the
>>> default "nr_cpus=1", when requesting tons of irqs.
>>> This patch detects such situation and warns users about the risk.
>>> Signed-off-by: Xunlei Pang <xlpang(a)redhat.com>
>>> - When detecting risky cpu vectors, we just warn users instead of
>>> modifying "nr_cpus=X" forcely.
>>> - Improved code comments.
>>> - Replaced nr_old with nr_origin, and improved some logic.
>>> kdumpctl | 81
>>> 1 file changed, 81 insertions(+)
>>> diff --git a/kdumpctl b/kdumpctl
>>> index b2068cc..b6fc1f9 100755
>>> --- a/kdumpctl
>>> +++ b/kdumpctl
>>> @@ -105,6 +105,85 @@ append_cmdline()
>>> echo $cmdline
>>> +# Check the number of cpus for kdump kernel to boot with.
>>> +# We met an issue for x86_64: kdump runs out of vectors with
>>> +# "nr_cpus=1" when requesting tons of irqs, so here we check
>>> +# "nr_cpus=X" and warn users if kdump probably can't work.
>>> + local nr nr_search nr_origin nr_min nr_max
>>> + local arch=$(uname -m) cmdline=$KDUMP_COMMANDLINE
>>> + # Special treatment for x86_64 only currently.
>>> + if [ $arch != "x86_64" ]; then
>>> + return
>>> + fi
>>> + # We only care about "nr_cpus=X" format for x86.
>>> + nr_search=$(echo $cmdline | grep -o "nr_cpus=[0-9]*" | wc -l)
>>> + if [ $nr_search -eq 0 ] ; then
>>> + # Do not need to process if no valid "nr_cpus=X" specified.
>> This comment sounds not necessary..
> ok, will remove it.
>>> + return
>>> + fi
>>> + # Get value X of "nr_cpus=X"
>>> + nr_search=$(echo $cmdline | grep -o "nr_cpus=[0-9]*" | cut -d
"=" -f2 | grep "[0-9]" | sort)
>> Is it ok to check $nr_search -eq 0 here and drop the previous chunk?
> It's on purpose, a little different if there is wrong "nr_cpus="
How about only check for "nr_cpus=1" as we set it as default, if one set
as other value, they must have tested it. So that we do not need these
corner cases checking, then all the error handling can be dropped.
What do you the following change?
kdumpctl | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 74 insertions(+)
diff --git a/kdumpctl b/kdumpctl
index b2068cc..4411ec5 100755
@@ -105,6 +105,78 @@ append_cmdline()
+# Check the number of cpus for kdump kernel to boot with.
+# We met an issue for x86_64: kdump runs out of vectors with
+# "nr_cpus=1" when requesting tons of irqs, so here we check
+# "nr_cpus=1" and warn users if kdump probably can't work.
+ local nr_origin nr_min nr_max
+ local arch=$(uname -m) cmdline=$KDUMP_COMMANDLINE_APPEND
+ # Special treatment for x86_64 only currently.
+ if [ $arch != "x86_64" ]; then
+ # We only care about the default "nr_cpus=1".
+ echo $cmdline | grep -E -q "nr_cpus=1 |nr_cpus=1$"
+ if [ $? -ne 0 ]; then
+ # Online cpus in first kernel.
+ nr_max=$(grep -c '^processor' /proc/cpuinfo)
+ # To calculate the estimated minimal cpus required by device interrupts.
+ nr_min=$(ls /proc/irq/ -l | grep ^d | wc -l)
+ # The total number of vectors percpu is 256 defined by x86 architecture.
+ # The available vectors can be allocated to io devices percpu starts
+ # from FIRST_EXTERNAL_VECTOR(see kernel code), and some high-numbered
+ # ones are consumed by some system interrupts. As a result, the vectors
+ # for io device are within [FIRST_EXTERNAL_VECTOR, FIRST_SYSTEM_VECTOR),
+ # with one known exception, 0x80 within the range is reserved specially
+ # as the syscall vector.
+ # FIRST_EXTERNAL_VECTOR is invariably 32, while FIRST_SYSTEM_VECTOR can
+ # vary between different kernel versions. E.g. FIRST_SYSTEM_VECTOR gets
+ # 0xef(with CONFIG_X86_LOCAL_APIC on)for linux-4.10, that is 17 vectors
+ # reserved, considering it may increase in the future and the special
+ # vectors, we use a flexible variance and assume there are 32 reserved
+ # from FIRST_EXTERNAL_VECTOR. Then the max vectors for device interrupts
+ # percpu is: (256-32)-32=192.
+ # For "nr_cpus=1", irq and vector have the 1:1 mapping.
+ nr_min=$(($nr_min + 192 - 1))
+ nr_min=$(($nr_min / 192))
+ if [ $nr_min -gt 1 ]; then
+ # The system seems to have tons of interrupts. while interrupts with
+ # multiple-cpu affinity can consume multiple vectors(i.e. 1:M mapping),
+ # with one vector for each cpu within the affinity mask. Fortunately
+ # for x2apic which is widely used on large modern machines, in default
+ # case of boot, device bringup etc will use a single cpu for interrupt
+ # affinity to minimize vector pressure.
+ # For further safety, we add one more cpu and round it up to an even
+ # number which is commonly-used.
+ nr_min=$(($nr_min + 1))
+ nr_min=$(($nr_min + $nr_min % 2))
+ if [ $nr_min -gt $nr_max ]; then
+ if [ $nr_origin -ge $nr_min ]; then
+ echo "Warning: nr_cpus=$nr_origin may not be enough for kdump boot, try
nr_cpus=$nr_min or larger instead"
# This function performs a series of edits on the command line.
# Store the final result in global $KDUMP_COMMANDLINE.
@@ -134,6 +206,8 @@ prepare_cmdline()