On 01/09/2017 at 11:10 AM, Dave Young wrote:
On 01/06/17 at 05:10pm, Xunlei Pang wrote:
> On 01/06/2017 at 04:46 PM, Dave Young wrote:
>> Hi, Xunlei
>> Thanks for the patch, a few comments replied inline.
>> On 01/06/17 at 02:37pm, Xunlei Pang wrote:
>>> Check the number of cpus for x86_64 kdump kernel to boot with.
>>> We met an issue for x86_64: kdump runs out of vectors with the
>>> default "nr_cpus=1", when requesting tons of irqs.
>>> This patch detects such situation and warns users about the risk.
>>> Signed-off-by: Xunlei Pang <xlpang(a)redhat.com>
>>> - When detecting risky cpu vectors, we just warn users instead of
>>> modifying "nr_cpus=X" forcely.
>>> - Improved code comments.
>>> - Replaced nr_old with nr_origin, and improved some logic.
>>> kdumpctl | 81
>>> 1 file changed, 81 insertions(+)
>>> diff --git a/kdumpctl b/kdumpctl
>>> index b2068cc..b6fc1f9 100755
>>> --- a/kdumpctl
>>> +++ b/kdumpctl
>>> @@ -105,6 +105,85 @@ append_cmdline()
>>> echo $cmdline
>>> +# Check the number of cpus for kdump kernel to boot with.
>>> +# We met an issue for x86_64: kdump runs out of vectors with
>>> +# "nr_cpus=1" when requesting tons of irqs, so here we check
>>> +# "nr_cpus=X" and warn users if kdump probably can't work.
>>> + local nr nr_search nr_origin nr_min nr_max
>>> + local arch=$(uname -m) cmdline=$KDUMP_COMMANDLINE
>>> + # Special treatment for x86_64 only currently.
>>> + if [ $arch != "x86_64" ]; then
>>> + return
>>> + fi
>>> + # We only care about "nr_cpus=X" format for x86.
>>> + nr_search=$(echo $cmdline | grep -o "nr_cpus=[0-9]*" | wc -l)
>>> + if [ $nr_search -eq 0 ] ; then
>>> + # Do not need to process if no valid "nr_cpus=X" specified.
>> This comment sounds not necessary..
> ok, will remove it.
>>> + return
>>> + fi
>>> + # Get value X of "nr_cpus=X"
>>> + nr_search=$(echo $cmdline | grep -o "nr_cpus=[0-9]*" | cut -d
"=" -f2 | grep "[0-9]" | sort)
>> Is it ok to check $nr_search -eq 0 here and drop the previous chunk?
> It's on purpose, a little different if there is wrong "nr_cpus="
How about only check for "nr_cpus=1" as we set it as default, if one set
as other value, they must have tested it. So that we do not need these
corner cases checking, then all the error handling can be dropped.
ok, we can peek sysconfig/kdump, if there is nr_cpus=1, then activate the check.