On 01/03/2017 at 02:41 PM, Baoquan He wrote:
Hi Xunlei,
As we discussed in meeting, I personally tend to agree not do it this
way. Mainly because it's a corner case and there isn't a clear judgement
to figure out what system could meet the corner case. So it will spread
out into each system, and may bring later maintenance cost though
unpredicted now. Maybe can fix it later when customer has a specific
request to fix this.
Warn user of the danger, or let customer adjust nr_cpus manually, either
is fine to me.
Hi Baoquan,
Thanks for the comments.
Yes, I'd prefer throwing some warnings when service restarts to tip users.
Regards,
Xunlei
Thanks
Baoquan
On 12/23/16 at 04:02pm, Xunlei Pang wrote:
> Adjust the number of cpus for x86_64 kdump kernel to boot with.
> We met an issue for x86_64: kdump runs out of vectors with the
> default "nr_cpus=1", when requesting tons of irqs.
>
> This patch detects such situation and ensure the minium number of
> cpus to be used by kdump.
>
> Signed-off-by: Xunlei Pang <xlpang(a)redhat.com>
> ---
> kdumpctl | 77 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 77 insertions(+)
>
> diff --git a/kdumpctl b/kdumpctl
> index 507f2dc..b633c34 100755
> --- a/kdumpctl
> +++ b/kdumpctl
> @@ -105,6 +105,81 @@ append_cmdline()
> echo $cmdline
> }
>
> +# Adjust the number of cpus for kdump kernel to boot with.
> +# We met an issue for x86_64: kdump runs out of vectors with
> +# "nr_cpus=1" when requesting tons of irqs, so we adjust the
> +# "nr_cpus=X" here to make kdump work properly.
> +try_to_adjust_kdump_cpus()
> +{
> + local nr_old nr_min nr_max nr
> + local arch=$(uname -m) cmdline=$KDUMP_COMMANDLINE
> +
> + # Special treatment for x86_64 only currently.
> + if [ $arch != "x86_64" ]; then
> + return
> + fi
> +
> + # We only care about "nr_cpus=X" format for x86.
> + nr_old=$(echo $cmdline | grep -o "nr_cpus=[0-9]*" | wc -l)
> + if [ $nr_old -eq 0 ] ; then
> + # Do not need to process if no valid "nr_cpus=X" specified.
> + return
> + fi
> +
> + # Get value X of "nr_cpus=X"
> + nr_old=$(echo $cmdline | grep -o "nr_cpus=[0-9]*" | cut -d "="
-f2 | grep "[0-9]" | sort)
> + # In case there are multiple "nr_cpus=X", get the mininum value.
> + for nr in $nr_old; do
> + if [ $nr -gt 0 ]; then
> + nr_old=$nr
> + break
> + fi
> + done
> + if [ -z "$nr_old" ]; then
> + echo "Warning: Wrong \"nr_cpus=\" kernel cmdline detected"
> + return
> + fi
> +
> + # Online cpus in first kernel.
> + nr_max=$(nproc)
> +
> + # Estimated minium cpus required by irqs(vectors), we roughly
> + # use 256-32(see kernel FIRST_EXTERNAL_VECTOR)=224 as the max
> + # supported vectors can be allocated to io devices per cpu.
> + nr_min=$(ls /proc/irq/ -l | grep ^d | wc -l)
> + # As nr_min is a ballpart figure, also some high-numbered vectors
> + # are consumed by the kernel(see FIRST_SYSTEM_VECTOR), we need a
> + # variance for safety.
> + #
> + # We got a large machine with 240 cpus, 6TB memory, 8 iommus, and
> + # 132 irqs under /proc/irq/, it boots successfully with "nr_cpus=1".
> + # (224-132)=92, choose 64 as the variance seems ok?
> + nr_min=$(($nr_min + 64))
> +
> + nr_min=$(($nr_min + 224 - 1))
> + nr_min=$(($nr_min / 224))
> + if [ $nr_min -gt 1 ]; then
> + # The system seems to have tons of irqs, add one more cpu
> + # for further safety.
> + nr_min=$(($nr_min + 1))
> + # Round up to an even number.
> + nr_min=$(($nr_min + $nr_min % 2))
> + fi
> +
> + if [ $nr_min -gt $nr_max ]; then
> + nr_min=$nr_max
> + fi
> +
> + if [ $nr_old -ge $nr_min ]; then
> + return
> + fi
> +
> + cmdline=$(remove_cmdline_param "$cmdline" nr_cpus)
> + cmdline="${cmdline} nr_cpus=${nr_min}"
> + KDUMP_COMMANDLINE=$cmdline
> + echo "CPU vector under pressure with \"nr_cpus=$nr_old\", replaced
it with \"nr_cpus=$nr_min\""
> +}
> +
> # This function performs a series of edits on the command line
> # Store the final result in global $KDUMP_COMMANDLINE.
> prepare_cmdline()
> @@ -134,6 +209,8 @@ prepare_cmdline()
> fi
>
> KDUMP_COMMANDLINE=$cmdline
> +
> + try_to_adjust_kdump_cpus
> }
>
>
> --
> 1.8.3.1
> _______________________________________________
> kexec mailing list -- kexec(a)lists.fedoraproject.org
> To unsubscribe send an email to kexec-leave(a)lists.fedoraproject.org