We met a problem on AMD machines, when using "nr_cpus=4" for kdump, and crash happens on cpus other than cpu0, kdump kernel will fail to boot and eventually reset.
After some debugging, we found that it stuck at the kernel path do_boot_cpu()-> ... ->wakeup_secondary_cpu_via_init(): apic_icr_write(APIC_INT_LEVELTRIG|APIC_INT_ASSERT|APIC_DM_INIT, phys_apicid); that is, it stuck at sending INIT from AP to BP and reset, which is actually what "disable_cpu_apicid=X" tries to solve. Printing the value of @phys_apicid showed that it was the value of "apicid" other that of "initial apicid" showed by /proc/cpuinfo.
As described in x86 specification: "In MP systems, the local APIC ID is also used as a processor ID by the BIOS and the operating system. Some processors permit software to modify the APIC ID. However, the ability of software to modify the APIC ID is processor model specific. Because of this, operating system software should avoid writing to the local APIC ID register. The value returned by bits 31-24 of the EBX register (when the CPUID instruction is executed with a source operand value of 1 in the EAX register) is always the Initial APIC ID (determined by the platform initialization). This is true even if software has changed the value in the Local APIC ID register."
From kernel commit 151e0c7de("x86, apic, kexec: Add disable_cpu_apicid kernel parameter"), we can see in generic_processor_info(), it uses a)read_apic_id() and b)@apicid to compare with @disabled_cpu_apicid.
a)@apicid which is actually @phys_apicid above-mentioned is from the following calltrace(on the problematic AMD machine): generic_processor_info+0x37/0x300 acpi_register_lapic+0x30/0x90 acpi_parse_lapic+0x40/0x50 acpi_table_parse_entries_array+0x171/0x1de acpi_boot_init+0xed/0x50f The value of @apicid(from acpi MADT) is equal to the value of "apicid" showed by /proc/cpuinfo as proved by our debug printk. b)read_apic_id() gets the value from LAPIC ID register which is "apicid" as well.
While the value of "initial apicid" is from cpuid instruction.
One example of "apicid" and "initial apicid" of cpu0 from /proc/cpuinfo on AMD machine: apicid : 32 initial apicid : 0
Therefore, we should assign /proc/cpuifo "apicid" to "disable_cpu_apicid=X".
We've never met such issue before, because we usually tested "nr_cpus=1", and mostly on Intel machines, and "apicid" and "initial apicid" have the same value in most cases on Intel machines.
Signed-off-by: Xunlei Pang xlpang@redhat.com --- kdumpctl | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/kdumpctl b/kdumpctl index 4d6b3e8..46b65d2 100755 --- a/kdumpctl +++ b/kdumpctl @@ -77,15 +77,15 @@ remove_cmdline_param() }
# -# This function returns the "initial apicid" of the -# boot cpu (cpu 0) if present. +# This function returns the "apicid" of the boot +# cpu (cpu 0) if present. # -get_bootcpu_initial_apicid() +get_bootcpu_apicid() { awk ' \ BEGIN { CPU = "-1"; } \ $1=="processor" && $2==":" { CPU = $NF; } \ - CPU=="0" && /initial apicid/ { print $NF; } \ + CPU=="0" && /^apicid/ { print $NF; } \ ' \ /proc/cpuinfo } @@ -206,7 +206,7 @@ prepare_cmdline()
cmdline="${cmdline} ${KDUMP_COMMANDLINE_APPEND}"
- id=`get_bootcpu_initial_apicid` + id=`get_bootcpu_apicid` if [ ! -z ${id} ] ; then cmdline=`append_cmdline "${cmdline}" disable_cpu_apicid ${id}` fi
Ccing "HATAYAMA Daisuke" who introduced "disable_cpu_apicid=X" kernel parameter.
On 05/09/2017 at 07:52 PM, Xunlei Pang wrote:
We met a problem on AMD machines, when using "nr_cpus=4" for kdump, and crash happens on cpus other than cpu0, kdump kernel will fail to boot and eventually reset.
After some debugging, we found that it stuck at the kernel path do_boot_cpu()-> ... ->wakeup_secondary_cpu_via_init(): apic_icr_write(APIC_INT_LEVELTRIG|APIC_INT_ASSERT|APIC_DM_INIT, phys_apicid); that is, it stuck at sending INIT from AP to BP and reset, which is actually what "disable_cpu_apicid=X" tries to solve. Printing the value of @phys_apicid showed that it was the value of "apicid" other that of "initial apicid" showed by /proc/cpuinfo.
As described in x86 specification: "In MP systems, the local APIC ID is also used as a processor ID by the BIOS and the operating system. Some processors permit software to modify the APIC ID. However, the ability of software to modify the APIC ID is processor model specific. Because of this, operating system software should avoid writing to the local APIC ID register. The value returned by bits 31-24 of the EBX register (when the CPUID instruction is executed with a source operand value of 1 in the EAX register) is always the Initial APIC ID (determined by the platform initialization). This is true even if software has changed the value in the Local APIC ID register."
From kernel commit 151e0c7de("x86, apic, kexec: Add disable_cpu_apicid kernel parameter"), we can see in generic_processor_info(), it uses a)read_apic_id() and b)@apicid to compare with @disabled_cpu_apicid.
a)@apicid which is actually @phys_apicid above-mentioned is from the following calltrace(on the problematic AMD machine): generic_processor_info+0x37/0x300 acpi_register_lapic+0x30/0x90 acpi_parse_lapic+0x40/0x50 acpi_table_parse_entries_array+0x171/0x1de acpi_boot_init+0xed/0x50f The value of @apicid(from acpi MADT) is equal to the value of "apicid" showed by /proc/cpuinfo as proved by our debug printk. b)read_apic_id() gets the value from LAPIC ID register which is "apicid" as well.
While the value of "initial apicid" is from cpuid instruction.
One example of "apicid" and "initial apicid" of cpu0 from /proc/cpuinfo on AMD machine: apicid : 32 initial apicid : 0
Therefore, we should assign /proc/cpuifo "apicid" to "disable_cpu_apicid=X".
We've never met such issue before, because we usually tested "nr_cpus=1", and mostly on Intel machines, and "apicid" and "initial apicid" have the same value in most cases on Intel machines.
Signed-off-by: Xunlei Pang xlpang@redhat.com
kdumpctl | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/kdumpctl b/kdumpctl index 4d6b3e8..46b65d2 100755 --- a/kdumpctl +++ b/kdumpctl @@ -77,15 +77,15 @@ remove_cmdline_param() }
# -# This function returns the "initial apicid" of the -# boot cpu (cpu 0) if present. +# This function returns the "apicid" of the boot +# cpu (cpu 0) if present. # -get_bootcpu_initial_apicid() +get_bootcpu_apicid() { awk ' \ BEGIN { CPU = "-1"; } \ $1=="processor" && $2==":" { CPU = $NF; } \
- CPU=="0" && /initial apicid/ { print $NF; } \
- CPU=="0" && /^apicid/ { print $NF; } \ ' \ /proc/cpuinfo
} @@ -206,7 +206,7 @@ prepare_cmdline()
cmdline="${cmdline} ${KDUMP_COMMANDLINE_APPEND}"
- id=`get_bootcpu_initial_apicid`
- id=`get_bootcpu_apicid` if [ ! -z ${id} ] ; then cmdline=`append_cmdline "${cmdline}" disable_cpu_apicid ${id}` fi
Pang,
Thanks for cc'ing to me.
-----Original Message----- From: Xunlei Pang [mailto:xlpang@redhat.com] Sent: Tuesday, May 9, 2017 8:52 PM To: kexec@lists.fedoraproject.org Cc: Xunlei Pang xlpang@redhat.com Subject: [PATCH] kdumpctl: use "apicid" other than "initial apicid"
We met a problem on AMD machines, when using "nr_cpus=4" for kdump, and crash happens on cpus other than cpu0, kdump kernel will fail to boot and eventually reset.
After some debugging, we found that it stuck at the kernel path do_boot_cpu()-> ... ->wakeup_secondary_cpu_via_init(): apic_icr_write(APIC_INT_LEVELTRIG|APIC_INT_ASSERT|APIC_DM_INIT, phys_apicid); that is, it stuck at sending INIT from AP to BP and reset, which is actually what "disable_cpu_apicid=X" tries to solve. Printing the value of @phys_apicid showed that it was the value of "apicid" other that of "initial apicid" showed by /proc/cpuinfo.
As described in x86 specification: "In MP systems, the local APIC ID is also used as a processor ID by the BIOS and the operating system. Some processors permit software to modify the APIC ID. However, the ability of software to modify the APIC ID is processor model specific. Because of this, operating system software should avoid writing to the local APIC ID register. The value returned by bits 31-24 of the EBX register (when the CPUID instruction is executed with a source operand value of 1 in the EAX register) is always the Initial APIC ID (determined by the platform initialization). This is true even if software has changed the value in the Local APIC ID register."
From kernel commit 151e0c7de("x86, apic, kexec: Add disable_cpu_apicid kernel parameter"), we can see in generic_processor_info(), it uses a)read_apic_id() and b)@apicid to compare with @disabled_cpu_apicid.
a)@apicid which is actually @phys_apicid above-mentioned is from the following calltrace(on the problematic AMD machine): generic_processor_info+0x37/0x300 acpi_register_lapic+0x30/0x90 acpi_parse_lapic+0x40/0x50 acpi_table_parse_entries_array+0x171/0x1de acpi_boot_init+0xed/0x50f The value of @apicid(from acpi MADT) is equal to the value of "apicid" showed by /proc/cpuinfo as proved by our debug printk. b)read_apic_id() gets the value from LAPIC ID register which is "apicid" as well.
While the value of "initial apicid" is from cpuid instruction.
One example of "apicid" and "initial apicid" of cpu0 from /proc/cpuinfo on AMD machine: apicid : 32 initial apicid : 0
Therefore, we should assign /proc/cpuifo "apicid" to "disable_cpu_apicid=X".
We've never met such issue before, because we usually tested "nr_cpus=1", and mostly on Intel machines, and "apicid" and "initial apicid" have the same value in most cases on Intel machines.
For my understanding, could you show me the following information on the AMD machines?
- dmesg | grep "ACPI: LAPIC" - /proc/cpuinfo
Signed-off-by: Xunlei Pang xlpang@redhat.com
kdumpctl | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/kdumpctl b/kdumpctl index 4d6b3e8..46b65d2 100755 --- a/kdumpctl +++ b/kdumpctl @@ -77,15 +77,15 @@ remove_cmdline_param() }
# -# This function returns the "initial apicid" of the -# boot cpu (cpu 0) if present. +# This function returns the "apicid" of the boot +# cpu (cpu 0) if present. # -get_bootcpu_initial_apicid() +get_bootcpu_apicid() { awk ' \ BEGIN { CPU = "-1"; } \ $1=="processor" && $2==":" { CPU = $NF; } \
- CPU=="0" && /initial apicid/ { print $NF; } \
- CPU=="0" && /^apicid/ { print $NF; } \ ' \ /proc/cpuinfo
} @@ -206,7 +206,7 @@ prepare_cmdline()
cmdline="${cmdline} ${KDUMP_COMMANDLINE_APPEND}"
- id=`get_bootcpu_initial_apicid`
- id=`get_bootcpu_apicid` if [ ! -z ${id} ] ; then cmdline=`append_cmdline "${cmdline}" disable_cpu_apicid
${id}` fi -- 1.8.3.1 _______________________________________________ kexec mailing list -- kexec@lists.fedoraproject.org To unsubscribe send an email to kexec-leave@lists.fedoraproject.org
On 05/10/2017 at 09:54 AM, Hatayama, Daisuke wrote:
Pang,
Thanks for cc'ing to me.
-----Original Message----- From: Xunlei Pang [mailto:xlpang@redhat.com] Sent: Tuesday, May 9, 2017 8:52 PM To: kexec@lists.fedoraproject.org Cc: Xunlei Pang xlpang@redhat.com Subject: [PATCH] kdumpctl: use "apicid" other than "initial apicid"
We met a problem on AMD machines, when using "nr_cpus=4" for kdump, and crash happens on cpus other than cpu0, kdump kernel will fail to boot and eventually reset.
After some debugging, we found that it stuck at the kernel path do_boot_cpu()-> ... ->wakeup_secondary_cpu_via_init(): apic_icr_write(APIC_INT_LEVELTRIG|APIC_INT_ASSERT|APIC_DM_INIT, phys_apicid); that is, it stuck at sending INIT from AP to BP and reset, which is actually what "disable_cpu_apicid=X" tries to solve. Printing the value of @phys_apicid showed that it was the value of "apicid" other that of "initial apicid" showed by /proc/cpuinfo.
As described in x86 specification: "In MP systems, the local APIC ID is also used as a processor ID by the BIOS and the operating system. Some processors permit software to modify the APIC ID. However, the ability of software to modify the APIC ID is processor model specific. Because of this, operating system software should avoid writing to the local APIC ID register. The value returned by bits 31-24 of the EBX register (when the CPUID instruction is executed with a source operand value of 1 in the EAX register) is always the Initial APIC ID (determined by the platform initialization). This is true even if software has changed the value in the Local APIC ID register."
From kernel commit 151e0c7de("x86, apic, kexec: Add disable_cpu_apicid kernel parameter"), we can see in generic_processor_info(), it uses a)read_apic_id() and b)@apicid to compare with @disabled_cpu_apicid.
a)@apicid which is actually @phys_apicid above-mentioned is from the following calltrace(on the problematic AMD machine): generic_processor_info+0x37/0x300 acpi_register_lapic+0x30/0x90 acpi_parse_lapic+0x40/0x50 acpi_table_parse_entries_array+0x171/0x1de acpi_boot_init+0xed/0x50f The value of @apicid(from acpi MADT) is equal to the value of "apicid" showed by /proc/cpuinfo as proved by our debug printk. b)read_apic_id() gets the value from LAPIC ID register which is "apicid" as well.
While the value of "initial apicid" is from cpuid instruction.
One example of "apicid" and "initial apicid" of cpu0 from /proc/cpuinfo on AMD machine: apicid : 32 initial apicid : 0
Therefore, we should assign /proc/cpuifo "apicid" to "disable_cpu_apicid=X".
We've never met such issue before, because we usually tested "nr_cpus=1", and mostly on Intel machines, and "apicid" and "initial apicid" have the same value in most cases on Intel machines.
For my understanding, could you show me the following information on the AMD machines?
- dmesg | grep "ACPI: LAPIC"
- /proc/cpuinfo
# dmesg | grep "ACPI: LAPIC" [ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x10] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x11] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x12] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x13] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x05] lapic_id[0x14] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x06] lapic_id[0x15] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x07] lapic_id[0x16] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x08] lapic_id[0x17] enabled) [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1])
# cat /proc/cpuinfo (there are 8 cpus, paste 4 cpus here) processor : 0 vendor_id : AuthenticAMD cpu family : 21 model : 2 model name : AMD FX(tm)-8350 Eight-Core Processor stepping : 0 microcode : 0x600084f cpu MHz : 4000.000 cache size : 2048 KB physical id : 0 siblings : 8 core id : 0 cpu cores : 4 apicid : 16 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc art rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb cpb hw_pstate bmi1 arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold bogomips : 7982.77 TLB size : 1536 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro
processor : 1 vendor_id : AuthenticAMD cpu family : 21 model : 2 model name : AMD FX(tm)-8350 Eight-Core Processor stepping : 0 microcode : 0x600084f cpu MHz : 4000.000 cache size : 2048 KB physical id : 0 siblings : 8 core id : 1 cpu cores : 4 apicid : 17 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc art rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb cpb hw_pstate bmi1 arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold bogomips : 7982.77 TLB size : 1536 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro processor : 2 vendor_id : AuthenticAMD cpu family : 21 model : 2 model name : AMD FX(tm)-8350 Eight-Core Processor stepping : 0 microcode : 0x600084f cpu MHz : 4000.000 cache size : 2048 KB physical id : 0 siblings : 8 core id : 2 cpu cores : 4 apicid : 18 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc art rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb cpb hw_pstate bmi1 arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold bogomips : 7982.77 TLB size : 1536 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro processor : 3 vendor_id : AuthenticAMD cpu family : 21 model : 2 model name : AMD FX(tm)-8350 Eight-Core Processor stepping : 0 microcode : 0x600084f cpu MHz : 4000.000 cache size : 2048 KB physical id : 0 siblings : 8 core id : 3 cpu cores : 4 apicid : 19 initial apicid : 3 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc art rep_good nopl nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb cpb hw_pstate bmi1 arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold bogomips : 7982.77 TLB size : 1536 4K pages clflush size : 64 cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro
Signed-off-by: Xunlei Pang xlpang@redhat.com
kdumpctl | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/kdumpctl b/kdumpctl index 4d6b3e8..46b65d2 100755 --- a/kdumpctl +++ b/kdumpctl @@ -77,15 +77,15 @@ remove_cmdline_param() }
# -# This function returns the "initial apicid" of the -# boot cpu (cpu 0) if present. +# This function returns the "apicid" of the boot +# cpu (cpu 0) if present. # -get_bootcpu_initial_apicid() +get_bootcpu_apicid() { awk ' \ BEGIN { CPU = "-1"; } \ $1=="processor" && $2==":" { CPU = $NF; } \
- CPU=="0" && /initial apicid/ { print $NF; } \
- CPU=="0" && /^apicid/ { print $NF; } \ ' \ /proc/cpuinfo
} @@ -206,7 +206,7 @@ prepare_cmdline()
cmdline="${cmdline} ${KDUMP_COMMANDLINE_APPEND}"
- id=`get_bootcpu_initial_apicid`
- id=`get_bootcpu_apicid` if [ ! -z ${id} ] ; then cmdline=`append_cmdline "${cmdline}" disable_cpu_apicid
${id}` fi -- 1.8.3.1 _______________________________________________ kexec mailing list -- kexec@lists.fedoraproject.org To unsubscribe send an email to kexec-leave@lists.fedoraproject.org
-----Original Message----- From: Xunlei Pang [mailto:xpang@redhat.com]
On 05/10/2017 at 09:54 AM, Hatayama, Daisuke wrote:
Pang,
Thanks for cc'ing to me.
-----Original Message----- From: Xunlei Pang [mailto:xlpang@redhat.com] Sent: Tuesday, May 9, 2017 8:52 PM To: kexec@lists.fedoraproject.org Cc: Xunlei Pang xlpang@redhat.com Subject: [PATCH] kdumpctl: use "apicid" other than "initial apicid"
We met a problem on AMD machines, when using "nr_cpus=4" for kdump, and crash happens on cpus other than cpu0, kdump kernel will fail to boot and eventually reset.
After some debugging, we found that it stuck at the kernel path do_boot_cpu()-> ... ->wakeup_secondary_cpu_via_init(): apic_icr_write(APIC_INT_LEVELTRIG|APIC_INT_ASSERT|APIC_DM_INIT, phys_apicid); that is, it stuck at sending INIT from AP to BP and reset, which is actually what "disable_cpu_apicid=X" tries to solve. Printing the value of @phys_apicid showed that it was the value of "apicid" other that of "initial apicid" showed by /proc/cpuinfo.
As described in x86 specification: "In MP systems, the local APIC ID is also used as a processor ID by the BIOS and the operating system. Some processors permit software to modify the APIC ID. However, the ability of software to modify the APIC ID is processor model specific. Because of this, operating system software should avoid writing to the local APIC ID register. The value returned by bits 31-24 of the EBX register (when the CPUID instruction is executed with a source operand value of 1 in the EAX register) is always the Initial APIC
ID
(determined by the platform initialization). This is true even if software has changed the value in the Local APIC ID register."
From kernel commit 151e0c7de("x86, apic, kexec: Add disable_cpu_apicid kernel parameter"), we can see in generic_processor_info(), it uses a)read_apic_id() and b)@apicid to compare with @disabled_cpu_apicid.
a)@apicid which is actually @phys_apicid above-mentioned is from the following calltrace(on the problematic AMD machine): generic_processor_info+0x37/0x300 acpi_register_lapic+0x30/0x90 acpi_parse_lapic+0x40/0x50 acpi_table_parse_entries_array+0x171/0x1de acpi_boot_init+0xed/0x50f The value of @apicid(from acpi MADT) is equal to the value of "apicid" showed by /proc/cpuinfo as proved by our debug printk. b)read_apic_id() gets the value from LAPIC ID register which is "apicid" as well.
While the value of "initial apicid" is from cpuid instruction.
One example of "apicid" and "initial apicid" of cpu0 from /proc/cpuinfo on AMD machine: apicid : 32 initial apicid : 0
Therefore, we should assign /proc/cpuifo "apicid" to
"disable_cpu_apicid=X".
We've never met such issue before, because we usually tested "nr_cpus=1", and mostly on Intel machines, and "apicid" and "initial apicid" have the same value in most cases on Intel machines.
For my understanding, could you show me the following information on the AMD machines?
- dmesg | grep "ACPI: LAPIC"
- /proc/cpuinfo
# dmesg | grep "ACPI: LAPIC" [ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x10] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x11] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x12] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x13] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x05] lapic_id[0x14] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x06] lapic_id[0x15] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x07] lapic_id[0x16] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x08] lapic_id[0x17] enabled) [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1])
# cat /proc/cpuinfo (there are 8 cpus, paste 4 cpus here) processor : 0 vendor_id : AuthenticAMD cpu family : 21 model : 2 model name : AMD FX(tm)-8350 Eight-Core Processor stepping : 0 microcode : 0x600084f cpu MHz : 4000.000 cache size : 2048 KB physical id : 0 siblings : 8 core id : 0 cpu cores : 4 apicid : 16 initial apicid : 0
Thanks for these information.
I was confused about MADT listing initial APIC id at least for BSP. I cannot recall the reason why I understand this way wrong, but looking back at Intel's Architectures Software Developer's Manual, I found the description " 5. As part of the boot-strap code, the BSP creates an ACPI table and/or an MP table and adds its initial APIC ID to these tables as appropriate." in 8.4.3 MP Initialization Protocol Algorithm for MP Systems and so I guess this was probably the reason.
Then, in this system, cpu0 has 16 as its APIC id. Is this the same system as you mentioned in the patch description? The patch description explains that APIC id of the cpu0 is 32. Or the APIC id could be changed at each boot or at each kdump kexec in the worst case? The latter case means that disable_cpu_apicid doesn't work well on such system.
On 05/10/2017 at 12:16 PM, Hatayama, Daisuke wrote:
-----Original Message----- From: Xunlei Pang [mailto:xpang@redhat.com] On 05/10/2017 at 09:54 AM, Hatayama, Daisuke wrote:
Pang,
Thanks for cc'ing to me.
-----Original Message----- From: Xunlei Pang [mailto:xlpang@redhat.com] Sent: Tuesday, May 9, 2017 8:52 PM To: kexec@lists.fedoraproject.org Cc: Xunlei Pang xlpang@redhat.com Subject: [PATCH] kdumpctl: use "apicid" other than "initial apicid"
We met a problem on AMD machines, when using "nr_cpus=4" for kdump, and crash happens on cpus other than cpu0, kdump kernel will fail to boot and eventually reset.
After some debugging, we found that it stuck at the kernel path do_boot_cpu()-> ... ->wakeup_secondary_cpu_via_init(): apic_icr_write(APIC_INT_LEVELTRIG|APIC_INT_ASSERT|APIC_DM_INIT, phys_apicid); that is, it stuck at sending INIT from AP to BP and reset, which is actually what "disable_cpu_apicid=X" tries to solve. Printing the value of @phys_apicid showed that it was the value of "apicid" other that of "initial apicid" showed by /proc/cpuinfo.
As described in x86 specification: "In MP systems, the local APIC ID is also used as a processor ID by the BIOS and the operating system. Some processors permit software to modify the APIC ID. However, the ability of software to modify the APIC ID is processor model specific. Because of this, operating system software should avoid writing to the local APIC ID register. The value returned by bits 31-24 of the EBX register (when the CPUID instruction is executed with a source operand value of 1 in the EAX register) is always the Initial APIC
ID
(determined by the platform initialization). This is true even if software has changed the value in the Local APIC ID register."
From kernel commit 151e0c7de("x86, apic, kexec: Add disable_cpu_apicid kernel parameter"), we can see in generic_processor_info(), it uses a)read_apic_id() and b)@apicid to compare with @disabled_cpu_apicid.
a)@apicid which is actually @phys_apicid above-mentioned is from the following calltrace(on the problematic AMD machine): generic_processor_info+0x37/0x300 acpi_register_lapic+0x30/0x90 acpi_parse_lapic+0x40/0x50 acpi_table_parse_entries_array+0x171/0x1de acpi_boot_init+0xed/0x50f The value of @apicid(from acpi MADT) is equal to the value of "apicid" showed by /proc/cpuinfo as proved by our debug printk. b)read_apic_id() gets the value from LAPIC ID register which is "apicid" as well.
While the value of "initial apicid" is from cpuid instruction.
One example of "apicid" and "initial apicid" of cpu0 from /proc/cpuinfo on AMD machine: apicid : 32 initial apicid : 0
Therefore, we should assign /proc/cpuifo "apicid" to
"disable_cpu_apicid=X".
We've never met such issue before, because we usually tested "nr_cpus=1", and mostly on Intel machines, and "apicid" and "initial apicid" have the same value in most cases on Intel machines.
For my understanding, could you show me the following information on the AMD machines?
- dmesg | grep "ACPI: LAPIC"
- /proc/cpuinfo
# dmesg | grep "ACPI: LAPIC" [ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x10] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x11] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x12] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x13] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x05] lapic_id[0x14] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x06] lapic_id[0x15] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x07] lapic_id[0x16] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x08] lapic_id[0x17] enabled) [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1])
# cat /proc/cpuinfo (there are 8 cpus, paste 4 cpus here) processor : 0 vendor_id : AuthenticAMD cpu family : 21 model : 2 model name : AMD FX(tm)-8350 Eight-Core Processor stepping : 0 microcode : 0x600084f cpu MHz : 4000.000 cache size : 2048 KB physical id : 0 siblings : 8 core id : 0 cpu cores : 4 apicid : 16 initial apicid : 0
Thanks for these information.
I was confused about MADT listing initial APIC id at least for BSP. I cannot recall the reason why I understand this way wrong, but looking back at Intel's Architectures Software Developer's Manual, I found the description " 5. As part of the boot-strap code, the BSP creates an ACPI table and/or an MP table and adds its initial APIC ID to these tables as appropriate." in 8.4.3 MP Initialization Protocol Algorithm for MP Systems and so I guess this was probably the reason.
I couldn't find an Intel machine with different "apicid" and "initial apicid", so it's hard to verify that.
Maybe it's different for AMD, I tested three different AMD machines showing the apicid from the ACPI table has the same value as /proc/cpuinfo "apicid".
For AMD: 1) apicid is initiated by init_amd(): c->apicid = hard_smp_processor_id(); // calls read_apic_id() 2) initial apicid is initiated by generic_identify(): c->initial_apicid = (cpuid_ebx(1) >> 24) & 0xFF;
Maybe I can apply this patch only for AMD machines for safety?
Then, in this system, cpu0 has 16 as its APIC id. Is this the same system as you mentioned in the patch description? The patch description explains that APIC id of the cpu0 is 32. Or the APIC id could be changed at each boot or at each kdump kexec in the worst case? The latter case means that disable_cpu_apicid doesn't work well on such system.
Sorry, I got them from two different AMD machines, the APIC ID stays invariable each reboot.
Regards, Xunlei
On 05/10/2017 at 06:44 PM, Hatayama, Daisuke wrote:
On 05/10/2017 at 12:16 PM, Hatayama, Daisuke wrote:
-----Original Message----- From: Xunlei Pang [mailto:xpang@redhat.com] On 05/10/2017 at 09:54 AM, Hatayama, Daisuke wrote:
Pang,
Thanks for cc'ing to me.
-----Original Message----- From: Xunlei Pang [mailto:xlpang@redhat.com] Sent: Tuesday, May 9, 2017 8:52 PM To: kexec@lists.fedoraproject.org Cc: Xunlei Pang xlpang@redhat.com Subject: [PATCH] kdumpctl: use "apicid" other than "initial apicid"
We met a problem on AMD machines, when using "nr_cpus=4" for kdump, and crash happens on cpus other than cpu0, kdump kernel will fail to boot and eventually reset.
After some debugging, we found that it stuck at the kernel path do_boot_cpu()-> ... ->wakeup_secondary_cpu_via_init(): apic_icr_write(APIC_INT_LEVELTRIG|APIC_INT_ASSERT|APIC_DM_INIT, phys_apicid); that is, it stuck at sending INIT from AP to BP and reset, which is actually what "disable_cpu_apicid=X" tries to solve. Printing the value of @phys_apicid showed that it was the value of "apicid" other that of "initial apicid" showed by /proc/cpuinfo.
As described in x86 specification: "In MP systems, the local APIC ID is also used as a processor ID by the BIOS and the operating system. Some processors permit software to modify the APIC ID. However, the ability of software to modify the APIC ID is processor model specific. Because of this, operating system software should avoid writing to the local APIC ID register. The value returned
by
bits 31-24 of the EBX register (when the CPUID instruction is executed
with
a source operand value of 1 in the EAX register) is always the Initial APIC
ID
(determined by the platform initialization). This is true even if software has changed the value in the Local APIC ID register."
From kernel commit 151e0c7de("x86, apic, kexec: Add disable_cpu_apicid kernel parameter"), we can see in generic_processor_info(), it uses a)read_apic_id() and b)@apicid to compare with @disabled_cpu_apicid.
a)@apicid which is actually @phys_apicid above-mentioned is from the following calltrace(on the problematic AMD machine): generic_processor_info+0x37/0x300 acpi_register_lapic+0x30/0x90 acpi_parse_lapic+0x40/0x50 acpi_table_parse_entries_array+0x171/0x1de acpi_boot_init+0xed/0x50f The value of @apicid(from acpi MADT) is equal to the value of "apicid" showed by /proc/cpuinfo as proved by our debug printk. b)read_apic_id() gets the value from LAPIC ID register which is "apicid" as well.
While the value of "initial apicid" is from cpuid instruction.
One example of "apicid" and "initial apicid" of cpu0 from /proc/cpuinfo on AMD machine: apicid : 32 initial apicid : 0
Therefore, we should assign /proc/cpuifo "apicid" to
"disable_cpu_apicid=X".
We've never met such issue before, because we usually tested "nr_cpus=1", and mostly on Intel machines, and "apicid" and "initial apicid" have the same value in most cases on Intel machines.
For my understanding, could you show me the following information on the AMD machines?
- dmesg | grep "ACPI: LAPIC"
- /proc/cpuinfo
# dmesg | grep "ACPI: LAPIC" [ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x10] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x11] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x12] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x13] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x05] lapic_id[0x14] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x06] lapic_id[0x15] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x07] lapic_id[0x16] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x08] lapic_id[0x17] enabled) [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1])
# cat /proc/cpuinfo (there are 8 cpus, paste 4 cpus here) processor : 0 vendor_id : AuthenticAMD cpu family : 21 model : 2 model name : AMD FX(tm)-8350 Eight-Core Processor stepping : 0 microcode : 0x600084f cpu MHz : 4000.000 cache size : 2048 KB physical id : 0 siblings : 8 core id : 0 cpu cores : 4 apicid : 16 initial apicid : 0
Thanks for these information.
I was confused about MADT listing initial APIC id at least for BSP. I cannot recall the reason why I understand this way wrong, but looking back at Intel's Architectures Software Developer's Manual, I found the description " 5. As part of the boot-strap code, the BSP creates an ACPI table and/or an MP table and adds its initial APIC ID to these tables as appropriate." in 8.4.3 MP Initialization Protocol Algorithm for MP Systems and so I guess this was probably the reason.
I couldn't find an Intel machine with different "apicid" and "initial apicid", so it's hard to verify that.
Maybe it's different for AMD, I tested three different AMD machines showing the apicid from the ACPI table has the same value as /proc/cpuinfo "apicid".
For AMD:
- apicid is initiated by init_amd(): c->apicid = hard_smp_processor_id(); // calls read_apic_id()
- initial apicid is initiated by generic_identify(): c->initial_apicid = (cpuid_ebx(1) >> 24) & 0xFF;
I'm not saying the Intel machine works like this. I'm just saying how I was wrong... It is correct that we specify local apicid in /proc/cpufinfo to disable_cpu_apic parameter because MADT lists local APIC id that are not necessarily initial.
Maybe I can apply this patch only for AMD machines for safety?
I don't think it necessary to do such limitation because there is no additional impact by your patch for the system where local apicid is equal to the initial apicid.
Ok, thanks for the explanation.
Then, in this system, cpu0 has 16 as its APIC id. Is this the same system as you mentioned in the patch description? The patch description explains that APIC id of the cpu0 is 32. Or the APIC id could be changed at each boot or at each kdump kexec in the worst case? The latter case means that disable_cpu_apicid doesn't work well on such system.
Sorry, I got them from two different AMD machines, the APIC ID stays invariable each reboot.
So, on the AMD machines, BPS's local APIC ID is unchanged until boot time of the kdump 2nd kernel. Then, disable_cpu_apicid works well on them.
The condition for disable_cpu_apicid to work well is that BSP's local APIC ID is kept unchanged until boot time of the kdump 2nd kernel.
I think it necessary to confirm when local APIC ID is changed in general and possibility for BSP's local APIC ID to be changed. I have no idea about these now.
However, honestly, I guess such case is actually unlikely to happen except for some bug...
Yes, I personally don't care about it, the Spec also recommends that software doesn't touch the value of the local APIC ID.
Thanks, Xunlei
On Tue, May 09, 2017 at 07:52:09PM +0800, Xunlei Pang wrote:
We met a problem on AMD machines, when using "nr_cpus=4" for kdump, and crash happens on cpus other than cpu0, kdump kernel will fail to boot and eventually reset.
After some debugging, we found that it stuck at the kernel path do_boot_cpu()-> ... ->wakeup_secondary_cpu_via_init(): apic_icr_write(APIC_INT_LEVELTRIG|APIC_INT_ASSERT|APIC_DM_INIT, phys_apicid); that is, it stuck at sending INIT from AP to BP and reset, which is actually what "disable_cpu_apicid=X" tries to solve. Printing the value of @phys_apicid showed that it was the value of "apicid" other that of "initial apicid" showed by /proc/cpuinfo.
As described in x86 specification: "In MP systems, the local APIC ID is also used as a processor ID by the BIOS and the operating system. Some processors permit software to modify the APIC ID. However, the ability of software to modify the APIC ID is processor model specific. Because of this, operating system software should avoid writing to the local APIC ID register. The value returned by bits 31-24 of the EBX register (when the CPUID instruction is executed with a source operand value of 1 in the EAX register) is always the Initial APIC ID (determined by the platform initialization). This is true even if software has changed the value in the Local APIC ID register."
From kernel commit 151e0c7de("x86, apic, kexec: Add disable_cpu_apicid kernel parameter"), we can see in generic_processor_info(), it uses a)read_apic_id() and b)@apicid to compare with @disabled_cpu_apicid.
Do you plan to clarify the kernel documentation:
Documentation/admin-guide/kernel-parameters.txt?
Thanks
Jerry
a)@apicid which is actually @phys_apicid above-mentioned is from the following calltrace(on the problematic AMD machine): generic_processor_info+0x37/0x300 acpi_register_lapic+0x30/0x90 acpi_parse_lapic+0x40/0x50 acpi_table_parse_entries_array+0x171/0x1de acpi_boot_init+0xed/0x50f The value of @apicid(from acpi MADT) is equal to the value of "apicid" showed by /proc/cpuinfo as proved by our debug printk. b)read_apic_id() gets the value from LAPIC ID register which is "apicid" as well.
While the value of "initial apicid" is from cpuid instruction.
One example of "apicid" and "initial apicid" of cpu0 from /proc/cpuinfo on AMD machine: apicid : 32 initial apicid : 0
Therefore, we should assign /proc/cpuifo "apicid" to "disable_cpu_apicid=X".
We've never met such issue before, because we usually tested "nr_cpus=1", and mostly on Intel machines, and "apicid" and "initial apicid" have the same value in most cases on Intel machines.
Signed-off-by: Xunlei Pang xlpang@redhat.com
kdumpctl | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/kdumpctl b/kdumpctl index 4d6b3e8..46b65d2 100755 --- a/kdumpctl +++ b/kdumpctl @@ -77,15 +77,15 @@ remove_cmdline_param() }
# -# This function returns the "initial apicid" of the -# boot cpu (cpu 0) if present. +# This function returns the "apicid" of the boot +# cpu (cpu 0) if present. # -get_bootcpu_initial_apicid() +get_bootcpu_apicid() { awk ' \ BEGIN { CPU = "-1"; } \ $1=="processor" && $2==":" { CPU = $NF; } \
- CPU=="0" && /initial apicid/ { print $NF; } \
- CPU=="0" && /^apicid/ { print $NF; } \ ' \ /proc/cpuinfo
} @@ -206,7 +206,7 @@ prepare_cmdline()
cmdline="${cmdline} ${KDUMP_COMMANDLINE_APPEND}"
- id=`get_bootcpu_initial_apicid`
- id=`get_bootcpu_apicid` if [ ! -z ${id} ] ; then cmdline=`append_cmdline "${cmdline}" disable_cpu_apicid ${id}` fi
-- 1.8.3.1 _______________________________________________ kexec mailing list -- kexec@lists.fedoraproject.org To unsubscribe send an email to kexec-leave@lists.fedoraproject.org
On 05/13/2017 at 04:35 AM, Jerry Hoemann wrote:
On Tue, May 09, 2017 at 07:52:09PM +0800, Xunlei Pang wrote:
We met a problem on AMD machines, when using "nr_cpus=4" for kdump, and crash happens on cpus other than cpu0, kdump kernel will fail to boot and eventually reset.
After some debugging, we found that it stuck at the kernel path do_boot_cpu()-> ... ->wakeup_secondary_cpu_via_init(): apic_icr_write(APIC_INT_LEVELTRIG|APIC_INT_ASSERT|APIC_DM_INIT, phys_apicid); that is, it stuck at sending INIT from AP to BP and reset, which is actually what "disable_cpu_apicid=X" tries to solve. Printing the value of @phys_apicid showed that it was the value of "apicid" other that of "initial apicid" showed by /proc/cpuinfo.
As described in x86 specification: "In MP systems, the local APIC ID is also used as a processor ID by the BIOS and the operating system. Some processors permit software to modify the APIC ID. However, the ability of software to modify the APIC ID is processor model specific. Because of this, operating system software should avoid writing to the local APIC ID register. The value returned by bits 31-24 of the EBX register (when the CPUID instruction is executed with a source operand value of 1 in the EAX register) is always the Initial APIC ID (determined by the platform initialization). This is true even if software has changed the value in the Local APIC ID register."
From kernel commit 151e0c7de("x86, apic, kexec: Add disable_cpu_apicid kernel parameter"), we can see in generic_processor_info(), it uses a)read_apic_id() and b)@apicid to compare with @disabled_cpu_apicid.
Do you plan to clarify the kernel documentation:
Documentation/admin-guide/kernel-parameters.txt?
Yes, will do after this patch is finalized.
Regards, Xunlei
Thanks
Jerry
a)@apicid which is actually @phys_apicid above-mentioned is from the following calltrace(on the problematic AMD machine): generic_processor_info+0x37/0x300 acpi_register_lapic+0x30/0x90 acpi_parse_lapic+0x40/0x50 acpi_table_parse_entries_array+0x171/0x1de acpi_boot_init+0xed/0x50f The value of @apicid(from acpi MADT) is equal to the value of "apicid" showed by /proc/cpuinfo as proved by our debug printk. b)read_apic_id() gets the value from LAPIC ID register which is "apicid" as well.
While the value of "initial apicid" is from cpuid instruction.
One example of "apicid" and "initial apicid" of cpu0 from /proc/cpuinfo on AMD machine: apicid : 32 initial apicid : 0
Therefore, we should assign /proc/cpuifo "apicid" to "disable_cpu_apicid=X".
We've never met such issue before, because we usually tested "nr_cpus=1", and mostly on Intel machines, and "apicid" and "initial apicid" have the same value in most cases on Intel machines.
Signed-off-by: Xunlei Pang xlpang@redhat.com
kdumpctl | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/kdumpctl b/kdumpctl index 4d6b3e8..46b65d2 100755 --- a/kdumpctl +++ b/kdumpctl @@ -77,15 +77,15 @@ remove_cmdline_param() }
# -# This function returns the "initial apicid" of the -# boot cpu (cpu 0) if present. +# This function returns the "apicid" of the boot +# cpu (cpu 0) if present. # -get_bootcpu_initial_apicid() +get_bootcpu_apicid() { awk ' \ BEGIN { CPU = "-1"; } \ $1=="processor" && $2==":" { CPU = $NF; } \
- CPU=="0" && /initial apicid/ { print $NF; } \
- CPU=="0" && /^apicid/ { print $NF; } \ ' \ /proc/cpuinfo
} @@ -206,7 +206,7 @@ prepare_cmdline()
cmdline="${cmdline} ${KDUMP_COMMANDLINE_APPEND}"
- id=`get_bootcpu_initial_apicid`
- id=`get_bootcpu_apicid` if [ ! -z ${id} ] ; then cmdline=`append_cmdline "${cmdline}" disable_cpu_apicid ${id}` fi
-- 1.8.3.1 _______________________________________________ kexec mailing list -- kexec@lists.fedoraproject.org To unsubscribe send an email to kexec-leave@lists.fedoraproject.org
On 05/09/2017 at 07:52 PM, Xunlei Pang wrote:
We met a problem on AMD machines, when using "nr_cpus=4" for kdump, and crash happens on cpus other than cpu0, kdump kernel will fail to boot and eventually reset.
After some debugging, we found that it stuck at the kernel path do_boot_cpu()-> ... ->wakeup_secondary_cpu_via_init(): apic_icr_write(APIC_INT_LEVELTRIG|APIC_INT_ASSERT|APIC_DM_INIT, phys_apicid); that is, it stuck at sending INIT from AP to BP and reset, which is actually what "disable_cpu_apicid=X" tries to solve. Printing the value of @phys_apicid showed that it was the value of "apicid" other that of "initial apicid" showed by /proc/cpuinfo.
As described in x86 specification: "In MP systems, the local APIC ID is also used as a processor ID by the BIOS and the operating system. Some processors permit software to modify the APIC ID. However, the ability of software to modify the APIC ID is processor model specific. Because of this, operating system software should avoid writing to the local APIC ID register. The value returned by bits 31-24 of the EBX register (when the CPUID instruction is executed with a source operand value of 1 in the EAX register) is always the Initial APIC ID (determined by the platform initialization). This is true even if software has changed the value in the Local APIC ID register."
From kernel commit 151e0c7de("x86, apic, kexec: Add disable_cpu_apicid kernel parameter"), we can see in generic_processor_info(), it uses a)read_apic_id() and b)@apicid to compare with @disabled_cpu_apicid.
a)@apicid which is actually @phys_apicid above-mentioned is from the following calltrace(on the problematic AMD machine): generic_processor_info+0x37/0x300 acpi_register_lapic+0x30/0x90 acpi_parse_lapic+0x40/0x50 acpi_table_parse_entries_array+0x171/0x1de acpi_boot_init+0xed/0x50f The value of @apicid(from acpi MADT) is equal to the value of "apicid" showed by /proc/cpuinfo as proved by our debug printk. b)read_apic_id() gets the value from LAPIC ID register which is "apicid" as well.
While the value of "initial apicid" is from cpuid instruction.
One example of "apicid" and "initial apicid" of cpu0 from /proc/cpuinfo on AMD machine: apicid : 32 initial apicid : 0
Therefore, we should assign /proc/cpuifo "apicid" to "disable_cpu_apicid=X".
We've never met such issue before, because we usually tested "nr_cpus=1", and mostly on Intel machines, and "apicid" and "initial apicid" have the same value in most cases on Intel machines.
According previous discussions, this patch looks correct to use "apicid".
Ping Dave, can we have this one now?
Regards, Xunlei
Signed-off-by: Xunlei Pang xlpang@redhat.com
kdumpctl | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/kdumpctl b/kdumpctl index 4d6b3e8..46b65d2 100755 --- a/kdumpctl +++ b/kdumpctl @@ -77,15 +77,15 @@ remove_cmdline_param() }
# -# This function returns the "initial apicid" of the -# boot cpu (cpu 0) if present. +# This function returns the "apicid" of the boot +# cpu (cpu 0) if present. # -get_bootcpu_initial_apicid() +get_bootcpu_apicid() { awk ' \ BEGIN { CPU = "-1"; } \ $1=="processor" && $2==":" { CPU = $NF; } \
- CPU=="0" && /initial apicid/ { print $NF; } \
- CPU=="0" && /^apicid/ { print $NF; } \ ' \ /proc/cpuinfo
} @@ -206,7 +206,7 @@ prepare_cmdline()
cmdline="${cmdline} ${KDUMP_COMMANDLINE_APPEND}"
- id=`get_bootcpu_initial_apicid`
- id=`get_bootcpu_apicid` if [ ! -z ${id} ] ; then cmdline=`append_cmdline "${cmdline}" disable_cpu_apicid ${id}` fi
Hi Xunlei, On 07/13/17 at 11:50am, Xunlei Pang wrote:
On 05/09/2017 at 07:52 PM, Xunlei Pang wrote:
We met a problem on AMD machines, when using "nr_cpus=4" for kdump, and crash happens on cpus other than cpu0, kdump kernel will fail to boot and eventually reset.
After some debugging, we found that it stuck at the kernel path do_boot_cpu()-> ... ->wakeup_secondary_cpu_via_init(): apic_icr_write(APIC_INT_LEVELTRIG|APIC_INT_ASSERT|APIC_DM_INIT, phys_apicid); that is, it stuck at sending INIT from AP to BP and reset, which is actually what "disable_cpu_apicid=X" tries to solve. Printing the value of @phys_apicid showed that it was the value of "apicid" other that of "initial apicid" showed by /proc/cpuinfo.
As described in x86 specification: "In MP systems, the local APIC ID is also used as a processor ID by the BIOS and the operating system. Some processors permit software to modify the APIC ID. However, the ability of software to modify the APIC ID is processor model specific. Because of this, operating system software should avoid writing to the local APIC ID register. The value returned by bits 31-24 of the EBX register (when the CPUID instruction is executed with a source operand value of 1 in the EAX register) is always the Initial APIC ID (determined by the platform initialization). This is true even if software has changed the value in the Local APIC ID register."
From kernel commit 151e0c7de("x86, apic, kexec: Add disable_cpu_apicid kernel parameter"), we can see in generic_processor_info(), it uses a)read_apic_id() and b)@apicid to compare with @disabled_cpu_apicid.
a)@apicid which is actually @phys_apicid above-mentioned is from the following calltrace(on the problematic AMD machine): generic_processor_info+0x37/0x300 acpi_register_lapic+0x30/0x90 acpi_parse_lapic+0x40/0x50 acpi_table_parse_entries_array+0x171/0x1de acpi_boot_init+0xed/0x50f The value of @apicid(from acpi MADT) is equal to the value of "apicid" showed by /proc/cpuinfo as proved by our debug printk. b)read_apic_id() gets the value from LAPIC ID register which is "apicid" as well.
While the value of "initial apicid" is from cpuid instruction.
One example of "apicid" and "initial apicid" of cpu0 from /proc/cpuinfo on AMD machine: apicid : 32 initial apicid : 0
Therefore, we should assign /proc/cpuifo "apicid" to "disable_cpu_apicid=X".
We've never met such issue before, because we usually tested "nr_cpus=1", and mostly on Intel machines, and "apicid" and "initial apicid" have the same value in most cases on Intel machines.
According previous discussions, this patch looks correct to use "apicid".
Ping Dave, can we have this one now?
I do not know much the x86 specific so I just leave it to you. Since Hatayama is also fine with it, so: Acked-by: Dave Young dyoung@redhat.com
Qiao, can you help to do more testing on both Intel and AMD machines before we apply it?
Thanks Dave