Hi, Just a quick question. To determine how many cpu's/cores/channels the kernel is configured to support, do I need to look at the kernel source to determine if all of the cores I have are capable of being used, or is there something else available to tell me without going to the source?
regards, Steve
On Tue, Jun 14, 2022 at 5:17 PM Stephen Morris samorris@netspace.net.au wrote:
Just a quick question. To determine how many cpu's/cores/channelsthe kernel is configured to support, do I need to look at the kernel source to determine if all of the cores I have are capable of being used, or is there something else available to tell me without going to the source?
The values the kernel was configured with are in /boot/config-<version>. If I understand your question correctly, you are looking for CONFIG_NR_CPUS, which is 8192 on my machine.
On 15/6/22 09:44, Jerry James wrote:
On Tue, Jun 14, 2022 at 5:17 PM Stephen Morris samorris@netspace.net.au wrote:
Just a quick question. To determine how many cpu's/cores/channelsthe kernel is configured to support, do I need to look at the kernel source to determine if all of the cores I have are capable of being used, or is there something else available to tell me without going to the source?
The values the kernel was configured with are in /boot/config-<version>. If I understand your question correctly, you are looking for CONFIG_NR_CPUS, which is 8192 on my machine.
Thanks Jerry, I checked that file and mine is the same as yours, so I don't need to modify the kernel to use all 16 cores/32 channels for my cpu.
regards, Steve
On Tue, Jun 14, 2022 at 05:44:27PM -0600, Jerry James wrote:
Just a quick question. To determine how many cpu's/cores/channelsthe kernel is configured to support, do I need to look at the kernel source to determine if all of the cores I have are capable of being used, or is there something else available to tell me without going to the source?
The values the kernel was configured with are in /boot/config-<version>. If I understand your question correctly, you are looking for CONFIG_NR_CPUS, which is 8192 on my machine.
You can also look at /proc/cpuinfo for what's actually detected — or run `lscpu` for a more human-readable view (especially when there are a lot of identical cores!)
Or, `cpu-x` for a GUI view with a lot of detail.
On 22/6/22 23:54, Matthew Miller wrote:
On Tue, Jun 14, 2022 at 05:44:27PM -0600, Jerry James wrote:
Just a quick question. To determine how many cpu's/cores/channelsthe kernel is configured to support, do I need to look at the kernel source to determine if all of the cores I have are capable of being used, or is there something else available to tell me without going to the source?
The values the kernel was configured with are in /boot/config-<version>. If I understand your question correctly, you are looking for CONFIG_NR_CPUS, which is 8192 on my machine.
You can also look at /proc/cpuinfo for what's actually detected — or run `lscpu` for a more human-readable view (especially when there are a lot of identical cores!)
Or, `cpu-x` for a GUI view with a lot of detail.
Thanks Greg. I installed cpu-x and tried all the commands. What makes the first two processes difficult from my perspective is the cpu I have has 32 treads all of which are the same so the first two processes lists all 32. I ran the cpu-x bench marks for random numbers and what was interesting was the results for 32 threads were only around 16 times the result for 1 thread, which is probably to be expected given the cpu has 16 cores.
regards, Steve
On Wed, Jun 29, 2022 at 5:43 AM Stephen Morris samorris@netsace.net.au wrote:
On 22/6/22 23:54, Matthew Miller wrote:
[...]
Or, `cpu-x` for a GUI view with a lot of detail. Thanks Greg. I installed cpu-x and tried all the commands. What makes the first two processes difficult from my perspective is the cpu I have has 32 treads all of which are the same so the first two processes lists all 32. I ran the cpu-x bench marks for random numbers and what was interesting was the results for 32 threads were only around 16 times the result for 1 thread, which is probably to be expected given the cpu has 16 cores.
My experience was that disabling hyperthreading didn't reduce throughput. These multi-core systems generally do better with integer workloads, I think some have one f.p. unit per core. There can be very counterintuitive performance changes due to CPU cache issues and communications overhead. My experience is mostly with I/O intensive workloads. We generally found it best to limit those tasks to a fraction of the cores so background tasks (job control/monitoring, backups, etc) didn't stall. After a big effort to make efficient use of all the cores you may encounter thermal throttling. It was better to adjust the workload to avoid thermal issues: more consistent thruput and fewer issues with background tasks.
If you have cache stalls in the algorithm/benchmark and/or io sections that have to be waited on then hyperthreading will usually help.
If the code is a nice tight loop that correctly/full uses the cpu with minimal cache stalls then hyperthreading will hurt.
I was doing some benchmarks and kind of accidentally found out that if you run 1 single cpu benchmark on an idle system (36 real cores) disabling hyperthreading on the system and/or pinning the benchmark to a cpu with that cpus specific hypertread being offline (finding the cpu in /sys and echoing 0 to online to disable it) resulted in a consistent measurable speed up (1-2% I think was the amount) on a idle system. The cost of the ht simply existing was a 1-2% of less work being done by the primary thread because the ht caused the primary thread to be inefficient in some manner (stolen cache and/or stolen cpu cycles).
On the newer cpus Intel is using thermal throttling to determine how hard it can push the turboboost frequencies, so thermal throttling is expected and is not a concern. And when I was testing several different sockets+memory each cpu/socket seemed to thermal throttle at different frequencies (all well above the rated frequency, but some sockets were always a few % faster than others and a few % higher frequencies before thermal throttling). I also noticed that the frequency the cpu could obtain consistently at the edge of thermal throttling seem to be different (as would be expected) based on the benchmark. Simple benchmarks went all of the way to the max turboboost frequency (+600Mhz), and other benchmarks only allowed +200Mhz over rated before overheating and reducing turboboost.
On Wed, Jun 29, 2022 at 6:32 AM George N. White III gnwiii@gmail.com wrote:
On Wed, Jun 29, 2022 at 5:43 AM Stephen Morris samorris@netsace.net.au wrote:
On 22/6/22 23:54, Matthew Miller wrote:
[...]
Or, `cpu-x` for a GUI view with a lot of detail.
Thanks Greg. I installed cpu-x and tried all the commands. What makes the first two processes difficult from my perspective is the cpu I have has 32 treads all of which are the same so the first two processes lists all 32. I ran the cpu-x bench marks for random numbers and what was interesting was the results for 32 threads were only around 16 times the result for 1 thread, which is probably to be expected given the cpu has 16 cores.
My experience was that disabling hyperthreading didn't reduce throughput. These multi-core systems generally do better with integer workloads, I think some have one f.p. unit per core. There can be very counterintuitive performance changes due to CPU cache issues and communications overhead. My experience is mostly with I/O intensive workloads. We generally found it best to limit those tasks to a fraction of the cores so background tasks (job control/monitoring, backups, etc) didn't stall. After a big effort to make efficient use of all the cores you may encounter thermal throttling. It was better to adjust the workload to avoid thermal issues: more consistent thruput and fewer issues with background tasks.
-- George N. White III
users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
On 29/6/22 21:29, George N. White III wrote:
On Wed, Jun 29, 2022 at 5:43 AM Stephen Morris samorris@netsace.net.au wrote:
On 22/6/22 23:54, Matthew Miller wrote: > [...] > Or, `cpu-x` for a GUI view with a lot of detail. Thanks Greg. I installed cpu-x and tried all the commands. What makes the first two processes difficult from my perspective is the cpu I have has 32 treads all of which are the same so the first two processes lists all 32. I ran the cpu-x bench marks for random numbers and what was interesting was the results for 32 threads were only around 16 times the result for 1 thread, which is probably to be expected given the cpu has 16 cores.My experience was that disabling hyperthreading didn't reduce throughput. These multi-core systems generally do better with integer workloads, I think some have one f.p. unit per core. There can be very counterintuitive performance changes due to CPU cache issues and communications overhead. My experience is mostly with I/O intensive workloads. We generally found it best to limit those tasks to a fraction of the cores so background tasks (job control/monitoring, backups, etc) didn't stall. After a big effort to make efficient use of all the cores you may encounter thermal throttling. It was better to adjust the workload to avoid thermal issues: more consistent thruput and fewer issues with background tasks.
Thanks George, I have experienced the same things myself. Having installed cpu-x for the first time I was just trying it out, and it default benchmark mode was 1 minute and 1 thread with slow random number generation and fast random number generation, I haven't done any investigation to determine exactly what that means. So a ran a test on the default setting, then ran a test on 32 threads which returned a result that was around 16 times the results from a single thread. Which I was not surprised at as being a 16 core dual channel cpu, and hence Windows and Fedora consider it to have 32 logical cores, it doesn't matter how many channel are available for each core the total throughput is only what a core is capable of producing.
regards, Steve
-- George N. White III
users mailing list --users@lists.fedoraproject.org To unsubscribe send an email tousers-leave@lists.fedoraproject.org Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives:https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org Do not reply to spam on the list, report it:https://pagure.io/fedora-infrastructure
On 29/6/22 22:14, Roger Heflin wrote:
If you have cache stalls in the algorithm/benchmark and/or io sections that have to be waited on then hyperthreading will usually help.
If the code is a nice tight loop that correctly/full uses the cpu with minimal cache stalls then hyperthreading will hurt.
I was doing some benchmarks and kind of accidentally found out that if you run 1 single cpu benchmark on an idle system (36 real cores) disabling hyperthreading on the system and/or pinning the benchmark to a cpu with that cpus specific hypertread being offline (finding the cpu in /sys and echoing 0 to online to disable it) resulted in a consistent measurable speed up (1-2% I think was the amount) on a idle system. The cost of the ht simply existing was a 1-2% of less work being done by the primary thread because the ht caused the primary thread to be inefficient in some manner (stolen cache and/or stolen cpu cycles).
On the newer cpus Intel is using thermal throttling to determine how hard it can push the turboboost frequencies, so thermal throttling is expected and is not a concern. And when I was testing several different sockets+memory each cpu/socket seemed to thermal throttle at different frequencies (all well above the rated frequency, but some sockets were always a few % faster than others and a few % higher frequencies before thermal throttling). I also noticed that the frequency the cpu could obtain consistently at the edge of thermal throttling seem to be different (as would be expected) based on the benchmark. Simple benchmarks went all of the way to the max turboboost frequency (+600Mhz), and other benchmarks only allowed +200Mhz over rated before overheating and reducing turboboost.
I have an AMD Ryzen 9 5980X cpu with overclocking not set in the motherboard bios so I haven't done any tests on how far I can push the cpu before thermal limitations start to kick in. Also I had only just installed cpu-x for the first time and I was just testing it out to see what it produced.
I also have an ASUS Nvidia RTX 3080 graphics card with 12GB of memory which I have run the 3D Mark and Unigine Heaven benchmark on to see what they produce, and what was interesting with the 3D Mark test was, either 3D Mark was doing something funny with its 2nd graphics test or that graphics card couldn't handle it, as instead of showing the video like it did for the first test it started to show the same video then almost immediately changed it to show what I can only describe as floating bubbles on a black background. With the Unigine Benchmark running the graphics card (as measured by Unigine) sat at around 78C to 79C, which is will within tolerances.
regards, Steve
regards, Steve
On Wed, Jun 29, 2022 at 6:32 AM George N. White III gnwiii@gmail.com wrote:
On Wed, Jun 29, 2022 at 5:43 AM Stephen Morris samorris@netsace.net.au wrote:
On 22/6/22 23:54, Matthew Miller wrote:
[...] Or, `cpu-x` for a GUI view with a lot of detail.
Thanks Greg. I installed cpu-x and tried all the commands. What makes the first two processes difficult from my perspective is the cpu I have has 32 treads all of which are the same so the first two processes lists all 32. I ran the cpu-x bench marks for random numbers and what was interesting was the results for 32 threads were only around 16 times the result for 1 thread, which is probably to be expected given the cpu has 16 cores.
My experience was that disabling hyperthreading didn't reduce throughput. These multi-core systems generally do better with integer workloads, I think some have one f.p. unit per core. There can be very counterintuitive performance changes due to CPU cache issues and communications overhead. My experience is mostly with I/O intensive workloads. We generally found it best to limit those tasks to a fraction of the cores so background tasks (job control/monitoring, backups, etc) didn't stall. After a big effort to make efficient use of all the cores you may encounter thermal throttling. It was better to adjust the workload to avoid thermal issues: more consistent thruput and fewer issues with background tasks.
-- George N. White III
users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure