I am running Fedora 9 x86 64 bit. What is the kernel timetick per thread? How many threads per second does the kernel run?
Steve
Steve West wrote:
I am running Fedora 9 x86 64 bit. What is the kernel timetick per thread? How many threads per second does the kernel run?
Probably not quite what you are asking but here goes: http://kerneltrap.org/node/464
run for a few seconds: $ vmstat 1
look at system|in = interrupts per second. this is approximately the interupts per second or timer Hz value.
from the kernel config parameter HZ_1000 etc: getconf CLK_TCK
DaveT.
Steve West wrote:
I am running Fedora 9 x86 64 bit. What is the kernel timetick per thread? How many threads per second does the kernel run?
Probably not quite what you are asking but here goes: http://kerneltrap.org/node/464
run for a few seconds: $ vmstat 1
look at system|in = interrupts per second. this is approximately the interupts per second or timer Hz value.
from the kernel config parameter HZ_1000 etc: getconf CLK_TCK
DaveT.
Is there ay way to set the ticks without rebuilding the kernel?
Steve
On Mon, 2008-11-17 at 12:49 -0500, Steve West wrote:
Steve West wrote:
I am running Fedora 9 x86 64 bit. What is the kernel timetick per thread? How many threads per second does the kernel run?
Probably not quite what you are asking but here goes: http://kerneltrap.org/node/464
run for a few seconds: $ vmstat 1
look at system|in = interrupts per second. this is approximately the interupts per second or timer Hz value.
from the kernel config parameter HZ_1000 etc: getconf CLK_TCK
DaveT.
Is there ay way to set the ticks without rebuilding the kernel?
Steve
Nope. HZ is a static definition. (#define'ed in linux/include/asm-$ARCH/param.h)
- Gilboa
On Mon, 2008-11-17 at 12:49 -0500, Steve West wrote:
Steve West wrote:
I am running Fedora 9 x86 64 bit. What is the kernel timetick per thread? How many threads per second does the kernel run?
Probably not quite what you are asking but here goes: http://kerneltrap.org/node/464
run for a few seconds: $ vmstat 1
look at system|in = interrupts per second. this is approximately the interupts per second or timer Hz value.
from the kernel config parameter HZ_1000 etc: getconf CLK_TCK
DaveT.
Is there ay way to set the ticks without rebuilding the kernel?
Perhaps if you explained what you are trying to achieve people might be able to help you get there.
poc
Steve West wrote:
I am running Fedora 9 x86 64 bit. What is the kernel timetick per thread? How many threads per second does the kernel run?
Probably not quite what you are asking but here goes: http://kerneltrap.org/node/464
run for a few seconds: $ vmstat 1
look at system|in = interrupts per second. this is approximately the interupts per second or timer Hz value.
from the kernel config parameter HZ_1000 etc: getconf CLK_TCK
DaveT.
Is there ay way to set the ticks without rebuilding the kernel?
Perhaps if you explained what you are trying to achieve people might be able to help you get there.
poc
I have an application/service that has 1000 or so threads. Most of these are TCPIP socket accept and connect. I want to be able to run all the threads in a second or so to achieve a reasonable throughput. I would like the kernel to run 1000 threads per second. Right now I think it is set for 100 ticks per second in f9 x86 64bit.
Steve
On Mon, 2008-11-17 at 14:45 -0500, Steve West wrote:
Steve West wrote:
I am running Fedora 9 x86 64 bit. What is the kernel timetick per thread? How many threads per second does the kernel run?
Probably not quite what you are asking but here goes: http://kerneltrap.org/node/464
run for a few seconds: $ vmstat 1
look at system|in = interrupts per second. this is approximately the interupts per second or timer Hz value.
from the kernel config parameter HZ_1000 etc: getconf CLK_TCK
DaveT.
Is there ay way to set the ticks without rebuilding the kernel?
Perhaps if you explained what you are trying to achieve people might be able to help you get there.
poc
I have an application/service that has 1000 or so threads. Most of these are TCPIP socket accept and connect. I want to be able to run all the threads in a second or so to achieve a reasonable throughput. I would like the kernel to run 1000 threads per second. Right now I think it is set for 100 ticks per second in f9 x86 64bit.
The ticks matter when the threads are competing for cpu, but it looks like in your case they'll mostly be waiting for socket calls (during which the schedular will hand off to another thread anyway), so increasing the timeslice frequency is probably not going to make a difference. Hard to know without testing of course.
poc
David Timms wrote:
Steve West wrote:
I am running Fedora 9 x86 64 bit. What is the kernel timetick per thread? How many threads per second does the kernel run?
Probably not quite what you are asking but here goes: http://kerneltrap.org/node/464
run for a few seconds: $ vmstat 1
look at system|in = interrupts per second. this is approximately the interupts per second or timer Hz value.
from the kernel config parameter HZ_1000 etc: getconf CLK_TCK
I am surprised that the distribution kernel is set so low, and I would have thought it was a tickless kernel by default now.
Steve West wrote:
I am running Fedora 9 x86 64 bit. What is the kernel timetick per thread? How many threads per second does the kernel run?
Probably not quite what you are asking but here goes: http://kerneltrap.org/node/464
run for a few seconds: $ vmstat 1
look at system|in = interrupts per second. this is approximately the interupts per second or timer Hz value.
from the kernel config parameter HZ_1000 etc: getconf CLK_TCK
DaveT.
Is there ay way to set the ticks without rebuilding the kernel?
Perhaps if you explained what you are trying to achieve people might be able to help you get there.
poc
I have an application/service that has 1000 or so threads. Most of these are TCPIP socket accept and connect. I want to be able to run all the threads in a second or so to achieve a reasonable throughput. I would like the kernel to run 1000 threads per second. Right now I think it is set for 100 ticks per second in f9 x86 64bit.
The ticks matter when the threads are competing for cpu, but it looks like in your case they'll mostly be waiting for socket calls (during which the schedular will hand off to another thread anyway), so increasing the timeslice frequency is probably not going to make a difference. Hard to know without testing of course.
poc
Yes you are correct under "NORMAL" circumstances 100 ticks per second would be ok. But if I design for worst case where all threads are running I need 1000 ticks per second, or response will not be good. I did not want to build a custom kernel, but it looks like I may have to to achieve the design goal.
Steve
Steve
On 17Nov2008 15:19, Steve West steve@cyglan.com wrote:
The ticks matter when the threads are competing for cpu, but it looks like in your case they'll mostly be waiting for socket calls (during which the schedular will hand off to another thread anyway), so increasing the timeslice frequency is probably not going to make a difference. Hard to know without testing of course.
Yes you are correct under "NORMAL" circumstances 100 ticks per second would be ok. But if I design for worst case where all threads are running I need 1000 ticks per second, or response will not be good. I did not want to build a custom kernel, but it looks like I may have to to achieve the design goal.
Do you really expect all 1000 to be CPU bound? Most machines would give terrible response no matter the tick rate.
As I recall, the tick rate only matters when the threads are, as Patrick said, _competing_ for CPU. Most TCP handlers will wake up, accept the connection, maybe send some data, and then _block_. Really fast. Well under 1/1000th of a second. The tick rate is how long a single thread can use the CPU flat out before a new scheduling decision is made. If, like most handlers, your threads wake up, do something, then block (including reading from disk) then it's not such an issue.
Can you elaborate on your app; it is really going to try to run 1000 CPU bound threads? It will be prtetty unusual.
Cheers,
The ticks matter when the threads are competing for cpu, but it looks like in your case they'll mostly be waiting for socket calls (during which the schedular will hand off to another thread anyway), so increasing the timeslice frequency is probably not going to make a difference. Hard to know without testing of course.
Yes you are correct under "NORMAL" circumstances 100 ticks per second would be ok. But if I design for worst case where all threads are running I need 1000 ticks per second, or response will not be good. I did not want to build a custom kernel, but it looks like I may have to to achieve the design goal.
Do you really expect all 1000 to be CPU bound? Most machines would give terrible response no matter the tick rate.
As I recall, the tick rate only matters when the threads are, as Patrick said, _competing_ for CPU. Most TCP handlers will wake up, accept the connection, maybe send some data, and then _block_. Really fast. Well under 1/1000th of a second. The tick rate is how long a single thread can use the CPU flat out before a new scheduling decision is made. If, like most handlers, your threads wake up, do something, then block (including reading from disk) then it's not such an issue.
Can you elaborate on your app; it is really going to try to run 1000 CPU bound threads? It will be prtetty unusual.
Maybe you are correct, I did not take into account that TCP would receive/send/block in under 1/1000th of a second. If that is the case then 100 ticks per second would be ok. I will have to do some testing to evaluate this.
Steve
On Tue, 2008-11-18 at 07:51 +1100, Cameron Simpson wrote:
Do you really expect all 1000 to be CPU bound? Most machines would give terrible response no matter the tick rate.
As I recall, the tick rate only matters when the threads are, as Patrick said, _competing_ for CPU. Most TCP handlers will wake up, accept the connection, maybe send some data, and then _block_. Really fast. Well under 1/1000th of a second. The tick rate is how long a single thread can use the CPU flat out before a new scheduling decision is made. If, like most handlers, your threads wake up, do something, then block (including reading from disk) then it's not such an issue.
Can you elaborate on your app; it is really going to try to run 1000 CPU bound threads? It will be prtetty unusual.
Agreed. If you really do have 1000 CPU bound threads competing for the CPU, you should be looking at a redesign.
Separate user interaction into a high priority thread pool, and dispatch long running computational tasks to a second, lower priority thread pool. If there is a lot of separate I/O going on as well, put it either into the high priority thread pool, or a medium priority pool in the middle.
The lower priority computational pool should be sized on the order of 1x to 2x the number of cores, depending on whether it's strictly computation or there's some I/O in the mix. If there is a lot of I/O blocking there, you might go even larger on that pool. You'll want to think about how much memory those computational threads need, to avoid thrashing the paging system.
If you're trying to service 1000's of sockets, you should at least take a look at select(2) / asynchronous I/O as a way to cut down on the number of threads and avoid the overhead or many context switches. Remember that each thread has its own stack, so a thousand 8k stacks chews up 8M of memory (not sure what the standard stack size is these days).
Wayne.
On 17Nov2008 21:24, Wayne Feick waf@brunz.org wrote: | If you're trying to service 1000's of sockets, you should at least take | a look at select(2)
Select() should be replaced with epoll() on modern linux. Select() has O(n) performance with the number of filedescriptors (because the bitmap must be scanned for bits.
| / asynchronous I/O as a way to cut down on the | number of threads and avoid the overhead or many context switches.
I seem to recall that lots of threads (up to the point of "too many") beats asynch I/O these days.
But this may well be too low level for Steve's task.
Steve, do a bit of load testing and see where things start to get bad.
Cheers,
On Mon, 2008-11-17 at 14:45 -0500, Steve West wrote:
Steve West wrote:
I am running Fedora 9 x86 64 bit. What is the kernel timetick per thread? How many threads per second does the kernel run?
Probably not quite what you are asking but here goes: http://kerneltrap.org/node/464
run for a few seconds: $ vmstat 1
look at system|in = interrupts per second. this is approximately the interupts per second or timer Hz value.
from the kernel config parameter HZ_1000 etc: getconf CLK_TCK
DaveT.
Is there ay way to set the ticks without rebuilding the kernel?
Perhaps if you explained what you are trying to achieve people might be able to help you get there.
poc
I have an application/service that has 1000 or so threads. Most of these are TCPIP socket accept and connect. I want to be able to run all the threads in a second or so to achieve a reasonable throughput. I would like the kernel to run 1000 threads per second. Right now I think it is set for 100 ticks per second in f9 x86 64bit.
Steve
Having you considered using a far smaller number of threads and select/poll to wait on the sockets instead of allocating thread-per-client.
In my experience, unless your server code is I/O bound (in which case, you can always use asyncio and/or I/O worker threads), a single thread can max out the bandwidth of 1Gbps line and handle 100's of clients.
- Gilboa
Steve West wrote:
Steve West wrote: > I am running Fedora 9 x86 64 bit. What is the kernel timetick per > thread? How many threads per second does the kernel run? Probably not quite what you are asking but here goes: http://kerneltrap.org/node/464
run for a few seconds: $ vmstat 1
look at system|in = interrupts per second. this is approximately the interupts per second or timer Hz value.
from the kernel config parameter HZ_1000 etc: getconf CLK_TCK
DaveT.
Is there ay way to set the ticks without rebuilding the kernel?
Perhaps if you explained what you are trying to achieve people
might be
able to help you get there.
poc
I have an application/service that has 1000 or so threads. Most of these are TCPIP socket accept and connect. I want to be able to run all the threads in a second or so to achieve a reasonable throughput. I would like the kernel to run 1000 threads per second. Right now I think it is set for 100 ticks per second in f9 x86 64bit.
The ticks matter when the threads are competing for cpu, but it looks like in your case they'll mostly be waiting for socket calls (during which the schedular will hand off to another thread anyway), so increasing the timeslice frequency is probably not going to make a difference. Hard to know without testing of course.
poc
Yes you are correct under "NORMAL" circumstances 100 ticks per second would be ok. But if I design for worst case where all threads are running I need 1000 ticks per second, or response will not be good. I did not want to build a custom kernel, but it looks like I may have to to achieve the design goal.
That depends on whether your design goal is to have the ticks at 1000 or to have the system respond properly. That are not the same thing.
Before you start building kernels, run vmstat and look at the context switching and interrupt rates. I would expect them to be over 1000 under load, indicating that something else is limiting response.
Then look at understand the tunable parameters in the /proc/sys/kernel/sched_* area, I found major changes in that area, and traded a small bit of total performance for far more response. I posted many things to the kernel mailing list, but the meaning of the bits in the "features" has changed since 2.6.18 or so, and what I did doesn't work the same way. Lots of room to tune there, though, before going down the road to maintaining your own config.