Why does disk I/O slow down a CPU bound task?
Dave Johansen
davejohansen at gmail.com
Tue Mar 31 19:21:55 UTC 2015
On Tue, Mar 31, 2015 at 10:44 AM, Richard W.M. Jones <rjones at redhat.com>
wrote:
> On Tue, Mar 31, 2015 at 08:32:16AM -0700, Dave Johansen wrote:
> > I am not familiar with the low level details of disk I/O but I'm sure
> that
> > they are far more complicated than my basic assumptions, but my concern
> is
> > how can a disk-bound process steal cycles from a CPU-bound one that is
> not
> > access the disk at all. The lwn.net articles that drago01 linked to
> helped
> > shed some light on what's going on, but it sounds like there is still
> some
> > potential work that could be done to help improve the situation.
>
> When you run the cpu load test program, where do you write the
> statistics to?
>
I had been redirecting it directly to disk.
> For a fair test you probably want to change the program so it stores
> them in preallocated memory and prints them at the end of the test.
>
You're right that is a problem because my "purely CPU bound task" was
actually writing to disk every 10 seconds, so I've attached an updated
version that pre-allocates a vector and stores the results there so they
can be dumped when the users presses Ctrl-C. With this update, the "CPU
bound task" should only using CPU and existing memory but I still see the
same slow down in the "CPU bound task" when the disk I/O is happening.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.fedoraproject.org/pipermail/devel/attachments/20150331/7353f02e/attachment.html>
-------------- next part --------------
#include <iostream>
#include <vector>
#include <math.h>
#include <signal.h>
#include <time.h>
#include <stdlib.h>
struct Stats {
Stats(const timespec &t_, const int n_, const double mean_, const double stdDev) :
t(t_.tv_sec),
n(n_),
mean(mean_),
stdDev(stdDev)
{
}
time_t t;
int n;
double mean, stdDev;
};
long do_work(long pseed)
{
// Just do a bunch of computations to use up CPU
for (int bnum=0; bnum<300000; ++bnum)
pseed = pseed * 1103515245 + 12345;
return pseed;
}
// Buffer the results
std::vector<Stats> g_stats;
// Dump the results when Ctrl-C is presssed
void handle_signal(int n)
{
for (std::vector<Stats>::const_iterator iter = g_stats.begin(); iter != g_stats.end(); ++iter)
std::cout << iter->t << ' ' << iter->n << ' ' << iter->mean << ' ' << iter->stdDev << std::endl;
exit(0);
}
int main()
{
// Track the time between "work cycles"
timespec t, lastT;
clock_gettime(CLOCK_REALTIME, &lastT);
// And the next time an output should happen
time_t nextOutputT = lastT.tv_sec + 10;
// Pre-allocate the buffer for the stats for > 1 day
g_stats.reserve(10000);
// And create the handler for dumping the stats
signal(SIGINT, &handle_signal);
int n = 0;
double mean = 0;
double m2 = 0;
int sum = 0;
// Loop for a long time instead of infinitely so the compiler won't optimize away anything
while (n < 1000000000) {
// Do some work
sum += do_work(lastT.tv_nsec);
// Get the current time
int retVal = clock_gettime(CLOCK_REALTIME, &t);
if (retVal == 0) {
// And calculate the statistics of the time between "work cycles"
long dT = (t.tv_sec - lastT.tv_sec) * 1000000000 + (t.tv_nsec - lastT.tv_nsec);
++n;
double delta = dT - mean;
mean += delta / n;
m2 += delta * (dT - mean);
} else {
std::cerr << "Error getting time: " << retVal << std::endl;
return -1;
}
// If it's time to output the statistics
if (t.tv_sec >= nextOutputT) {
// Then output them
if (n > 1)
m2 = sqrt(m2 / (n - 1));
g_stats.push_back(Stats(t, n, mean, m2));
// Reset the statisitics
n = 0;
mean = 0;
m2 = 0;
// And record the next time an output should happen
nextOutputT = t.tv_sec + 10;
}
// Save the current time for calculating time between "work cycles"
lastT = t;
}
// Return the value so the compiler won't optimize away anything
return sum;
}
More information about the devel
mailing list