Why does disk I/O slow down a CPU bound task?

Dave Johansen davejohansen at gmail.com
Tue Mar 31 19:21:55 UTC 2015


On Tue, Mar 31, 2015 at 10:44 AM, Richard W.M. Jones <rjones at redhat.com>
wrote:

> On Tue, Mar 31, 2015 at 08:32:16AM -0700, Dave Johansen wrote:
> > I am not familiar with the low level details of disk I/O but I'm sure
> that
> > they are far more complicated than my basic assumptions, but my concern
> is
> > how can a disk-bound process steal cycles from a CPU-bound one that is
> not
> > access the disk at all. The lwn.net articles that drago01 linked to
> helped
> > shed some light on what's going on, but it sounds like there is still
> some
> > potential work that could be done to help improve the situation.
>
> When you run the cpu load test program, where do you write the
> statistics to?
>

I had been redirecting it directly to disk.


> For a fair test you probably want to change the program so it stores
> them in preallocated memory and prints them at the end of the test.
>

You're right that is a problem because my "purely CPU bound task" was
actually writing to disk every 10 seconds, so I've attached an updated
version that pre-allocates a vector and stores the results there so they
can be dumped when the users presses Ctrl-C. With this update, the "CPU
bound task" should only using CPU and existing memory but I still see the
same slow down in the "CPU bound task" when the disk I/O is happening.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.fedoraproject.org/pipermail/devel/attachments/20150331/7353f02e/attachment.html>
-------------- next part --------------
#include <iostream>
#include <vector>
#include <math.h>
#include <signal.h>
#include <time.h>
#include <stdlib.h>

struct Stats {
  Stats(const timespec &t_, const int n_, const double mean_, const double stdDev) :
    t(t_.tv_sec),
    n(n_),
    mean(mean_),
    stdDev(stdDev)
  {
  }

  time_t t;
  int n;
  double mean, stdDev;
};

long do_work(long pseed)
{
  // Just do a bunch of computations to use up CPU
	for (int bnum=0; bnum<300000; ++bnum)
		pseed = pseed * 1103515245 + 12345;
	return pseed;
}

// Buffer the results
std::vector<Stats> g_stats;

// Dump the results when Ctrl-C is presssed
void handle_signal(int n)
{
  for (std::vector<Stats>::const_iterator iter = g_stats.begin(); iter != g_stats.end(); ++iter)
    std::cout << iter->t << ' ' << iter->n << ' ' << iter->mean << ' ' << iter->stdDev << std::endl;
  exit(0);
}

int main()
{
  // Track the time between "work cycles"
	timespec t, lastT;
  clock_gettime(CLOCK_REALTIME, &lastT);
  // And the next time an output should happen
	time_t nextOutputT = lastT.tv_sec + 10;

  // Pre-allocate the buffer for the stats for > 1 day
  g_stats.reserve(10000);
  // And create the handler for dumping the stats
  signal(SIGINT, &handle_signal);

	int n = 0;
	double mean = 0;
	double m2 = 0;
	int sum = 0;
  // Loop for a long time instead of infinitely so the compiler won't optimize away anything
	while (n < 1000000000) {
    // Do some work
		sum += do_work(lastT.tv_nsec);

    // Get the current time
		int retVal = clock_gettime(CLOCK_REALTIME, &t);
		if (retVal == 0) {
      // And calculate the statistics of the time between "work cycles"
			long dT = (t.tv_sec - lastT.tv_sec) * 1000000000 + (t.tv_nsec - lastT.tv_nsec);
			++n;
			double delta = dT - mean;
			mean += delta / n;
			m2 += delta * (dT - mean);
		} else {
			std::cerr << "Error getting time: " << retVal << std::endl;
      return -1;
		}

    // If it's time to output the statistics
		if (t.tv_sec >= nextOutputT) {
      // Then output them
			if (n > 1)
				m2 = sqrt(m2 / (n - 1));
      g_stats.push_back(Stats(t, n, mean, m2));

      // Reset the statisitics
			n = 0;
			mean = 0;
			m2 = 0;

      // And record the next time an output should happen
			nextOutputT = t.tv_sec + 10;
		}

    // Save the current time for calculating time between "work cycles"
		lastT = t;
	}

  // Return the value so the compiler won't optimize away anything
	return sum;
}


More information about the devel mailing list