Why does disk I/O slow down a CPU bound task?

Dave Johansen davejohansen at gmail.com
Wed Apr 1 17:34:17 UTC 2015


On Wed, Apr 1, 2015 at 8:18 AM, Stephen John Smoogen <smooge at gmail.com>
wrote:

>
>
> On 31 March 2015 at 22:53, Dave Johansen <davejohansen at gmail.com> wrote:
>
>> On Tue, Mar 31, 2015 at 3:26 PM, Richard W.M. Jones <rjones at redhat.com>
>> wrote:
>>
>>> On Tue, Mar 31, 2015 at 12:21:55PM -0700, Dave Johansen wrote:
>>> > You're right that is a problem because my "purely CPU bound task" was
>>> > actually writing to disk every 10 seconds, so I've attached an updated
>>> > version that pre-allocates a vector and stores the results there so
>>> they
>>> > can be dumped when the users presses Ctrl-C. With this update, the "CPU
>>> > bound task" should only using CPU and existing memory but I still see
>>> the
>>> > same slow down in the "CPU bound task" when the disk I/O is happening.
>>>
>>> For the definitive test, you might want to add a call to mlockall()
>>> into your program.  It is supposed to lock every page of your process
>>> into RAM (just in case it is being swapped out, and hence using I/O).
>>>
>>> I guess you will also need to run the CPU test program as root.
>>
>>
>> I added the call to mlockall() (it did have to be run as root) on a F21
>> machine with no swap and the slow down was still visible in the "CPU bound
>> task".
>>
>>
> The slowdown isn't going to go away with mlockall etc. All that is to do
> is so you can have a better idea of where the slowdown is by removing
> various noise. There isn't going to be any quick fix for the slowdown. At
> best you can figure out the points in the kernel that might need fixing..
> but realize this is a problem which has been around for at least a decade
> and a half. It has been looked at in various forms over and over and over
> again by kernel devels. It isn't going to be fixed quickly or easily so
> expect that if you are interested in figuring it out and helping the devels
> that there isn't going to be an quick fix. It is going to be a long hard
> slog.
>

I'm ok with it taking a while to look into this issue. The kernel is a
complex system and this is an issue that seems to touch more than one of
the most complex parts, so I imagine that analyzing this and understanding
the source of the issue will take quite a bit of time. I'm hoping to be
able to help improve the kernel in the long term and that's why I am trying
to make as simple of a reproducer as possible. It seems like I'm starting
to reach the point where as much of the noise as possible has been removed
and the effect is small but the issue is still distinctly reproducible.
So my next question would then be, "what's next?". Is running the "CPU
bound task" with something like perf record while doing the disk I/O and
not doing the disk I/O the right thing? Or is there a better option that
can help try and isolate where the issue is happening?
Thanks,
Dave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.fedoraproject.org/pipermail/devel/attachments/20150401/877b7999/attachment.html>


More information about the devel mailing list