NUMA, migrate/N, and tuned-adm

David Timothy Strauss david at davidstrauss.net
Mon Dec 16 07:44:45 UTC 2013


We're running Fedora 19 with the 3.11.10-200.fc19.x86_64 kernel (just
the normal RPM) on large servers (128GB RAM over two NUMA regions,
each with one hex-core processor) with a large number of processes
(more than 700, a couple hundred of which are active fairly
frequently). We encounter situations where there system gets
overwhelmed with migrate/N tasks from the kernel, based on what we've
seen in top.

Here's what we've tried:
 * tuned-adm on latency-performance and virtual-host profiles
 * kernel.sched_migration_cost_ns=5000000 (which tuned will do for
those profiles in v3.3/Fedora 20)
 * numad

Here's what we've used for analysis:
 * powertop
 * top/htop
 * perf record -a -g
 * SystemTap with code to print out migrations occurring
 * numatop

All we know is that the migration storms correlate with concurrent
Chef runs verifying/configuring containers on the system.

Obviously, Chef invokes many things. But, most of the migrations we
see are for Chef, the Ruby interpreter, the sh interpreter, and Munin.
Responsiveness returns to normal after SystemTap reports a large set
of chef-solo migrations, presumably at their completion.

David Strauss
Pantheon Systems
Fedora Server Working Group


More information about the kernel mailing list