[fedora-virt] tuning fdatasync for kvm?

Bill McGonigle bill at bfccomputing.com
Tue Apr 13 14:33:32 UTC 2010


Hi, all,

The subject line is a bit of a guess.  I'm running the preview release 
on an updated F12, with a Vista guest (no virtio drivers yet) and after 
an initial OS install the software updates have been running for a bit 
more than 12 hours, and the system's hard drives have been thrashing 
heartily during it.

I've got a data-journaled ext4 on luks on raid-1.  That's clearly not a 
write-optimized stack, but performance has been pretty good in the past 
with KVM & Vista and fine for normal system operations.  If I check out 
iotop while it's thrashing it's all in [jbd2/dm-2-8] and [kdmflush].

Looking for the blocked tasks (below), I think I see kvm is 
fdatasync'ing, on what I'd presume is a very frequent basis.  I noticed 
that fsync was replaced with fdatasync not too long ago, but that should 
have helped performance, not clobbered it (I think...).  Ideally I could 
tell kvm to fdatasync every 5 seconds or something like that and get 
batched writes.  I've tried switching schedules to cfq, deadline, and 
noop, with no big difference.

Does this seem sensible or am I totally off-base here?  I'm rebuilding 
this virtual machine after a virtio driver install went south, so I'd 
like to at least have a usable solution (but not libeatmydata) without 
them.  I also run OS's with no virtio driver support at times.

Thanks,
-Bill

-----

qemu-system-x86-0.12.3-6.fc12.x86_64
kernel-2.6.32.11-99.fc12.x86_64

-----

SysRq : Show Blocked State
   task                        PC stack   pid father
kdmflush      D ffff8802218d4880     0  1818      2 0x00000080
  ffff8802243a9d40 0000000000000046 ffff8802243a9ca0 ffffffffa006b33d
  ffff8802228d1a70 ffff8802228d1a70 ffff8802243a9fd8 ffff8802243a9fd8
  ffff880223c21b38 000000000000f980 0000000000015740 ffff880223c21b38
Call Trace:
  [<ffffffffa006b33d>] ? raid1_unplug+0x29/0x2d [raid1]
  [<ffffffff8107c359>] ? ktime_get_ts+0x85/0x8e
  [<ffffffff81454c1d>] io_schedule+0x43/0x5d
  [<ffffffff81378b9e>] dm_wait_for_completion+0xf7/0x129
  [<ffffffff81050c59>] ? default_wake_function+0x0/0x14
  [<ffffffff81379824>] dm_flush+0x59/0x5e
  [<ffffffff813798ea>] dm_wq_work+0xc1/0x173
  [<ffffffff8107027c>] worker_thread+0x1a9/0x237
  [<ffffffff81379829>] ? dm_wq_work+0x0/0x173
  [<ffffffff8107488b>] ? autoremove_wake_function+0x0/0x39
  [<ffffffff810700d3>] ? worker_thread+0x0/0x237
  [<ffffffff8107459e>] kthread+0x7f/0x87
  [<ffffffff81012d6a>] child_rip+0xa/0x20
  [<ffffffff8107451f>] ? kthread+0x0/0x87
  [<ffffffff81012d60>] ? child_rip+0x0/0x20
jbd2/dm-2-8   D 0000000000000002     0  1915      2 0x00000080
  ffff88022586fbe0 0000000000000046 ffff8802218d4800 ffff880222eee880
  ffff88022586fba0 ffffffff8137a489 ffff88022586ffd8 ffff88022586ffd8
  ffff880222c71b38 000000000000f980 0000000000015740 ffff880222c71b38
Call Trace:
  [<ffffffff8137a489>] ? dm_table_unplug_all+0x58/0xc0
  [<ffffffff8107c359>] ? ktime_get_ts+0x85/0x8e
  [<ffffffff8114132a>] ? sync_buffer+0x0/0x44
  [<ffffffff8114132a>] ? sync_buffer+0x0/0x44
  [<ffffffff81454c1d>] io_schedule+0x43/0x5d
  [<ffffffff8114136a>] sync_buffer+0x40/0x44
  [<ffffffff81455170>] __wait_on_bit+0x48/0x7b
  [<ffffffff81455211>] out_of_line_wait_on_bit+0x6e/0x79
  [<ffffffff8114132a>] ? sync_buffer+0x0/0x44
  [<ffffffff810748c4>] ? wake_bit_function+0x0/0x33
  [<ffffffff8114128d>] __wait_on_buffer+0x24/0x26
  [<ffffffff811c6cc9>] wait_on_buffer+0x3d/0x41
  [<ffffffff811c79d2>] jbd2_journal_commit_transaction+0xb5c/0x1102
  [<ffffffff81048024>] ? pick_next_task_fair+0xdb/0xec
  [<ffffffff810653cd>] ? try_to_del_timer_sync+0x73/0x81
  [<ffffffff811cdf31>] kjournald2+0xc6/0x203
  [<ffffffff8107488b>] ? autoremove_wake_function+0x0/0x39
  [<ffffffff811cde6b>] ? kjournald2+0x0/0x203
  [<ffffffff8107459e>] kthread+0x7f/0x87
  [<ffffffff81012d6a>] child_rip+0xa/0x20
  [<ffffffff8107451f>] ? kthread+0x0/0x87
  [<ffffffff81012d60>] ? child_rip+0x0/0x20
qemu-kvm      D ffff8801d1a8fdc0     0 17177      1 0x00000080
  ffff8801d1a8fd78 0000000000000086 0000000000000002 0000000000000046
  ffff8801d1a8fcd8 ffffffff81045a01 ffff8801d1a8ffd8 ffff8801d1a8ffd8
  ffff8802143ac9f8 000000000000f980 0000000000015740 ffff8802143ac9f8
Call Trace:
  [<ffffffff81045a01>] ? task_rq_unlock+0x11/0x13
  [<ffffffff811cd8b3>] jbd2_log_wait_commit+0xc6/0x119
  [<ffffffff8107488b>] ? autoremove_wake_function+0x0/0x39
  [<ffffffff811c5b98>] jbd2_journal_stop+0x205/0x235
  [<ffffffff811c6b08>] jbd2_journal_force_commit+0x28/0x2c
  [<ffffffff811a911c>] ext4_force_commit+0x27/0x2d
  [<ffffffff8118fd57>] ext4_sync_file+0x16f/0x174
  [<ffffffff8113e610>] vfs_fsync_range+0x82/0xb2
  [<ffffffff8113e6aa>] vfs_fsync+0x1d/0x1f
  [<ffffffff8113e6e0>] do_fsync+0x34/0x4b
  [<ffffffff8113e70a>] sys_fdatasync+0x13/0x17
  [<ffffffff81011d32>] system_call_fastpath+0x16/0x1b


-- 
Bill McGonigle, Owner
BFC Computing, LLC
http://bfccomputing.com/
Telephone: +1.603.448.4440
Email, IM, VOIP: bill at bfccomputing.com
VCard: http://bfccomputing.com/vcard/bill.vcf
Social networks: bill_mcgonigle/bill.mcgonigle


More information about the virt mailing list