Hello,
I'm a little bit confused about the io_timeout option for sanlock. I
increased the io_timeout to 30 seconds, but it seems like the overall
initialization becomes slower now.
libvirtd is the client through the sanlock plugin.
sanlock runs as "sanlock daemon -R 1 -o 30"
Restarting sanlock + libvirtd takes about 60 seconds before libvirtd
acquires the lease (or at least, before libvirtd starts responding).
After a reboot, for some reason, this delay increases up to 360 seconds...
I have no idea why this would take longer...
The libvirtd guys don't seem to know why this happens... so I hope to find
an answer on this list... I didn't find any other timeouts that are
configurable. From the source code, it looks like most of the timeouts are
based on the io_timeout.
Best regards,
Frido
---------- Forwarded message ----------
From: Daniel P. Berrange <berrange(a)redhat.com>
Date: Tue, Mar 13, 2012 at 3:54 PM
Subject: Re: [libvirt-users] libvirt with sanlock
To: Frido Roose <fr.roose(a)gmail.com>
Cc: libvirt-users(a)redhat.com
On Tue, Mar 13, 2012 at 03:42:36PM +0100, Frido Roose wrote:
Hello,
I configured libvirtd with the sanlock lock manager plugin:
# rpm -qa | egrep "libvirt-0|sanlock-[01]"
libvirt-lock-sanlock-0.9.4-23.el6_2.4.x86_64
sanlock-1.8-2.el6.x86_64
libvirt-0.9.4-23.el6_2.4.x86_64
# egrep -v "^#|^$" /etc/libvirt/qemu-sanlock.conf
auto_disk_leases = 1
disk_lease_dir = "/var/lib/libvirt/sanlock"
host_id = 4
# mount | grep sanlock
/dev/mapper/kvm--shared-sanlock on /var/lib/libvirt/sanlock type gfs2
(rw,noatime,hostdata=jid=0)
# cat /etc/sysconfig/sanlock
SANLOCKOPTS="-R 1 -o 30"
I increased the sanlock io_timeout to 30 seconds (default = 10), because
the sanlock dir is on a GFS2 volume and can be blocked for some time while
fencing and journal recovery takes place.
With the default sanlock io timeout, I get lease timeouts because IO is
blocked:
Mar 5 15:37:14 raiti sanlock[5858]: 3318 s1 check_our_lease warning 79
last_success 3239
Mar 5 15:37:15 raiti sanlock[5858]: 3319 s1 check_our_lease failed 80
So far, all fine, but when I restart sanlock and libvirtd, it takes about
2
* 30 seconds = 1 minute before libvirtd is usable. "virsh
list" hangs
during this time. I can still live with that...
But it gets worse after a reboot, when running a "virsh list" even takes a
couple of minutes (like about 5 minutes) before it responds. After this
initial time, virsh is responding normally, so it looks like an
initialization issue to me.
Is this a configuration issue, a bug, or expected behavior?
Each libvirtd instance has a lease that it owns. When restarting libvirtd
it tries to acquire this lease. I don't really understand why, but sanlock
sometimes has to wait a very long time between starting & completing its
lease acquisition. You'll probably have to ask the sanlock developers for
an explanation of why
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/:|
|:
http://libvirt.org -o-
http://virt-manager.org:|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/:|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc:|