I'm a little bit confused about the io_timeout option for sanlock.  I increased the io_timeout to 30 seconds, but it seems like the overall initialization becomes slower now.
libvirtd is the client through the sanlock plugin.
sanlock runs as "sanlock daemon -R 1 -o 30"

Restarting sanlock + libvirtd takes about 60 seconds before libvirtd acquires the lease (or at least, before libvirtd starts responding).

After a reboot, for some reason, this delay increases up to 360 seconds...  I have no idea why this would take longer...

The libvirtd guys don't seem to know why this happens... so I hope to find an answer on this list...  I didn't find any other timeouts that are configurable.  From the source code, it looks like most of the timeouts are based on the io_timeout.

Best regards,

---------- Forwarded message ----------
From: Daniel P. Berrange <berrange@redhat.com>
Date: Tue, Mar 13, 2012 at 3:54 PM
Subject: Re: [libvirt-users] libvirt with sanlock
To: Frido Roose <fr.roose@gmail.com>
Cc: libvirt-users@redhat.com

On Tue, Mar 13, 2012 at 03:42:36PM +0100, Frido Roose wrote:
> Hello,
> I configured libvirtd with the sanlock lock manager plugin:
> # rpm -qa | egrep "libvirt-0|sanlock-[01]"
> libvirt-lock-sanlock-0.9.4-23.el6_2.4.x86_64
> sanlock-1.8-2.el6.x86_64
> libvirt-0.9.4-23.el6_2.4.x86_64
> # egrep -v "^#|^$" /etc/libvirt/qemu-sanlock.conf
> auto_disk_leases = 1
> disk_lease_dir = "/var/lib/libvirt/sanlock"
> host_id = 4
> # mount | grep sanlock
> /dev/mapper/kvm--shared-sanlock on /var/lib/libvirt/sanlock type gfs2
> (rw,noatime,hostdata=jid=0)
> # cat /etc/sysconfig/sanlock
> SANLOCKOPTS="-R 1 -o 30"
> I increased the sanlock io_timeout to 30 seconds (default = 10), because
> the sanlock dir is on a GFS2 volume and can be blocked for some time while
> fencing and journal recovery takes place.
> With the default sanlock io timeout, I get lease timeouts because IO is
> blocked:
>    Mar  5 15:37:14 raiti sanlock[5858]: 3318 s1 check_our_lease warning 79
> last_success 3239
>    Mar  5 15:37:15 raiti sanlock[5858]: 3319 s1 check_our_lease failed 80
> So far, all fine, but when I restart sanlock and libvirtd, it takes about 2
> * 30 seconds = 1 minute before libvirtd is usable.  "virsh list" hangs
> during this time.  I can still live with that...
> But it gets worse after a reboot, when running a "virsh list" even takes a
> couple of minutes (like about 5 minutes) before it responds.  After this
> initial time, virsh is responding normally, so it looks like an
> initialization issue to me.
> Is this a configuration issue, a bug, or expected behavior?

Each libvirtd instance has a lease that it owns. When restarting libvirtd
it tries to acquire this lease. I don't really understand why, but sanlock
sometimes has to wait a very long time between starting & completing its
lease acquisition.  You'll probably have to ask the sanlock developers for
an explanation of why

|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|