On Tue, Mar 13, 2012 at 5:08 PM, David Teigland <teigland(a)redhat.com> wrote:
> > I increased the sanlock io_timeout to 30 seconds (default =
10),
because
> > the sanlock dir is on a GFS2 volume and can be blocked for some time
while
> > fencing and journal recovery takes place.
This is one of the reasons why you should not put sanlock leases on gfs2.
They should be put directly on a shared block device.
> I'm a little bit confused about the io_timeout option for sanlock. I
> increased the io_timeout to 30 seconds, but it seems like the overall
> initialization becomes slower now.
> libvirtd is the client through the sanlock plugin.
> sanlock runs as "sanlock daemon -R 1 -o 30"
>
> Restarting sanlock + libvirtd takes about 60 seconds before libvirtd
> acquires the lease (or at least, before libvirtd starts responding).
>
> After a reboot, for some reason, this delay increases up to 360
seconds...
> I have no idea why this would take longer...
>
> The libvirtd guys don't seem to know why this happens... so I hope to
find
> an answer on this list... I didn't find any other timeouts that are
> configurable. From the source code, it looks like most of the timeouts
are
> based on the io_timeout.
Yes, all the timeouts are derived from the io_timeout and are dictated by
the recovery requirements and the algorithm the host_id leases are based
on: "Light-Weight Leases for Storage-Centric Coordination" by Gregory
Chockler and Dahlia Malkhi.
Here are the actual equations copied from sanlock_internal.h.
"delta" refers to host_id leases that take a long time to acquire at
startup
"free" corresponds to starting up after a clean shutdown
"held" corresponds to starting up after an unclean shutdown
You should find that with 30 sec io timeout these come out to 1 min / 4 min
which you see when starting after a clean / unclean shutdown.
Thanks! This information explains the differences in delay I encounter
between a clean and unclean situation.
The reboot delay was effectively after a fence operation, so an unclean
restart.
I guess having a delta_acquire_held_min of 300 seconds is to be sure that
no host with this host_id would acquire the lock in the meantime.
I'm not sure if this still makes sense on top of GFS2, but that reminds me
to the fact that you said sanlock was meant to be used with a block device.
Maybe this is something that the libvirtd devs need to be aware of. I'll
start a discussion about this on the libvirt list as the person who tried
to help me didn't understand the delays neither.
* io_timeout_seconds: defined by us
*
* id_renewal_seconds: defined by us
*
* id_renewal_fail_seconds: defined by us
*
* watchdog_fire_timeout: /dev/watchdog will fire without being petted
this long
* = 60 constant
*
* host_dead_seconds: the length of time from the last successful host_id
* renewal until that host is killed by its watchdog.
* = id_renewal_fail_seconds + watchdog_fire_timeout
*
* delta_large_delay: from the algorithm
* = id_renewal_seconds + (6 * io_timeout_seconds)
*
* delta_short_delay: from the algorithm
* = 2 * io_timeout_seconds
*
* delta_acquire_held_max: max time it can take to successfully
* acquire a non-free delta lease
* = io_timeout_seconds (read) +
* max(delta_large_delay, host_dead_seconds) +
* io_timeout_seconds (read) +
* io_timeout_seconds (write) +
* delta_short_delay +
* io_timeout_seconds (read)
*
* delta_acquire_held_min: min time it can take to successfully
* acquire a non-free delta lease
* = max(delta_large_delay, host_dead_seconds)
*
* delta_acquire_free_max: max time it can take to successfully
* acquire a free delta lease.
* = io_timeout_seconds (read) +
* io_timeout_seconds (write) +
* delta_short_delay +
* io_timeout_seconds (read)
*
* delta_acquire_free_min: min time it can take to successfully
* acquire a free delta lease.
* = delta_short_delay
*
* delta_renew_max: max time it can take to successfully
* renew a delta lease.
* = io_timeout_seconds (read) +
* io_timeout_seconds (write)
*
* delta_renew_min: min time it can take to successfully
* renew a delta lease.
* = 0
_______________________________________________
sanlock-devel mailing list
sanlock-devel(a)lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/sanlock-devel