On Sun, Mar 02, 2014 at 04:33:24AM -0500, Saggi Mizrahi wrote:
I think the problem is that we are even trying to "fence"
What we really want is to for sanlock to try and release a HostID lease
IIRC if sanlock can't do it (kill all the related processes etc)
it fences the host.
I don't yet understand what the desired outcome in each situation is, i.e.
when do we want the host to be reset, when do we want to migrate vms, when
do we want to suspend or kill vms so we can release the leases, etc. The
mechanisms should mostly exist to do what we want, but how and when to
apply them is unclear to me.
If we make a unique ID (call it instance) for every time we acquire
a hostID lease (but use the same when we renew) we could have the
message address the lockspace and an instance. This means that if
the host was able to release everything the message would no longer
apply to him. Couple this with the host ID we can see that the
hostID\instanceID pair has changed and clear the message.
The instanceID sounds a lot like the existing "host id generation", which
is incremented each time that a host_id is acquired. The host messages
are already addressed to a specific host_id/host_generation, so if a host
loses its host_id lease, returns, and reacquires it (with a new generation
number), the previous message will no longer apply.