I'm working now on sanlock fencing feature for ovirt.
I have some questions and suggestions regarding the patch.
> A host can send a predefined msg_num to another host.
> The host messages are sent from one host to another via
> a lockspace that both hosts are using. If no lockspace
> name is specified, the sanlock daemon will search for a
> common lockspace to use. (N.B. hosts do not necessarily
> use the same host_id in all lockspaces, so not specifying
> the lockspace could result in targeting the wrong host.)
I think that making the lockspace a required parameter makes
more sense and will avoid fatal errors.
> The lockspace used to transmit the message may or may not
> have any other relation to the message itself.
> A host can send one message to a one other host at a time.
Can we increase this number, simplifying (unlikely) case where
more then one host need to be fenced?
> The message is placed in the sending host's delta lease,
> and remains there for two renewals. When the receiving
> host renews its own delta lease, it checks the delta leases
> of all other hosts, and sees itself addressed in the sending
> host's lease. It then processes the message from the
> sending host.
Why did you choose the keep the message for two renewals?
We would like to have this value configurable, to make
it easier to solve issues in the field.
I think we have to handle the (unlikely) case, where a host
lost its lease without seeing the WD_RESET message, then
acquire the lease again (not sure if this is possible in vdsm
currently). The fencing host may assume wrongly that the host
was fenced in this case.
What if we leave the WD_RESET message until the fencing host
send a WD_UNRESET message?
This way I can send a WD_RESET message, wait for some renwals,
ensuring that the host either lost it's lease, or will *not*
get a new lease, until I decide to allow the host to get one.
We may have a case where we cannot access a host, we fence it,
ensuring that it cannot access the storage, but the host never
see the fence request, and keeping it in "fenced" mode is
required until we can reboot the bost using power management
> If a message is currently active in a lockspace, the
> sending host_message call will return -EBUSY. After two
> renewals (around 40 seconds), another message may be sent.
> An optional host generation can be included, in which
> case the receiving host_id will accept the message only
> if its current generation matches.
> The single msg_num defined here is WD_RESET (1), which
> means that the host receiving the message should use
> its watchdog device to reset itself as soon as possible.
> The WD_RESET message has no effect on any lockspaces
> or resources that may exist. Existing lockspaces and
> resources continue to operate as usual until the reset.
> (A watchdog reset due to "standard" lockspace failure
> could in fact occur before the watchdog reset caused
> by the host message.)
> Because host messages may not be received if the
> destination host fails, or looses storage access,
> there are no guaranteed times associated with the
> delivery, processing or effect of a host message.
> Guaranteed times for another host being dead should
> continue to be based on either acquiring a resource,
> or sanlock_get_hosts().
What would be the best way to detect that the host was fenced?
> TODO: will be adding another msg_num to cause the
> destination to use /proc/sysrq-trigger to reboot itself.
> (After setting up the watchdog to reset the machine in
> case the sysrq mechanism fails.) The sysrq reboot is
> immediate, whereas the watchdog takes a minute to reset.
Maybe use WD_FENCE, and let sanlock use the best available
method for fencing?
Can we customize the actions taken by sanlock when receiving
the WD_RESET message? For example, running a script after
the message was received?
What would be the best way to detect if a host supports
the new fencing feature - check its sanlock version?