[PATCH v2 0/6] Add fence kdump support

Vivek Goyal vgoyal at redhat.com
Mon Jan 27 15:08:46 UTC 2014


On Mon, Jan 27, 2014 at 09:22:39AM +0100, Marek Grac wrote:
> On 01/24/2014 03:48 PM, Vivek Goyal wrote:
> >But dump time varies based on machine type. So if you add a machine
> >to cluster with large amount of memory, it could take 30minutes easily
> >to dump.
> >
> >And there is no documentation which explains how much time it will take
> >to dump. Nobody knows.
> Yes, that's true.
> 
> >I am sorry I still don't understand how does this timeout logic work.
> >
> >- Is it a tick based mechanism where 60 seconds represent the interval
> >   in which atleast one tick should be received.
> >
> >- Or is it absolute upper limit of time in which dump should be completed.
> Problem is that I did not describe it precisely enough, so don't
> worry. There are two timeouts:
> * fence_kdump
>     - tick based mechanism, 60 seconds for valid message, upper
> bound is infiity
>     - usable everywhere even without cluster
> 
> * using fence_kdump with pacemaker/corosync cluster
>     - the most usual combination
>     - cluster has it's upper limit in which fencing has to be
> finished otherwise it is considered to be failed
>     - this timeout has to be set to a value which is system-dependant

So what's the default value of this system dependent timeout?

>     - fencing will fail if previous timeout is not enough and if no
> message was received in 60 seconds (fence_kdump fails -> fencing
> fails)

IIUC, you are saying that there are two timeouts in effect. One says that
every 60 seconds a message should be received from fence_kdump. And other
timeout is global upper limit set by cluster admin and dump should finish
in that time. 

So first tick based fence_kdump timeout should not be a problem. Only 
problem will be this absolute upper limit timeout for cluster. I am
curious to know what's the default value for this timeout.

Thanks
Vivek


More information about the kexec mailing list