On Thu, 2014-06-12 at 16:11 +0200, Miloslav Trmač wrote:
2014-06-12 16:03 GMT+02:00 Stephen Gallagher
<sgallagh(a)redhat.com>:
> On 06/11/2014 10:16 PM, Simo Sorce wrote:
> > Btw, I am not sure I understand why a crash would be resolved by a
> > deploy, there is quite a difference between an error in deploying
> > and a runtime error a while after successfully deployed.
> >
>
> That's a case where resetError() is the more likely answer (or a
> package update fixing the crash bug). I can add a separate Crashed
> state if you really want it, but it seems superfluous to me.
>
I’d rather not; by my yardstick of “would the user treat the system
differently in the two states?”, there is no difference between
{crashed,failed with an error message} {while starting {for the first time,
for the $Nth time}, after successful start and running fo some time}: the
next thing to do is to review any logs applicable, and the ways to remedy
are to either change the configuration (resetError()), to update to a fixed
version (vaguely ~deploy), or to repair a truly broken system (e.g.
lost/corrupted files) by manual action we aren‘t making easier.
It seem you consider the system is always unavailable after a crash but
that is not the case with monitored daemons that are automatically
restarted. You certainly may want to be notified but if the service is
running (once the daemon has been automatically restarted) then there is
no error state to really fix (of course you want to eventually update
packages, but it may take quite a while before updates are available).
Simo.
--
Simo Sorce * Red Hat, Inc * New York