Sarat,
I've been using RHQ for past couple of days to integrate our
application. For a newbie like me, RHQ was very easy to learn and ponder around ( UI is
very intuitive ). I've been able to get it working very quickly with my application.
Thanks to you guys for coming up with such a great application.
Thanks
Now, I have a situation, where I should demonstrate that RHQ should
raise an alert in case 'If connectivity to an external interface fails for 3 times in
past 5 mins" raise a medium priority alert to a given list of Users. To demonstrate
this I have instrumented my code and exposed a JMX attribute to denote this value. The
value of the JMX attribute is incremented everytime we have a connectivity timeout with
the external interface.
RHQ was able to fire an alert, when this counter is reaching the value of 3 but since
this value is not reset from my application it was never able to recover from this
situation. For ex: After 3 connection timeouts, the connectivity to the external interface
resumed, so the counter stays at 3 even for the next collection of metrics, next time RHQ
queries the metrics it identifies the value again matches the rule and it fires an alert.
What is the best way to handle in this kind of situation ?
Just send a 1 when the connection is not available and a 0 else.
Then define an alert that triggers on value > 0.5 and define a dampening rule for N
occurrences in X minutes
Hope that helps
Heiko
--
Reg. Adresse: Red Hat GmbH, Technopark II, Haus C,
Werner-von-Siemens-Ring 14, D-85630 Grasbrunn
Handelsregister: Amtsgericht München HRB 153243
Geschaeftsführer: Brendan Lane, Charlie Peters, Michael Cunningham, Charles Cachera