Nagios checks for datanommer/fedmsg

Kevin Fenzi kevin at scrye.com
Mon Apr 29 23:40:09 UTC 2013


On Mon, 29 Apr 2013 13:55:21 -0400
Ralph Bean <rbean at redhat.com> wrote:

> I came up with a draft of a nagios/nrpe check for datanommer/fedmsg
> 
>     https://gist.github.com/ralphbean/5482129
> 
> It queries the datanommer DB and asks for the time since the latest
> message in a particular category (i.e., bodhi, buildsys/koji, askbot).
> 
> We could configure it to raise a warning if, for example:
> 
> - It hasn't seen a buildsys message in the last 30 minutes.
> - It hasn't seen a bodhi message in the last 6 hours.
> - It hasn't seen a fedoratagger message in the last 2 months.
> 
> Since nagios alerts affect lots of people, it should probably be
> discussed here or in the infra meeting before being rolled out.
> 
> Pierre pointed out in channel that this approach assumes that there
> *must* be bodhi activity for such and such amount of time or else
> something is wrong.  This could be problematic.  There are times like
> the holidays in December when fewer people are contributing to
> Fedora, in which case this plugin could throw false positives.
> Accordingly, we would need to set the WARN and CRIT thresholds to be
> generously long.

Yeah, this seems like it would work, but we may have to tweak the
warnings/errors. 

I suppose another more complex approach would have it periodically
check the last fedmsg from service XYZ, and then go pull that info from
another source (pulling directly from the service perhaps), and then
alerting if they didn't match? 

Ie, I haven't seen a fedmsg from buildsys in 30min... but yet when I
look at the completed tasks page I see one from 1 minute ago, alert? 

But that could be really complex to scrape data from. :(

kevin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://lists.fedoraproject.org/pipermail/infrastructure/attachments/20130429/62c3c80f/attachment.sig>


More information about the infrastructure mailing list