Well the most numerous log is nagios asking for
11731659 GET /voting/ HTTP/1.0
which gets put in both the access log and error log somehow. This is
coming primarily from nagios from noc2 and publictest7. It would seem
we test 3 times a second per app server.
On bapp01 we see in a typical hour the following number of requests
per IP address
1447 10.5.126.51
1389 192.168.1.7
1376 192.168.1.63
1346 192.168.1.14
1289 192.168.1.52
1286 192.168.1.25
1280 192.168.1.12
This is a lot more than the other app servers get per hour
bapp01.phx2.fedoraproject.org
9413
app01.phx2.fedoraproject.org
2369
app04.phx2.fedoraproject.org
2366
app02.phx2.fedoraproject.org
2399
app03.phx2.fedoraproject.org
2372
app6.fedoraproject.org
1381
app5.fedoraproject.org
1429
so there must be a weighting issue going on somewhere.
I think the problem is in haproxy with the following:
listen voting 0.0.0.0:10007
balance hdr(appserver)
server app1 app1:80 check inter 10s rise 2 fall 4
server app2 app2:80 check inter 10s rise 2 fall 4
server app3 app3:80 check inter 10s rise 2 fall 4
server app4 app4:80 check inter 10s rise 2 fall 4
server app5 app5:80 backup check inter 15s rise 2 fall 4
server app6 app6:80 backup check inter 15s rise 2 fall 4
server app7 app7:80 check inter 15s rise 2 fall 4
server bapp1 bapp1:80 backup check inter 2s rise 2 fall 4
option httpchk GET /voting/
If I am reading the context right, the check is every 2 seconds on
bapp1 versus 10 to 15 for the others.
--
Stephen J Smoogen.
"The core skill of innovators is error recovery, not failure avoidance."
Randy Nelson, President of Pixar University.
"Let us be kind, one to another, for most of us are fighting a hard
battle." -- Ian MacLaren