Unplanned Proxy Outage: - 2011-08-19 16:30 UTC

Kevin Fenzi kevin at scrye.com
Mon Aug 22 15:38:05 UTC 2011


On Fri, 19 Aug 2011 19:45:45 -0700
Toshio Kuratomi <a.badger at gmail.com> wrote:

...snip...

> Action Items
> ============
> 
> There are some open questions to try to resolve:
> 
> * Why did proxy01 and proxy02 die?  A brief look at the logs has not
>   revealed a cause for this.

I can't find any cause here. Logs just stop, they were locked up
hard. ;( 

As a side note: libvirt/kvm supports watchdog. We could possibly setup
watchdog on all our guests so they at least reboot if they are
unresponsive. Of course that could lead to problems if they get stuck
in a reboot/lockup cycle. 

> * Why didn't app06 take up any of the slack when haproxy started
> passing traffic to the backups?

Yeah, all I can think of is that it was too slow to answer and haproxy
didn't want to add it. 

> We have identified one means of mitigating this in the future:
> 
> If we ran internal DNS for phx2 then we could have
> admin.fedoraproject.org resolve to different proxy servers (using
> internal ip addresses for the proxies inside of PHX2).  This should
> remove the SPOF on proxy01.  We have not yet determined whether we'd
> need to run more proxy servers inside of PHX2 or if hairpinning would
> not be an issue if we used proxy servers outside of phx2.

Well, we do run dns there, so we can tweak it. :) 

Hairpinning only comes into play if we try and list a phx2 external IP
in there. The problem with listing another external proxy is that then
it's likely to be slow... the request would need to go all the way out,
then back in to fas. 

We could run another proxy thats just internal to phx2. 
That seems like it's sort of overkill though. ;( 

I think I might sit down and draw up our proxy/app/fas/etc setup and
perhaps we can look at a picture and see how we can simplify it or make
it more robust. 

kevin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
Url : http://lists.fedoraproject.org/pipermail/infrastructure/attachments/20110822/d4eed7aa/attachment.bin 


More information about the infrastructure mailing list