[freeze break request] Switch phx2 app servers back to proxy01.

Toshio Kuratomi a.badger at gmail.com
Fri Dec 6 18:37:08 UTC 2013


On Fri, Dec 06, 2013 at 01:06:06PM -0500, Ricky Elrod wrote:
> After talking in #fedora-noc, we would like to make phx2 app servers
> talk to other app servers via proxy01, rather than (potentially) going
> out to a server across the country for the benefit of load balancing.
> 
> The issue this solves is that because apps hosted on admin.fp.o talk to
> each other using the same admin.fp.o roundrobin that users use, when an
> application hits a random proxy and that proxy has gone down for some
> reason, the application would block its thread waiting for a response.
> Eventually this would happen on enough applications that we would get a
> flood of alerts, and ultimately cause downtime.
> 
> By using (only) proxy01, it means re-adding a single point of failure,
> but it seems likely that the case of "proxy01 is down so everything else
> is down" would mean that phx2 was having a network issue anyway, meaning
> we'd be in the same position (i.e., app servers unreachable). It seems
> unlikely that proxy01 will just die at random, and if it does then that
> is a whole new issue that we should address. So yes, the single point of
> failure is bad, but it seems marginally better than what we have been
> seeing lately.
> 
> I would like +1's to push this to puppet, which effectively reverts
> 1bac8c9a and 23ceebd5.
> 

+1

Do we also need to do this for apps.fedoraproject.org?
koji.fedoraproject.org?

If we're planning on sticking to this long term, perhaps we should have
a SOP about switching the host files to another phx2 proxy when we have
planned updates/outages of the proxies so we don't interrupt service.
(And in the same vein, having another proxy ip address commented out in the
puppet manifest so people can readily see what to switch to.)

-Toshio

> [codeblock at lockbox01 puppet]$ git show
> commit f0445cffd64d3db980a4b689517fd4b95f6e7686
> Author: Ricky Elrod <codeblock at lockbox01.phx2.fedoraproject.org>
> Date:   Fri Dec 6 17:55:38 2013 +0000
> 
>     Make phx2 boxen use proxy01 for admin.fp.o again
> 
> diff --git a/manifests/services/phx.pp b/manifests/services/phx.pp
> index 72c24be..317bfc2 100644
> --- a/manifests/services/phx.pp
> +++ b/manifests/services/phx.pp
> @@ -11,11 +11,10 @@ class phx {
>    }
>    case $environment {
>      'production' : {
> -#        host { 'admin.fedoraproject.org':
> -          #ip => '10.5.126.52',
> -          #ip => '66.35.62.166',
> -#          ensure => absent,
> -#        }
> +        host { 'admin.fedoraproject.org':
> +          ip => '10.5.126.52',
> +          ensure => present,
> +        }
>          host { 'cvs.fedoraproject.org':
>              ip => '10.5.125.151',
>              host_aliases => ['cvs']
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://lists.fedoraproject.org/pipermail/infrastructure/attachments/20131206/2a65bd65/attachment.sig>


More information about the infrastructure mailing list