[PATCH] haproxy & mirrorlist processes (root cause, more fixes)

Wed May 12 04:20:58 UTC 2010

On Tue, May 11, 2010 at 02:48:23PM -0500, Matt Domsch wrote:
> The mirrorlists are falling over - haproxy keeps marking app servers
> as down, and some requests are getting HTTP 503 Server Temporarily
> Unavailable responses.  This happens every 10 minutes, for 2-3
> minutes, as several thousand EC3 instances request the mirrorlist
> again.
> 
> For reference, we're seeing a spike of over 2000 simultaneous requests
> across our 6 proxy and 4 app servers, occuring every 10 minutes,
> dropping back down to under 20 simultaneous requests inbetween.
> 
> Trying out several things.
> 
> 1) increase number of mirrorlist WSGI processes on each app server
>    from 45 to 100.  This is the maximum number of simultaneous
>    mirrorlist requests that each server can serve.  I've tried this
>    value on app01, and running this many still keeps the
>    mirrorlist_server back end (which fork()s on each connection)
>    humming right along.  I think this is safe.  Increasing much beyond
>    this though, the app servers will start to swap, which we must
>    avoid.  We can watch the swapping, and if it starts, lower this
>    value somewhat.  The value was 6 just a few days ago, which wasn't
>    working either.
> 
>    This gives us 400 slots to work with on the app servers.

I don't have to do this anymore.  I found the source of the problem
(the CPU on the app servers was at 100% utilization, due to another
bug in MM's handling of user input).  Fixing that, we don't need
nearly as many workers.

> 
> 2) try limiting the number of connections from each proxy server to
>    each app server, to 25 per.  Right now we're seeing a max of
>    between 60 and 135 simultaneous requests from each proxy server to
>    each app server.  All those over 25 will get queued by haproxy and
>    then served as app server instances become available.  I did this
>    on proxy03, and it really helped out the app servers and kept them
>    humming.  There were still some longish response times (some >30
>    seconds).
> 
>    We're still oversubscribing app server slots here though, but
>    oddly, not by as much as you'd think, as proxy03 is taking 40% of
>    the incoming requests itself for some reason.

I'm not going to do these either.

> 
> 3) bump the haproxy timeout up to 60 seconds.  5 seconds (the global
>    default) is way too low when we get the spikes.  This was causing
>    haproxy to think app servers were down, and start sending load to
>    the other app servers, which would then overload, and then start
>    sending to the first backup server, ...  Let's be nicer.  If during
>    a spike it takes 60 seconds to get an answer, or be told HTTP 503,
>    so be it.

I did bump the timeout to 30 seconds instead of 60.  We'll see how
that works.

 
> 4) have haproxy use all the backup servers when all the app servers
>    are marked down.  Right now it sends all the requests to a single
>    backup server, and if that's down, all to the next backup server,
>    etc.  We know one server can't handle the load (even 4 aren't
>    really), so don't overload a single backup either.

Done.

> 
> 5) the default mirrorlist_server listen backlog is only 5, meaning
>    that at most 5 WSGI clients get queued up if all the children are
>    busy.  To handle spikes, bump that to 300 (though it's limited by
>    the kernel to 128 by default).  This was the intent, but the code was buggy.

Will do.

 
> 6) bug fix to mirrorlist_server to not ignore SIGCHLD.  Amazing this
>    ever worked in the first place.  This should resolve the problem
>    where mirrorlist_server slows down and memory grows over time.

Will do.

The root cause of the CPU utilization by mirrorlist_client.wsgi was
that mirrorlist_server.py couldn't deal with malformed arch=i386%80%E2
style query strings, and was crashing.  mirrorlist_client.wsgi would
then spin forever waiting for the server to respond, which it never
would.  This patch addresses the malformed input.

>From 621af2882e984459a23d1e7af3ef1854ea6f0ba1 Mon Sep 17 00:00:00 2001
From: Matt Domsch <Matt_Domsch at dell.com>
Date: Tue, 11 May 2010 22:57:19 -0500
Subject: [PATCH 1/2] mirrorlist_client: sanitize input into UTF-8

Users may put all sorts of odd URL-escaped characters onto the URLs.
When this happens, mirrorlist_server.py child thread would crash
trying to write the byte string with escaped '\x80' characters inside
it into the error report header.  When the child crashes,
mirrorlist_client.wsgi would spin forever eating 100% CPU waiting for
a response from the server that would never come.

This patch sanitizes the input query args into a byte string that only
contains UTF-8 characters.  This fixes this particular source of crash
in the server and subsequent hang of the client.  It would still be
good if the client could time out or otherwise recognize when the
server is never coming back.
---
 mirrorlist-server/mirrorlist_client.wsgi |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/mirrorlist-server/mirrorlist_client.wsgi b/mirrorlist-server/mirrorlist_client.wsgi
index 9dbdf7a..dd6d1d9 100755
--- a/mirrorlist-server/mirrorlist_client.wsgi
+++ b/mirrorlist-server/mirrorlist_client.wsgi
@@ -83,6 +83,12 @@ def request_setup(request):
         pathinfo = request.environ['PATH_INFO']
     if scriptname ==  '/metalink' or pathinfo == '/metalink':
         d['metalink'] = True
+
+    for k, v in d.iteritems():
+        try:
+            d[k] = unicode(v, 'utf8', 'ignore').encode('utf8')
+        except:
+            pass
     return d
 
 def accept_encoding_gzip(request):
-- 
1.7.0.1


Second patch will address the lack of a timeout:

>From 4976394a2f988843baf1d3e490dcc5dd3e74b1ea Mon Sep 17 00:00:00 2001
From: Matt Domsch <Matt_Domsch at dell.com>
Date: Tue, 11 May 2010 23:14:15 -0500
Subject: [PATCH 2/2] mirrorlist_client: add 60sec timeout to reading from the server

---
 mirrorlist-server/mirrorlist_client.wsgi |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/mirrorlist-server/mirrorlist_client.wsgi b/mirrorlist-server/mirrorlist_client.wsgi
index dd6d1d9..cc4416c 100755
--- a/mirrorlist-server/mirrorlist_client.wsgi
+++ b/mirrorlist-server/mirrorlist_client.wsgi
@@ -10,8 +10,10 @@ from string import zfill, atoi, strip, replace
 from paste.wsgiwrappers import *
 import gzip
 import cStringIO
+from datetime import datetime, timedelta
 
 socketfile = '/var/run/mirrormanager/mirrorlist_server.sock'
+request_timeout = 60 # seconds
 
 def get_mirrorlist(d):
     try:
@@ -37,9 +39,10 @@ def get_mirrorlist(d):
         readlen = len(resultsize)
     resultsize = atoi(resultsize)
     
+    expiry = datetime.utcnow() + timedelta(seconds=request_timeout)
     readlen = 0
     p = ''
-    while readlen < resultsize:
+    while readlen < resultsize and datetime.utcnow() < expiry:
         p += s.recv(resultsize - readlen)
         readlen = len(p)
         
-- 
1.7.0.1



I plan to hotfix both mirrorlist_client.wsgi and mirrorlist_server.py
on all the app servers with the above several patches then.

+1?

Thanks,
Matt

-- 
Matt Domsch
Technology Strategist
Dell | Office of the CTO