awstat runs
by Stephen John Smoogen
Looking over the configurations of awstats on log01, we copy the data
over once per day from each of the servers and then run awstats on the
server. However we also seem to run the program against that once a
day data every hour. Am I seeing correctly or is there something
missing? I am just tyring to figure out why the box is always running
bzcat and awstats :)?
--
Stephen J Smoogen.
“The core skill of innovators is error recovery, not failure avoidance.”
Randy Nelson, President of Pixar University.
"We have a strategic plan. It's called doing things.""
— Herb Kelleher, founder Southwest Airlines
13 years, 11 months
change request, create maps for ppc64
by Dennis Gilmore
I want to apply the following patch to make ppc64 a valid arch.
I noticed that there were no el6 maps for ppc64 el-6 we are shipping ppc64,
i386 and x86_64 this is because RHEL switched the base userland from 32 bit
to 64 bit
Dennis
diff --git a/modules/maps/files/parse9.pl b/modules/maps/files/parse9.pl
index 5d35e01..2cfd65d 100755
--- a/modules/maps/files/parse9.pl
+++ b/modules/maps/files/parse9.pl
@@ -47,6 +47,8 @@ sub valid_arch {
return 1;
} elsif("$arch" eq "ppc") {
return 1;
+ } elsif("$arch" eq "ppc64") {
+ return 1;
} elsif("$arch" eq "ia64") {
return 1;
} elsif("$arch" eq "sparc") {
13 years, 11 months
Change Request: Docs sync removal
by Mike McGrath
There's a cron job running on the app servers that shouldn't be and it's
eating up lots of io time. The docsSync script should be running only on
bapp1, it looks like it made it on to the rest of the app servers at some
point and was removed in puppet but not removed from the servers.
Can I get 2 +1's to remove this cron job?
-Mike
13 years, 11 months
[PATCH] avoid \xZZ in mirrorlist urls
by Matt Domsch
We are getting some mirrorlist requests with escape characters in them
such as \xe2 . While I've taken steps to deal with these in the
mirrorlist code, at least one client makes such a request hourly, and
they are causing the mirrorlist WSGI process to spin. I can't
recreate the failure, even using the same request URI, and the fixes
I've tried haven't avoided them all.
I'd like to block such requests at the proxy, to prevent them from
making it all the way to MM. It's a hack, but I'm at a loss for
another solution right now.
diff --git a/modules/mirrormanager/templates/mirrormanager-mirrorlist.conf.erb b/modules/mirrormanager/templates/mirrormanager-mirrorlist.conf.erb
index e52c926..95792fe 100644
--- a/modules/mirrormanager/templates/mirrormanager-mirrorlist.conf.erb
+++ b/modules/mirrormanager/templates/mirrormanager-mirrorlist.conf.erb
@@ -17,6 +17,10 @@ RewriteEngine On
RewriteCond %{QUERY_STRING} repo=epel-5&arch=\$basea\$
RewriteRule ^/mirrorlist - [F]
# END hack
+# BEGIN hack for escaped chars
+RewriteCond %{QUERY_STRING} \\x
+RewriteRule ^/(mirrorlist|metalink) - [F]
+# END hack
RewriteRule ^/publiclist(.*) <%= proxyurl %>/publiclist/$1 [P,L]
RewriteRule ^/mirrorlist(.*) <%= proxyurl %>/mirrorlist$1 [P,L]
RewriteRule ^/metalink(.*) <%= proxyurl %>/metalink$1 [P,L]
--
Matt Domsch
Technology Strategist
Dell | Office of the CTO
13 years, 11 months
Change request - remove email2trac from hosted
by Jesse Keating
Currently the only consumers of email2trac setup were pungi and rel-eng.
At one time I had somehow disabled email2trac for rel-eng due to the
spam, although I can't remember how I disabled it. Unfortunately
something changed in the past week or two that re-enabled it and a bunch
more spam came through. I'd like to just remove email2trac from our
hosted environment all together.
--
Jesse Keating
Fedora -- Freedom² is a feature!
identi.ca: http://identi.ca/jkeating
13 years, 11 months
[PATCH] mirrorlist_client.wsgi use select() waiting for server
by Matt Domsch
>From 7cd05b296ab426c386e99c3ff6f7143fbf6ed052 Mon Sep 17 00:00:00 2001
From: Matt Domsch <Matt_Domsch(a)dell.com>
Date: Wed, 12 May 2010 08:59:16 -0500
Subject: [PATCH] mirrorlist_client: use select() waiting on the response from mirrorlist_server
Client was spinning waiting for read() to complete, during the time
the server was doing its thinking. Instead, use select() to sleep
until the server has data to return. This should reduce CPU time
spent in the client considerably.
---
mirrorlist-server/mirrorlist_client.wsgi | 15 ++++++++++-----
1 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/mirrorlist-server/mirrorlist_client.wsgi b/mirrorlist-server/mirrorlist_client.wsgi
index cc4416c..15b3a15 100755
--- a/mirrorlist-server/mirrorlist_client.wsgi
+++ b/mirrorlist-server/mirrorlist_client.wsgi
@@ -4,7 +4,7 @@
# by Matt Domsch <Matt_Domsch(a)dell.com>
# Licensed under the MIT/X11 license
-import socket
+import socket, select
import cPickle as pickle
from string import zfill, atoi, strip, replace
from paste.wsgiwrappers import *
@@ -32,24 +32,29 @@ def get_mirrorlist(d):
s.shutdown(socket.SHUT_WR)
del p
+ # wait for other end to start writing
+ expiry = datetime.utcnow() + timedelta(seconds=request_timeout)
+ rlist, wlist, xlist = select.select([s],[],[],request_timeout)
+ if len(rlist) == 0:
+ s.shutdown(socket.SHUT_RD)
+ raise socket.timeout
+
readlen = 0
resultsize = ''
while readlen < 10:
resultsize += s.recv(10 - readlen)
readlen = len(resultsize)
resultsize = atoi(resultsize)
-
- expiry = datetime.utcnow() + timedelta(seconds=request_timeout)
+
readlen = 0
p = ''
while readlen < resultsize and datetime.utcnow() < expiry:
p += s.recv(resultsize - readlen)
readlen = len(p)
-
- s.shutdown(socket.SHUT_RD)
results = pickle.loads(p)
del p
+ s.shutdown(socket.SHUT_RD)
return results
def real_client_ip(xforwardedfor):
--
1.7.0.1
--
Matt Domsch
Technology Strategist
Dell | Office of the CTO
13 years, 11 months
Self-Introduction: ruigo
by Rui Gouveia
Hi,
my name is Rui Gouveia and I live in Porto/Portugal.
My Fedora Account System (FAS) username is ruigo, and my IRC nick is ruigo.
I'm one of the co-coordinators of the Fedora translation team for pt_PT
since Fedora 10.
Professionally, I'm a sysadmin since 2002, skills that I would hope to
utilize in the benefit of the Fedora Project.
A couple of goals I have for the Fedora Project are increase the user
base in this city and recruit help for Fedora projects.
For now, I'll be glad just to seat back and learn how you guys work.
Please help me get started!
Thanks for your time.
--
Rui Gouveia
--
O Software Livre não é apenas software. É também uma
filosofia de vida. Aprenda mais sobre este assunto em
http://www.gnu.org/philosophy/ e liberte-se.
http://www.google.com/reader/shared/05174907382601741850
13 years, 11 months
[PATCH] mirrorlist_client timeouts
by Matt Domsch
>From 1aa19dfed950c209ad5a2ddf48e2b828b50c07ee Mon Sep 17 00:00:00 2001
From: Matt Domsch <Matt_Domsch(a)dell.com>
Date: Wed, 12 May 2010 13:52:51 -0500
Subject: [PATCH] mirrorlist_client: a better way to handle socket timeouts
blocking sockets, calling recv(), may block forever if the server end
doesn't send anything for some reason. Don't let that happen. Python
has a socket.settimeout() capability. We'll use that to let any
individual operation (except the select()) take up to 5 seconds (they
should all be in the microsecond range, so this is very generous), and
let select() continue to wait for 60 seconds for the server to respond
at all.
If a timeout happens, an exception is raised, which is caught by the
caller and a HTTP 503 returned to the web client.
---
mirrorlist-server/mirrorlist_client.wsgi | 17 ++++++++---------
1 files changed, 8 insertions(+), 9 deletions(-)
diff --git a/mirrorlist-server/mirrorlist_client.wsgi b/mirrorlist-server/mirrorlist_client.wsgi
index 15b3a15..3508f19 100755
--- a/mirrorlist-server/mirrorlist_client.wsgi
+++ b/mirrorlist-server/mirrorlist_client.wsgi
@@ -13,14 +13,14 @@ import cStringIO
from datetime import datetime, timedelta
socketfile = '/var/run/mirrormanager/mirrorlist_server.sock'
-request_timeout = 60 # seconds
+select_timeout = 60 # seconds
+timeout = 5 # seconds
def get_mirrorlist(d):
- try:
- s = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
- s.connect(socketfile)
- except:
- raise
+ # any exceptions or timeouts raised here get handled by the caller
+ s = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
+ s.settimeout(timeout)
+ s.connect(socketfile)
p = pickle.dumps(d)
del d
@@ -33,8 +33,7 @@ def get_mirrorlist(d):
del p
# wait for other end to start writing
- expiry = datetime.utcnow() + timedelta(seconds=request_timeout)
- rlist, wlist, xlist = select.select([s],[],[],request_timeout)
+ rlist, wlist, xlist = select.select([s],[],[],select_timeout)
if len(rlist) == 0:
s.shutdown(socket.SHUT_RD)
raise socket.timeout
@@ -48,7 +47,7 @@ def get_mirrorlist(d):
readlen = 0
p = ''
- while readlen < resultsize and datetime.utcnow() < expiry:
+ while readlen < resultsize:
p += s.recv(resultsize - readlen)
readlen = len(p)
results = pickle.loads(p)
--
1.7.0.1
--
Matt Domsch
Technology Strategist
Dell | Office of the CTO
13 years, 11 months
[PATCH] haproxy & mirrorlist processes
by Matt Domsch
The mirrorlists are falling over - haproxy keeps marking app servers
as down, and some requests are getting HTTP 503 Server Temporarily
Unavailable responses. This happens every 10 minutes, for 2-3
minutes, as several thousand EC3 instances request the mirrorlist
again.
For reference, we're seeing a spike of over 2000 simultaneous requests
across our 6 proxy and 4 app servers, occuring every 10 minutes,
dropping back down to under 20 simultaneous requests inbetween.
Trying out several things.
1) increase number of mirrorlist WSGI processes on each app server
from 45 to 100. This is the maximum number of simultaneous
mirrorlist requests that each server can serve. I've tried this
value on app01, and running this many still keeps the
mirrorlist_server back end (which fork()s on each connection)
humming right along. I think this is safe. Increasing much beyond
this though, the app servers will start to swap, which we must
avoid. We can watch the swapping, and if it starts, lower this
value somewhat. The value was 6 just a few days ago, which wasn't
working either.
This gives us 400 slots to work with on the app servers.
2) try limiting the number of connections from each proxy server to
each app server, to 25 per. Right now we're seeing a max of
between 60 and 135 simultaneous requests from each proxy server to
each app server. All those over 25 will get queued by haproxy and
then served as app server instances become available. I did this
on proxy03, and it really helped out the app servers and kept them
humming. There were still some longish response times (some >30
seconds).
We're still oversubscribing app server slots here though, but
oddly, not by as much as you'd think, as proxy03 is taking 40% of
the incoming requests itself for some reason.
3) bump the haproxy timeout up to 60 seconds. 5 seconds (the global
default) is way too low when we get the spikes. This was causing
haproxy to think app servers were down, and start sending load to
the other app servers, which would then overload, and then start
sending to the first backup server, ... Let's be nicer. If during
a spike it takes 60 seconds to get an answer, or be told HTTP 503,
so be it.
4) have haproxy use all the backup servers when all the app servers
are marked down. Right now it sends all the requests to a single
backup server, and if that's down, all to the next backup server,
etc. We know one server can't handle the load (even 4 aren't
really), so don't overload a single backup either.
5) the default mirrorlist_server listen backlog is only 5, meaning
that at most 5 WSGI clients get queued up if all the children are
busy. To handle spikes, bump that to 300 (though it's limited by
the kernel to 128 by default). This was the intent, but the code was buggy.
6) bug fix to mirrorlist_server to not ignore SIGCHLD. Amazing this
ever worked in the first place. This should resolve the problem
where mirrorlist_server slows down and memory grows over time.
diff --git a/modules/haproxy/files/haproxy.cfg b/modules/haproxy/files/haproxy.cfg
index 6e538ed..5a6fda0 100644
--- a/modules/haproxy/files/haproxy.cfg
+++ b/modules/haproxy/files/haproxy.cfg
@@ -43,15 +43,17 @@ listen fp-wiki 0.0.0.0:10001
listen mirror-lists 0.0.0.0:10002
balance hdr(appserver)
- server app1 app1:80 check inter 5s rise 2 fall 3
- server app2 app2:80 check inter 5s rise 2 fall 3
- server app3 app3:80 check inter 5s rise 2 fall 3
- server app4 app4:80 check inter 5s rise 2 fall 3
- server app5 app5:80 backup check inter 10s rise 2 fall 3
- server app6 app6:80 backup check inter 10s rise 2 fall 3
- server app7 app7:80 check inter 5s rise 2 fall 3
- server bapp1 bapp1:80 backup check inter 5s rise 2 fall 3
+ timeout connect 60s
+ server app1 app1:80 check inter 5s rise 2 fall 3 maxconn 25
+ server app2 app2:80 check inter 5s rise 2 fall 3 maxconn 25
+ server app3 app3:80 check inter 5s rise 2 fall 3 maxconn 25
+ server app4 app4:80 check inter 5s rise 2 fall 3 maxconn 25
+ server app5 app5:80 backup check inter 10s rise 2 fall 3 maxconn 25
+ server app6 app6:80 backup check inter 10s rise 2 fall 3 maxconn 25
+ server app7 app7:80 check inter 5s rise 2 fall 3 maxconn 25
+ server bapp1 bapp1:80 backup check inter 5s rise 2 fall 3 maxconn 25
option httpchk GET /mirrorlist
+ option allbackups
listen pkgdb 0.0.0.0:10003
balance hdr(appserver)
diff --git a/modules/mirrormanager/files/mirrorlist-server.conf b/modules/mirrormanager/files/mirrorlist-server.conf
index fd7cf98..482f7af 100644
--- a/modules/mirrormanager/files/mirrorlist-server.conf
+++ b/modules/mirrormanager/files/mirrorlist-server.conf
@@ -7,7 +7,7 @@ Alias /publiclist /var/lib/mirrormanager/mirrorlists/publiclist/
ExpiresDefault "modification plus 1 hour"
</Directory>
-WSGIDaemonProcess mirrorlist user=apache processes=45 threads=1 display-name=mirrorlist maximum-requests=1000
+WSGIDaemonProcess mirrorlist user=apache processes=100 threads=1 display-name=mirrorlist maximum-requests=1000
WSGIScriptAlias /metalink /usr/share/mirrormanager/mirrorlist-server/mirrorlist_client.wsgi
WSGIScriptAlias /mirrorlist /usr/share/mirrormanager/mirrorlist-server/mirrorlist_client.wsgi
>From 45d401446bfecba768fdf4f26409bf291172f7bc Mon Sep 17 00:00:00 2001
From: Matt Domsch <Matt_Domsch(a)dell.com>
Date: Mon, 10 May 2010 15:23:57 -0500
Subject: [PATCH 1/2] mirrorlist_server: set request_queue_size earlier
While the docs say that request_queue_size can be a per-instance
value, in reality it's used during ForkingUnixStreamServer __init__,
meaning it needs to override the default class attribute instead.
Moving this up means that connections aren't blocking after about 5
are already running (default), and mirrorlist_client can now connect
in ~200us like one would expect, rather than seconds or tens of
seconds like we were seeing when lots (say, 40+) clients were
connecting simultaneously.
---
mirrorlist-server/mirrorlist_server.py | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/mirrorlist-server/mirrorlist_server.py b/mirrorlist-server/mirrorlist_server.py
index 8825a1a..2ade357 100755
--- a/mirrorlist-server/mirrorlist_server.py
+++ b/mirrorlist-server/mirrorlist_server.py
@@ -725,6 +725,7 @@ def sighup_handler(signum, frame):
signal.signal(signal.SIGHUP, sighup_handler)
class ForkingUnixStreamServer(ForkingMixIn, UnixStreamServer):
+ request_queue_size = 300
def finish_request(self, request, client_address):
signal.signal(signal.SIGHUP, signal.SIG_IGN)
BaseServer.finish_request(self, request, client_address)
@@ -815,7 +816,6 @@ def main():
signal.signal(signal.SIGHUP, sighup_handler)
signal.signal(signal.SIGCHLD, signal.SIG_IGN)
ss = ForkingUnixStreamServer(socketfile, MirrorlistHandler)
- ss.request_queue_size = 300
ss.serve_forever()
try:
--
1.7.0.1
>From d82f20b10c755e5ce40d67ca7ea4a6dba9e37d34 Mon Sep 17 00:00:00 2001
From: Matt Domsch <Matt_Domsch(a)dell.com>
Date: Mon, 10 May 2010 23:56:09 -0500
Subject: [PATCH 2/2] mirrorlist_server: don't ignore SIGCHLD
Amazing that this ever worked in the first place. Ignoring SIGCHLD
causes the parent's active_children list to grow without bound. This
is also probably the cause of our long-term memory size growth. The
parent really needs to catch SIGCHLD in order to do its reaping.
---
mirrorlist-server/mirrorlist_server.py | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)
diff --git a/mirrorlist-server/mirrorlist_server.py b/mirrorlist-server/mirrorlist_server.py
index 2ade357..0de7132 100755
--- a/mirrorlist-server/mirrorlist_server.py
+++ b/mirrorlist-server/mirrorlist_server.py
@@ -814,7 +814,6 @@ def main():
open_geoip_databases()
read_caches()
signal.signal(signal.SIGHUP, sighup_handler)
- signal.signal(signal.SIGCHLD, signal.SIG_IGN)
ss = ForkingUnixStreamServer(socketfile, MirrorlistHandler)
ss.serve_forever()
--
1.7.0.1
--
Matt Domsch
Technology Strategist
Dell | Office of the CTO
13 years, 11 months