Post Move Stuff

Mike McGrath mmcgrath at redhat.com
Wed Dec 16 04:08:01 UTC 2009


So the move itself is over but there is still a lot of work to be done.
At the moment some of our normally redundant services (koji, vpn) aren't
redundant.  Also I'm pretty sure puppet is still failing on some hosts but
that we can fix at our own pace.

Also just a note, smooge and I are likely going to be in recovery mode the
next couple of days.  So if something seems broken please open a ticket.

So what is left?

 - We started renaming everything, we'll need to finish that (involves
   renames, re-keying things, etc)
 - New network map.  We went from essentially having a single network to
   having 3 networks:
   - A build network
   - A storage network
   - A public network
 - Training on one of our new server types, bxen*.  These hosts are
   dedicated to build and releng activities.  This was done for a couple
   of reasons, most of which are organizational.  It will also allow us to
   more easily predict growth needs for the buildsystem in the future.
 - Figure out what to do about proxy servers in PHX2.  We can go a load
   balancer route or a heartbeat route but I'm not totally convinced we
   need two proxy servers in PHX2 like we had in PHX1 though, because of
   the way network routing still works we'll have to figure out something
   HA
 - QA - The new QA boxes are ready to be configured, I'll be working with
   jlaska on this.  It's the first kind of hosted by Infrastructure but
   not really run by infrastructure set of boxes.  Similar to how the
   releng boxes work but the QA team is less close to Infrastructure then
   release engineering is.  This will involve training and some new
   policies.
 - Host certification - this is something I've been working on but not
   enacted yet.  Mostly a solid lookover everything based on a recent CSI
   doc.
	http://infrastructure.fedoraproject.org/csi/host-lifecycle-policy/en-US/html-single/#HostLifecycle-Host-Recertification
   In some organizations this the certification process will help bring
   about accountability.  For us it's more about knowing what's going on,
   it's not like if someone accidentally certifies a box wrong they'll be
   in trouble but in our case a second pair of eyes will help.  Even in
   the first trial run I did with smooge he discovered something I missed.

There's also a lot of little things to do, especially with verifying
things like IPTables and monitoring.

	-Mike




More information about the infrastructure mailing list