qa machine management

Tim Flink tflink at redhat.com
Mon Apr 2 19:39:58 UTC 2012


On Mon, 2 Apr 2012 08:22:39 -0600
Kevin Fenzi <kevin at scrye.com> wrote:

> On Sat, 31 Mar 2012 18:45:49 -0600
> Stephen John Smoogen <smooge at gmail.com> wrote:
> 
> > On 31 March 2012 15:29, seth vidal <skvidal at fedoraproject.org>
> > wrote:
> > > On Sat, 31 Mar 2012 14:25:35 -0600
> > > Stephen John Smoogen <smooge at gmail.com> wrote:
> > >
> > 
> > > One concern I have with bcfg2 is lack of momentum. Since, for all
> > > intents and purposes it is just puppet but in python.
> > >
> > 
> > Well I am more worried about xml versus playbooks. in any case I
> > think I will go with ansible .. will see how much I can learn while
> > on Percacet (hey its QA environment right before release.. how bad
> > could it be :)?)
> > 
> > > One of the reasons I've been looking so hard at ansible is simple
> > > - it doesn't require a client-side. It's all push-based. From a
> > > logging and quietness-standpoint it should be significantly better
> > > especially for our environment where if a host cannot reach
> > > lockbox01 we know we cannot do anything else.
> 
> well, if QA folks are willing to give that a try, sounds reasonable to
> me. ;) 
> 
> I'd suggest we leave the autoqa machines alone until after release,
> but instead look at the other not very used ones in the list to try
> things on. 

Either that or limit work to staging (autoqa-stg01 and associated
clients). We're not really doing much in the way of development work
right now since F17 testing is in full swing so it doesn't matter so
much if staging goes down or has issues.

> Perhaps Tim can chime in here and explain the kinds of things they
> are doing now that they would like to not have to do once things are
> automated...

In my mind, there are two types of things that would be nice to
automate:
 - Server configuration
 - Client configuration

The server configuration is relatively static for now and would be as
much for disaster recovery and configuration backup as anything.
If/when we start doing more functional self-testing and re-deployment of
AutoQA, this would be very helpful but we aren't doing that at the
moment.

The client configuration is what we're more interested in at the
moment and as we go forward (disposable single-use clients will happen
at some point). This is a matter of configuring the AutoQA yum repo,
installing some packages and manipulating a few files. I don't have a
link for the documentation off hand but can find/create it.

Another thing that we're looking to do is automate the maintenance and
monitoring of the clients. I'm not sure how applicable this is to the
current discussion and I think this will mostly involve some scripting
on our end. We occasionally have problems with the clients running out
of disk space and failing tests. We'd like to have an automated method
for cleaning up and updating all of the test clients and possibly some
monitoring so that we are notified of low disk space before the tests
start failing.

The only snag to that is that the cleanup would have to run when there
are no active test runs going on. If done at the wrong time, said
maintenance could cause random and difficult-to-diagnose test failures.
We don't currently do this, but my thought is to have regular downtime
so that any of the updates and maintenance could be done.

Let me know if there is anything you'd like to see expanded or
clarified. F17 beta testing is pretty much consuming all of my
available sanity and time (I should have waited to ask nirik about all
this) but I will do my best to keep up with the discussion.

Tim
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: not available
URL: <http://lists.fedoraproject.org/pipermail/infrastructure/attachments/20120402/ae29f9c5/attachment.sig>


More information about the infrastructure mailing list