builders of the future!!!!!

Seth Vidal skvidal at fedoraproject.org
Tue Jul 24 18:34:40 UTC 2012




On Wed, 21 Mar 2012, Kevin Fenzi wrote:

>
>>> I'd agree collectd off probibly. Or at least a seperate one if we
>>> needed to monitor them.
>>
>> I'm not sure what benefit we get from collectd on transient builders,
>> though.
>>
>> On our long-running hosts I understand but not on the builders.
>
> Yeah, the only case I can see is so we could see how loaded they are...
> and we might have better ways to tell that.
>
>>> Yeah, we could hopefully have another network thats larger than /24
>>> for the arm builders.
>>
>> I can imagine various network changes should easily allow us to
>> allocate larger than a /24 to the internal build network.
>
> Yeah.
>
>>> I'm sure some of this will be a process of 'oh no, what we have now
>>> doesn't scale, lets fix it'. Of course some of it we can get ready
>>> for up front too.
>>
>> yay for planning! :)
>>
>>
>>> Overall I like the idea of the automated builder re-install and
>>> think it will get us more ready for things like a large arm
>>> cluster.
>>
>> Then I will get crackin' on making it work.
> Sounds good.

I wanted to come back around to this discussion to close it out-  as we 
are most of the way complete here:

In the last few weeks I've setup a system that deploys a new builder, 
provisions it and gets it ready in a single command.

It's in the builder git repository. This repo is on lockbox but it is only 
accessible to sysadmin-main and sysadmin-releng.

I've posted a site-specific sanitized version of the script I'm using 
here:
http://fedorapeople.org/cgit/skvidal/public_git/scripts.git/tree/ansible/start-prov-boot.py

and I'll be happy to post the playbooks I'm using to provision these 
hosts.

The repo is restricted b/c it contains some certs/ssl keys that we aren't 
going to give away to everyone :)

The process for reinstalling a host is incredibly trivial, we built all 
the hosts for the latest mass rebuild using that process. It takes a 
single command and you walk away.

(other than any enabling of the build in koji).

The next step is to put this process into a cron job so we, ideally, can 
reinstall a certain percentage of our builders at any/all times.

We're using ansible for all of the command/control and it has been 
remarkably stable for our use case. It does require ssh keys on the hosts 
but we have that set via kickstarts now for the builders.

After some discussion we took the step of removing FAS and all fedora 
accounts from the builders. We couldn't come up with a compelling reason 
to keep these throw-away hosts coupled to FAS since the only folks 
connecting to them were sysadmin-main/releng - it was a waste of time to 
setup and keep the FAS db on the hosts current. Furthermore, it was an 
additional risk that a rogue package could try to snatch up our fas db and 
crack the passwords.

If anyone has any questions about how this works or would like any piece 
of the infrastructure for doing it (other than the certs/keys :)) please 
email to this list and ask.

-sv



More information about the infrastructure mailing list