P2P Packaging/Koji Cloud

seth vidal skvidal at fedoraproject.org
Wed Dec 7 15:36:26 UTC 2011

On Wed, 7 Dec 2011 14:46:18 +0100
Denis Arnaud <denis.arnaud_fedora at m4x.org> wrote:

> Hello,
> RedHat-hosted Koji servers offer an invaluable service by allowing
> all of us, package maintainers, to build all of "our" Fedora
> packages. I guess that that infrastructure is not cost-less for
> RedHat and and the quality of service is great (for instance, the
> wait in the queues, before Koji actually builds the packages
> submitted via the command-line client, is not so long).
> As Fedora is pretty advanced in the cloud/virtualisation arena, we
> could imagine a "Koji Cloud", hosted on VMs offered by volunteers.
> For instance, I could contribute a few VMs in Europe (hosted on
> http://www.ovh.co.uk/). Our Cloud SIG
> (https://fedoraproject.org/wiki/Cloud_SIG) and/or Virt ML
> ( https://admin.fedoraproject.org/mailman/listinfo/virt and
> https://fedoraproject.org/wiki/Getting_started_with_virtualization)/RedHat
> ET (http://et.redhat.com/) colleagues could help designing and
> implementing the following infrastructure:
>  * VM template/images, ready to be started on the volunteer's servers
> everywhere in the world, 24x7.
>     - SSH public keys of Koji administrators would be part of the
> images, so that they can have an easy access to them, just in case.
>     - Those VMs would update themselves automatically.
>     - The containers could be standardised as well. For instance,
> ProxMox/OpenVZ or Fedora/CentOS with libvirt.
>  * A directory (LDAP, or something less centralised, like the address
> book of Skype, for instance), keeping track of all those VMs:
>     - with the corresponding last known status;
>     - with the VM configurations (Fedora/CentOS release, CPU, memory,
> disk usage, etc);
>     - with some rating corresponding to their quality of service
> (build duration, reliability of the VM, MTBF, etc).
>  * A dispatcher system:
>     - which would route the Koji build requests to available VMs;
>     - collect the outcome of the builds (logs, RPM packages,
> statistics, QoS, etc) and store them in the current ("centralised")
> Koji infrastructure.
> As I am not a specialist of all those technologies, I may have
> forgotten a lot of things, but you get the idea.
> Doesn't it sound great? Does it sound realisable? Am I crazy to dream
> to such an infrastructure?

I've looked into spawning virt instances to do building and it is
pretty doable. The problem with them being offered by volunteers is

1. how do we trust the initial installation hasn't been poisoned unless
we ship all the bits over ourselves.
2. how do we trust the in-flight build isn't molested
3. how do the people providing the trust insure against
tainted/dangerous builds doing $bad_things on their systems.

this is why I concluded that the idea of donated/volunteered VM was not
going to work - additionally b/c the bandwidth requirements are
non-trivial for many builds.

However, building on environments where we have a contractual (read:
financial) relationship will work better and where the remote end has
protected themselves against attacks from the VMs. I'm speaking of
cloud hosting providers like amazon ec2 and rackspace cloud servers.

I've worked on some code to spawn off an instance, submit jobs +
packages, build them (a chain-build so you don't have to keep
respawning them) then collect all the results back to your local
machine. It works - it requires setting up trusted images at those
cloud providers but that's not very hard to do and keep current. Right
now I'm porting the code to use a different cloud-communication API
than I was using before. 

The problems still persist with bandwidth consumption and to some
extent with trust but trust is mitigated b/c the relationship with
the provider is more standardized and less haphazard.

I have a couple of systems inside the red hat colo that I had planned
on reinstalling to f16 and setting up openstack on them to play with the
same idea but on a local cloud instance. 

Is all this inline with the problems you've thought about?


More information about the cloud mailing list