Building Images for Taskotron Disposable Clients

Kamil Paral kparal at redhat.com
Fri Nov 13 11:29:08 UTC 2015


> As we get closer to putting disposable clients into production, we need
> a way to have updated images for those clients. I don't think this is
> news to anyone since the topic has come up several times before but now
> there's a bit more urgency :)
> 
> In my mind, we have the following requirements:
>  - Produces qcow2 images that work with testcloud
>  - can be run in an automated way
>  - allows adding/changing/customizing packages contained in image
>  - allows arbitrary repos to be specified
> 
> and the following "nice to have" things:
>  - can build branched and rawhide images
>  - builds images from scratch using only things provided by releng
>  - written in python
>  - builds more than qcow2 for some future-proofing

qemu-img can convert between many image formats, so native support in that tool is not that important, I think.

>  - can run well in a VM
> 
> Is there anything that I missed?

The image should be compatible with guestfish, so that we can e.g. copy in some files without rebuilding the image from scratch. Might be useful for e.g. additional ssh keys (we have cloud-init for that at the moment, but if we had some troubles with it or we needed something it doesn't support, this would be an alternative way). I'm not fully sure what the requirements are, but I think guestfish can work with almost anything, including LVM, so unless the tool creates some crazy partition layout, it should work with everything.


> 
> As far as I know, we're looking at two options right now:
> taskotron-vmbuilder and imagefactory. I've put together a list of
> the pros and cons that I know of for both tools. Thoughts on which
> direction to take would be appreciated.
> 
> Tim
> 
> 
> taskotron-vmbuilder [1] is a PoC system kparal built around
> virt-builder [2]. Images are specified in a yaml file and instead of
> building those images from scratch "It takes cleanly prepared, digitally
> signed OS templates and customizes them".
> 
> [1] https://bitbucket.org/fedoraqa/taskotron-vmbuilder
> [2] http://libguestfs.org/virt-builder.1.html
> 
> pros:
>  - already does almost everything we need

To be fair, there have been some issues regarding SELinux, and I'm not sure they are sorted yet. The SELinux contexts of files inside the image were not set properly and one more reboot with autorelabel was needed. Might be fixed now, or not, I haven't tried for a long time. With anaconda, we're not likely to hit these kind of issues (we'll hit different ones).

>  - fits all requirements
>  - builds quickly
>  - well supported
> 
> cons:
>  - requires blobs which are out of our control
>    * yes, I know who does the work behind virt-builder. My concern
>      isn't with him, it's the general concept that I don't like. This
>      also gets into the fact that we would have pretty much no control
>      over timing of release for the base images.

All the tools required to create that "blob" - or image template (as they call it) - are open source and in Fedora, from what I see, so we can host our own. virt-builder man page says:
"For serious virt-builder use, you may want to create your own repository of templates."

This is how to create the template:
http://libguestfs.org/virt-builder.1.html#create-the-templates
For new stable Fedora releases, we can a single install manually, and use virt-sysprep on the image to have it ready. For Rawhide and maybe even for Branched, we might want to prepare a fresh new template more often, i.e. automate that. This is how libguestfs project does it:
https://github.com/libguestfs/libguestfs/blob/master/builder/website/fedora.sh

So you might see this as a combination of imagefactory and virt-builder-style process. The image is installed clean using anaconda once in a time (but very rarely), and most of the time just the prepared template is adjusted (updated with new packages), because it's much faster.

I'm not saying this is better or worse than alternatives, I just don't think this "blob" argument is quite right - we'd probably create and host our own templates, not rely on upstream one.


>  - limited support for rawhide and branched releases

There's limited (or no) support for it in the upstream repo, that's correct.

But if we host our own repo, according to the documentation and source code, it seems that as long as anaconda can install it, it should be possible to create an image for it. Which sounds as the same situation as with imagefactory. (Of course with the additional requirement that virt-* tools have to work in Rawhide/Branched).


>  - limited support for non-server spins

I'm not really sure what you mean, we can install any package set we want, so the only difference would be in the filesystem layout? The upstream templates seems to have only @core installed, in our own images we could adjust even that.

>  - output images are large in size

This is interesting. Theoretically I see no reason why official Cloud images should be smaller than the same package set installed using virt-builder. I guess they are simply more stripped down, and the filesystem much smaller? It could use some investigation. It's also a question how imagefactory-created images will look like (once we use our custom kickstarts).

By the way, since we seem to agree that we'll need several package set templates for each release (minimal, server, workstation), we're going to distribute pretty big disk images anyway. (Which concerns me a bit in itself).


>  - virt-builder is not written in python

Yeah, there are some parts of it in OCaml. Scary. I wouldn't want to patch that :)

> 
> 
> imagefactory [3] is a system for building os images and potentially
> shipping those images to various cloud systems. Images are specified
> with a kickstart file and an xml template descriptor. Imagefactory
> builds images from scratch, essentially using the kickstart to run an
> install inside a VM and processing that install into the desired image
> type.
> 
> [3] http://imgfac.org/
> 
> pros:
>  - used by releng to create Fedora cloud images

Collaborating on the tool with another team would be a big win from my POV.

>  - builds images from packages: no blobs that we don't have control over
>  - already has a mostly-complete RESTful api that can list images and
>    trigger new builds
>  - can support almost all spins; anything that can be represented in a
>    kickstart
>  - written in python
> 
> cons:
>  - not as fast as virt-builder

I haven't tried imagefactory, but we all know how long anaconda installation takes. I don't think it's a problem for our production environment, we don't care about 10 minutes difference. But if we consider building the same image on task-developer's machine, the speed gets more important.

>  - somewhat more complex than virt-builder

That's true, but if we start prepare our own virt-builder templates, I think that quickly reaches parity in complexity.

>  - when something goes wrong, debugging can be difficult due to how the
>    tool works

Do they have something like real-time monitoring of anaconda logs, do you know? Because otherwise I guess it's quite hard to learn what went wrong.

>  - we may be somewhat on our own to fix issues if releng is not hitting
>    similar problems
>  - may not run well in a VM (would need nested virt)

This is the same as in virt-builder, it also needs virt support. Originally I thought it doesn't, but it does. It can still be used without hw virt support (unlike anaconda, that would just be impossible performance-wise), but it's much much much slower and I don't think we would want to go that route (building an image 30 minutes instead of 3 minutes).


More information about the qa-devel mailing list