Dracut, dmsquash, and overlays

Peter Robinson pbrobinson at gmail.com
Tue Jul 29 08:25:00 UTC 2014


>> You're not quite right about how the overlay works.
>>
>> The default in-memory overlay is just 512MB. And the device-mapper docs
>> note that "if it fills up the snapshot will become useless and be
>> disabled, returning errors."[1].
>>
>> You should also note that the overlay is a block-level snapshot - so any
>> changes to existing files or filesystem metadata will cause data to be
>> written to the overlay. Furthermore, the default chunk size is 4kb - so
>> any change less than 4kb will take 4kb of space.
>
> After hitting a wall several times today, I began to see what you're talking about here. ;)
>
>> I assume you're using filesystem images 'cuz you don't have a reliable
>> network connection (otherwise you'd probably be using NFS or iSCSI or
>> something)?
>
> The network connection is reliable, but I'm working with thousands of nodes that need a largely stateless system with only a few persistent items.
>
>> Since your systems have lots of RAM, why not just use a regular ext4
>> filesystem image as your root filesystem? Then you don't need to worry
>> about blowing up the overlay at all.
>
> Are you suggesting an ext4 r/w filesystem stored in RAM?  I haven't seen how to do that in dracut with the existing scripts.
>
>> If you need compression to save RAM: why not use a squashfs image
>> directly, and mount/bind a tmpfs to the places you'll be writing data?
>
> I'd be interested in that for sure but the dmsquash module in dracut seems to require a real ext/btrfs/xfs filesystem for device mapper.  I couldn't find a way to boot a plain squashfs with a filesystem in it.
>
>> Is there a particular reason you need to use dmsquash-live, or is this
>> just a case of the hammer making all your problems look like nails?
>
> My goal is to live boot our servers since the majority of our systems would be stateless.  Being able to reboot into a known good, tested state would be advantageous.  I've worked with Debian's Live Systems project[1] and their strategy is to mount a squashfs read only but then use aufs to provide a writeable filesystem overlay.  It's handy since you can fill up the overlay without causing the snapshot to overflow.  However, AUFS isn't in the upstream kernel and that makes things a bit challenging.

It sounds like oVirt node does a lot of what you need and might be a
good starting point, it's basically a minimal KVM plus associated
userspace hypervisor. It can be booted as a live image, pxe boot or
installed.

http://www.ovirt.org/Category:Node
http://www.ovirt.org/Node_Building
http://www.ovirt.org/Node_PXE

Peter


More information about the devel mailing list