Technical Spec, better upgrade/rollback control
Colin Walters
walters at verbum.org
Mon Feb 24 12:29:29 UTC 2014
On Mon, Feb 24, 2014 at 1:11 AM, Chris Murphy <lists at colorremedies.com>
wrote:
>
> Yes. Snapper on openSUSE is doing this already on Btrfs. I'm not sure
> how it's dealt with on LVM thinp since /boot has to be outside LVM
> thinp because while GRUB groks conventional LVM, it doesn't get thinp
> yet. GRUB does understand /boot on Btrfs, but Fedora's grubby has a
> problem with it [1]. I've also been making /var/log a separate
> subvolume making it immune to rootfs snapshots and rollbacks.
>
Note for OSTree, /var/lib/rpm -> /usr/share/rpm (it's also immutable).
Same for /var/lib/yum.
> Is there good chance of optimizing OSTree to use LVMthin and Btrfs
> snapshots instead of hardlinks, while still being in charge of the
> proper semantic enforcement?
>
Note OSTree already today uses BTRFS_IOC_CLONE if on btrfs for
implementing the separate copies of /etc. (Actually this happens via
the generic g_file_copy() since
https://git.gnome.org/browse/glib/commit/?id=5eba9784979e0b723c05a45cf767046607e4e759
)
Beyond that though - because for OSTree, /usr is immutable, there isn't
really a big advantage of thinp or btrfs snapshots. Just try this
right now on your laptop:
# Once for cold cache performance
time cp -al /usr /usr.copy
# And once for hot cache
time cp -al /usr /usr.copy2
For me (and this a real-world RHEL7 system with a 5.1G /usr):
[root at localhost /]# time cp -al usr usr.copy
real 0m5.199s
user 0m0.220s
sys 0m2.849s
[root at localhost /]# time cp -al usr usr.copy2
real 0m2.245s
user 0m0.166s
sys 0m2.049s
That's really fast enough for the use cases I envision, for now.
Obviously FS/block snapshots have other advantages beyond being instant
- for example, they don't incur lots of scattered writes to bump the
refcounts of inodes. But many systems already have that happening
periodically to a lesser degree with the default of relatime anyways.
Where FS/block snapshots become *necessary* is if you have
*uncontrolled writes* to /usr. For example, with OSTree's hardlink
model, I cannot allow arbitrary rpm %post code to run. Each one has to
be carefully audited to break hardlinks via "write new copy, rename"
instead of doing edits in place.
This is necessary to allow a story for local software installation. We
don't need to do it though for the "pure replication" model where *no*
RPM %post runs on client systems - it all happens on the build server.
This replication model where OSTree is strongest right now, and where
the traditional package model is weakest, so I have been mainly
emphasizing it.
That said, doing this careful auditing of RPM %post and in general
laying the foundations for a package-like system on top of OSTree is
very much in the long term plans.
> Yes I also don't consider one kind of "rollback" since there can be
> different contexts. A user rolling back their /home doesn't mean
> rolling back any other user's, or the system. Conversely rolling back
> the system doesn't mean rolling back user /home or logs or some other
> things.
>
Definitely. OSTree doesn't touch /home (note this is now /var/home) -
and so it makes a lot of sense to still have something that's more like
a backup system. Particularly a backup system that knew to take a
backup before OSTree upgrades.
That's where using BTRFS or thinp in *combination* with OSTree is
really nice - that total freedom to do whatever you want at the block
layer means you can choose to have /home (/var/home) on a separate
partition and do thinp snapshots of it. Or use BTRFS's per-subvolume
RAID to say you want RAID0 for /, and RAID1 for /home.
To answer your question in another way then - I'll definitely be fast
to take advantage of any new APIs added by the storage layer to
*transparently* make things better for OSTree. But I don't want to
mandate any particular partition layout or FS/block level layout,
because I think it takes away too much administrator flexibilty.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.fedoraproject.org/pipermail/desktop/attachments/20140224/a5de2cd6/attachment.html>
More information about the desktop
mailing list