Hi every one,
Reading all ideas about solving issues with upgrading systems from working
systems are more or less ideas of ad choc solving some issues or even more
or less reinventing the wheel. IMO all those ideas will not solve anything
and will only increase total level of entropy. After this will be necessary
sooner or later add even more ad choc workarounds and so on ..
I've mention already that some solutions are close to reinventing the wheel.
Why? Because they've been solved long time ago. To be more precise *more
than decade ago*.
I'm working with rpm (RPM Package Manager) more than two decades (try to
execute "rpm -qa --channgelog | grep kloczek" and you can find one on my
earliest activities still present in any RH based distributions. I've been
maintaining for 3 years PLD which at peak time was withs rpm based
distribution with more than 5k src.rpm packages).
Initially rpm was huge step forward because it's been formalizing many
install/upgrade, uninstall, verifications, building, testing problems under
Especially many things related to building packages have been solved very
well. So well that even today only some small improvements time too time
needs to be done.
From the beginning of the rpm (from time when it was 100% implemented in
perl) compare do SySV packages (used on Solaris and BSD*s) and deb (kind of
only improved new skyfold on top of original SySV packaging tools ideas) up
to now problem of consistent upgrade never been solved completely. Why?
Because man assumption about doing upgrade on working system
image/resources is broken by design idea. As long as during upgrade process
will be deleted some files still used by working processes or will be
reopened by those processes always possibility that those processes will be
not able normally used resources or will be trying to use resources from
wrong version is relatively high.
Whatever could be done on packagemanager are to avoid those icebergs is not
enough and will never solve those two fundamental uncertain scenarios.
So why with existing rpm is not possible to solve upgrade dilemmas is
probably more or less obvious now.
So seems like now is yet another iteration of clashes with rmp limitations
only question is how (and by who?) those problems have been already solved?
Answer is very simple: those problems have been solved almost *decade ago
on Solaris* with introduction two crucial technologies like ZFS (Zeta File
System) and IPS (Image Packaging system). These two bits on maintaining
system resources are interacting very closely and they cannot be used
separately (yes .. atm only).
So how *ALL* upgrade problems have been solved on ideas layer?
Very simple: by assumption that system upgrade will never (ever) will be
done on working system resource.
Someone may scratch his head asking "how it is possible to do upgrade if
system resources are not touched?". Answer is that it is not possible to
implement this idea adding some functionalities to package management (PM)
software. Such operation like upgrade needs to be supported by OS and to be
more precise by *FS layer*.
So how problem of consistent upgrade have been solved on Solaris using ZFS
ZFS has ability to create snapshot of the vol (RO resource) and create on
top of the shapshot clone (RW resource).
Whole upgrade process consist from few steps:
- find volumes which needs to be snapshoted and cloned
- create clones
- mount clones as separated tree and perform upgrad
This part is crucial. If anything wrong will happen during upgrade still
working system is not affected. It is possible to observe state of broken
upgrade and produce very precise diagnostic data allowing to fix upgrade
process on layer of packages. In other words *impact of *during *upgrade*
on top of still working system *is* *NULL/ZERO!!!*
- when upgrade process is finished grub boot loaded configuration is
updated to add new root point from from which updated system system image
needs to be booted.
As I wrote two technologies here (together) are crucial here to solve 100%
upgrade issues: ZFS and IPS. 3rd minor part is bootloaded. Originally on
Solaris 10 was used grub and grub2 on Solaris 11 only simplified whole
So what is missing here on Linux to implement those idea? To be hones ..
not to much which is good :)
Only few small bolts and beans are missing :)
On Linux at the moment is available btrfs which provides possibility of RW
snapshots (equivalent of ZFS clones). All what needs to be added to this
layer is btrfs volume attribute indicating that volume needs to be cloned
during upgrade in case of more complicated scenarios.
Why? Because automatic discovery may be not enough in cases like mayr
database upgrade when part of the u[grade may be some format change which
needs to be applied in format for example database files used by some
application. If in boot loaded will be possible to have to boot entries
allowing to boot from original state from before upgrade and all what was
done after upgrade upgrade if post PM upgrade operations applied on top of
upgraded software will be cloned as well in case any troubles on this
stage. Whole rollback/downgrade procedure will only consist from reboot and
choose another BE (Boot Environment)
All BE management on solaris is dome over one command beadm. This command
is used on cloning existing OS resources manually as well. BE idea is
connected to to other small bits like running BE and active BE. Running BE
it is BE which is used now and active BE it is BE which will be used
automatically if it will be used reboot command without specify BE from
which system needs to boot after shutdown.
Another small bit which needs to be sorted is related to install procedures
implemented in anaconda and post installation procedures in kernel package.
What is missing here? anaconda does not allow now to use /boot on btrfs. It
forces use ext3/4.
Few weeks ago dysk in my laptop started failing so I've attached new disk
replacing CD. Initially I've started replicating whole partitions layout as
it was applied by anaconda installer with one partition for swap, second
one for ext3/4 /boot and / on btrfs.
When I've done and after start "btrfs send | btr receive" commands I've
found out that in kernel space are loaded ext modules ad they are used only
by /boot. So I've stopped everything to change to have only swap partition
(without LVM) and btrfs root pool.
After copy all resources and generate proper boot loaded on new disk
everything still is working so there is no any technical reasons now to
have /boot separated!!!
Only obstacle is that implemented in kernel package post installation
procedure does not like btrfs on /noot and does not update grub boot
entries so after few one or two kernel upgrades from rawhide I found that
my grub menu is shorter and rhorer :)
All what needs to be done to fix this issue is execute "grub2-mkconfig -o
I'm pretty sure that above will not break booting from other FSeses :)
Going to the end of his long email ..
All that needs to have done solving all upgrade issues on top of the Fedora
in some minimalistic scenario is:
- add to dnf BEs management
- switch btrfs as default FS
- adapt kernel post installation procedure
On top of above can be added few other small bits making whole BE
management consistent from point of view BEs management.
Anyone who will choose other than btrfs FS will need to accept that it will
be more or less dealing with limitations of the calassic more than 30 years
old ideas of using non-shapshotable/cloneable volumes and limitations of
old SySV packages ideas.
rpm need to die sooner or later as well and probably best would be adapt
IPS is fully OSS (https://java.net/projects/ips/sources/pkg-gate/show/
many people see far been thinking about porting it of Linux. However lack
of enough stable btrfs was main obstacle. As now btrfs is quite stable IMO
it is time to start thinking about move away from rpm as well. However no
As I said and I think that I've prove above that now IPS is not essential
maybe another time I'll try to write longer comment why rpm is already dead
Tomasz Kłoczko | LinkedIn: *http://lnkd.in/FXPWxH <http://lnkd.in/FXPWxH>*