Hi,
We have a bug open here which is a difficult question, in my opinion: http://dev.laptop.org/ticket/9349
For the upcoming XO-1.5 software build (which will hopefully make it onto XO-1 in the near future), do we want to ship olpc-update or use standard Fedora technologies?
My unconclusive thoughts and questions:
olpc-update has some features which aren't available elsewhere, such as the ability to switch between 2 OS builds installed on the same disk. However, I've never seen or heard of this being used in the field.
olpc-update is a "deployment quality" system in that it has covered everything - not only from being an updater, it has olpc-update-query which implements logic to figure out when to ask for updates and where to ask for them. It's accompanied by the theft deterrence protocol and 3 different server-side implementations of that. It's been implemented in OLPC deployments.
olpc-update creates a huge mass of hardlinks on disk, one for every file on the main OS. It then rsyncs in the updated files on a copy-on-write basis. Therefore you end up with a hell of a lot of hardlinks, and 2 complete copies of every file that has changed. (this means it is not great for situations when a lot of files have changed, e.g. changing between major Fedora releases)
olpc-update presents challenges for deployments that customise their images. Namely, to produce a build that it is possible to olpc-update to involves quite a bit of hassle (inserting security keys into the firmware of each XO, setting up an OATS server, etc). We did manage to figure this out in Paraguay (thanks to the internet connection, since it wouldn't be safe to put the OATS server in the schools) and I improved some tools accordingly.
We've switched to dracut for our initramfs, meaning that the initramfs-level stuff for the olpc-update system needs to be reimplemented. Basically we have to wrangle with the strange /versions filesystem layout and "frob" the /versions/current symlink if the user is switching between versions. I have reimplemented this logic in a dracut module but it is completely untested.
The option for using "Standard Fedora technologies" as far as I am aware is to basically use "yum update" in a fashion somehow suited for deployments.
Before F11, this would have meant a lot of downloading as each RPM update was downloaded in full even if only a little bit changed. As of F11, Fedora is now joining the "only download the bits that change" movement with yum-presto and deltarpms: http://lwn.net/Articles/329484/
The deltarpm description on that lwn article leaves a little to be desired -- it wouldn't work with big packages (is the quoted 200mb the compressed or uncompressed size?), the reconstruction from deltas is a CPU intensive task and we have a slow CPU, and it's all done in memory and we don't have a huge amount of memory especially on XO-1. olpc-update has an overhead of about 5 minutes of checksumming files.
Even though olpc-update isn't great for doing big distro updates (because of storing 2 copies of changed files, in this case almost all of them), it worked in those situations. I've never attempted an RPM-based update from e.g. Fedora 10 to Fedora 11. How well does that work out for regular Fedora users?
"yum update" always seems to use a surprising amount of bandwidth, redownloading entire package databases just to see if anything new is available. olpc-update was much more efficient in this respect, sending only a 128-byte hash of the filesystem contents file to the OATS server. For rpms, is there a more efficient alternative for updates-checking in situations where there is only going to be e.g. 1 update per month?
"yum update" has historically not worked very well on the XO. It hits OOM, it fills up yum's cache and then aborts, it gets confused between i586 and i686, etc.
We would have to tweak yum's behaviour quite heavily.. I don't think we want an rpm cache, or for it to keep the .rpm files at all.
It introduces new questions of security, signing, etc. Deployments will want to sign their rpm updates that they push out, so we now need a mechanism for getting that specific RPM signing public key onto all the laptops in a deployment so that they can trust the updates server. We had this nailed down for olpc-update: deployments can insert "local" public keys into the manufacturing through keyjector firmwares for existing laptops, and Quanta can now manufacture laptops with these keys already in place for future orders. olpc-update and the bitfrost code used these keys to verify updates.
For the XO-1 possibility it raises the question of how existing laptops could be migrated to this new system, without losing their user data.
Using "yum update" would be a large benefit of sharing technologies with other communities and not having to maintain our own systems and tools. However olpc-update would be easier in the short term because all the components have been implemented and deployed, and there aren't really any open questions.
Any other thoughts/opinions one way or the other?
Daniel
On Tue, Jun 23, 2009 at 8:01 PM, Daniel Drakedsd@laptop.org wrote:
Any other thoughts/opinions one way or the other?
My (also inconclusive) thoughts on this. I dislike using a bespoke update method profoundly and discussed this topic at length with Scott.
The main point, where olpc-update wins without a shadow of a doubt, is failure-friendliness. Assuming a fail-safe filesystem, olpc-update can be interrupted (hard) at any point, might take N runs to complete. And will only switch to the 'new' image once it's completed in an atomic operation.
RPM is unfortunately a disaster in this space. If an rpm transaction fails, there are no sane ways to recover it. On XO-1 a full OS update transaction with rpm will take a very long time, so it's highly likely that it will be interrupted. RPM and yum hackers treat this as an intractable problem.
There may be ways to make it a bit less bad (dpkg seems to handle recovery quite a bit better, but surely will have unrecoverable scenarios too) but it is a long hard road, and the shadow of a significant % of failed upgrades remains.
This is, IMO, the actual showstopper. If we had a fs that could handle snapshotting, then we could snapshot the system before starting the RPM transaction, and have recovery hooks in the boot partition or in initrd. There's been discussion on fedora-devel about hooking up yum with btrfs transactions
A few other notes -- for completeness:
- Upstream has not historically supported yum-based upgrades, only anaconda upgrades. Yum OS upgrades are a relatively new (and risky) thing. We could have an 'anaconda boot' partition to handle upgrades without external media.
- Disk space requirements for olpc-update vs yum/rpm are probably similar, and I suspect that during the upgrade rpm transaction the footprint is larger for the rpm methods. We could teach olpc-update to jettison the old image as soon as the new one boots successfully...
- olpc-update completely sidesteps %pre and %post scripts -- this is potentially a problem.
- olpc-update offers downgrades, but our software at higher layers doesn't support this. The Journal format upgrade broke downgrades in the last OS upgrade. Sugar 0.84 has another Journal format change, probably with a similar upgrade-only path.
- deltarpm-style tools are of little benefit for a whole OS upgrade. as there is little or no overlap between releases. Where there is overlap, then olpc-update's hardlinking strategy is also a win of similar proportions. Deltarpm-style tools usually force you to keep around the original rpm as well.
- olpc-update-query needs to be changed to retry more aggressively once it has received an "upgrade now" message from the OATS. It's in my TODO list...
hth,
martin
The main point, where olpc-update wins without a shadow of a doubt, is failure-friendliness. Assuming a fail-safe filesystem, olpc-update can be interrupted (hard) at any point, might take N runs to complete. And will only switch to the 'new' image once it's completed in an atomic operation.
RPM is unfortunately a disaster in this space. If an rpm transaction fails, there are no sane ways to recover it. On XO-1 a full OS update transaction with rpm will take a very long time, so it's highly likely that it will be interrupted. RPM and yum hackers treat this as an intractable problem.
Have you tried the « yum-complete-transaction » tool ? It comes from the yum-utils package and might interest you :)
This is, IMO, the actual showstopper. If we had a fs that could handle snapshotting, then we could snapshot the system before starting the RPM transaction, and have recovery hooks in the boot partition or in initrd. There's been discussion on fedora-devel about hooking up yum with btrfs transactions
I probably am way over my knowledge, but what about LVM snapshots ?
- olpc-update offers downgrades, but our software at higher layers doesn't support this. The Journal format upgrade broke downgrades in the last OS upgrade. Sugar 0.84 has another Journal format change, probably with a similar upgrade-only path.
Yum/RPM also have downgrades. See the « yum-allow-downgrade » plugin and the « --oldpackage » RPM option.
I'm don't know if those cover all the use cases you might have for downgrades though.
----------
Mathieu Bridon (bochecha)
On Tue, Jun 23, 2009 at 10:19 PM, Mathieu Bridon (bochecha)bochecha@fedoraproject.org wrote:
Have you tried the « yum-complete-transaction » tool ? It comes from the yum-utils package and might interest you :)
Discussed it a bit in fedora-devel. I didn't get the impression that it's a very deep fix, but I might be wrong.
How comprehensive is it?
Yum/RPM also have downgrades. See the « yum-allow-downgrade » plugin and the « --oldpackage » RPM option.
it's not exactly the same. olpc-update will let you have 2 full OSs in the same partition and pick one at boot time. You could ping-pong between F9 and F11 at wish, as long as they work with the same kernel.
cheers,
m
On Tue, Jun 23, 2009 at 10:07 PM, Martin Langhoffmartin.langhoff@gmail.com wrote:
A few other notes -- for completeness:
- olpc-update depends on a tricky bit of code at boot time that keeps "OS install" 2 trees in different subdirs, and picks which one to run. The OS install is kept pristine with hardlinks and various mountpoints are layered on top, making it a jury-rigged COW setup. This has various interesting aspects, one of them is that we could use rpm's --root option to run a similar "shadow" OS install for resilient upgrades. %pre and %post scripts could be tricky to handle.
cheers,
m
RPM is unfortunately a disaster in this space. If an rpm transaction fails, there are no sane ways to recover it. On XO-1 a full OS update transaction with rpm will take a very long time, so it's highly likely that it will be interrupted. RPM and yum hackers treat this as an intractable problem.
yum has a feature now that allows you to complete 'transactions' I discovered the hard way that this actually works quite well :-)
This is, IMO, the actual showstopper. If we had a fs that could handle snapshotting, then we could snapshot the system before starting the RPM transaction, and have recovery hooks in the boot partition or in initrd. There's been discussion on fedora-devel about hooking up yum with btrfs transactions
We do this at work on our SANs for major upgrades "snap restore backup-x" reboot :-)
- Upstream has not historically supported yum-based upgrades, only anaconda upgrades. Yum OS upgrades are a relatively new (and risky) thing. We could have an 'anaconda boot' partition to handle upgrades without external media.
Sounds like preupgrade. But in specific deployments where you know exactly the current software versions and the hardware it would be fairly easy to test and verify a 'yum upgrade'.
- Disk space requirements for olpc-update vs yum/rpm are probably similar, and I suspect that during the upgrade rpm transaction the footprint is larger for the rpm methods. We could teach olpc-update to jettison the old image as soon as the new one boots successfully...
- olpc-update completely sidesteps %pre and %post scripts -- this is potentially a problem.
- olpc-update offers downgrades, but our software at higher layers doesn't support this. The Journal format upgrade broke downgrades in the last OS upgrade. Sugar 0.84 has another Journal format change, probably with a similar upgrade-only path.
- deltarpm-style tools are of little benefit for a whole OS upgrade. as there is little or no overlap between releases. Where there is overlap, then olpc-update's hardlinking strategy is also a win of similar proportions. Deltarpm-style tools usually force you to keep around the original rpm as well.
I think it uses the actual installed files so there's no need to keep the actual rpms around.
Peter
Den 2009-06-23 20:01, Daniel Drake skrev:
Even though olpc-update isn't great for doing big distro updates (because of storing 2 copies of changed files, in this case almost all of them), it worked in those situations. I've never attempted an RPM-based update from e.g. Fedora 10 to Fedora 11.
Preupgrade+Anaconda would be better.
The big question with yum updates and Anaconda upgrades is how to do it safely. The system must remain bootable at all times.
It would be neat (perhaps required?) if Anaconda upgrades would do a transaction-type upgrade, where you have the option to roll back if something goes wrong in the middle. People keep asking for that for regular yum updates, but that's not really possible since you can't roll back a generic running system. (How do you roll back the OS without also rolling back /var/lib/mysql and /home at the same time? No thanks.) With Anaconda, you know nothing else is running alongside the upgrade so if you have a snapshot mechanism for the filesystem or block device then maybe it would be possible...
"yum update" has historically not worked very well on the XO. It hits OOM, it fills up yum's cache and then aborts, it gets confused between i586 and i686, etc.
This option would certainly requires lots of testing to make sure it works well.
It introduces new questions of security, signing, etc. Deployments will want to sign their rpm updates that they push out, so we now need a
Maybe olpc-update can be extended to manage /etc/pki/rpm-gpg/ and rpm --import?
For the XO-1 possibility it raises the question of how existing laptops could be migrated to this new system, without losing their user data.
Use olpc-update to push a new image that contains preupgrade?
/abo
On Tue, Jun 23, 2009 at 8:01 PM, Daniel Drakedsd@laptop.org wrote:
For the upcoming XO-1.5 software build (which will hopefully make it onto XO-1 in the near future), do we want to ship olpc-update or use standard Fedora technologies?
Due to the fedora lists' draconian 'reply-to' settings, the thread has now moved to fedora-devel.
Just mentioning it so that people (who may not be in both lists) can follow it.
(And cross your fingers, maybe one day fedora lists get their reply-to fixed ;-) )
cheers,
m
For the upcoming XO-1.5 software build (which will hopefully make it onto XO-1 in the near future), do we want to ship olpc-update or use standard Fedora technologies?
Due to the fedora lists' draconian 'reply-to' settings, the thread has now moved to fedora-devel.
Just mentioning it so that people (who may not be in both lists) can follow it.
(And cross your fingers, maybe one day fedora lists get their reply-to fixed ;-) )
Well, actually, that's *my* fault...
I first replied only to Daniel, letting the lists out.
Then I wanted to correct my mistakes and transferred my answer to the lists. I saw "devel" and so I sent to fedora-devel instead of devel@laptop.org :-/
So there might be issues in the "reply-to" settings of the Fedora lists, but the problem actually came from "reply-to" issues in my GMail :)
Sorry about that (even though this actuallly lead to some interesting talk)
----------
Mathieu Bridon (bochecha)
We have a bug open here which is a difficult question, in my opinion: http://dev.laptop.org/ticket/9349
For the upcoming XO-1.5 software build (which will hopefully make it onto XO-1 in the near future), do we want to ship olpc-update or use standard Fedora technologies?
My unconclusive thoughts and questions:
olpc-update has some features which aren't available elsewhere, such as the ability to switch between 2 OS builds installed on the same disk. However, I've never seen or heard of this being used in the field.
olpc-update is a "deployment quality" system in that it has covered everything - not only from being an updater, it has olpc-update-query which implements logic to figure out when to ask for updates and where to ask for them. It's accompanied by the theft deterrence protocol and 3 different server-side implementations of that. It's been implemented in OLPC deployments.
olpc-update creates a huge mass of hardlinks on disk, one for every file on the main OS. It then rsyncs in the updated files on a copy-on-write basis. Therefore you end up with a hell of a lot of hardlinks, and 2 complete copies of every file that has changed. (this means it is not great for situations when a lot of files have changed, e.g. changing between major Fedora releases)
olpc-update presents challenges for deployments that customise their images. Namely, to produce a build that it is possible to olpc-update to involves quite a bit of hassle (inserting security keys into the firmware of each XO, setting up an OATS server, etc). We did manage to figure this out in Paraguay (thanks to the internet connection, since it wouldn't be safe to put the OATS server in the schools) and I improved some tools accordingly.
We've switched to dracut for our initramfs, meaning that the initramfs-level stuff for the olpc-update system needs to be reimplemented. Basically we have to wrangle with the strange /versions filesystem layout and "frob" the /versions/current symlink if the user is switching between versions. I have reimplemented this logic in a dracut module but it is completely untested.
The option for using "Standard Fedora technologies" as far as I am aware is to basically use "yum update" in a fashion somehow suited for deployments.
Before F11, this would have meant a lot of downloading as each RPM update was downloaded in full even if only a little bit changed. As of F11, Fedora is now joining the "only download the bits that change" movement with yum-presto and deltarpms: http://lwn.net/Articles/329484/
The deltarpm description on that lwn article leaves a little to be desired -- it wouldn't work with big packages (is the quoted 200mb the compressed or uncompressed size?), the reconstruction from deltas is a CPU intensive task and we have a slow CPU, and it's all done in memory and we don't have a huge amount of memory especially on XO-1. olpc-update has an overhead of about 5 minutes of checksumming files.
Well it works fine for OpenOffice and that's probably about the biggest package I have on my F-11 system. The file size might be a restriction to keep the memory usage in check. The thing with deltarpm is ultimately its a decision between bandwidth usage and memory usage. If all the updates are coming from a local school server and there's only a few updates a month its probably better going with the complete rpms. If bandwidth is an issue then drpms might well be the way to go. Fortunately its easy to enforce from the server side so you could include the deltrarpm support on the client side and if its not wanted just don't make the drpms available from the server.
Even though olpc-update isn't great for doing big distro updates (because of storing 2 copies of changed files, in this case almost all of them), it worked in those situations. I've never attempted an RPM-based update from e.g. Fedora 10 to Fedora 11. How well does that work out for regular Fedora users?
I've never had a problem with it. I've used it to remotely update my australian server from F-7 -> F-11 (not at once), and never had an issue with it. I've done it without issue on my local London firewall as well (a Fit-PC which runs the same style Geode a the XO). The main issue is that Fedora only tests upgrades between supported versions. So they "support" yum upgrades to F-11 from F-9 and F-10.
"yum update" always seems to use a surprising amount of bandwidth, redownloading entire package databases just to see if anything new is available. olpc-update was much more efficient in this respect, sending only a 128-byte hash of the filesystem contents file to the OATS server. For rpms, is there a more efficient alternative for updates-checking in situations where there is only going to be e.g. 1 update per month?
This should have improved greatly from F-9 to F-11. As for more efficient, not that I'm aware of but if there's only one update a month why does it matter?
"yum update" has historically not worked very well on the XO. It hits OOM, it fills up yum's cache and then aborts, it gets confused between i586 and i686, etc.
That should be a thing of the past with F-11. Also its a simple change to a rpm config file to change the geode from being a i586 to i686 as far as rpm is concerned. With F-11 being i586 this also shouldn't really be an issue either and the problem will be gone entirely with F-12. rpm and yum memory usage has improved alot in recent releases so hopefully the overall situation should be improved.
http://laiskiainen.org/blog/?p=19
We would have to tweak yum's behaviour quite heavily.. I don't think we want an rpm cache, or for it to keep the .rpm files at all.
It only keeps the rpms until its installed them. Most of that should be easily tunable through a olpc-relase package similar to or that depends on the generic-release or fedora-release package.
It introduces new questions of security, signing, etc. Deployments will want to sign their rpm updates that they push out, so we now need a mechanism for getting that specific RPM signing public key onto all the laptops in a deployment so that they can trust the updates server. We had this nailed down for olpc-update: deployments can insert "local" public keys into the manufacturing through keyjector firmwares for existing laptops, and Quanta can now manufacture laptops with these keys already in place for future orders. olpc-update and the bitfrost code used these keys to verify updates.
The way to get the signing key onto them would be through or in the equivalent of the fedora-release file.
For the XO-1 possibility it raises the question of how existing laptops could be migrated to this new system, without losing their user data.
I think It would be possible but obviously something that needs testing, and obviously would be dependant partially on the deployment as well.
Using "yum update" would be a large benefit of sharing technologies with other communities and not having to maintain our own systems and tools. However olpc-update would be easier in the short term because all the components have been implemented and deployed, and there aren't really any open questions.
Agreed.
Any other thoughts/opinions one way or the other?
None either way, obviously using yum has the advantage of being upstream and having mainstream testing and development. The advantage of olpc-update is that it has specific features, some which are used, some aren't and was designed with olpc deployments in mind. Maybe some of that functionality can be implemented through the yum plugin system.
Peter