= Proposed System Wide Change: RpmOstree - Server side composes and atomic upgrades = https://fedoraproject.org/wiki/Changes/RpmOstree
Change owner(s): Colin Walters walters@verbum.org
The rpm-ostree [1] tool provides a new way to deploy and manage RPM-based operating systems. Instead of performing a package-by-package install and upgrade on each client machine, the tooling supports "composing" sets of packages on a server side, and then clients can perform atomic upgrades as a tree.
The system by default preserves the previously booted deployment, providing an "A/B partition" type feel, allowing quick system rollbacks for the entire OS content (kernel and userspace).
This is a dependency of the Changes/Atomic_Cloud_Image. [2]
== Detailed Description == rpm-ostree is far from the first effort in the field of "image-like" update systems in Fedora. The StatelessLinux [3] project was first prototyped in Fedora Core 6 timeframe. Today, particularly in the cloud, many deployments perform OS upgrades by terminating an instance, and booting a new OS image and having it discover previous state stored in an external volume or network store.
Another model is to perform an atomic upgrade by delivering the OS content via an ISO or USB stick, and simply swapping it out, then rebooting. The oVirt Node [4] is an example of this model.
The most challenging case though is stateful systems that require online/incremental Internet/Intranet connected upgrades. This is the default model for traditional Fedora package managers such as yum. A common approach for this to have an "A/B" partition model, and to use rsync or a custom tool to perform upgrades offline into the non-active partition.
rpm-ostree is attempting to address this last case, but in a more flexible and dynamic fashion. It has some of the flexibility of package systems, with the atomic upgrade and rollback of image-based systems. Furthermore, rpm-ostree intends to bind together the world of packages with an image-like update system. For example, an "rpm-ostree upgrade" command can show the system administrator the package-level diff.
In the future, the intention is for rpm-ostree to further gain package-system like features. See package layering prototype [5]. An active git branch uses libhif [6].
== Scope == * Proposal owners: work on http://projectatomic.io upstream
* Other developers: ** Anaconda: Help maintain rpmostreepayload.py ** Anaconda/Architecture porters: Backends for the OSTree bootloader code, similar to grubby
** RPM content: *** Use systemd-tmpfiles instead of placing content in /var (TODO: better docs for this) *** Change "rootfiles" and "bash" to not require files in /root by default (TODO: bugzilla entry)
* Release engineering: Create trees from package set, mirroring support
* Policies and guidelines: TODO: Guideline for /var
[1] https://github.com/projectatomic/rpm-ostree [2] https://fedoraproject.org/wiki/Changes/Atomic_Cloud_Image [3] https://fedoraproject.org/wiki/StatelessLinux [4] http://www.ovirt.org/Node_Building [5] https://lists.projectatomic.io/projectatomic-archives/atomic-devel/2014-Octo... [6] https://github.com/projectatomic/rpm-ostree/pull/81 _______________________________________________ devel-announce mailing list devel-announce@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel-announce
On Tue, Jan 13, 2015 at 7:32 AM, Jaroslav Reznik jreznik@redhat.com wrote:
= Proposed System Wide Change: RpmOstree - Server side composes and atomic upgrades = https://fedoraproject.org/wiki/Changes/RpmOstree
Change owner(s): Colin Walters walters@verbum.org
The rpm-ostree [1] tool provides a new way to deploy and manage RPM-based operating systems. Instead of performing a package-by-package install and upgrade on each client machine, the tooling supports "composing" sets of packages on a server side, and then clients can perform atomic upgrades as a tree.
The system by default preserves the previously booted deployment, providing an "A/B partition" type feel, allowing quick system rollbacks for the entire OS content (kernel and userspace).
This is a dependency of the Changes/Atomic_Cloud_Image. [2]
Erm.. if this change is a dependency for the above, did the above wind up not making F21? If so, should it be moved to F22?
== Detailed Description == rpm-ostree is far from the first effort in the field of "image-like" update systems in Fedora. The StatelessLinux [3] project was first prototyped in Fedora Core 6 timeframe. Today, particularly in the cloud, many deployments perform OS upgrades by terminating an instance, and booting a new OS image and having it discover previous state stored in an external volume or network store.
Another model is to perform an atomic upgrade by delivering the OS content via an ISO or USB stick, and simply swapping it out, then rebooting. The oVirt Node [4] is an example of this model.
The most challenging case though is stateful systems that require online/incremental Internet/Intranet connected upgrades. This is the default model for traditional Fedora package managers such as yum. A common approach for this to have an "A/B" partition model, and to use rsync or a custom tool to perform upgrades offline into the non-active partition.
rpm-ostree is attempting to address this last case, but in a more flexible and dynamic fashion. It has some of the flexibility of package systems, with the atomic upgrade and rollback of image-based systems. Furthermore, rpm-ostree intends to bind together the world of packages with an image-like update system. For example, an "rpm-ostree upgrade" command can show the system administrator the package-level diff.
In the future, the intention is for rpm-ostree to further gain package-system like features. See package layering prototype [5]. An active git branch uses libhif [6].
== Scope ==
Proposal owners: work on http://projectatomic.io upstream
Other developers:
** Anaconda: Help maintain rpmostreepayload.py ** Anaconda/Architecture porters: Backends for the OSTree bootloader code, similar to grubby
** RPM content: *** Use systemd-tmpfiles instead of placing content in /var (TODO: better docs for this) *** Change "rootfiles" and "bash" to not require files in /root by default (TODO: bugzilla entry)
Release engineering: Create trees from package set, mirroring support
Policies and guidelines: TODO: Guideline for /var
Jaroslav, there is a lot more information on the actual wiki page. Like the fact that this is only for particular opt-in new installs and that yum/dnf/RPM can only operate in read-only mode on such installs. Could you resend this with the entirety of the text? It might lead to fewer questions.
josh
Jaroslav, there is a lot more information on the actual wiki page. Like the fact that this is only for particular opt-in new installs and that yum/dnf/RPM can only operate in read-only mode on such installs. Could you resend this with the entirety of the text? It might lead to fewer questions.
This is being sent to devel-announce, so should not overwhelm people who are not interested. That’s why it includes the basic description (to let you decide whether you are interested) and the Scope section (to let you check whether this will, through the “Other developers” bullet point, place demands on you). It is somewhat important that everybody reads these parts; wouldn’t including the full page drown these parts out? Mirek
On Tue, Jan 13, 2015 at 3:27 PM, Miloslav Trmač mitr@redhat.com wrote:
Jaroslav, there is a lot more information on the actual wiki page. Like the fact that this is only for particular opt-in new installs and that yum/dnf/RPM can only operate in read-only mode on such installs. Could you resend this with the entirety of the text? It might lead to fewer questions.
This is being sent to devel-announce, so should not overwhelm people who are not interested. That’s why it includes the basic description (to let you decide whether you are interested) and the Scope section (to let you check whether this will, through the “Other developers” bullet point, place demands on you). It is somewhat important that everybody reads these parts; wouldn’t including the full page drown these parts out?
No. People can stop reading wherever they'd like. Omitting relevant information from the actual Change page makes it rather difficult to _discuss_ the Change on the devel list, which is the main reason they are sent. Doing the discussion on the wiki is terrible. I would much rather be able to quote the sections via email.
josh
----- Original Message -----
On Tue, Jan 13, 2015 at 3:27 PM, Miloslav Trmač mitr@redhat.com wrote:
Jaroslav, there is a lot more information on the actual wiki page. Like the fact that this is only for particular opt-in new installs and that yum/dnf/RPM can only operate in read-only mode on such installs. Could you resend this with the entirety of the text? It might lead to fewer questions.
This is being sent to devel-announce, so should not overwhelm people who are not interested. That’s why it includes the basic description (to let you decide whether you are interested) and the Scope section (to let you check whether this will, through the “Other developers” bullet point, place demands on you). It is somewhat important that everybody reads these parts; wouldn’t including the full page drown these parts out?
No. People can stop reading wherever they'd like. Omitting relevant information from the actual Change page makes it rather difficult to _discuss_ the Change on the devel list, which is the main reason they are sent. Doing the discussion on the wiki is terrible. I would much rather be able to quote the sections via email.
Actually it was Mitr who asked me first to add more parts into email announcement but then we talked about it - and I think current way is a good compromise - to give overview of change but not to overload people with two many details. Also it really depends on how change is filled, usually detailed description and scope are that parts really needed to do the decision/get overview of changes in detail. With more aim on contingency plans, I can add contingency plan section. By these three, you have covered 95% of content of standard changes. Let's try it.
You can easily copy excerpt from wiki to the email and start that discussion, wiki for discussion is not needed (and really very bad place to do so).
Btw. it's my opinion, if more people would like to see the whole change page announced, I'll do it. Or any other ideas in the way how announcement is structured...
Jaroslav
josh
devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
== Scope ==
- Other developers:
*** Use systemd-tmpfiles instead of placing content in /var (TODO: better docs for this)
Is this a strict dependency or a nice-to-have item? That is, are we talking about having to change all such packages in Fedora (or some specific subset) within the next ~month? Mirek
On Tue, Jan 13, 2015, at 04:06 PM, Miloslav Trmač wrote:
== Scope ==
- Other developers:
*** Use systemd-tmpfiles instead of placing content in /var (TODO: better docs for this)
Is this a strict dependency or a nice-to-have item? That is, are we talking about having to change all such packages in Fedora (or some specific subset) within the next ~month?
If the package is just installing an empty directory hierarchy, it's a nice to have - rpm-ostree auto-synthesizes tmpfiles.d snippets.
If it's installing a regular file, then it won't work - the package (daemon) needs to create it on start.
The goal of this is: - Support "factory reset" - Make it clear that (rpm-)ostree itself never touches /var, not on install or upgrade. It's the sole province of 1) The daemon writing the content 2) The administrator's chosen backup system. (We use systemd-tmpfiles because it's a convenient way to ensure correct SELinux labeling for initial directories)
On Tue, Jan 13, 2015, at 04:41 PM, Colin Walters wrote:
If it's installing a regular file, then it won't work - the package (daemon) needs to create it on start.
I filed a bug about this:
https://bugzilla.redhat.com/show_bug.cgi?id=1182785
Though I wonder if this should be a Change in itself, or a packaging guideline update?
To be clear, this transition doesn't have to happen all at once, only for the packages that one would want to consume via rpm-ostree. (Which ideally at some point is all, but I'm focusing on the core personally)
On Tue, Jan 13, 2015, at 04:41 PM, Colin Walters wrote:
If it's installing a regular file, then it won't work - the package (daemon) needs to create it on start.
I filed a bug about this:
https://bugzilla.redhat.com/show_bug.cgi?id=1182785
Though I wonder if this should be a Change in itself, or a packaging guideline update?
I think it does need to go through the FPC. I don’t see a benefit in splitting the Change into two if there is a strict dependency; would it make sense to have the /var change without the rest of the RpmOstree features, or vice versa?
To be clear, this transition doesn't have to happen all at once, only for the packages that one would want to consume via rpm-ostree. (Which ideally at some point is all, but I'm focusing on the core personally)
(FWIW the schedule for going through the FPC and hitting the completion/testable deadline on Feb 24 feels a little tight, though probably still doable. Do you know how many packages would be affected?) Mirek
On 13. 1. 2015 at 16:41:46, Colin Walters wrote:
On Tue, Jan 13, 2015, at 04:06 PM, Miloslav Trmač wrote:
== Scope ==
- Other developers:
*** Use systemd-tmpfiles instead of placing content in /var (TODO: better docs for this)
Is this a strict dependency or a nice-to-have item? That is, are we talking about having to change all such packages in Fedora (or some specific subset) within the next ~month?
If the package is just installing an empty directory hierarchy, it's a nice to have - rpm-ostree auto-synthesizes tmpfiles.d snippets.
If it's installing a regular file, then it won't work - the package (daemon) needs to create it on start.
The goal of this is:
- Support "factory reset"
- Make it clear that (rpm-)ostree itself never touches /var, not on install
or upgrade. It's the sole province of 1) The daemon writing the content 2) The administrator's chosen backup system. (We use systemd-tmpfiles
because
it's a convenient way to ensure correct SELinux labeling for initial directories)
I have hard time figuring out what exactly is the purpose of including the factory reset feature in your proposal. No offense but unless I'm missing something, it seems to me that you are trying to solve some of ostree problems in the rest of the distribution rather than in ostree itself.
I think this part of the proposed change has implications as severe as those the infamous UsrMove had. And from what I can remember, some of us spent another two releases fixing that up. In this particular case, I foresee problems with all databases (they store data in /var) and web servers (/var/www). For me personally the most immediate blocker is the rpm stack which stores its data in /var on multiple different levels. Even if we consider something as unimportant as metadata cache, re-downloading it because of transient /var is not something our users will be happy about. Also linking stuff to /usr is not an option, as /usr is often read-only and
All in all, I'm rather against this part of the proposal. In my opinion, ostree should take /var as it is now instead of re-designing it. If there is a strong demand for the factory reset feature, it should be proposed, decided and implemented separately.
Thanks Jan
On Mon, Jan 19, 2015, at 07:02 AM, Jan Zelený wrote:
I have hard time figuring out what exactly is the purpose of including the factory reset feature in your proposal. No offense but unless I'm missing something, it seems to me that you are trying to solve some of ostree problems in the rest of the distribution rather than in ostree itself.
I wouldn't say ostree problems exactly. I certainly could change ostree to write to /var. But I think the benefits of it not doing so are worth the change in terms of getting a much cleaner separation between what's owned by the OS and what's user data.
I think this part of the proposed change has implications as severe as those the infamous UsrMove had. And from what I can remember, some of us spent another two releases fixing that up.
Yep, I too made changes for UsrMove.
In this particular case, I foresee problems with all databases (they store data in /var) and web servers (/var/www). For me personally the most immediate blocker is the rpm stack which stores its data in /var on multiple different levels.
Storing data in /var is fine!
Even if we consider something as unimportant as metadata cache, re-downloading it because of transient /var is not something our users will be happy about.
Hmm, there may be confusion on this, which is understandable because documentation is very thin. This isn't about making /var transient by default. In the default OSTree model it's fully persistent. It *can* be optionally transient, or reset explicitly.
All in all, I'm rather against this part of the proposal. In my opinion, ostree should take /var as it is now instead of re-designing it.
Does the above help to address your concerns?
If there is a strong demand for the factory reset feature, it should be proposed, decided and implemented separately.
Fair enough, again to be clear this is only partly about factory reset; it's also about making upgrades more reliable by having it be clear who owns data and when it's modified, which is why the ostree model uses it.
On 19. 1. 2015 at 11:30:22, Colin Walters wrote:
On Mon, Jan 19, 2015, at 07:02 AM, Jan Zelený wrote:
I have hard time figuring out what exactly is the purpose of including the factory reset feature in your proposal. No offense but unless I'm missing something, it seems to me that you are trying to solve some of ostree problems in the rest of the distribution rather than in ostree itself.
I wouldn't say ostree problems exactly. I certainly could change ostree to write to /var. But I think the benefits of it not doing so are worth the change in terms of getting a much cleaner separation between what's owned by the OS and what's user data.
I think this part of the proposed change has implications as severe as those the infamous UsrMove had. And from what I can remember, some of us spent another two releases fixing that up.
Yep, I too made changes for UsrMove.
In this particular case, I foresee problems with all databases (they store data in /var) and web servers (/var/www). For me personally the most immediate blocker is the rpm stack which stores its data in /var on multiple different levels.
Storing data in /var is fine!
Even if we consider something as unimportant as metadata cache, re-downloading it because of transient /var is not something our users will be happy about.
Hmm, there may be confusion on this, which is understandable because documentation is very thin. This isn't about making /var transient by default. In the default OSTree model it's fully persistent. It *can* be optionally transient, or reset explicitly.
You are probably right, I might have misunderstood what you actually propose. Does it mean that you actually don't require this part to be implemented at all and you can go with what's in /var without any distribution-wide changes? In other words, do you propose this change to be gradually implemented where it makes sense?
All in all, I'm rather against this part of the proposal. In my opinion, ostree should take /var as it is now instead of re-designing it.
Does the above help to address your concerns?
Yes, it does. Thank you
If there is a
strong demand for the factory reset feature, it should be proposed, decided and implemented separately.
Fair enough, again to be clear this is only partly about factory reset; it's also about making upgrades more reliable by having it be clear who owns data and when it's modified, which is why the ostree model uses it.
Thank you for the additional explanation. Now I think that the problem is not in what you want but in possibly ambiguous specification. What I'm afraid of is that some people will use this opportunity to push through fully transient /var.
Jan
On Tue, Jan 20, 2015, at 06:27 AM, Jan Zelený wrote:
You are probably right, I might have misunderstood what you actually propose. Does it mean that you actually don't require this part to be implemented at all and you can go with what's in /var without any distribution-wide changes?
Fedora 21 Atomic does work with a current F21 packageset. For example Docker stores state in /var/lib/docker, and no changes were required to the docker RPM or or rpm-ostree for this.
(Although an aside, Fedora 21 Atomic also stores docker images in a LVM volume which means they would need special factory reset handling if the feature was implemented)
In other words, do you propose this change to be gradually implemented where it makes sense?
Right, on demand. If there's a Fedora RPM that doesn't work with this scheme and is desirable to use by a 3rd party or by the Fedora Atomic subproject, we'd engage with the RPM maintainer to figure out a solution.
Thank you for the additional explanation. Now I think that the problem is not in what you want but in possibly ambiguous specification. What I'm afraid of is that some people will use this opportunity to push through fully transient /var.
There's a commit to OSTree which does make fully transient /var happen out of the box if the underlying / is read-only: https://git.gnome.org/browse/ostree/commit/?id=ff6883ca0655ac8844cd783caf6a7...
Which is pretty nice for some use cases, but definitely not the default =)
At a high level, I do agree this part of the change needs analysis, and I'm glad you brought it up. In the end I think the risk here is going to vary a lot per package.
For example, I need to look closely at the alternatives system.
But I certainly don't want to break the traditional install model - among other things, it's going to be a while before rpm-ostree would be a usable way to run my desktop, and also packages need to be backportable to older branches, etc.
On 20. 1. 2015 at 08:40:30, Colin Walters wrote:
On Tue, Jan 20, 2015, at 06:27 AM, Jan Zelený wrote:
You are probably right, I might have misunderstood what you actually propose. Does it mean that you actually don't require this part to be implemented at all and you can go with what's in /var without any distribution-wide changes?
Fedora 21 Atomic does work with a current F21 packageset. For example Docker stores state in /var/lib/docker, and no changes were required to the docker RPM or or rpm-ostree for this.
(Although an aside, Fedora 21 Atomic also stores docker images in a LVM volume which means they would need special factory reset handling if the feature was implemented)
In other words, do you propose this change to be gradually implemented where it makes sense?
Right, on demand. If there's a Fedora RPM that doesn't work with this scheme and is desirable to use by a 3rd party or by the Fedora Atomic subproject, we'd engage with the RPM maintainer to figure out a solution.
Thank you for the additional explanation. Now I think that the problem is not in what you want but in possibly ambiguous specification. What I'm afraid of is that some people will use this opportunity to push through fully transient /var.
There's a commit to OSTree which does make fully transient /var happen out of the box if the underlying / is read-only: https://git.gnome.org/browse/ostree/commit/?id=ff6883ca0655ac8844cd783caf6a7 d8815515ba3
Which is pretty nice for some use cases, but definitely not the default =)
At a high level, I do agree this part of the change needs analysis, and I'm glad you brought it up. In the end I think the risk here is going to vary a lot per package.
For example, I need to look closely at the alternatives system.
But I certainly don't want to break the traditional install model - among other things, it's going to be a while before rpm-ostree would be a usable way to run my desktop, and also packages need to be backportable to older branches, etc.
Thanks again for the detailed explanation, I really appreciate it :-)
Jan