We've talked about this for a while, but let's make it formal. The plan is to transition from Cloud as a Fedora Edition to Something Container Clustery (see https://fedoraproject.org/wiki/Objectives/ProjectFAO).
But, we still need cloud as a _deploy target_. The FAO-container-thing will continue to have cloud image deploy targets (as well as bare metal). I think it makes sense to _also_ have Fedora Server as a cloud deploy target.
This could possibly be both a Fedora Server Minimal Cloud Image and Fedora Server Batteries Included Image — but that'd be up to Server WG, I think.
Overall, I'm proposing:
1. Dissolve Cloud WG (See below; don't panic)
2. Form new Atomic WG or FAO WG (name to be bikeshedded) (a lot of overlap in membership with current Cloud WG, of course!)
3. _Keep_ Cloud SIG as a gathering point around cloud technology and covering shared underlying technology (fedimg, koji cloud image production, autocloud). Think of this as analogous in some ways to something like the ARM SIG.
4. Change https://getfedora.org/cloud/ to https://getfedora.org/atomic/ or https://getfedora.org/fao/
5. Create new http://cloud.fedoraproject.org/ in the same style as https://arm.fedoraproject.org/
6. New Atomic/FAO WG produces Whatever New Deliverable (starting with Two Week Atomic)
7. Cloud Base Image becomes base (uh, see what I did there?) for new Fedora Server cloud image (or images).
8. Vagrant image _probably_ the same — or maybe becomes its own thing?
9. ???
10. Profit!
On 25 August 2016 at 13:34, Matthew Miller mattdm@fedoraproject.org wrote:
We've talked about this for a while, but let's make it formal. The plan is to transition from Cloud as a Fedora Edition to Something Container Clustery (see https://fedoraproject.org/wiki/Objectives/ProjectFAO).
But, we still need cloud as a _deploy target_. The FAO-container-thing will continue to have cloud image deploy targets (as well as bare metal). I think it makes sense to _also_ have Fedora Server as a cloud deploy target.
Could we make sure that whatever targets we have are actually getting tested? The fact that autocloud has said it was broken for months but the cloud sig wasn't looking or fixing says that before we get to step 2, we need to say 'is anyone more than 2 people really interested?' It should be ok to say 'no we aren't.' without people diving into the fire trying to rescue something that unless it was on fire they wouldn't have helped.
This could possibly be both a Fedora Server Minimal Cloud Image and Fedora Server Batteries Included Image — but that'd be up to Server WG, I think.
Overall, I'm proposing:
Dissolve Cloud WG (See below; don't panic)
Form new Atomic WG or FAO WG (name to be bikeshedded) (a lot of overlap in membership with current Cloud WG, of course!)
_Keep_ Cloud SIG as a gathering point around cloud technology and covering shared underlying technology (fedimg, koji cloud image production, autocloud). Think of this as analogous in some ways to something like the ARM SIG.
Change https://getfedora.org/cloud/ to https://getfedora.org/atomic/ or https://getfedora.org/fao/
Create new http://cloud.fedoraproject.org/ in the same style as https://arm.fedoraproject.org/
New Atomic/FAO WG produces Whatever New Deliverable (starting with Two Week Atomic)
Cloud Base Image becomes base (uh, see what I did there?) for new Fedora Server cloud image (or images).
Vagrant image _probably_ the same — or maybe becomes its own thing?
???
Profit!
-- Matthew Miller mattdm@fedoraproject.org Fedora Project Leader _______________________________________________ server mailing list server@lists.fedoraproject.org https://lists.fedoraproject.org/admin/lists/server@lists.fedoraproject.org
On Thu, Aug 25, 2016 at 3:38 PM, Stephen John Smoogen smooge@gmail.com wrote:
On 25 August 2016 at 13:34, Matthew Miller mattdm@fedoraproject.org wrote:
We've talked about this for a while, but let's make it formal. The plan is to transition from Cloud as a Fedora Edition to Something Container Clustery (see https://fedoraproject.org/wiki/Objectives/ProjectFAO).
But, we still need cloud as a _deploy target_. The FAO-container-thing will continue to have cloud image deploy targets (as well as bare metal). I think it makes sense to _also_ have Fedora Server as a cloud deploy target.
Could we make sure that whatever targets we have are actually getting tested? The fact that autocloud has said it was broken for months but the cloud sig wasn't looking or fixing says that before we get to step 2, we need to say 'is anyone more than 2 people really interested?' It should be ok to say 'no we aren't.' without people diving into the fire trying to rescue something that unless it was on fire they wouldn't have helped.
There are a lot of images being produced and I have no idea if they're really needed. That a release blocking image (cloud base qcow2) nearly caused F25 alpha to slip because it was busted at least suggests it probably shouldn't be release blocking anymore. FWIW, cloud base qcow2 now gets grub2 in lieu of extlinux as the work around for the breakage.
On Thu, Aug 25, 2016 at 04:43:50PM -0600, Chris Murphy wrote:
There are a lot of images being produced and I have no idea if they're really needed. That a release blocking image (cloud base qcow2) nearly caused F25 alpha to slip because it was busted at least suggests it probably shouldn't be release blocking anymore. FWIW, cloud base qcow2 now gets grub2 in lieu of extlinux as the work around for the breakage.
Puts us back at 231M for the qcow2, instead of 195M for F24. Ah well; at least it boots.
Rather than having the Cloud Base Image — or its Server-based successor — be blocking, I'd like to it as see an updated, automatically-tested two-week image. Ideally, we'd have a solid one on release day, but if we don't for some reason, it'd be less of a crisis.
We also, obviously, have a process breakdown with what to do with failure reports from autocloud.
On 26 August 2016 at 14:27, Matthew Miller mattdm@fedoraproject.org wrote:
On Thu, Aug 25, 2016 at 04:43:50PM -0600, Chris Murphy wrote:
There are a lot of images being produced and I have no idea if they're really needed. That a release blocking image (cloud base qcow2) nearly caused F25 alpha to slip because it was busted at least suggests it probably shouldn't be release blocking anymore. FWIW, cloud base qcow2 now gets grub2 in lieu of extlinux as the work around for the breakage.
Puts us back at 231M for the qcow2, instead of 195M for F24. Ah well; at least it boots.
Well if it helps any, the alternative is a 0 MB image :). I think we can all agree that is as good as it gets until we get complex or negative storage.
Rather than having the Cloud Base Image — or its Server-based successor — be blocking, I'd like to it as see an updated, automatically-tested two-week image. Ideally, we'd have a solid one on release day, but if we don't for some reason, it'd be less of a crisis.
We also, obviously, have a process breakdown with what to do with failure reports from autocloud.
-- Matthew Miller mattdm@fedoraproject.org Fedora Project Leader _______________________________________________ server mailing list server@lists.fedoraproject.org https://lists.fedoraproject.org/admin/lists/server@lists.fedoraproject.org
On Fri, 2016-08-26 at 14:27 -0400, Matthew Miller wrote:
On Thu, Aug 25, 2016 at 04:43:50PM -0600, Chris Murphy wrote:
There are a lot of images being produced and I have no idea if they're really needed. That a release blocking image (cloud base qcow2) nearly caused F25 alpha to slip because it was busted at least suggests it probably shouldn't be release blocking anymore. FWIW, cloud base qcow2 now gets grub2 in lieu of extlinux as the work around for the breakage.
Puts us back at 231M for the qcow2, instead of 195M for F24. Ah well; at least it boots.
Rather than having the Cloud Base Image — or its Server-based successor — be blocking, I'd like to it as see an updated, automatically-tested two-week image. Ideally, we'd have a solid one on release day, but if we don't for some reason, it'd be less of a crisis.
We also, obviously, have a process breakdown with what to do with failure reports from autocloud.
Right. We *have* the automated testing, but automated testing is no use if no-one looks at the results and fixes the bugs. This is not really a QA responsibility (even though I seem to be the one who always winds up doing it for Server and Workstation; I do not have time to do it for Cloud). Of course, in an 'ideal' world we'd have a more CI-ish setup where changes that cause the tests to start failing get rejected, and people are working on that - but the fact that we don't have it already is not an excuse to ignore the test systems we already have in place.
I will note that I filed https://bugzilla.redhat.com/show_bug.cgi?id=1331864%C2%A0- an unavoidable crash when installing from the Atomic installer image - in *April*, and no-one appears to care that the Atomic installer image has been broken since then. This bug still shows up like clockwork in every F25 and Rawhide compose tested in openQA. It makes me wonder why I put in the effort to implement the openQA testing, if no-one cares when it finds a bug.
On Fri, Aug 26, 2016, at 04:45 PM, Adam Williamson wrote:
I will note that I filed https://bugzilla.redhat.com/show_bug.cgi?id=1331864%C2%A0- an unavoidable crash when installing from the Atomic installer image - in *April*, and no-one appears to care that the Atomic installer image has been broken since then.
In this case, it's a combination of routing issue and bandwidth; the routing issue here is that the people who watch the `anaconda` bugzilla entries don't currently intersect much with Atomic Host. In general, please add me to CC for any critical bugs you find.
Or alternatively, raise any blockers on the cloud@ or atomic-devel@ lists.
On Fri, 2016-08-26 at 17:03 -0400, Colin Walters wrote:
On Fri, Aug 26, 2016, at 04:45 PM, Adam Williamson wrote:
I will note that I filed https://bugzilla.redhat.com/show_bug.cgi?id=1331864%C2%A0- an unavoidable crash when installing from the Atomic installer image - in *April*, and no-one appears to care that the Atomic installer image has been broken since then.
In this case, it's a combination of routing issue and bandwidth; the routing issue here is that the people who watch the `anaconda` bugzilla entries don't currently intersect much with Atomic Host. In general, please add me to CC for any critical bugs you find.
Or alternatively, raise any blockers on the cloud@ or atomic-devel@ lists.
I CC'ed Adam Miller, figuring he'd CC anyone else who was interested, but OK, I'll add you in future. I think I have mentioned it a couple times before in IRC and stuff, but never mind.
On Fri, Aug 26, 2016 at 05:03:21PM -0400, Colin Walters wrote:
In this case, it's a combination of routing issue and bandwidth; the routing issue here is that the people who watch the `anaconda` bugzilla entries don't currently intersect much with Atomic Host. In general, please add me to CC for any critical bugs you find. Or alternatively, raise any blockers on the cloud@ or atomic-devel@ lists.
As I understand it, AutoCloud creates these issues: https://pagure.io/atomic-images/issues
We should probably subscribe one or both of those lists to those tickets.
On 26 August 2016 at 16:45, Adam Williamson adamwill@fedoraproject.org wrote:
On Fri, 2016-08-26 at 14:27 -0400, Matthew Miller wrote:
On Thu, Aug 25, 2016 at 04:43:50PM -0600, Chris Murphy wrote:
There are a lot of images being produced and I have no idea if they're really needed. That a release blocking image (cloud base qcow2) nearly caused F25 alpha to slip because it was busted at least suggests it probably shouldn't be release blocking anymore. FWIW, cloud base qcow2 now gets grub2 in lieu of extlinux as the work around for the breakage.
Puts us back at 231M for the qcow2, instead of 195M for F24. Ah well; at least it boots.
Rather than having the Cloud Base Image — or its Server-based successor — be blocking, I'd like to it as see an updated, automatically-tested two-week image. Ideally, we'd have a solid one on release day, but if we don't for some reason, it'd be less of a crisis.
We also, obviously, have a process breakdown with what to do with failure reports from autocloud.
Right. We *have* the automated testing, but automated testing is no use if no-one looks at the results and fixes the bugs. This is not really a QA responsibility (even though I seem to be the one who always winds up doing it for Server and Workstation; I do not have time to do it for Cloud). Of course, in an 'ideal' world we'd have a more CI-ish setup where changes that cause the tests to start failing get rejected, and people are working on that - but the fact that we don't have it already is not an excuse to ignore the test systems we already have in place.
I will note that I filed https://bugzilla.redhat.com/show_bug.cgi?id=1331864 - an unavoidable crash when installing from the Atomic installer image - in *April*, and no-one appears to care that the Atomic installer image has been broken since then. This bug still shows up like clockwork in every F25 and Rawhide compose tested in openQA. It makes me wonder why I put in the effort to implement the openQA testing, if no-one cares when it finds a bug.
I feel your pain on this but think it is also a good thing. Maybe no-one cares <period> about this target but we didn't have data on it until you put in a tool which could measure how much people actually care. Look at these tools as part of the cost people need to pay for having various targets of the distribution. People say they want stuff as long as it is free to them even if they never use it.. but when a cost is actually associated with the thing they are a lot pickier about what they want to spend on.
On 08/25/2016 01:34 PM, Matthew Miller wrote:
We've talked about this for a while, but let's make it formal. The plan is to transition from Cloud as a Fedora Edition to Something Container Clustery (see https://fedoraproject.org/wiki/Objectives/ProjectFAO).
But, we still need cloud as a _deploy target_. The FAO-container-thing will continue to have cloud image deploy targets (as well as bare metal). I think it makes sense to _also_ have Fedora Server as a cloud deploy target.
This could possibly be both a Fedora Server Minimal Cloud Image and Fedora Server Batteries Included Image — but that'd be up to Server WG, I think.
I've been socializing this a bit lately (at Flock, on #fedora-devel, etc.) and I think it makes a lot of sense to unify the cloud image into Fedora Server. I'm not sure we necessarily want to try to produce two different images here, though. I think we probably want a single cloud image that is just Fedora Server (as it would be installed by Anaconda) plus whatever "cloudy bits" are needed to get it up and running.
The Server and Cloud Editions previously didn't differ in too many ways except for the inclusion of rolekit and Cockpit in the Server. With our plans to remove rolekit in favor of a config-management system like Ansible, I think that really only leaves Cockpit (and its dependencies) as a possible point of contention. Cockpit's modularity does mean that we can minimize its footprint if we opt to skip things like the NetworkManager and storaged support from the Cloud image.
We can debate this later, but for now I'd be in favor of just keeping all of Cockpit there and available, if only because it helps us strongly encourage the use of the modern APIs that it uses under the hood (such as NetworkManager and storaged).
Overall, I'm proposing:
Dissolve Cloud WG (See below; don't panic)
Form new Atomic WG or FAO WG (name to be bikeshedded) (a lot of overlap in membership with current Cloud WG, of course!)
_Keep_ Cloud SIG as a gathering point around cloud technology and covering shared underlying technology (fedimg, koji cloud image production, autocloud). Think of this as analogous in some ways to something like the ARM SIG.
Change https://getfedora.org/cloud/ to https://getfedora.org/atomic/ or https://getfedora.org/fao/
Create new http://cloud.fedoraproject.org/ in the same style as https://arm.fedoraproject.org/
New Atomic/FAO WG produces Whatever New Deliverable (starting with Two Week Atomic)
Cloud Base Image becomes base (uh, see what I did there?) for new Fedora Server cloud image (or images).
Vagrant image _probably_ the same — or maybe becomes its own thing?
???
Profit!
On 26 August 2016 at 09:51, Stephen Gallagher sgallagh@redhat.com wrote:
On 08/25/2016 01:34 PM, Matthew Miller wrote:
We've talked about this for a while, but let's make it formal. The plan is to transition from Cloud as a Fedora Edition to Something Container Clustery (see https://fedoraproject.org/wiki/Objectives/ProjectFAO).
But, we still need cloud as a _deploy target_. The FAO-container-thing will continue to have cloud image deploy targets (as well as bare metal). I think it makes sense to _also_ have Fedora Server as a cloud deploy target.
This could possibly be both a Fedora Server Minimal Cloud Image and Fedora Server Batteries Included Image — but that'd be up to Server WG, I think.
I've been socializing this a bit lately (at Flock, on #fedora-devel, etc.) and I think it makes a lot of sense to unify the cloud image into Fedora Server. I'm not sure we necessarily want to try to produce two different images here, though. I think we probably want a single cloud image that is just Fedora Server (as it would be installed by Anaconda) plus whatever "cloudy bits" are needed to get it up and running.
The Server and Cloud Editions previously didn't differ in too many ways except for the inclusion of rolekit and Cockpit in the Server. With our plans to remove rolekit in favor of a config-management system like Ansible, I think that really only leaves Cockpit (and its dependencies) as a possible point of contention. Cockpit's modularity does mean that we can minimize its footprint if we opt to skip things like the NetworkManager and storaged support from the Cloud image.
We can debate this later, but for now I'd be in favor of just keeping all of Cockpit there and available, if only because it helps us strongly encourage the use of the modern APIs that it uses under the hood (such as NetworkManager and storaged).
I expect there will need to be 2 images.. mainly because people always ask for a 'minimal' image with nothing but a ssh, shell and dnf in it. (actually they say they want even less than that... but rarely show up to do the work on getting that working.)
On Fri, 26 Aug 2016 10:36:25 -0400 Stephen John Smoogen smooge@gmail.com wrote:
On 26 August 2016 at 09:51, Stephen Gallagher sgallagh@redhat.com wrote:
On 08/25/2016 01:34 PM, Matthew Miller wrote:
We've talked about this for a while, but let's make it formal. The plan is to transition from Cloud as a Fedora Edition to Something Container Clustery (see https://fedoraproject.org/wiki/Objectives/ProjectFAO).
But, we still need cloud as a _deploy target_. The FAO-container-thing will continue to have cloud image deploy targets (as well as bare metal). I think it makes sense to _also_ have Fedora Server as a cloud deploy target.
This could possibly be both a Fedora Server Minimal Cloud Image and Fedora Server Batteries Included Image — but that'd be up to Server WG, I think.
I've been socializing this a bit lately (at Flock, on #fedora-devel, etc.) and I think it makes a lot of sense to unify the cloud image into Fedora Server. I'm not sure we necessarily want to try to produce two different images here, though. I think we probably want a single cloud image that is just Fedora Server (as it would be installed by Anaconda) plus whatever "cloudy bits" are needed to get it up and running.
The Server and Cloud Editions previously didn't differ in too many ways except for the inclusion of rolekit and Cockpit in the Server. With our plans to remove rolekit in favor of a config-management system like Ansible, I think that really only leaves Cockpit (and its dependencies) as a possible point of contention. Cockpit's modularity does mean that we can minimize its footprint if we opt to skip things like the NetworkManager and storaged support from the Cloud image.
We can debate this later, but for now I'd be in favor of just keeping all of Cockpit there and available, if only because it helps us strongly encourage the use of the modern APIs that it uses under the hood (such as NetworkManager and storaged).
I expect there will need to be 2 images.. mainly because people always ask for a 'minimal' image with nothing but a ssh, shell and dnf in it. (actually they say they want even less than that... but rarely show up to do the work on getting that working.
With a minimal image we could also make it easy to just install cockpit from the repos, but if it doesn't really add that much space I don't see why we couldn't just add it in.
Of course while we are at it, if we use cockpits NetworkManager setup, we would need to switch to that from the current 'network' and if we are going to be using ansible to do all kinds of things, we may want to include python2.
kevin
On Fri, Aug 26, 2016 at 8:36 AM, Stephen John Smoogen smooge@gmail.com wrote:
On 26 August 2016 at 09:51, Stephen Gallagher sgallagh@redhat.com wrote:
On 08/25/2016 01:34 PM, Matthew Miller wrote:
We've talked about this for a while, but let's make it formal. The plan is to transition from Cloud as a Fedora Edition to Something Container Clustery (see https://fedoraproject.org/wiki/Objectives/ProjectFAO).
But, we still need cloud as a _deploy target_. The FAO-container-thing will continue to have cloud image deploy targets (as well as bare metal). I think it makes sense to _also_ have Fedora Server as a cloud deploy target.
This could possibly be both a Fedora Server Minimal Cloud Image and Fedora Server Batteries Included Image — but that'd be up to Server WG, I think.
I've been socializing this a bit lately (at Flock, on #fedora-devel, etc.) and I think it makes a lot of sense to unify the cloud image into Fedora Server. I'm not sure we necessarily want to try to produce two different images here, though. I think we probably want a single cloud image that is just Fedora Server (as it would be installed by Anaconda) plus whatever "cloudy bits" are needed to get it up and running.
The Server and Cloud Editions previously didn't differ in too many ways except for the inclusion of rolekit and Cockpit in the Server. With our plans to remove rolekit in favor of a config-management system like Ansible, I think that really only leaves Cockpit (and its dependencies) as a possible point of contention. Cockpit's modularity does mean that we can minimize its footprint if we opt to skip things like the NetworkManager and storaged support from the Cloud image.
We can debate this later, but for now I'd be in favor of just keeping all of Cockpit there and available, if only because it helps us strongly encourage the use of the modern APIs that it uses under the hood (such as NetworkManager and storaged).
I expect there will need to be 2 images.. mainly because people always ask for a 'minimal' image with nothing but a ssh, shell and dnf in it. (actually they say they want even less than that... but rarely show up to do the work on getting that working.)
Just for a size perspective:
Fedora-Cloud-Base-25_Alpha-2.x86_64.qcow2 232M Fedora-Cloud-Base-24-1.2.x86_64.qcow2 195M
That uses cloud-init which Server WG may want to drop, the Cloud WG was looking at using something else eventually. Since Cockpit from Fedora 24 has a way to install ssh public keys, it could be this image requires public key authentication by default (or maybe all Server images can do this now?) and use Cockpit to install the keys.
I guess the nice thing about cloud-init is that cloud instances can be provisioned pretty easily by having an ISO "sidecar" file attached to the VM's cdrom device, without having to login to that VM to set that up. I don't know if that's something Cockpit can mimic, a way to produce an export-import file?
On Fri, Aug 26, 2016, at 04:35 PM, Chris Murphy wrote:
I guess the nice thing about cloud-init is that cloud instances can be provisioned pretty easily by having an ISO "sidecar" file attached to the VM's cdrom device, without having to login to that VM to set that up.
That's not the primary use case of cloud-init. The primary use is cloud-init compatible providers like Amazon Web Services, OpenStack, and (somewhat) oVirt/RHEV.
For Project Atomic we were also looking at cloud-init for pxe-to-live; old blog post, not really productized: http://www.projectatomic.io/blog/2015/05/building-and-running-live-atomic/ (TL;DR, add a cloud-init URL to the kernel parameters on the PXE server)
Currently for Atomic Host, we carry cloud-init even on baremetal deployments, but it's disabled for traditional kickstart installs.
On Fri, Aug 26, 2016 at 09:51:11AM -0400, Stephen Gallagher wrote:
We can debate this later, but for now I'd be in favor of just keeping all of Cockpit there and available, if only because it helps us strongly encourage the use of the modern APIs that it uses under the hood (such as NetworkManager and storaged).
We still have Cockpit-as-container as a planned releng deliverable (although I don't think it's going to make F25). In that case, there's not much advantage in having it baked in place, is there?
server@lists.fedoraproject.org