Hello all.
With release 0.3.0 about ready to ship it seems like a good time to start talking about features we'd like to see for 0.4.0. I'd like to continue the three-month release cycle we've been on, so that puts our next one around mid-October.
Below are some obvious buckets, please feel free to suggest additional features large or small.
Finally, note I'm not making any claim that the list below is achievable in the timeframe we're talking about (although I would hope it's not that far from what is achievable). I'm more thinking in terms of what would make our 0.4.0 release seem like a coherent whole, and make the largest number of upstream users interested and happy.
I'll start with Conductor features:
* Authorization. We have a fair amount of authorization checking in place, but no way to actually set who can do what. Given that a central Conductor feature is the ability to control access to cloud resources, this seems like an important feature. Things we'll need to put this in place:
* UX around setting permissions
* UX around displaying appropriate "You can't do that" messages where required, or showing/hiding controls as appropriate
* Good tests
* Not much model code -- I think it's all mostly in place. Correct me if I'm wrong.
* Identity and encryption. Authorization doesn't do a lot of good if anyone can bumble along and impersonate anyone else, so it would be pretty nice to have at least a workaday identity and encryption setup. Conversations with potential users have suggested the following minimum features, feel free to suggest your own:
* Conductor will authenticate against an LDAP server. Since most LDAP servers in the real world are Windows Active Directory, we should probably include AD in the set of servers we test against.
* Fall back to local user data store, maybe? You can imagine needing a local admin user that isn't in LDAP, for example
* Be able to proxy identity when talking to other things that need to know it. Checking identity when saving things to/retrieving things from Image Warehouse is the main requirement for this. I think it's getting a GSSAPI library soon which should help. We will also probably need this for Katello, when we get to talking to it. FWIW Katello is currently using two-legged OAuth for this, so I would think this would be the primary candidate for us too.
* A way to encrypt the traffic between Conductor, Deltacloud API, Warehouse, and Katello. The obvious solution for this is ssl certs that are created and signed by the installer, with some way to update/revoke them.
* Admin UX work
* We need to give the pool, pool family, and provider management screens the same loving treatment we have given the instance management screens.
* We need to make sure self-service really is sane. A big part of self service is image visibility -- i.e. who can launch what where (VMWare's "Catalog" concept answers this requirement for them). A good self-service solution is going to take thinking through some use cases and some serious UX work as well.
* I'd really like to see a front door to the Conductor app. I'm afraid to call it a "dashboard" because then it will never get built :). I'd love suggestions for what should appear on such a thing.
* Other UX work
* I think we should be able to launch single images from Conductor without requiring a deployable XML. To make that easier for users, it would be nice if there was some UI for displaying images that are available to launch.
* Status reporting
* We should reliably display the status of a running instance and its uptime
* We should start thinking about how we will handle the richer data about instance health that we will get once Matahari is in place
* Users should be able to view an audit trail of events for an instance or a set of instances
* Users should be able to export those events
* API
* We've been saying for a very long time that we need a real API for managing Conductor and for doing instance stuff in Conductor. If we admit that we have to manage instances that are not part of deployments, then we can also just say that the Deltacloud API we expose only works for instances. I think this is good enough for the next release.
Infrastructure-around-Conductor features:
* Identity and encryption. In addition to the bits that go in Conductor proper, there's going to be a lot of work in the installer and in other projects nearby.
* Better self-monitoring. I'd like to see a quick shell command that will give a meaningful report of the status of all the app components.
* Way better logging and error reporting.
* All components should be using syslog if at all possible
* Logs should be timestamped
* We should not be logging credentials or things that are potentially embarassing
* Components can be distributed across multiple machines
* RHEV-M 3.0 really works as a cloud provider.
"Orchestrator" features (even though these aren't yet separate components, I've bracketed off stuff that concerns post-boot and multi-instance operations as conceptually different topics to work on)
* Assemblies
* Users can define assemblies that cause the post-boot config apparatus to install software and set config parameters on instances when they check in after booting
* Deployables and deployments
* Users can define deployables that contain multiple assemblies.
* Users can specify parameters that should be collected from a user when the user launches the deployable.
* Users can direct that parameters collected from a user be interpolated in arbitrary spots in the deployable descriptor.
* There is a UI for collecting parameters from the launching user
* There is a mechanism for passing all the assembly and deployable config information through to the post-boot agent. (I think this could use user-data, *or* a config server.)
* Authorization
* Should there be some way of restricting the assemblies/deployables that a user can launch on particular hardware?
On 07/19/2011 11:11 AM, Hugh Brock wrote:
Hello all.
With release 0.3.0 about ready to ship it seems like a good time to start talking about features we'd like to see for 0.4.0. I'd like to continue the three-month release cycle we've been on, so that puts our next one around mid-October.
Below are some obvious buckets, please feel free to suggest additional features large or small.
Finally, note I'm not making any claim that the list below is achievable in the timeframe we're talking about (although I would hope it's not that far from what is achievable). I'm more thinking in terms of what would make our 0.4.0 release seem like a coherent whole, and make the largest number of upstream users interested and happy.
I'll start with Conductor features:
<
<snip>
Status reporting
We should reliably display the status of a running instance and its uptime
We should start thinking about how we will handle the richer data about instance health that we will get once Matahari is in place
Users should be able to view an audit trail of events for an instance or a set of instances
Users should be able to export those events
This would be a good place for pacemaker-cloud integration. Currently we generate QMF events when state change events occur.
Here is an example of a 3 assembly deployable where assembly 2 is terminated (and then restarted by pacemaker-cloud):
system start:
Event: {'reason': 'All good', 'assembly': 'assy1-F14', 'state': 'running', 'deployable': 'dep1-F14'} Event: {'reason': 'All good', 'assembly': 'assy2-F14', 'state': 'running', 'deployable': 'dep1-F14'} Event: {'reason': 'All good', 'assembly': 'assy3-F14', 'state': 'running', 'deployable': 'dep1-F14'}
At this point all assemblies are started in the deployable dep1-F14
Then we terminate an assembly via virtual machine manager GUI:
Event: {'reason': 'Not reachable', 'assembly': 'assy2-F14', 'state': 'failed', 'deployable': 'dep1-F14'}
Then it is restarted by pacemaker-cloud (and active):
Event: {'reason': 'All good', 'assembly': 'assy2-F14', 'state': 'running', 'deployable': 'dep1-F14'}
These events all occur as a result of monitoring matahari of all assemblies+deployables.
On Tue, Jul 19, 2011 at 11:51:19AM -0700, Steven Dake wrote:
On 07/19/2011 11:11 AM, Hugh Brock wrote:
Hello all.
With release 0.3.0 about ready to ship it seems like a good time to start talking about features we'd like to see for 0.4.0. I'd like to continue the three-month release cycle we've been on, so that puts our next one around mid-October.
Below are some obvious buckets, please feel free to suggest additional features large or small.
Finally, note I'm not making any claim that the list below is achievable in the timeframe we're talking about (although I would hope it's not that far from what is achievable). I'm more thinking in terms of what would make our 0.4.0 release seem like a coherent whole, and make the largest number of upstream users interested and happy.
I'll start with Conductor features:
<
<snip>
Status reporting
We should reliably display the status of a running instance and its uptime
We should start thinking about how we will handle the richer data about instance health that we will get once Matahari is in place
Users should be able to view an audit trail of events for an instance or a set of instances
Users should be able to export those events
This would be a good place for pacemaker-cloud integration. Currently we generate QMF events when state change events occur.
Here is an example of a 3 assembly deployable where assembly 2 is terminated (and then restarted by pacemaker-cloud):
system start:
Event: {'reason': 'All good', 'assembly': 'assy1-F14', 'state': 'running', 'deployable': 'dep1-F14'} Event: {'reason': 'All good', 'assembly': 'assy2-F14', 'state': 'running', 'deployable': 'dep1-F14'} Event: {'reason': 'All good', 'assembly': 'assy3-F14', 'state': 'running', 'deployable': 'dep1-F14'}
At this point all assemblies are started in the deployable dep1-F14
Then we terminate an assembly via virtual machine manager GUI:
Event: {'reason': 'Not reachable', 'assembly': 'assy2-F14', 'state': 'failed', 'deployable': 'dep1-F14'}
Then it is restarted by pacemaker-cloud (and active):
Event: {'reason': 'All good', 'assembly': 'assy2-F14', 'state': 'running', 'deployable': 'dep1-F14'}
These events all occur as a result of monitoring matahari of all assemblies+deployables.
Yes -- so we need to get this stuff into the event log that we display to the user, at a minimum.
When do you expect you'll be ready to stand this thing up and have it talk to Conductor? Do you have any sense of what sort of API we'll need to provide you to make the integration work?
On 07/24/2011 10:58 AM, Hugh Brock wrote:
On Tue, Jul 19, 2011 at 11:51:19AM -0700, Steven Dake wrote:
On 07/19/2011 11:11 AM, Hugh Brock wrote:
Hello all.
With release 0.3.0 about ready to ship it seems like a good time to start talking about features we'd like to see for 0.4.0. I'd like to continue the three-month release cycle we've been on, so that puts our next one around mid-October.
Below are some obvious buckets, please feel free to suggest additional features large or small.
Finally, note I'm not making any claim that the list below is achievable in the timeframe we're talking about (although I would hope it's not that far from what is achievable). I'm more thinking in terms of what would make our 0.4.0 release seem like a coherent whole, and make the largest number of upstream users interested and happy.
I'll start with Conductor features:
<
<snip>
Status reporting
We should reliably display the status of a running instance and its uptime
We should start thinking about how we will handle the richer data about instance health that we will get once Matahari is in place
Users should be able to view an audit trail of events for an instance or a set of instances
Users should be able to export those events
This would be a good place for pacemaker-cloud integration. Currently we generate QMF events when state change events occur.
Here is an example of a 3 assembly deployable where assembly 2 is terminated (and then restarted by pacemaker-cloud):
system start:
Event: {'reason': 'All good', 'assembly': 'assy1-F14', 'state': 'running', 'deployable': 'dep1-F14'} Event: {'reason': 'All good', 'assembly': 'assy2-F14', 'state': 'running', 'deployable': 'dep1-F14'} Event: {'reason': 'All good', 'assembly': 'assy3-F14', 'state': 'running', 'deployable': 'dep1-F14'}
At this point all assemblies are started in the deployable dep1-F14
Then we terminate an assembly via virtual machine manager GUI:
Event: {'reason': 'Not reachable', 'assembly': 'assy2-F14', 'state': 'failed', 'deployable': 'dep1-F14'}
Then it is restarted by pacemaker-cloud (and active):
Event: {'reason': 'All good', 'assembly': 'assy2-F14', 'state': 'running', 'deployable': 'dep1-F14'}
These events all occur as a result of monitoring matahari of all assemblies+deployables.
Yes -- so we need to get this stuff into the event log that we display to the user, at a minimum.
When do you expect you'll be ready to stand this thing up and have it talk to Conductor? Do you have any sense of what sort of API we'll need to provide you to make the integration work?
We will be wrapped up with most stand-alone development this week. For integration, we need to have discussion and then agreement on the topic of the API/(s) you mentioned.
When the integration will be complete is hard to predict. There are many unknowns that require definition. I'll kick the API discussion off next week when we wrap up with our Fedora 16 release. This should provide us with a better understanding of integration tasks required and their associated schedules.
Regards -steve
On 07/19/11 - 02:11:26PM, Hugh Brock wrote:
Other UX work
- I think we should be able to launch single images from Conductor without requiring a deployable XML. To make that easier for users, it would be nice if there was some UI for displaying images that are available to launch.
I know this is going to make me unpopular, but I think we need to re-instate the UI for building images. As a short-term solution, removing it and going CLI-only removed a roadblock for us, but the immediate reaction of (potential) users when presented with it is revulsion.
I know that there is talk about making Katello do it, but I still don't love that solution for 2 reasons:
1) It is a separate place to do things. That is, you'd have to (potentially) log into Katello to do the building, then switch back to Aeolus. I guess I could imagine ways to integrate it into the Conductor UI so it is seemless, but I don't think I can imagine getting those ways in a short time-frame. 2) It is another external dependency. If one of the goals of the project is to keep required external dependencies down, then the Conductor should have its own UI for building images.
Let the flames begin :).
On Tue, Jul 19, 2011 at 04:17:12PM -0400, Chris Lalancette wrote:
On 07/19/11 - 02:11:26PM, Hugh Brock wrote:
Other UX work
- I think we should be able to launch single images from Conductor without requiring a deployable XML. To make that easier for users, it would be nice if there was some UI for displaying images that are available to launch.
Incidentally, I was going to reply to this very block to suggest that this might also be of use in the sort of use case I've been thinking about lately -- someone who wants to run Conductor to manage a handful of instances, maybe across a couple providers, but who really doesn't want all the overhead that comes with a full-blown setup.
So there's that, too.
I know this is going to make me unpopular, but I think we need to re-instate the UI for building images. As a short-term solution, removing it and going CLI-only removed a roadblock for us, but the immediate reaction of (potential) users when presented with it is revulsion.
Having witnessed this same reaction, I have to agree with Chris here.
Personally, I'd be fine if we started by providing a limited UI, and explained that the command line tools offered more flexibility. But "hand-edit this XML file, run a command-line tool on it to build the image, tail -f the log file and wait for it to finish, and then copy-and-paste that UUID into another command" really didn't go over well at all with the people I showed this to. It works for us, but not the people we're building this for.
-- Matt
On Tue, Jul 19, 2011 at 04:42:14PM -0400, Matt Wagner wrote:
On Tue, Jul 19, 2011 at 04:17:12PM -0400, Chris Lalancette wrote:
On 07/19/11 - 02:11:26PM, Hugh Brock wrote:
Other UX work
- I think we should be able to launch single images from Conductor without requiring a deployable XML. To make that easier for users, it would be nice if there was some UI for displaying images that are available to launch.
Incidentally, I was going to reply to this very block to suggest that this might also be of use in the sort of use case I've been thinking about lately -- someone who wants to run Conductor to manage a handful of instances, maybe across a couple providers, but who really doesn't want all the overhead that comes with a full-blown setup.
So there's that, too.
I know this is going to make me unpopular, but I think we need to re-instate the UI for building images. As a short-term solution, removing it and going CLI-only removed a roadblock for us, but the immediate reaction of (potential) users when presented with it is revulsion.
Having witnessed this same reaction, I have to agree with Chris here.
Personally, I'd be fine if we started by providing a limited UI, and explained that the command line tools offered more flexibility. But "hand-edit this XML file, run a command-line tool on it to build the image, tail -f the log file and wait for it to finish, and then copy-and-paste that UUID into another command" really didn't go over well at all with the people I showed this to. It works for us, but not the people we're building this for.
Yes, you and Chris were exposed to the full "WTF" of the folks in the training, so I'm not surprised you have that thought.
I would like to challenge one assumption though, which is the whole idea that choosing a base OS and then choosing some packages to be installed on it, *outside the context of a native installer*, is even a useful activity at all in the real world.
So for example, you connect to EC2. EC2 has a million canned AMIs you can choose to launch. I am willing to bet that hardly anyone builds their own AMI for EC2; if they really want to do that EC2 has a great set of tools for it. Would it not be more useful, just from the standpoint of Conductor all by itself, to say "We'll make it really easy for you to browse the images that are already available for you to start in EC2 and then start one" These images include current RHEL and Fedora and Windows images, FWIW?
Similarly, take your local VMWare or RHEV-M installation. Both have a notion of "Templates" -- images that have already been pre-baked and are ready to go. And a lot of real-world installations are going to have a lot of those templates already. Right now it isn't really easy to browse those, but what if it was? Now you have a pretty convenient way to launch things on EC2 and RHEV-M and VMWare from a single pane of glass, and you never had to build *anything* -- from a command line, a UI, or anywhere.
Now of course what we've given up here is the notion of image equivalency -- that you can build something that will be close-enough-to-identical in its EC2 incarnation and its RHEV-M incarnation that people will accept it as such. Having said that, is the value proposition of RHEL not precisely that equivalency -- i.e. that you can launch RHEL on EC2 and it will behave precisely the way RHEL on VMWare or RHEL on RHEV-M behaves? If you the customer are willing to accept that some limited number of JEOSes -- which we could ship and preinstall with Conductor, by the way -- are enough images to suffice, and that you would then use some part of Orchestrator to install packages post-boot or run your own script to install things post-boot... well, then you don't need a template-building UI, because you never have to build a template.
I firmly believe that the reason not having a template building UI is a problem, is because we keep telling people they have to build templates in order to do things. I further think if we want Conductor to be successful, we should fix it so people don't have to build templates unless they really, really want to.
Apologize for the length, but this is a subject I have been thinking about a lot lately.
Please feel free to tell me I'm full of it...
--H
On 19/07/11 22:19, Hugh Brock wrote:
Now of course what we've given up here is the notion of image equivalency -- that you can build something that will be close-enough-to-identical in its EC2 incarnation and its RHEV-M incarnation that people will accept it as such. Having said that, is the value proposition of RHEL not precisely that equivalency -- i.e. that you can launch RHEL on EC2 and it will behave precisely the way RHEL on VMWare or RHEL on RHEV-M behaves? If you the customer are willing to accept that some limited number of JEOSes -- which we could ship and preinstall with Conductor, by the way -- are enough images to suffice, and that you would then use some part of Orchestrator to install packages post-boot or run your own script to install things post-boot... well, then you don't need a template-building UI, because you never have to build a template.
I firmly believe that the reason not having a template building UI is a problem, is because we keep telling people they have to build templates in order to do things. I further think if we want Conductor to be successful, we should fix it so people don't have to build templates unless they really, really want to.
I think this is absolutely right. A base OS image can be used for a huge array of purposes, which will allow users to get value of of Aeolus by launching useful deployments, rather than getting bogged down in building and managing multiple iterations of images.
I think that the integration with Katello is key here, though not because Katello is potentially a better home for the UI around template definition.
The aspect of Katello that can really help us is its ability to to deploy packages and configuration to arrays of machines in a consistent way, and to keep them all up to date and in sync with each other (the principle purpose of RHN/spacewalk today).
By exploiting those capabilities, we can support a small set of base OS images, exactly as described by Hugh above, and use post-boot configuration to put an instance to a specific use.
We're already building the infrastructure where we'll have a config server which will pass run-time parameters to a newly launched instance. Combine that with the implementation of services through puppet/shell/whatever scripts which run on a newly booted instance and configure it for a task, and we have a flexible means of defining the purpose of an instance at boot time, by getting the instance to contact Katello, install updates & additional packages, update configuration files and start services.
There's a dependency on a reliable, scalable mechanism which can consistently deploy software and configuration across array of machines. Happily, that's exactly what RHN/spacewalk already does, and there's every reason to believe that Katello will do it just as well.
By moving the emphasis slightly away from image building, and onto post-boot configuration, we can avoid some other issues that have been chewed over quite a bit recently:
Images becoming stale: As soon as updated versions of packages are available, we need to flag those bootable images which are no longer up to date, and potentially to rebuild them for each provider By installing updates into an instance on boot, we can ensure that *running instances* are up to date and in sync with each other, even if the images from which they were booted are not (the cost of stale images becomes the time to install updates on each instance)
Promoting images from Development/QE to Prod: We need a way to allow an image to be propagated from one environment/pool to another once it has passed testing By using a single base OS image, and encapsulating its specific purpose in post-boot configuration, rather than baking it into the image itself, we can meet this requirement by promoting the post-boot configuration rules, which is a significantly easier practical task than copying large images around, or reproducing images in another environment.
The alternative approach, of optimising images for a specific purpose, and then building those specific images for each provider/realm where they might be required, then rebuilding them when they're stale looks like a whole load of image building which isn't actually required, and would also require users to store and manage an expanding array of images.
Two things to bear in mind are that, for public cloud providers, storing those images costs real money, and, given that users will use a limited range of OS, a lot of the images will be near duplicates of each other, with only a thin layer of configuration, or a couple of packages, differentiating them.
On 07/20/11 - 11:19:49AM, Angus Thomas wrote:
On 19/07/11 22:19, Hugh Brock wrote:
Now of course what we've given up here is the notion of image equivalency -- that you can build something that will be close-enough-to-identical in its EC2 incarnation and its RHEV-M incarnation that people will accept it as such. Having said that, is the value proposition of RHEL not precisely that equivalency -- i.e. that you can launch RHEL on EC2 and it will behave precisely the way RHEL on VMWare or RHEL on RHEV-M behaves? If you the customer are willing to accept that some limited number of JEOSes -- which we could ship and preinstall with Conductor, by the way -- are enough images to suffice, and that you would then use some part of Orchestrator to install packages post-boot or run your own script to install things post-boot... well, then you don't need a template-building UI, because you never have to build a template.
I firmly believe that the reason not having a template building UI is a problem, is because we keep telling people they have to build templates in order to do things. I further think if we want Conductor to be successful, we should fix it so people don't have to build templates unless they really, really want to.
I think this is absolutely right. A base OS image can be used for a huge array of purposes, which will allow users to get value of of Aeolus by launching useful deployments, rather than getting bogged down in building and managing multiple iterations of images.
<snipping other useful stuff>
I agree with this aspect of images for clouds, but only as one possible deployment model. My worry is that if we say this "thin provisioning" model is the one true model, then you are requiring people to accept that model in its entirety.
On the flip side, I think there are people who have servers running in their datacenter today that want to try out, or slowly transition to, the cloud. For those people, the more traditional model where all of their software is pre-installed and only a few minor pieces of configuration are done at boot is easier to swallow and more like what they are doing today. Additionally, this will be much faster to boot; have you ever watched how slow yum is to install a large amount of packages?
That's why I advocate an image-building UI. Of course, if someone has substantial data to show that customers are *not* interested in my alternate model, then I'm happy to change my position :).
On Tue, 2011-07-19 at 17:19 -0400, Hugh Brock wrote:
I firmly believe that the reason not having a template building UI is a problem, is because we keep telling people they have to build templates in order to do things. I further think if we want Conductor to be successful, we should fix it so people don't have to build templates unless they really, really want to.
I think you're absolutely right, and a corollary is that folks should "really, really want to" build images from templates whether or not they are using conductor.
I think the image building tools have a compelling vision that is being lost by being so tied up with conductor - this isn't just a tool you need to use before you can do anything useful with conductor.
This is about freeing people from being locked into specific clouds by images, in the same way that deltacloud frees people from being locked in by the APIs.
If you use these tools to build your images, you have a much greater ability to switch cloud providers. That's an awesome message!
Cheers, Mark.
On 07/19/11 - 05:19:25PM, Hugh Brock wrote:
I firmly believe that the reason not having a template building UI is a problem, is because we keep telling people they have to build templates in order to do things. I further think if we want Conductor to be successful, we should fix it so people don't have to build templates unless they really, really want to.
As may be evident by some of my other replies, I am of the opinion that people will want to build their own images to install their own software.
However, even if we step back and take the tactic that people will always use pre-canned images, at the very least you need an "import image" GUI screen. If we start there, then we can see whether it is worthwhile to go further and have a full-up "choose your packages and install your OS" screen. Does that make sense?
On Wed, Jul 20, 2011 at 02:00:55PM -0400, Chris Lalancette wrote:
On 07/19/11 - 05:19:25PM, Hugh Brock wrote:
I firmly believe that the reason not having a template building UI is a problem, is because we keep telling people they have to build templates in order to do things. I further think if we want Conductor to be successful, we should fix it so people don't have to build templates unless they really, really want to.
As may be evident by some of my other replies, I am of the opinion that people will want to build their own images to install their own software.
However, even if we step back and take the tactic that people will always use pre-canned images, at the very least you need an "import image" GUI screen. If we start there, then we can see whether it is worthwhile to go further and have a full-up "choose your packages and install your OS" screen. Does that make sense?
I think an "Import image" GUI screen or a "Show me the images that are available" GUI screen is a must-have. I'd like to get it scoped for this release. My thought is we should add it to the pool screen -- a pool should have some images that it can see by virtue of being connected to a cloud provider.
I was talking to a couple of the Katello guys today (www.katello.org) and they are going to look at adding a feature to generate a Factory template definition to an upcoming sprint. They don't think it should be that difficult if it's limited to specifying packages. I know their UI isn't explicitly connected to ours, but it seems to me like we should be using their template definition UI if we're going to use one.
Anyone up for having a look at Katello and helping those guys out in generating a Factory template? Or better yet calling Factory directly to build a image and put it in Warehouse?
--H
On 07/20/2011 02:19 PM, Hugh Brock wrote:
On Wed, Jul 20, 2011 at 02:00:55PM -0400, Chris Lalancette wrote:
On 07/19/11 - 05:19:25PM, Hugh Brock wrote:
I firmly believe that the reason not having a template building UI is a problem, is because we keep telling people they have to build templates in order to do things. I further think if we want Conductor to be successful, we should fix it so people don't have to build templates unless they really, really want to.
As may be evident by some of my other replies, I am of the opinion that people will want to build their own images to install their own software.
However, even if we step back and take the tactic that people will always use pre-canned images, at the very least you need an "import image" GUI screen. If we start there, then we can see whether it is worthwhile to go further and have a full-up "choose your packages and install your OS" screen. Does that make sense?
I think an "Import image" GUI screen or a "Show me the images that are available" GUI screen is a must-have. I'd like to get it scoped for this release. My thought is we should add it to the pool screen -- a pool should have some images that it can see by virtue of being connected to a cloud provider.
Yeah -- we can even use the same code that aeolus-image uses for the push -- just wrap the API with the UI we need.
However, I'm not sure the pool place is the right area here -- we don't explicitly tie images to pools, nor do we explicitly tie providers to pools.
Although I guess we could still hook the action here as a convenience -- 'import image' would then allow a user to import images from any of the provider accounts that are attached to the pool family that this pool belongs to, assuming the user logged in has permission to view images on those providers (self-service users may or may not have that privilege -- we'd need to decide.
Scott
I was talking to a couple of the Katello guys today (www.katello.org) and they are going to look at adding a feature to generate a Factory template definition to an upcoming sprint. They don't think it should be that difficult if it's limited to specifying packages. I know their UI isn't explicitly connected to ours, but it seems to me like we should be using their template definition UI if we're going to use one.
Anyone up for having a look at Katello and helping those guys out in generating a Factory template? Or better yet calling Factory directly to build a image and put it in Warehouse?
--H
On 07/19/2011 05:19 PM, Hugh Brock wrote:
On Tue, Jul 19, 2011 at 04:42:14PM -0400, Matt Wagner wrote:
On Tue, Jul 19, 2011 at 04:17:12PM -0400, Chris Lalancette wrote:
On 07/19/11 - 02:11:26PM, Hugh Brock wrote:
Other UX work
- I think we should be able to launch single images from Conductor without requiring a deployable XML. To make that easier for users, it would be nice if there was some UI for displaying images that are available to launch.
Incidentally, I was going to reply to this very block to suggest that this might also be of use in the sort of use case I've been thinking about lately -- someone who wants to run Conductor to manage a handful of instances, maybe across a couple providers, but who really doesn't want all the overhead that comes with a full-blown setup.
So there's that, too.
I know this is going to make me unpopular, but I think we need to re-instate the UI for building images. As a short-term solution, removing it and going CLI-only removed a roadblock for us, but the immediate reaction of (potential) users when presented with it is revulsion.
Having witnessed this same reaction, I have to agree with Chris here.
Personally, I'd be fine if we started by providing a limited UI, and explained that the command line tools offered more flexibility. But "hand-edit this XML file, run a command-line tool on it to build the image, tail -f the log file and wait for it to finish, and then copy-and-paste that UUID into another command" really didn't go over well at all with the people I showed this to. It works for us, but not the people we're building this for.
Yes, you and Chris were exposed to the full "WTF" of the folks in the training, so I'm not surprised you have that thought.
I would like to challenge one assumption though, which is the whole idea that choosing a base OS and then choosing some packages to be installed on it, *outside the context of a native installer*, is even a useful activity at all in the real world.
So for example, you connect to EC2. EC2 has a million canned AMIs you can choose to launch. I am willing to bet that hardly anyone builds their own AMI for EC2; if they really want to do that EC2 has a great set of tools for it. Would it not be more useful, just from the standpoint of Conductor all by itself, to say "We'll make it really easy for you to browse the images that are already available for you to start in EC2 and then start one" These images include current RHEL and Fedora and Windows images, FWIW?
Similarly, take your local VMWare or RHEV-M installation. Both have a notion of "Templates" -- images that have already been pre-baked and are ready to go. And a lot of real-world installations are going to have a lot of those templates already. Right now it isn't really easy to browse those, but what if it was? Now you have a pretty convenient way to launch things on EC2 and RHEV-M and VMWare from a single pane of glass, and you never had to build *anything* -- from a command line, a UI, or anywhere.
Now of course what we've given up here is the notion of image equivalency -- that you can build something that will be close-enough-to-identical in its EC2 incarnation and its RHEV-M incarnation that people will accept it as such. Having said that, is the value proposition of RHEL not precisely that equivalency -- i.e. that you can launch RHEL on EC2 and it will behave precisely the way RHEL on VMWare or RHEL on RHEV-M behaves? If you the customer are willing to accept that some limited number of JEOSes -- which we could ship and preinstall with Conductor, by the way -- are enough images to suffice, and that you would then use some part of Orchestrator to install packages post-boot or run your own script to install things post-boot... well, then you don't need a template-building UI, because you never have to build a template.
I firmly believe that the reason not having a template building UI is a problem, is because we keep telling people they have to build templates in order to do things. I further think if we want Conductor to be successful, we should fix it so people don't have to build templates unless they really, really want to.
Apologize for the length, but this is a subject I have been thinking about a lot lately.
Please feel free to tell me I'm full of it...
--H
We will get forced into a UI at some point, question is can we survive v1 without it?
If we can't do we have time to get far enough that is does not suck so bad, that no-one uses it?
Carl.
On 07/19/2011 01:17 PM, Chris Lalancette wrote:
On 07/19/11 - 02:11:26PM, Hugh Brock wrote:
Other UX work
- I think we should be able to launch single images from Conductor without requiring a deployable XML. To make that easier for users, it would be nice if there was some UI for displaying images that are available to launch.
I know this is going to make me unpopular, but I think we need to re-instate the UI for building images. As a short-term solution, removing it and going CLI-only removed a roadblock for us, but the immediate reaction of (potential) users when presented with it is revulsion.
One problem the CLI solves is that it separates the function of creating the deployables from the activity of displaying the user interface. A CLI is really just an API with a command line presentation.
I have seen in my 10+ years of software development, GUI development fail time and time again because the detail of the actions were tightly integrated with the graphical user interface. The failure to use abstraction results in too much complexity in one place. The net result is complexity fatigue resulting in inability to ship a functional GUI. Once the functionality was abstracted from the presentation, software was shippable.
I advocate keeping the CLI and making a GUI that uses the cli internally. The CLI then acts as a an interface between the act of building images and the act of presentation to the user. With this approach, engineers are not overwhelmed with complexity and unable to make functional software. The disadvantage to this approach is the extra time required to negotiate a proper interface between the GUI developers and CLI developers. This is time well spent if the result is functional software.
Regards -steve
I know that there is talk about making Katello do it, but I still don't love that solution for 2 reasons:
- It is a separate place to do things. That is, you'd have to (potentially)
log into Katello to do the building, then switch back to Aeolus. I guess I could imagine ways to integrate it into the Conductor UI so it is seemless, but I don't think I can imagine getting those ways in a short time-frame. 2) It is another external dependency. If one of the goals of the project is to keep required external dependencies down, then the Conductor should have its own UI for building images.
Let the flames begin :).
On 07/20/11 - 09:46:22AM, Steven Dake wrote:
On 07/19/2011 01:17 PM, Chris Lalancette wrote:
On 07/19/11 - 02:11:26PM, Hugh Brock wrote:
Other UX work
- I think we should be able to launch single images from Conductor without requiring a deployable XML. To make that easier for users, it would be nice if there was some UI for displaying images that are available to launch.
I know this is going to make me unpopular, but I think we need to re-instate the UI for building images. As a short-term solution, removing it and going CLI-only removed a roadblock for us, but the immediate reaction of (potential) users when presented with it is revulsion.
One problem the CLI solves is that it separates the function of creating the deployables from the activity of displaying the user interface. A CLI is really just an API with a command line presentation.
I have seen in my 10+ years of software development, GUI development fail time and time again because the detail of the actions were tightly integrated with the graphical user interface. The failure to use abstraction results in too much complexity in one place. The net result is complexity fatigue resulting in inability to ship a functional GUI. Once the functionality was abstracted from the presentation, software was shippable.
I advocate keeping the CLI and making a GUI that uses the cli internally. The CLI then acts as a an interface between the act of building images and the act of presentation to the user. With this approach, engineers are not overwhelmed with complexity and unable to make functional software. The disadvantage to this approach is the extra time required to negotiate a proper interface between the GUI developers and CLI developers. This is time well spent if the result is functional software.
I mostly agree with this. Instead of saying the GUI should use the CLI internally, though, I would say "the GUI and the CLI should use the same interface". That interface I leave up in the air a bit, but I would presume it is a library of some sort.
On 07/20/2011 10:44 AM, Chris Lalancette wrote:
On 07/20/11 - 09:46:22AM, Steven Dake wrote:
On 07/19/2011 01:17 PM, Chris Lalancette wrote:
On 07/19/11 - 02:11:26PM, Hugh Brock wrote:
Other UX work
- I think we should be able to launch single images from Conductor without requiring a deployable XML. To make that easier for users, it would be nice if there was some UI for displaying images that are available to launch.
I know this is going to make me unpopular, but I think we need to re-instate the UI for building images. As a short-term solution, removing it and going CLI-only removed a roadblock for us, but the immediate reaction of (potential) users when presented with it is revulsion.
One problem the CLI solves is that it separates the function of creating the deployables from the activity of displaying the user interface. A CLI is really just an API with a command line presentation.
I have seen in my 10+ years of software development, GUI development fail time and time again because the detail of the actions were tightly integrated with the graphical user interface. The failure to use abstraction results in too much complexity in one place. The net result is complexity fatigue resulting in inability to ship a functional GUI. Once the functionality was abstracted from the presentation, software was shippable.
I advocate keeping the CLI and making a GUI that uses the cli internally. The CLI then acts as a an interface between the act of building images and the act of presentation to the user. With this approach, engineers are not overwhelmed with complexity and unable to make functional software. The disadvantage to this approach is the extra time required to negotiate a proper interface between the GUI developers and CLI developers. This is time well spent if the result is functional software.
I mostly agree with this. Instead of saying the GUI should use the CLI internally, though, I would say "the GUI and the CLI should use the same interface". That interface I leave up in the air a bit, but I would presume it is a library of some sort.
Good proposal. I suggest the cli/gui developers proceed with this methodology.
Regards -steve
On 07/19/2011 02:11 PM, Hugh Brock wrote:
- We need to make sure self-service really is sane. A big part of self service is image visibility -- i.e. who can launch what where (VMWare's "Catalog" concept answers this requirement for them). A good self-service solution is going to take thinking through some use cases and some serious UX work as well.
Speaking of self-service, currently self-service user creation has been removed from the UI. It appears to be an inadvertent regression that came with the redesigned login page, but now that it's gone, it's been suggested that we didn't want it anyway -- that we _want_ explicit admin approval before a new account is enabled.
So we need to decide what to do both for 0.3.0 and for 0.4.0+
Starting with the latter, if we're integrating with ldap, etc, presumably we wouldn't want self-created self-service user accounts that _don't_ go through ldap. That said, in the case of an external user store, we may not want to automatically enable _every_ ldap user for conductor, so in this case, self-service creation might amount to a user saying "Yes I'm on ldap -- I just logged in -- now enable me for conductor." Then again if we're enabling any ldap user from doing things, why not just do that whole thing transparently on ldap/etc login -- and in the case where the admins have decided that they have to explicitly enable external ldap users for conductor, then they'd get some sort of 'access denied - -contact your administrator for access' message on login.
So back to 0.3.0 -- what here? Do we keep self-service hidden, or do we re-enable it?
Scott
On 07/19/2011 11:30 PM, Scott Seago wrote:
On 07/19/2011 02:11 PM, Hugh Brock wrote:
* We need to make sure self-service really is sane. A big part of self service is image visibility -- i.e. who can launch what where (VMWare's "Catalog" concept answers this requirement for them). A good self-service solution is going to take thinking through some use cases and some serious UX work as well.
Speaking of self-service, currently self-service user creation has been removed from the UI. It appears to be an inadvertent regression that came with the redesigned login page, but now that it's gone, it's been suggested that we didn't want it anyway -- that we _want_ explicit admin approval before a new account is enabled.
Only nit: I made sure that removing self-service from login page is approved by Angus before ACKing the patch so I think it was more intention than inadvertent regression.
Jan
On 20/07/11 08:54, Jan Provaznik wrote:
On 07/19/2011 11:30 PM, Scott Seago wrote:
On 07/19/2011 02:11 PM, Hugh Brock wrote:
* We need to make sure self-service really is sane. A big part of self service is image visibility -- i.e. who can launch what where (VMWare's "Catalog" concept answers this requirement for them). A good self-service solution is going to take thinking through some use cases and some serious UX work as well.
Speaking of self-service, currently self-service user creation has been removed from the UI. It appears to be an inadvertent regression that came with the redesigned login page, but now that it's gone, it's been suggested that we didn't want it anyway -- that we _want_ explicit admin approval before a new account is enabled.
Only nit: I made sure that removing self-service from login page is approved by Angus before ACKing the patch so I think it was more intention than inadvertent regression.
Jan _______________________________________________ aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel
Both accounts are correct. Self-service account creation on the login page was inadvertently removed during the overhaul of the login page, and I then confirmed to Jan that the patch could/should be ack'ed despite the apparent regression.
All that serves to open up a conversation about what we actually want to design into the app.
I'm not at all convinced that there's really a credible use case where an organisation which deploys Conductor will want to enable *anyone* who can navigate to the login page to start creating deployments, consuming resources, and spending money.
In the corporate world, it is far more likely that organisations will require that users are correctly identified and associated with their cost centre/business unit etc. first, so that their usage can be correctly accounted for and charged back etc..
Away from corporate users, any organisation or group of users is going to want some level of control, since the person who is paying by the hour for running deployments is, almost by definition, not the self-service user who creates an account without reference to anyone else.
Even with self service user account creation enabled, we don't necessarily achieve the goal of reducing administrative overhead, and letting users just use the app without administrative overhead, since administrators will still be required to ensure that the roles associated with a specific user, and the pools to which they have access etc. are correct.
Another advantage of allowing users to create their own accounts is that it allows administrators to crowdsource what is otherwise a tedious, repetitive task. Even when user's credentials are contained in LDAP or AD, there's still per-user grouping, permissions etc.
I wonder if we'd be better off providing admins with a CSV file format through which they could specify user parameters in a spreadsheet for bulk import.
Angus
On Tue, Jul 19, 2011 at 02:11:26PM -0400, Hugh Brock wrote:
Below are some obvious buckets, please feel free to suggest additional features large or small.
I'd really love it if "Documentation" were on this list. It's not really a task as much as an ongoing process that we could improve on.
I'm very specifically *not* requesting that Justin be asked to write documentation that's meticulously-proofread, peer-reviewed, and QE-tested on multile platforms for the main website. (Although that'd be a nice eventual goal.)
Instead, I really like the idea of throwing wiki pages together as we go. They'll surely start off mediocre, but that's infinitely better (in my mind, at least) than documentation that's yet to be written. As we go, we can fill in holes, clean up errors, and generally enhance it.
Whether the wiki is the permanent home or we eventually move it to the main website doesn't really matter to me. I just want it to exist.
As an example of how this can work fairly well, when we switched to using command-line tools for building and pushing, it took me the longest time to figure it out, and I kept having to ask for help. Eventually, I threw together a wiki page with what I had figured out, and asked someone to confirm that it didn't contain egregious errors. Inevitably, others had the same questions I did, and we were able to refer them to that page. And then, once it was being treated as documentation, others took an interest in making sure it was kept up-to-date. (In hindsight, I didn't even create the article on the right wiki[1], but this kind of goes to the point I'm trying to make -- getting something live today is better than waiting until you're sure you know which wiki you're suppose to use.)
It occurs to me that there's still an immense amount of knowledge about how everything works that you can really only get by asking people on IRC, and this is really unfortunate. I think if we tried to work towards throwing information on the wiki as the need arises, as opposed to waiting until the right time to write good documentation, we could end up getting a lot more information out there.
Is this something others agree should be a focus right now?
[1] The right wiki, for the record, is https://www.aeolusproject.org/redmine/projects/aeolus/wiki/ (which happens to be down right now...)
-- Matt
On Tue, Jul 19, 2011 at 05:44:28PM -0400, Matt Wagner wrote:
On Tue, Jul 19, 2011 at 02:11:26PM -0400, Hugh Brock wrote:
Below are some obvious buckets, please feel free to suggest additional features large or small.
I'd really love it if "Documentation" were on this list. It's not really a task as much as an ongoing process that we could improve on.
I'm very specifically *not* requesting that Justin be asked to write documentation that's meticulously-proofread, peer-reviewed, and QE-tested on multile platforms for the main website. (Although that'd be a nice eventual goal.)
Instead, I really like the idea of throwing wiki pages together as we go. They'll surely start off mediocre, but that's infinitely better (in my mind, at least) than documentation that's yet to be written. As we go, we can fill in holes, clean up errors, and generally enhance it.
Whether the wiki is the permanent home or we eventually move it to the main website doesn't really matter to me. I just want it to exist.
As an example of how this can work fairly well, when we switched to using command-line tools for building and pushing, it took me the longest time to figure it out, and I kept having to ask for help. Eventually, I threw together a wiki page with what I had figured out, and asked someone to confirm that it didn't contain egregious errors. Inevitably, others had the same questions I did, and we were able to refer them to that page. And then, once it was being treated as documentation, others took an interest in making sure it was kept up-to-date. (In hindsight, I didn't even create the article on the right wiki[1], but this kind of goes to the point I'm trying to make -- getting something live today is better than waiting until you're sure you know which wiki you're suppose to use.)
It occurs to me that there's still an immense amount of knowledge about how everything works that you can really only get by asking people on IRC, and this is really unfortunate. I think if we tried to work towards throwing information on the wiki as the need arises, as opposed to waiting until the right time to write good documentation, we could end up getting a lot more information out there.
Is this something others agree should be a focus right now?
[1] The right wiki, for the record, is https://www.aeolusproject.org/redmine/projects/aeolus/wiki/ (which happens to be down right now...)
-- Matt _______________________________________________ aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel
ACK s|e
On 07/19/11 - 05:44:28PM, Matt Wagner wrote:
On Tue, Jul 19, 2011 at 02:11:26PM -0400, Hugh Brock wrote:
Below are some obvious buckets, please feel free to suggest additional features large or small.
I'd really love it if "Documentation" were on this list. It's not really a task as much as an ongoing process that we could improve on.
I'm very specifically *not* requesting that Justin be asked to write documentation that's meticulously-proofread, peer-reviewed, and QE-tested on multile platforms for the main website. (Although that'd be a nice eventual goal.)
Instead, I really like the idea of throwing wiki pages together as we go. They'll surely start off mediocre, but that's infinitely better (in my mind, at least) than documentation that's yet to be written. As we go, we can fill in holes, clean up errors, and generally enhance it.
Whether the wiki is the permanent home or we eventually move it to the main website doesn't really matter to me. I just want it to exist.
Absolutely. We should be doing this all of the time. Stuff that ends up being immensely useful can eventually get graduated to the main website, or man pages, or whatever, but this sort of thing is essential to making this useful to outside people.
On 07/19/2011 05:44 PM, Matt Wagner wrote:
On Tue, Jul 19, 2011 at 02:11:26PM -0400, Hugh Brock wrote:
Below are some obvious buckets, please feel free to suggest additional features large or small.
I'd really love it if "Documentation" were on this list. It's not really a task as much as an ongoing process that we could improve on.
I'm very specifically *not* requesting that Justin be asked to write documentation that's meticulously-proofread, peer-reviewed, and QE-tested on multile platforms for the main website. (Although that'd be a nice eventual goal.)
Instead, I really like the idea of throwing wiki pages together as we go. They'll surely start off mediocre, but that's infinitely better (in my mind, at least) than documentation that's yet to be written. As we go, we can fill in holes, clean up errors, and generally enhance it.
Whether the wiki is the permanent home or we eventually move it to the main website doesn't really matter to me. I just want it to exist.
As an example of how this can work fairly well, when we switched to using command-line tools for building and pushing, it took me the longest time to figure it out, and I kept having to ask for help. Eventually, I threw together a wiki page with what I had figured out, and asked someone to confirm that it didn't contain egregious errors. Inevitably, others had the same questions I did, and we were able to refer them to that page. And then, once it was being treated as documentation, others took an interest in making sure it was kept up-to-date. (In hindsight, I didn't even create the article on the right wiki[1], but this kind of goes to the point I'm trying to make -- getting something live today is better than waiting until you're sure you know which wiki you're suppose to use.)
It occurs to me that there's still an immense amount of knowledge about how everything works that you can really only get by asking people on IRC, and this is really unfortunate. I think if we tried to work towards throwing information on the wiki as the need arises, as opposed to waiting until the right time to write good documentation, we could end up getting a lot more information out there.
Is this something others agree should be a focus right now?
+1
There are a bunch of different ways that we can attack this, and my feeling is that the best answer may be a mixture of different:
* how-to guides like the cli build/deploy guide you mentioned above * interesting/useful/usable example deployable definitions * anything that helps troubleshooting
[1] The right wiki, for the record, is https://www.aeolusproject.org/redmine/projects/aeolus/wiki/ (which happens to be down right now...)
-- Matt _______________________________________________ aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel
On Tue, 2011-07-19 at 14:11 -0400, Hugh Brock wrote:
Hello all.
With release 0.3.0 about ready to ship it seems like a good time to start talking about features we'd like to see for 0.4.0. I'd like to continue the three-month release cycle we've been on, so that puts our next one around mid-October.
You know, looking at this list, I really wonder - why not release earlier and oftener?
- we don't have a massive amount of sub-projects; we should be able to turn around a release very quickly and the more often we do it, the smoother the process will be
- we're not intentionally breaking anything on a regular basis, so there shouldn't be any reason not to release more often
- shorter release cycles means the goals for each cycle won't be as far reaching and hand-wavey, instead it would be a much more specific set of tasks and features
Why not aim for e.g. every 3 weeks? Or perhaps every 2 weeks?
Below are some obvious buckets, please feel free to suggest additional features large or small.
Finally, note I'm not making any claim that the list below is achievable in the timeframe we're talking about (although I would hope it's not that far from what is achievable). I'm more thinking in terms of what would make our 0.4.0 release seem like a coherent whole, and make the largest number of upstream users interested and happy.
I'll start with Conductor features:
Authorization. We have a fair amount of authorization checking in place, but no way to actually set who can do what. Given that a central Conductor feature is the ability to control access to cloud resources, this seems like an important feature. Things we'll need to put this in place:
UX around setting permissions
UX around displaying appropriate "You can't do that" messages where required, or showing/hiding controls as appropriate
Good tests
Not much model code -- I think it's all mostly in place. Correct me if I'm wrong.
I'd characterize this all as "paying closer attention to the self-service UI".
Perhaps simply a 'create_self_service_user' rake task to go along with our 'create_admin_user' task would help an awful lot?
i.e. set up a self-service user by default for developers and encourage everyone to test the UI using both users.
Identity and encryption. Authorization doesn't do a lot of good if anyone can bumble along and impersonate anyone else, so it would be pretty nice to have at least a workaday identity and encryption setup. Conversations with potential users have suggested the following minimum features, feel free to suggest your own:
Conductor will authenticate against an LDAP server. Since most LDAP servers in the real world are Windows Active Directory, we should probably include AD in the set of servers we test against.
Fall back to local user data store, maybe? You can imagine needing a local admin user that isn't in LDAP, for example
Be able to proxy identity when talking to other things that need to know it. Checking identity when saving things to/retrieving things from Image Warehouse is the main requirement for this. I think it's getting a GSSAPI library soon which should help. We will also probably need this for Katello, when we get to talking to it. FWIW Katello is currently using two-legged OAuth for this, so I would think this would be the primary candidate for us too.
A way to encrypt the traffic between Conductor, Deltacloud API, Warehouse, and Katello. The obvious solution for this is ssl certs that are created and signed by the installer, with some way to update/revoke them.
Well, there's a few things going on here:
1) Integrating with existing identity providers would be nice - the common example is LDAP. If you're using Aeolus within a corporate environment which has LDAP or AD, this would be desirable. But OpenID and OAuth etc. would be nice too.
(There's lots of questions around this - e.g. policy for self-service users, whether the admin user can be in the federated identity store etc.)
2) Authentication, authorization and permissions in iwhd
3) Authentication and authorization in imagefactory - e.g. you can't have an owner for an image in iwhd, unless imagefactory knows what user is building the image
4) Allowing deltacloud, iwhd and imagefactory to be deployed on different machines; it's only at this point you need to encrypt the communication to each
5) Considerations about what other projects like Katello need if they are going to build on (parts of?) Aeolus
Admin UX work
We need to give the pool, pool family, and provider management screens the same loving treatment we have given the instance management screens.
We need to make sure self-service really is sane. A big part of self service is image visibility -- i.e. who can launch what where (VMWare's "Catalog" concept answers this requirement for them). A good self-service solution is going to take thinking through some use cases and some serious UX work as well.
I'd really like to see a front door to the Conductor app. I'm afraid to call it a "dashboard" because then it will never get built :). I'd love suggestions for what should appear on such a thing.
Other UX work
- I think we should be able to launch single images from Conductor without requiring a deployable XML. To make that easier for users, it would be nice if there was some UI for displaying images that are available to launch.
Absolutely.
The notion of managing single instances would be required for conductor to expose the deltacloud API too.
Status reporting
We should reliably display the status of a running instance and its uptime
We should start thinking about how we will handle the richer data about instance health that we will get once Matahari is in place
What kind of monitoring data are we talking about, specifically? Why are we assuming Matahari is the solution here?
Users should be able to view an audit trail of events for an instance or a set of instances
Users should be able to export those events
Are we simply talking about start/stop events?
API
- We've been saying for a very long time that we need a real API for managing Conductor and for doing instance stuff in Conductor. If we admit that we have to manage instances that are not part of deployments, then we can also just say that the Deltacloud API we expose only works for instances. I think this is good enough for the next release.
Right, a deltacloud API implementation in conductor for instances and images should be the first goal.
It's tempting to think that adding the admin API is a simpler task, but I think my summary showed that it's not as straightforward as it seems:
https://fedorahosted.org/pipermail/aeolus-devel/2011-July/002883.html
Infrastructure-around-Conductor features:
Identity and encryption. In addition to the bits that go in Conductor proper, there's going to be a lot of work in the installer and in other projects nearby.
Better self-monitoring. I'd like to see a quick shell command that will give a meaningful report of the status of all the app components.
Way better logging and error reporting.
- All components should be using syslog if at all possible
Why syslog?
Logs should be timestamped
We should not be logging credentials or things that are potentially embarassing
Components can be distributed across multiple machines
RHEV-M 3.0 really works as a cloud provider.
"Orchestrator" features (even though these aren't yet separate components, I've bracketed off stuff that concerns post-boot and multi-instance operations as conceptually different topics to work on)
Assemblies
- Users can define assemblies that cause the post-boot config apparatus to install software and set config parameters on instances when they check in after booting
Deployables and deployments
Users can define deployables that contain multiple assemblies.
Users can specify parameters that should be collected from a user when the user launches the deployable.
Users can direct that parameters collected from a user be interpolated in arbitrary spots in the deployable descriptor.
There is a UI for collecting parameters from the launching user
There is a mechanism for passing all the assembly and deployable config information through to the post-boot agent. (I think this could use user-data, *or* a config server.)
Authorization
- Should there be some way of restricting the assemblies/deployables that a user can launch on particular hardware?
Okay, some other things that occur to me:
- Move to Rails 3
- Re-instate searching in the UI, possibly using scoped_search
- At least one single real life example of templates and deployables in use
Cheers, Mark.
On Wed, Jul 20, 2011 at 12:20:43PM +0100, Mark McLoughlin wrote:
On Tue, 2011-07-19 at 14:11 -0400, Hugh Brock wrote:
Hello all.
With release 0.3.0 about ready to ship it seems like a good time to start talking about features we'd like to see for 0.4.0. I'd like to continue the three-month release cycle we've been on, so that puts our next one around mid-October.
You know, looking at this list, I really wonder - why not release earlier and oftener?
we don't have a massive amount of sub-projects; we should be able to turn around a release very quickly and the more often we do it, the smoother the process will be
we're not intentionally breaking anything on a regular basis, so there shouldn't be any reason not to release more often
shorter release cycles means the goals for each cycle won't be as far reaching and hand-wavey, instead it would be a much more specific set of tasks and features
Why not aim for e.g. every 3 weeks? Or perhaps every 2 weeks?
I have no problem with releasing more often, as long as we do that without incurring the overhead of a full QE cycle. If people will be happy with a minor/major type of release setup where we do frequent releases but only do a really solid release once every 3 months or so, I see no problem with that.
Below are some obvious buckets, please feel free to suggest additional features large or small.
Finally, note I'm not making any claim that the list below is achievable in the timeframe we're talking about (although I would hope it's not that far from what is achievable). I'm more thinking in terms of what would make our 0.4.0 release seem like a coherent whole, and make the largest number of upstream users interested and happy.
I'll start with Conductor features:
Authorization. We have a fair amount of authorization checking in place, but no way to actually set who can do what. Given that a central Conductor feature is the ability to control access to cloud resources, this seems like an important feature. Things we'll need to put this in place:
UX around setting permissions
UX around displaying appropriate "You can't do that" messages where required, or showing/hiding controls as appropriate
Good tests
Not much model code -- I think it's all mostly in place. Correct me if I'm wrong.
I'd characterize this all as "paying closer attention to the self-service UI".
Perhaps simply a 'create_self_service_user' rake task to go along with our 'create_admin_user' task would help an awful lot?
i.e. set up a self-service user by default for developers and encourage everyone to test the UI using both users.
Identity and encryption. Authorization doesn't do a lot of good if anyone can bumble along and impersonate anyone else, so it would be pretty nice to have at least a workaday identity and encryption setup. Conversations with potential users have suggested the following minimum features, feel free to suggest your own:
Conductor will authenticate against an LDAP server. Since most LDAP servers in the real world are Windows Active Directory, we should probably include AD in the set of servers we test against.
Fall back to local user data store, maybe? You can imagine needing a local admin user that isn't in LDAP, for example
Be able to proxy identity when talking to other things that need to know it. Checking identity when saving things to/retrieving things from Image Warehouse is the main requirement for this. I think it's getting a GSSAPI library soon which should help. We will also probably need this for Katello, when we get to talking to it. FWIW Katello is currently using two-legged OAuth for this, so I would think this would be the primary candidate for us too.
A way to encrypt the traffic between Conductor, Deltacloud API, Warehouse, and Katello. The obvious solution for this is ssl certs that are created and signed by the installer, with some way to update/revoke them.
Well, there's a few things going on here:
Integrating with existing identity providers would be nice - the common example is LDAP. If you're using Aeolus within a corporate environment which has LDAP or AD, this would be desirable. But OpenID and OAuth etc. would be nice too.
(There's lots of questions around this - e.g. policy for self-service users, whether the admin user can be in the federated identity store etc.)
Authentication, authorization and permissions in iwhd
Authentication and authorization in imagefactory - e.g. you can't have an owner for an image in iwhd, unless imagefactory knows what user is building the image
Allowing deltacloud, iwhd and imagefactory to be deployed on different machines; it's only at this point you need to encrypt the communication to each
Considerations about what other projects like Katello need if they are going to build on (parts of?) Aeolus
OK, I'm willing to admit that's a better list than the one I made...
Admin UX work
We need to give the pool, pool family, and provider management screens the same loving treatment we have given the instance management screens.
We need to make sure self-service really is sane. A big part of self service is image visibility -- i.e. who can launch what where (VMWare's "Catalog" concept answers this requirement for them). A good self-service solution is going to take thinking through some use cases and some serious UX work as well.
I'd really like to see a front door to the Conductor app. I'm afraid to call it a "dashboard" because then it will never get built :). I'd love suggestions for what should appear on such a thing.
Other UX work
- I think we should be able to launch single images from Conductor without requiring a deployable XML. To make that easier for users, it would be nice if there was some UI for displaying images that are available to launch.
Absolutely.
The notion of managing single instances would be required for conductor to expose the deltacloud API too.
Status reporting
We should reliably display the status of a running instance and its uptime
We should start thinking about how we will handle the richer data about instance health that we will get once Matahari is in place
What kind of monitoring data are we talking about, specifically? Why are we assuming Matahari is the solution here?
Well, I think we're talking about the kind of instance health data you would get from virt-top and (possibly) virt-dmesg. Unfortunately in a cloud environment you can't get to the host and use those tools, so we have to settle for an in-instance agent, which we have been saying is going to be Matahari.
Having said all that, I don't think Conductor cares where the data comes from -- I think we really just need to start thinking about how we display it.
Users should be able to view an audit trail of events for an instance or a set of instances
Users should be able to export those events
Are we simply talking about start/stop events?
To begin with, yes -- but with better monitoring we'll have more of them.
API
- We've been saying for a very long time that we need a real API for managing Conductor and for doing instance stuff in Conductor. If we admit that we have to manage instances that are not part of deployments, then we can also just say that the Deltacloud API we expose only works for instances. I think this is good enough for the next release.
Right, a deltacloud API implementation in conductor for instances and images should be the first goal.
It's tempting to think that adding the admin API is a simpler task, but I think my summary showed that it's not as straightforward as it seems:
https://fedorahosted.org/pipermail/aeolus-devel/2011-July/002883.html
Yes.
Infrastructure-around-Conductor features:
Identity and encryption. In addition to the bits that go in Conductor proper, there's going to be a lot of work in the installer and in other projects nearby.
Better self-monitoring. I'd like to see a quick shell command that will give a meaningful report of the status of all the app components.
Way better logging and error reporting.
- All components should be using syslog if at all possible
Why syslog?
My thinking here was that we should, to the extent we can, be using logging facilities we don't have to manage ourselves. I doubt syslog is appropriate for the Rails app, but I would think it would be appropriate for IWHD for example. I'm ultimately more interested in getting the logs managed, rotated properly, and put in a well known location for support though.
Logs should be timestamped
We should not be logging credentials or things that are potentially embarassing
Components can be distributed across multiple machines
RHEV-M 3.0 really works as a cloud provider.
"Orchestrator" features (even though these aren't yet separate components, I've bracketed off stuff that concerns post-boot and multi-instance operations as conceptually different topics to work on)
Assemblies
- Users can define assemblies that cause the post-boot config apparatus to install software and set config parameters on instances when they check in after booting
Deployables and deployments
Users can define deployables that contain multiple assemblies.
Users can specify parameters that should be collected from a user when the user launches the deployable.
Users can direct that parameters collected from a user be interpolated in arbitrary spots in the deployable descriptor.
There is a UI for collecting parameters from the launching user
There is a mechanism for passing all the assembly and deployable config information through to the post-boot agent. (I think this could use user-data, *or* a config server.)
Authorization
- Should there be some way of restricting the assemblies/deployables that a user can launch on particular hardware?
Okay, some other things that occur to me:
Move to Rails 3
Re-instate searching in the UI, possibly using scoped_search
At least one single real life example of templates and deployables in use
All good choices. I hope we'll have rails 3 sorted by the beginning of this iteration.
Thanks, I will incorporate your thoughts in the next revision of the doc.
--H
On 07/20/2011 08:51 AM, Hugh Brock wrote:
On Wed, Jul 20, 2011 at 12:20:43PM +0100, Mark McLoughlin wrote:
On Tue, 2011-07-19 at 14:11 -0400, Hugh Brock wrote:
Hello all.
With release 0.3.0 about ready to ship it seems like a good time to start talking about features we'd like to see for 0.4.0. I'd like to continue the three-month release cycle we've been on, so that puts our next one around mid-October.
You know, looking at this list, I really wonder - why not release earlier and oftener?
we don't have a massive amount of sub-projects; we should be able to turn around a release very quickly and the more often we do it, the smoother the process will be
we're not intentionally breaking anything on a regular basis, so there shouldn't be any reason not to release more often
shorter release cycles means the goals for each cycle won't be as far reaching and hand-wavey, instead it would be a much more specific set of tasks and features
Why not aim for e.g. every 3 weeks? Or perhaps every 2 weeks?
I have no problem with releasing more often, as long as we do that without incurring the overhead of a full QE cycle. If people will be happy with a minor/major type of release setup where we do frequent releases but only do a really solid release once every 3 months or so, I see no problem with that.
+1
That's one of the things that Chris was getting at with establishing testing vs 0.x.0 type repos so that the community has a fairly obvious way to differentiate between firmed up releases vs 'hot off the presses' type stuff.
One thing to consider is whether the publishing of rpms should fall within whatever period is decided on. I'd vote for 3 weeks simply because I know a few other projects are aligning against 3 weeks and I'd hate to cause more churn there. I'd also vote that we include the building/publishing of the rpms in the release cycle itself simply so that each chunk of time is self-contained.
<lots of good info snipped>
Mike
On 07/20/2011 08:51 AM, Hugh Brock wrote:
I have no problem with releasing more often, as long as we do that without incurring the overhead of a full QE cycle. If people will be happy with a minor/major type of release setup where we do frequent releases but only do a really solid release once every 3 months or so, I see no problem with that.
I've done this on other projects and it works very well as it gives QE incremental features and builds to create tests, so the backlog is less for the final stable release of test automation/creation.
+1 from my side.
Carl.
Why not aim for e.g. every 3 weeks? Or perhaps every 2 weeks?
I have no problem with releasing more often, as long as we do that without incurring the overhead of a full QE cycle. If people will be happy with a minor/major type of release setup where we do frequent releases but only do a really solid release once every 3 months or so, I see no problem with that.
Again, as a complete newcomer, I'd like to see shorter release cycles.
On 07/20/11 - 12:20:43PM, Mark McLoughlin wrote:
On Tue, 2011-07-19 at 14:11 -0400, Hugh Brock wrote:
Hello all.
With release 0.3.0 about ready to ship it seems like a good time to start talking about features we'd like to see for 0.4.0. I'd like to continue the three-month release cycle we've been on, so that puts our next one around mid-October.
You know, looking at this list, I really wonder - why not release earlier and oftener?
we don't have a massive amount of sub-projects; we should be able to turn around a release very quickly and the more often we do it, the smoother the process will be
we're not intentionally breaking anything on a regular basis, so there shouldn't be any reason not to release more often
shorter release cycles means the goals for each cycle won't be as far reaching and hand-wavey, instead it would be a much more specific set of tasks and features
Why not aim for e.g. every 3 weeks? Or perhaps every 2 weeks?
I've been advocating this for a while. The problem up until this point has been that doing a release is a large undertaking. There are so many dependencies, and enough sub-projects, that coordinating them at that rapid a pace is difficult.
The dependency situation is slowly sorting itself out. We have enough of the stuff in Fedora, and we are reducing dependencies at a good enough rate, that this is no longer the huge issue it once was. The sub-components are also becoming more stable (at least in their interfaces), so I think it would be a good idea to re-visit doing much more frequent releases.
On 07/19/2011 02:11 PM, Hugh Brock wrote:
Infrastructure-around-Conductor features:
Can we do something around, import from Kaetello, i.e. some or other end-to end case from a content source service.
To be specific, get the import of a Katello template into image-factory definition and build as part of a deployable.
Carl.
On Wed, 2011-07-20 at 16:06 -0400, Carl Trieloff wrote:
On 07/19/2011 02:11 PM, Hugh Brock wrote:
Infrastructure-around-Conductor features:
Can we do something around, import from Kaetello, i.e. some or other end-to end case from a content source service.
To be specific, get the import of a Katello template into image-factory definition and build as part of a deployable.
I see Aeolus as exposing an Image Factory API which Katello uses.
On the Aeolus side, I'm sure we'd be delighted to get feedback from the Katello project on the Image Factory API and template definition format.
Cheers, Mark.
On 07/21/2011 02:40 AM, Mark McLoughlin wrote:
On Wed, 2011-07-20 at 16:06 -0400, Carl Trieloff wrote:
On 07/19/2011 02:11 PM, Hugh Brock wrote:
Infrastructure-around-Conductor features:
Can we do something around, import from Kaetello, i.e. some or other end-to end case from a content source service.
To be specific, get the import of a Katello template into image-factory definition and build as part of a deployable.
I see Aeolus as exposing an Image Factory API which Katello uses.
On the Aeolus side, I'm sure we'd be delighted to get feedback from the Katello project on the Image Factory API and template definition format.
Why would we just not call Image Factory directly?
-- bk
On Thu, 2011-07-21 at 10:19 -0400, Bryan Kearney wrote:
On 07/21/2011 02:40 AM, Mark McLoughlin wrote:
On Wed, 2011-07-20 at 16:06 -0400, Carl Trieloff wrote:
On 07/19/2011 02:11 PM, Hugh Brock wrote:
Infrastructure-around-Conductor features:
Can we do something around, import from Kaetello, i.e. some or other end-to end case from a content source service.
To be specific, get the import of a Katello template into image-factory definition and build as part of a deployable.
I see Aeolus as exposing an Image Factory API which Katello uses.
On the Aeolus side, I'm sure we'd be delighted to get feedback from the Katello project on the Image Factory API and template definition format.
Why would we just not call Image Factory directly?
That's what I meant ... terminology confusion, perhaps
Aeolus == umbrella of projects Image Factory and Conductor == two of those projects Image Factory API == Image Factory's QMF interface
Cheers, Mark.
On 07/21/2011 10:22 AM, Mark McLoughlin wrote:
On Thu, 2011-07-21 at 10:19 -0400, Bryan Kearney wrote:
On 07/21/2011 02:40 AM, Mark McLoughlin wrote:
On Wed, 2011-07-20 at 16:06 -0400, Carl Trieloff wrote:
On 07/19/2011 02:11 PM, Hugh Brock wrote:
Infrastructure-around-Conductor features:
Can we do something around, import from Kaetello, i.e. some or other end-to end case from a content source service.
To be specific, get the import of a Katello template into image-factory definition and build as part of a deployable.
I see Aeolus as exposing an Image Factory API which Katello uses.
On the Aeolus side, I'm sure we'd be delighted to get feedback from the Katello project on the Image Factory API and template definition format.
Why would we just not call Image Factory directly?
That's what I meant ... terminology confusion, perhaps
Aeolus == umbrella of projects Image Factory and Conductor == two of those projects Image Factory API == Image Factory's QMF interface
kk.. sorry.
-- bk
On Tue, Jul 19, 2011 at 02:11:26PM -0400, Hugh Brock wrote:
Infrastructure-around-Conductor features:
<snip>
- Better self-monitoring. I'd like to see a quick shell command that will give a meaningful report of the status of all the app components.
There's a crude script I checked in a while ago, util/check_services, that does exactly this. It iterates over the relevant services in /etc/init.d/ and calls 'status' on them.
Really, I wrote it just because I got sick of trying to guess the failed service, not because I wanted a generally-useful tool that was elegantly written, so there's surely room for improvement. (Plus, it's not in the RPM spec, so it never gets installed for end-users.) But it may be helpful to some nonetheless.
-- Matt
aeolus-devel@lists.fedorahosted.org