Re: Release 0.4.0 planning

20 Jul 2011

      On Wed, Jul 20, 2011 at 12:20:43PM +0100, Mark McLoughlin wrote:
...
On Tue, 2011-07-19 at 14:11 -0400, Hugh Brock wrote:
...
Hello all.
With release 0.3.0 about ready to ship it seems like a good time to
start talking about features we'd like to see for 0.4.0. I'd like to
continue the three-month release cycle we've been on, so that puts our
next one around mid-October.
You know, looking at this list, I really wonder - why not release
earlier and oftener?

we don't have a massive amount of sub-projects; we should be able 
to turn around a release very quickly and the more often we do it, 
the smoother the process will be

we're not intentionally breaking anything on a regular basis, so 
there shouldn't be any reason not to release more often

shorter release cycles means the goals for each cycle won't be as
far reaching and hand-wavey, instead it would be a much more 
specific set of tasks and features

Why not aim for e.g. every 3 weeks? Or perhaps every 2 weeks?
I have no problem with releasing more often, as long as we do that
without incurring the overhead of a full QE cycle. If people will be
happy with a minor/major type of release setup where we do frequent
releases but only do a really solid release once every 3 months or so,
I see no problem with that.
...
...
Below are some obvious buckets, please feel free to suggest additional
features large or small.
Finally, note I'm not making any claim that the list below is
achievable in the timeframe we're talking about (although I would hope
it's not that far from what is achievable). I'm more thinking in terms
of what would make our 0.4.0 release seem like a coherent whole, and
make the largest number of upstream users interested and happy.
I'll start with Conductor features:

Authorization. We have a fair amount of authorization checking in
place, but no way to actually set who can do what. Given that a
central Conductor feature is the ability to control access to cloud
resources, this seems like an important feature. Things we'll need
to
put this in place:

UX around setting permissions

UX around displaying appropriate "You can't do that" messages
where required, or showing/hiding controls as appropriate

Good tests

Not much model code -- I think it's all mostly in place. Correct
me if I'm wrong.

I'd characterize this all as "paying closer attention to the
self-service UI".
Perhaps simply a 'create_self_service_user' rake task to go along with
our 'create_admin_user' task would help an awful lot?
i.e. set up a self-service user by default for developers and encourage
everyone to test the UI using both users.
...

Identity and encryption. Authorization doesn't do a lot of good if
anyone can bumble along and impersonate anyone else, so it would be
pretty nice to have at least a workaday identity and encryption
setup. Conversations with potential users have suggested the
following minimum features, feel free to suggest your own:

Conductor will authenticate against an LDAP server. Since most
LDAP servers in the real world are Windows Active Directory, we
should probably include AD in the set of servers we test against.

Fall back to local user data store, maybe? You can imagine needing
a local admin user that isn't in LDAP, for example

Be able to proxy identity when talking to other things that need
to know it. Checking identity when saving things to/retrieving
things from Image Warehouse is the main requirement for this. I
think it's getting a GSSAPI library soon which should help. We
will also probably need this for Katello, when we get to talking
to it. FWIW Katello is currently using two-legged OAuth for this,
so I would think this would be the primary candidate for us too.

A way to encrypt the traffic between Conductor, Deltacloud API,
Warehouse, and Katello. The obvious solution for this is ssl certs
that are created and signed by the installer, with some way to
update/revoke them.

Well, there's a few things going on here:

Integrating with existing identity providers would be nice - the 
common example is LDAP. If you're using Aeolus within a corporate 
environment which has LDAP or AD, this would be desirable. But 
OpenID and OAuth etc. would be nice too.
(There's lots of questions around this - e.g. policy for 
self-service users, whether the admin user can be in the federated 
identity store etc.)

Authentication, authorization and permissions in iwhd

Authentication and authorization in imagefactory - e.g. you can't 
have an owner for an image in iwhd, unless imagefactory knows what 
user is building the image

Allowing deltacloud, iwhd and imagefactory to be deployed on
different machines; it's only at this point you need to encrypt
the communication to each

Considerations about what other projects like Katello need if they 
are going to build on (parts of?) Aeolus

OK, I'm willing to admit that's a better list than the one I made...
...
...

Admin UX work

We need to give the pool, pool family, and provider management
screens the same loving treatment we have given the instance
management screens.

We need to make sure self-service really is sane. A big part of
self service is image visibility -- i.e. who can launch what where
(VMWare's "Catalog" concept answers this requirement for them). A
good self-service solution is going to take thinking through some
use cases and some serious UX work as well.

I'd really like to see a front door to the Conductor app. I'm
afraid to call it a "dashboard" because then it will never get
built :). I'd love suggestions for what should appear on such a
thing.

Other UX work

I think we should be able to launch single images from Conductor
without requiring a deployable XML. To make that easier for users,
it would be nice if there was some UI for displaying images that
are available to launch.

Absolutely.
The notion of managing single instances would be required for conductor
to expose the deltacloud API too.
...

Status reporting

We should reliably display the status of a running instance and
its uptime

We should start thinking about how we will handle the richer data
about instance health that we will get once Matahari is in place

What kind of monitoring data are we talking about, specifically? Why are
we assuming Matahari is the solution here?
Well, I think we're talking about the kind of instance health data you
would get from virt-top and (possibly) virt-dmesg. Unfortunately in a
cloud environment you can't get to the host and use those tools, so we
have to settle for an in-instance agent, which we have been saying is
going to be Matahari.
Having said all that, I don't think Conductor cares where the data
comes from -- I think we really just need to start thinking about how
we display it.
...
...

Users should be able to view an audit trail of events for an
instance or a set of instances

Users should be able to export those events

Are we simply talking about start/stop events?
To begin with, yes -- but with better monitoring we'll have more of them.
...
...

API

We've been saying for a very long time that we need a real API for
managing Conductor and for doing instance stuff in Conductor. If
we admit that we have to manage instances that are not part of
deployments, then we can also just say that the Deltacloud API we
expose only works for instances. I think this is good enough for
the next release.

Right, a deltacloud API implementation in conductor for instances and
images should be the first goal.
It's tempting to think that adding the admin API is a simpler task, but
I think my summary showed that it's not as straightforward as it seems:
https://fedorahosted.org/pipermail/aeolus-devel/2011-July/002883.html
Yes.
...
...
Infrastructure-around-Conductor features:

Identity and encryption. In addition to the bits that go in
Conductor proper, there's going to be a lot of work in the installer
and in other projects nearby.

Better self-monitoring. I'd like to see a quick shell command that
will give a meaningful report of the status of all the app
components.

Way better logging and error reporting.

All components should be using syslog if at all possible

Why syslog?
My thinking here was that we should, to the extent we can, be using
logging facilities we don't have to manage ourselves. I doubt syslog
is appropriate for the Rails app, but I would think it would be
appropriate for IWHD for example. I'm ultimately more interested in
getting the logs managed, rotated properly, and put in a well known
location for support though.
...
...

Logs should be timestamped

We should not be logging credentials or things that are
potentially embarassing

Components can be distributed across multiple machines

RHEV-M 3.0 really works as a cloud provider.

"Orchestrator" features (even though these aren't yet separate
components, I've bracketed off stuff that concerns post-boot and
multi-instance operations as conceptually different topics to work on)

Assemblies

Users can define assemblies that cause the post-boot config
apparatus to install software and set config parameters on
instances when they check in after booting

Deployables and deployments

Users can define deployables that contain multiple assemblies.

Users can specify parameters that should be collected from a user
when the user launches the deployable.

Users can direct that parameters collected from a user be
interpolated in arbitrary spots in the deployable descriptor.

There is a UI for collecting parameters from the launching user

There is a mechanism for passing all the assembly and deployable
config information through to the post-boot agent. (I think this
could use user-data, *or* a config server.)

Authorization

Should there be some way of restricting the assemblies/deployables
that a user can launch on particular hardware?

Okay, some other things that occur to me:

Move to Rails 3

Re-instate searching in the UI, possibly using scoped_search

At least one single real life example of templates and deployables 
in use

All good choices. I hope we'll have rails 3 sorted by the beginning of
this iteration.
Thanks, I will incorporate your thoughts in the next revision of the
doc.
--H
-- 
== Hugh Brock, hbrock@redhat.com                                   ==
== Engineering Manager, Cloud BU                                   ==
== Aeolus Project: Manage virtual infrastructure across clouds.    ==
== http://aeolusproject.org                                        ==

"I know that you believe you understand what you think I said, but I’m
not sure you realize that what you heard is not what I meant."
--Robert McCloskey

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: Release 0.4.0 planning