On 06/29/2011 09:32 PM, Andrew Beekhof wrote:
On Wed, Jun 29, 2011 at 2:05 AM, Perry Myers pmyers@redhat.com wrote:
On 06/28/2011 10:26 AM, Steven Dake wrote:
On 06/28/2011 06:24 AM, Joseph VLcek wrote:
On Mon, 2011-06-27 at 13:59 -0700, Steven Dake wrote:
On 06/24/2011 04:16 PM, Steven Dake wrote:
Currently most linux distributions that use dbus store a UUID in /var/lib/dbus/machine_id. In our pacemaker-cloud work test tools, we must manipulate this file via oz image creation to match a value we know about.
Q1. Is this file freshly created on each image creation/cloning process?
If not, it should be, because Matahari uses this information to uniquely identify a host. If it is copied exactly to each new image, that creates a problem (all hosts appear the same to matahari).
Q2. If/when it is created by image factory, will it be stored in a database or other storage medium?
In pacemaker-cloud we need to have a mapping from image->internal id id so that we know which VM maps to which deployable HA configuration.
If we wait on this point until after our 1.0 release, we could end up with a bunch of images in the field that have either the same machine id or are not mapped in any way that allows us to provide HA functionality.
Regards -steve
After more investigation, Chris and I came up with a workable plan for handling unique VM ids (see thread with subject How Audrey, Conductor, and Audrey's config server interact and their relationship to a unique vm instance id).
Currently the Audrey script runs as a replacement to rc.local. Matahari runs at S99. The general idea is for Audrey script to run as S98 (before Matahari) and write a management-wide unique instance UUID to the file /etc/vm_machine_id (audrey has access to this information).
Matahari could be changed to read /etc/vm_machine_id first. If that file doesn't exist, /var/lib/dbus/machine_id would be read.
This creates some difficulty in running the audrey script at a specific runlevel (it requires some changes to oz to insert init scripts).
Another option is for the current rc.local script that audrey replaces to run the Matahari service starting giblits as its first action.
Comments welcome before I start writing code... -steve
I think the solution of having Audrey store a launch time unique UUID in a file that Matahari can read will work.
It may not be necessary to alter oz to insert init scripts to ensure Audrey runs before Matahari.
Let me explain.
Audrey does not replace /etc/rc.local
When Image Factory builds the image it appends to the end of /etc/rc.local a line of code that will start Audrey.
e.g.: [ -f /usr/bin/audrey ] && /usr/bin/audrey
I propose having Image Factory append another line to /etc/rc.local below where it starts Audrey to start Matahari.
e.g.:
[ -f /usr/bin/audrey ] && /usr/bin/audrey
[ -f <Matahari start> ] && <run Matahari start>
We may need to manage timing to ensure /usr/bin/audrey does not return until it has stored the unique UUID in a file and have it return an error status if it is unable to.
Thoughts?
This sounds fine to me, although the rc.local modification always sounded a bit hacky (rather then using a proper init script). One minor issue is matahari expects to be started via service xxx start and each agent has a separate init script. There are 5 or 6 agents.
Yes, I think having audrey 'start matahari' is very hackish, since in normal systems matahari would start via regular init scripts. So this means for cloud we'd need to disable the normal init scripts and then relegate control to audrey. I don't like that approach...
Let's back up a bit... why does matahari start need to depend on audrey starting? The answer is that we need audrey to put in the /etc/machine-id file.
Well, why not let matahari start at its normal runlevel, and respond to queries, etc, but if you call get-id API, then it should return something that indicates '/etc/machine-id not set yet, giving you dbus-id instead'
Then it's just up to the person doing the querying to wait until the id returned is the /etc/machine-id.
So remove the dep on service start and replace with intelligent application usage
Thoughts?
TBH, I'm not a fan.
Partly because that UUID is a property and attribute of every agent and are initialised at startup - I don't like the idea of those being volatile. It also pre-supposes that you'd never want the dbus id if /etc/machine-id is present.
We talked in the past about providing access to both a hardware _and_ a software uuid.
A software id representing the unique vm instance is what we need. The key feature of the software id is that it needs to be dynamically chanageable (ie: Audrey needs to change it at start time to something unique it knows about based upon a data transfer which we don't yet have a clear handle on how to make happen).
Which is /etc/machine-id supposed to be? If the former, then the solution is easy - add an extra API call.
systemd is moving the location of the "hardware id" from /var/lib/dbus/machine-id to /etc/machine-id.
In a cloud environment, if you upload a created image to something like EC2, each time you start a new instance with that image, the machine-id doesn't change. The problem is launching two vms - there is no way to correlate which one is the one we care about - essentially the machine-id would be the same for every instance of that image. If matahari gives out a machine id for two vms, surely it would confuse some piece of software (pacemaker-cloud for example).
aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel