On 06/28/2011 10:46 AM, Joseph VLcek wrote:
On Tue, 2011-06-28 at 07:26 -0700, Steven Dake wrote:
On 06/28/2011 06:24 AM, Joseph VLcek wrote:
On Mon, 2011-06-27 at 13:59 -0700, Steven Dake wrote:
On 06/24/2011 04:16 PM, Steven Dake wrote:
Currently most linux distributions that use dbus store a UUID in /var/lib/dbus/machine_id. In our pacemaker-cloud work test tools, we must manipulate this file via oz image creation to match a value we know about.
Q1. Is this file freshly created on each image creation/cloning process?
If not, it should be, because Matahari uses this information to uniquely identify a host. If it is copied exactly to each new image, that creates a problem (all hosts appear the same to matahari).
Q2. If/when it is created by image factory, will it be stored in a database or other storage medium?
In pacemaker-cloud we need to have a mapping from image->internal id id so that we know which VM maps to which deployable HA configuration.
If we wait on this point until after our 1.0 release, we could end up with a bunch of images in the field that have either the same machine id or are not mapped in any way that allows us to provide HA functionality.
Regards -steve
After more investigation, Chris and I came up with a workable plan for handling unique VM ids (see thread with subject How Audrey, Conductor, and Audrey's config server interact and their relationship to a unique vm instance id).
Currently the Audrey script runs as a replacement to rc.local. Matahari runs at S99. The general idea is for Audrey script to run as S98 (before Matahari) and write a management-wide unique instance UUID to the file /etc/vm_machine_id (audrey has access to this information).
Matahari could be changed to read /etc/vm_machine_id first. If that file doesn't exist, /var/lib/dbus/machine_id would be read.
This creates some difficulty in running the audrey script at a specific runlevel (it requires some changes to oz to insert init scripts).
Another option is for the current rc.local script that audrey replaces to run the Matahari service starting giblits as its first action.
Comments welcome before I start writing code... -steve
I think the solution of having Audrey store a launch time unique UUID in a file that Matahari can read will work.
It may not be necessary to alter oz to insert init scripts to ensure Audrey runs before Matahari.
Let me explain.
Audrey does not replace /etc/rc.local
When Image Factory builds the image it appends to the end of /etc/rc.local a line of code that will start Audrey.
e.g.: [ -f /usr/bin/audrey ] && /usr/bin/audrey
I propose having Image Factory append another line to /etc/rc.local below where it starts Audrey to start Matahari.
e.g.:
[ -f /usr/bin/audrey ] && /usr/bin/audrey
[ -f <Matahari start> ] && <run Matahari start>
We may need to manage timing to ensure /usr/bin/audrey does not return until it has stored the unique UUID in a file and have it return an error status if it is unable to.
Thoughts?
This sounds fine to me, although the rc.local modification always sounded a bit hacky (rather then using a proper init script).
One minor issue is matahari expects to be started via service xxx start and each agent has a separate init script. There are 5 or 6 agents.
Regards -steve
Joe
Yes. I agree the rc.local modification could be viewed as a bit hacky however I don't think unacceptably so. If others disagree we could explore Chris's suggestion of altering oz to insert init scripts.
On principle I say it's hacky and there is a better solution.
However, the pragmatic part of me says, "just make it work", and I think getting something working is more important than architectural purity.
So let's go with the solution we know works _now_ (rc.local starting audrey and matahari in sequence). Later when time permits and we have a fully functional system and the time to refine things, we can always fix it to be better.
Cheers,
Perry