/var/lib/dbus/machine_id, imagefactory, and matahari

List overview All Threads
Download

newer

older

Re: [PATCH Oz] Add some text to...

Announcing Pacemaker Cloud 0.4.1 -...

Steven Dake

25 Jun 2011 25 Jun '11

12:16 a.m.

Currently most linux distributions that use dbus store a UUID in /var/lib/dbus/machine_id. In our pacemaker-cloud work test tools, we must manipulate this file via oz image creation to match a value we know about.

Q1. Is this file freshly created on each image creation/cloning process?

If not, it should be, because Matahari uses this information to uniquely identify a host. If it is copied exactly to each new image, that creates a problem (all hosts appear the same to matahari).

Q2. If/when it is created by image factory, will it be stored in a database or other storage medium?

In pacemaker-cloud we need to have a mapping from image->internal id id so that we know which VM maps to which deployable HA configuration.

If we wait on this point until after our 1.0 release, we could end up with a bunch of images in the field that have either the same machine id or are not mapped in any way that allows us to provide HA functionality.

Regards -steve

Show replies by date

Steven Dake

27 Jun 27 Jun

9:59 p.m.

On 06/24/2011 04:16 PM, Steven Dake wrote:

...

Currently most linux distributions that use dbus store a UUID in /var/lib/dbus/machine_id. In our pacemaker-cloud work test tools, we must manipulate this file via oz image creation to match a value we know about.

Q1. Is this file freshly created on each image creation/cloning process?

If not, it should be, because Matahari uses this information to uniquely identify a host. If it is copied exactly to each new image, that creates a problem (all hosts appear the same to matahari).

Q2. If/when it is created by image factory, will it be stored in a database or other storage medium?

In pacemaker-cloud we need to have a mapping from image->internal id id so that we know which VM maps to which deployable HA configuration.

If we wait on this point until after our 1.0 release, we could end up with a bunch of images in the field that have either the same machine id or are not mapped in any way that allows us to provide HA functionality.

Regards -steve

After more investigation, Chris and I came up with a workable plan for handling unique VM ids (see thread with subject How Audrey, Conductor, and Audrey's config server interact and their relationship to a unique vm instance id).

Currently the Audrey script runs as a replacement to rc.local. Matahari runs at S99. The general idea is for Audrey script to run as S98 (before Matahari) and write a management-wide unique instance UUID to the file /etc/vm_machine_id (audrey has access to this information).

Matahari could be changed to read /etc/vm_machine_id first. If that file doesn't exist, /var/lib/dbus/machine_id would be read.

This creates some difficulty in running the audrey script at a specific runlevel (it requires some changes to oz to insert init scripts).

Another option is for the current rc.local script that audrey replaces to run the Matahari service starting giblits as its first action.

Comments welcome before I start writing code... -steve

...

aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel

apevec＠redhat.com

11:36 p.m.

New subject: [Matahari] /var/lib/dbus/machine_id, imagefactory, and matahari

On 06/27/2011 10:59 PM, Steven Dake wrote:

...

Currently the Audrey script runs as a replacement to rc.local. Matahari runs at S99. The general idea is for Audrey script to run as S98 (before Matahari) and write a management-wide unique instance UUID to

Relying on start priority levels seems fragile, esp. with systemd which "provides aggressive parallelization"

...

the file /etc/vm_machine_id (audrey has access to this information).

Matahari could be changed to read /etc/vm_machine_id first. If that file doesn't exist, /var/lib/dbus/machine_id would be read.

This creates some difficulty in running the audrey script at a specific runlevel (it requires some changes to oz to insert init scripts).

To express such dependency you could use LSB initscript header Required-Start: in matahari initscript (which one, matahari-host I guess?) but such dependency in Matahari doesn't seem right. In systemd unit file you could specify Before and After, but that's only for Fedora >= 15.

...

Another option is for the current rc.local script that audrey replaces to run the Matahari service starting giblits as its first action.

That actually might be the most reliable and portable way.

Alan

Perry Myers

28 Jun 28 Jun

2:14 p.m.

New subject: [Matahari] /var/lib/dbus/machine_id, imagefactory, and matahari

On 06/27/2011 06:36 PM, Alan Pevec wrote:

...

On 06/27/2011 10:59 PM, Steven Dake wrote:

...
Currently the Audrey script runs as a replacement to rc.local. Matahari runs at S99. The general idea is for Audrey script to run as S98 (before Matahari) and write a management-wide unique instance UUID to

Relying on start priority levels seems fragile, esp. with systemd which "provides aggressive parallelization"

...
the file /etc/vm_machine_id (audrey has access to this information).

Matahari could be changed to read /etc/vm_machine_id first. If that file doesn't exist, /var/lib/dbus/machine_id would be read.

This creates some difficulty in running the audrey script at a specific runlevel (it requires some changes to oz to insert init scripts).

To express such dependency you could use LSB initscript header Required-Start: in matahari initscript (which one, matahari-host I guess?) but such dependency in Matahari doesn't seem right. In systemd unit file you could specify Before and After, but that's only for Fedora >= 15.

...
Another option is for the current rc.local script that audrey replaces to run the Matahari service starting giblits as its first action.

That actually might be the most reliable and portable way.

Just one thing to keep in mind... Matahari has to be runnable in environments where Audrey doesn't exist (i.e. outside of the Cloud, maybe on bare metal hosts, etc)

So don't introduce a hard dependency on Audrey. It has to be 'if Audrey is present, matahari needs to run after Audrey. Otherwise, if Audrey is not present, matahari runs at runlevel foo'

Also... I hate bunching up services towards the end of the list (98, 99, etc). Is there any reason why these aren't slightly lower in the runlevel numbering so that we aren't almost falling off the end?

Perry

Steven Dake

3:21 p.m.

New subject: [Matahari] /var/lib/dbus/machine_id, imagefactory, and matahari

On 06/28/2011 06:14 AM, Perry Myers wrote:

...

On 06/27/2011 06:36 PM, Alan Pevec wrote:

...
On 06/27/2011 10:59 PM, Steven Dake wrote:

...
Currently the Audrey script runs as a replacement to rc.local. Matahari runs at S99. The general idea is for Audrey script to run as S98 (before Matahari) and write a management-wide unique instance UUID to

Relying on start priority levels seems fragile, esp. with systemd which "provides aggressive parallelization"

...
the file /etc/vm_machine_id (audrey has access to this information).

Matahari could be changed to read /etc/vm_machine_id first. If that file doesn't exist, /var/lib/dbus/machine_id would be read.

This creates some difficulty in running the audrey script at a specific runlevel (it requires some changes to oz to insert init scripts).

To express such dependency you could use LSB initscript header Required-Start: in matahari initscript (which one, matahari-host I guess?) but such dependency in Matahari doesn't seem right. In systemd unit file you could specify Before and After, but that's only for Fedora >= 15.

...
Another option is for the current rc.local script that audrey replaces to run the Matahari service starting giblits as its first action.

That actually might be the most reliable and portable way.

Just one thing to keep in mind... Matahari has to be runnable in environments where Audrey doesn't exist (i.e. outside of the Cloud, maybe on bare metal hosts, etc)

So don't introduce a hard dependency on Audrey. It has to be 'if Audrey is present, matahari needs to run after Audrey. Otherwise, if Audrey is not present, matahari runs at runlevel foo'

I agree we don't want a hard dep on audrey. This is why reading of the vm machine id would happen first , if that fails, the dbus machine id would be read. As far as deps, any kind of runlevel dep on audrey is not gong to work.

...

Also... I hate bunching up services towards the end of the list (98, 99, etc). Is there any reason why these aren't slightly lower in the runlevel numbering so that we aren't almost falling off the end?

I haven't made this decision - this is what is in upstream atm.

...

Perry _______________________________________________ aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel

Perry Myers

5:02 p.m.

New subject: [Pcmk-cloud] [Matahari] /var/lib/dbus/machine_id, imagefactory, and matahari

...

...
Also... I hate bunching up services towards the end of the list (98, 99, etc). Is there any reason why these aren't slightly lower in the runlevel numbering so that we aren't almost falling off the end?

I haven't made this decision - this is what is in upstream atm.

I know. That question was aimed more at the matahari team, since they're on cross-post here :)

Joseph VLcek

2:24 p.m.

On Mon, 2011-06-27 at 13:59 -0700, Steven Dake wrote:

...

On 06/24/2011 04:16 PM, Steven Dake wrote:

...
Currently most linux distributions that use dbus store a UUID in /var/lib/dbus/machine_id. In our pacemaker-cloud work test tools, we must manipulate this file via oz image creation to match a value we know about.

Q1. Is this file freshly created on each image creation/cloning process?

If not, it should be, because Matahari uses this information to uniquely identify a host. If it is copied exactly to each new image, that creates a problem (all hosts appear the same to matahari).

Q2. If/when it is created by image factory, will it be stored in a database or other storage medium?

In pacemaker-cloud we need to have a mapping from image->internal id id so that we know which VM maps to which deployable HA configuration.

If we wait on this point until after our 1.0 release, we could end up with a bunch of images in the field that have either the same machine id or are not mapped in any way that allows us to provide HA functionality.

Regards -steve

After more investigation, Chris and I came up with a workable plan for handling unique VM ids (see thread with subject How Audrey, Conductor, and Audrey's config server interact and their relationship to a unique vm instance id).

Currently the Audrey script runs as a replacement to rc.local. Matahari runs at S99. The general idea is for Audrey script to run as S98 (before Matahari) and write a management-wide unique instance UUID to the file /etc/vm_machine_id (audrey has access to this information).

Matahari could be changed to read /etc/vm_machine_id first. If that file doesn't exist, /var/lib/dbus/machine_id would be read.

This creates some difficulty in running the audrey script at a specific runlevel (it requires some changes to oz to insert init scripts).

Another option is for the current rc.local script that audrey replaces to run the Matahari service starting giblits as its first action.

Comments welcome before I start writing code... -steve

I think the solution of having Audrey store a launch time unique UUID in a file that Matahari can read will work.

It may not be necessary to alter oz to insert init scripts to ensure Audrey runs before Matahari.

Let me explain.

Audrey does not replace /etc/rc.local

When Image Factory builds the image it appends to the end of /etc/rc.local a line of code that will start Audrey.

e.g.: [ -f /usr/bin/audrey ] && /usr/bin/audrey

I propose having Image Factory append another line to /etc/rc.local below where it starts Audrey to start Matahari.

e.g.:

[ -f /usr/bin/audrey ] && /usr/bin/audrey

[ -f <Matahari start> ] && <run Matahari start>

We may need to manage timing to ensure /usr/bin/audrey does not return until it has stored the unique UUID in a file and have it return an error status if it is unable to.

Thoughts?

Joe

Steven Dake

3:26 p.m.

On 06/28/2011 06:24 AM, Joseph VLcek wrote:

...

On Mon, 2011-06-27 at 13:59 -0700, Steven Dake wrote:

...
On 06/24/2011 04:16 PM, Steven Dake wrote:

...
Currently most linux distributions that use dbus store a UUID in /var/lib/dbus/machine_id. In our pacemaker-cloud work test tools, we must manipulate this file via oz image creation to match a value we know about.

Q1. Is this file freshly created on each image creation/cloning process?

If not, it should be, because Matahari uses this information to uniquely identify a host. If it is copied exactly to each new image, that creates a problem (all hosts appear the same to matahari).

Q2. If/when it is created by image factory, will it be stored in a database or other storage medium?

In pacemaker-cloud we need to have a mapping from image->internal id id so that we know which VM maps to which deployable HA configuration.

If we wait on this point until after our 1.0 release, we could end up with a bunch of images in the field that have either the same machine id or are not mapped in any way that allows us to provide HA functionality.

Regards -steve

After more investigation, Chris and I came up with a workable plan for handling unique VM ids (see thread with subject How Audrey, Conductor, and Audrey's config server interact and their relationship to a unique vm instance id).

Currently the Audrey script runs as a replacement to rc.local. Matahari runs at S99. The general idea is for Audrey script to run as S98 (before Matahari) and write a management-wide unique instance UUID to the file /etc/vm_machine_id (audrey has access to this information).

Matahari could be changed to read /etc/vm_machine_id first. If that file doesn't exist, /var/lib/dbus/machine_id would be read.

This creates some difficulty in running the audrey script at a specific runlevel (it requires some changes to oz to insert init scripts).

Another option is for the current rc.local script that audrey replaces to run the Matahari service starting giblits as its first action.

Comments welcome before I start writing code... -steve

I think the solution of having Audrey store a launch time unique UUID in a file that Matahari can read will work.

It may not be necessary to alter oz to insert init scripts to ensure Audrey runs before Matahari.

Let me explain.

Audrey does not replace /etc/rc.local

When Image Factory builds the image it appends to the end of /etc/rc.local a line of code that will start Audrey.

e.g.: [ -f /usr/bin/audrey ] && /usr/bin/audrey

I propose having Image Factory append another line to /etc/rc.local below where it starts Audrey to start Matahari.

e.g.:

[ -f /usr/bin/audrey ] && /usr/bin/audrey

[ -f <Matahari start> ] && <run Matahari start>

We may need to manage timing to ensure /usr/bin/audrey does not return until it has stored the unique UUID in a file and have it return an error status if it is unable to.

Thoughts?

This sounds fine to me, although the rc.local modification always sounded a bit hacky (rather then using a proper init script). One minor issue is matahari expects to be started via service xxx start and each agent has a separate init script. There are 5 or 6 agents.

Regards -steve

...

Joe

Joseph VLcek

3:46 p.m.

On Tue, 2011-06-28 at 07:26 -0700, Steven Dake wrote:

...

On 06/28/2011 06:24 AM, Joseph VLcek wrote:

...
On Mon, 2011-06-27 at 13:59 -0700, Steven Dake wrote:

...
On 06/24/2011 04:16 PM, Steven Dake wrote:

...
Currently most linux distributions that use dbus store a UUID in /var/lib/dbus/machine_id. In our pacemaker-cloud work test tools, we must manipulate this file via oz image creation to match a value we know about.

Q1. Is this file freshly created on each image creation/cloning process?

If not, it should be, because Matahari uses this information to uniquely identify a host. If it is copied exactly to each new image, that creates a problem (all hosts appear the same to matahari).

Q2. If/when it is created by image factory, will it be stored in a database or other storage medium?

In pacemaker-cloud we need to have a mapping from image->internal id id so that we know which VM maps to which deployable HA configuration.

If we wait on this point until after our 1.0 release, we could end up with a bunch of images in the field that have either the same machine id or are not mapped in any way that allows us to provide HA functionality.

Regards -steve

After more investigation, Chris and I came up with a workable plan for handling unique VM ids (see thread with subject How Audrey, Conductor, and Audrey's config server interact and their relationship to a unique vm instance id).

Currently the Audrey script runs as a replacement to rc.local. Matahari runs at S99. The general idea is for Audrey script to run as S98 (before Matahari) and write a management-wide unique instance UUID to the file /etc/vm_machine_id (audrey has access to this information).

Matahari could be changed to read /etc/vm_machine_id first. If that file doesn't exist, /var/lib/dbus/machine_id would be read.

This creates some difficulty in running the audrey script at a specific runlevel (it requires some changes to oz to insert init scripts).

Another option is for the current rc.local script that audrey replaces to run the Matahari service starting giblits as its first action.

Comments welcome before I start writing code... -steve

I think the solution of having Audrey store a launch time unique UUID in a file that Matahari can read will work.

It may not be necessary to alter oz to insert init scripts to ensure Audrey runs before Matahari.

Let me explain.

Audrey does not replace /etc/rc.local

When Image Factory builds the image it appends to the end of /etc/rc.local a line of code that will start Audrey.

e.g.: [ -f /usr/bin/audrey ] && /usr/bin/audrey

I propose having Image Factory append another line to /etc/rc.local below where it starts Audrey to start Matahari.

e.g.:

[ -f /usr/bin/audrey ] && /usr/bin/audrey

[ -f <Matahari start> ] && <run Matahari start>

We may need to manage timing to ensure /usr/bin/audrey does not return until it has stored the unique UUID in a file and have it return an error status if it is unable to.

Thoughts?

This sounds fine to me, although the rc.local modification always sounded a bit hacky (rather then using a proper init script).

...

One minor issue is matahari expects to be started via service xxx start and each agent has a separate init script. There are 5 or 6 agents.

Regards -steve

...
Joe

Yes. I agree the rc.local modification could be viewed as a bit hacky however I don't think unacceptably so. If others disagree we could explore Chris's suggestion of altering oz to insert init scripts.

Joe

Perry Myers

29 Jun 29 Jun

6:53 p.m.

New subject: [Matahari] /var/lib/dbus/machine_id, imagefactory, and matahari

On 06/28/2011 10:46 AM, Joseph VLcek wrote:

...

On Tue, 2011-06-28 at 07:26 -0700, Steven Dake wrote:

...
On 06/28/2011 06:24 AM, Joseph VLcek wrote:

...
On Mon, 2011-06-27 at 13:59 -0700, Steven Dake wrote:

...
On 06/24/2011 04:16 PM, Steven Dake wrote:

...
Currently most linux distributions that use dbus store a UUID in /var/lib/dbus/machine_id. In our pacemaker-cloud work test tools, we must manipulate this file via oz image creation to match a value we know about.

Q1. Is this file freshly created on each image creation/cloning process?

If not, it should be, because Matahari uses this information to uniquely identify a host. If it is copied exactly to each new image, that creates a problem (all hosts appear the same to matahari).

Q2. If/when it is created by image factory, will it be stored in a database or other storage medium?

In pacemaker-cloud we need to have a mapping from image->internal id id so that we know which VM maps to which deployable HA configuration.

If we wait on this point until after our 1.0 release, we could end up with a bunch of images in the field that have either the same machine id or are not mapped in any way that allows us to provide HA functionality.

Regards -steve

After more investigation, Chris and I came up with a workable plan for handling unique VM ids (see thread with subject How Audrey, Conductor, and Audrey's config server interact and their relationship to a unique vm instance id).

Currently the Audrey script runs as a replacement to rc.local. Matahari runs at S99. The general idea is for Audrey script to run as S98 (before Matahari) and write a management-wide unique instance UUID to the file /etc/vm_machine_id (audrey has access to this information).

Matahari could be changed to read /etc/vm_machine_id first. If that file doesn't exist, /var/lib/dbus/machine_id would be read.

This creates some difficulty in running the audrey script at a specific runlevel (it requires some changes to oz to insert init scripts).

Another option is for the current rc.local script that audrey replaces to run the Matahari service starting giblits as its first action.

Comments welcome before I start writing code... -steve

I think the solution of having Audrey store a launch time unique UUID in a file that Matahari can read will work.

It may not be necessary to alter oz to insert init scripts to ensure Audrey runs before Matahari.

Let me explain.

Audrey does not replace /etc/rc.local

When Image Factory builds the image it appends to the end of /etc/rc.local a line of code that will start Audrey.

e.g.: [ -f /usr/bin/audrey ] && /usr/bin/audrey

I propose having Image Factory append another line to /etc/rc.local below where it starts Audrey to start Matahari.

e.g.:

[ -f /usr/bin/audrey ] && /usr/bin/audrey

[ -f <Matahari start> ] && <run Matahari start>

We may need to manage timing to ensure /usr/bin/audrey does not return until it has stored the unique UUID in a file and have it return an error status if it is unable to.

Thoughts?

This sounds fine to me, although the rc.local modification always sounded a bit hacky (rather then using a proper init script).

...
One minor issue is matahari expects to be started via service xxx start and each agent has a separate init script. There are 5 or 6 agents.

Regards -steve

...
Joe

Yes. I agree the rc.local modification could be viewed as a bit hacky however I don't think unacceptably so. If others disagree we could explore Chris's suggestion of altering oz to insert init scripts.

On principle I say it's hacky and there is a better solution.

However, the pragmatic part of me says, "just make it work", and I think getting something working is more important than architectural purity.

So let's go with the solution we know works _now_ (rc.local starting audrey and matahari in sequence). Later when time permits and we have a fully functional system and the time to refine things, we can always fix it to be better.

Cheers,

Perry

Perry Myers

28 Jun 28 Jun

5:05 p.m.

New subject: [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 06/28/2011 10:26 AM, Steven Dake wrote:

...

On 06/28/2011 06:24 AM, Joseph VLcek wrote:

...
On Mon, 2011-06-27 at 13:59 -0700, Steven Dake wrote:

...
On 06/24/2011 04:16 PM, Steven Dake wrote:

...
Currently most linux distributions that use dbus store a UUID in /var/lib/dbus/machine_id. In our pacemaker-cloud work test tools, we must manipulate this file via oz image creation to match a value we know about.

Q1. Is this file freshly created on each image creation/cloning process?

If not, it should be, because Matahari uses this information to uniquely identify a host. If it is copied exactly to each new image, that creates a problem (all hosts appear the same to matahari).

Q2. If/when it is created by image factory, will it be stored in a database or other storage medium?

In pacemaker-cloud we need to have a mapping from image->internal id id so that we know which VM maps to which deployable HA configuration.

If we wait on this point until after our 1.0 release, we could end up with a bunch of images in the field that have either the same machine id or are not mapped in any way that allows us to provide HA functionality.

Regards -steve

After more investigation, Chris and I came up with a workable plan for handling unique VM ids (see thread with subject How Audrey, Conductor, and Audrey's config server interact and their relationship to a unique vm instance id).

Currently the Audrey script runs as a replacement to rc.local. Matahari runs at S99. The general idea is for Audrey script to run as S98 (before Matahari) and write a management-wide unique instance UUID to the file /etc/vm_machine_id (audrey has access to this information).

Matahari could be changed to read /etc/vm_machine_id first. If that file doesn't exist, /var/lib/dbus/machine_id would be read.

This creates some difficulty in running the audrey script at a specific runlevel (it requires some changes to oz to insert init scripts).

Another option is for the current rc.local script that audrey replaces to run the Matahari service starting giblits as its first action.

Comments welcome before I start writing code... -steve

I think the solution of having Audrey store a launch time unique UUID in a file that Matahari can read will work.

It may not be necessary to alter oz to insert init scripts to ensure Audrey runs before Matahari.

Let me explain.

Audrey does not replace /etc/rc.local

When Image Factory builds the image it appends to the end of /etc/rc.local a line of code that will start Audrey.

e.g.: [ -f /usr/bin/audrey ] && /usr/bin/audrey

I propose having Image Factory append another line to /etc/rc.local below where it starts Audrey to start Matahari.

e.g.:

[ -f /usr/bin/audrey ] && /usr/bin/audrey

[ -f <Matahari start> ] && <run Matahari start>

We may need to manage timing to ensure /usr/bin/audrey does not return until it has stored the unique UUID in a file and have it return an error status if it is unable to.

Thoughts?

This sounds fine to me, although the rc.local modification always sounded a bit hacky (rather then using a proper init script). One minor issue is matahari expects to be started via service xxx start and each agent has a separate init script. There are 5 or 6 agents.

Yes, I think having audrey 'start matahari' is very hackish, since in normal systems matahari would start via regular init scripts. So this means for cloud we'd need to disable the normal init scripts and then relegate control to audrey. I don't like that approach...

Let's back up a bit... why does matahari start need to depend on audrey starting? The answer is that we need audrey to put in the /etc/machine-id file.

Well, why not let matahari start at its normal runlevel, and respond to queries, etc, but if you call get-id API, then it should return something that indicates '/etc/machine-id not set yet, giving you dbus-id instead'

Then it's just up to the person doing the querying to wait until the id returned is the /etc/machine-id.

So remove the dep on service start and replace with intelligent application usage

Thoughts?

Perry

Steven Dake

5:25 p.m.

New subject: [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 06/28/2011 09:05 AM, Perry Myers wrote:

...

On 06/28/2011 10:26 AM, Steven Dake wrote:

...
On 06/28/2011 06:24 AM, Joseph VLcek wrote:

...
On Mon, 2011-06-27 at 13:59 -0700, Steven Dake wrote:

...
On 06/24/2011 04:16 PM, Steven Dake wrote:

...
Currently most linux distributions that use dbus store a UUID in /var/lib/dbus/machine_id. In our pacemaker-cloud work test tools, we must manipulate this file via oz image creation to match a value we know about.

Q1. Is this file freshly created on each image creation/cloning process?

If not, it should be, because Matahari uses this information to uniquely identify a host. If it is copied exactly to each new image, that creates a problem (all hosts appear the same to matahari).

Q2. If/when it is created by image factory, will it be stored in a database or other storage medium?

In pacemaker-cloud we need to have a mapping from image->internal id id so that we know which VM maps to which deployable HA configuration.

If we wait on this point until after our 1.0 release, we could end up with a bunch of images in the field that have either the same machine id or are not mapped in any way that allows us to provide HA functionality.

Regards -steve

After more investigation, Chris and I came up with a workable plan for handling unique VM ids (see thread with subject How Audrey, Conductor, and Audrey's config server interact and their relationship to a unique vm instance id).

Currently the Audrey script runs as a replacement to rc.local. Matahari runs at S99. The general idea is for Audrey script to run as S98 (before Matahari) and write a management-wide unique instance UUID to the file /etc/vm_machine_id (audrey has access to this information).

Matahari could be changed to read /etc/vm_machine_id first. If that file doesn't exist, /var/lib/dbus/machine_id would be read.

This creates some difficulty in running the audrey script at a specific runlevel (it requires some changes to oz to insert init scripts).

Another option is for the current rc.local script that audrey replaces to run the Matahari service starting giblits as its first action.

Comments welcome before I start writing code... -steve

I think the solution of having Audrey store a launch time unique UUID in a file that Matahari can read will work.

It may not be necessary to alter oz to insert init scripts to ensure Audrey runs before Matahari.

Let me explain.

Audrey does not replace /etc/rc.local

When Image Factory builds the image it appends to the end of /etc/rc.local a line of code that will start Audrey.

e.g.: [ -f /usr/bin/audrey ] && /usr/bin/audrey

I propose having Image Factory append another line to /etc/rc.local below where it starts Audrey to start Matahari.

e.g.:

[ -f /usr/bin/audrey ] && /usr/bin/audrey

[ -f <Matahari start> ] && <run Matahari start>

We may need to manage timing to ensure /usr/bin/audrey does not return until it has stored the unique UUID in a file and have it return an error status if it is unable to.

Thoughts?

This sounds fine to me, although the rc.local modification always sounded a bit hacky (rather then using a proper init script). One minor issue is matahari expects to be started via service xxx start and each agent has a separate init script. There are 5 or 6 agents.

Yes, I think having audrey 'start matahari' is very hackish, since in normal systems matahari would start via regular init scripts. So this means for cloud we'd need to disable the normal init scripts and then relegate control to audrey. I don't like that approach...

Let's back up a bit... why does matahari start need to depend on audrey starting? The answer is that we need audrey to put in the /etc/machine-id file.

Well, why not let matahari start at its normal runlevel, and respond to queries, etc, but if you call get-id API, then it should return something that indicates '/etc/machine-id not set yet, giving you dbus-id instead'

Then it's just up to the person doing the querying to wait until the id returned is the /etc/machine-id.

So remove the dep on service start and replace with intelligent application usage

This model still needs a mechanism to turn on the matahari init scripts. This could be done with the proposed <commands> oz patch and is what we are using today in pacemaker-cloud test suites.

Regards -stee

...

Thoughts?

Perry _______________________________________________ aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel

Perry Myers

5:35 p.m.

New subject: [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 06/28/2011 12:25 PM, Steven Dake wrote:

...

On 06/28/2011 09:05 AM, Perry Myers wrote:

...
On 06/28/2011 10:26 AM, Steven Dake wrote:

...
On 06/28/2011 06:24 AM, Joseph VLcek wrote:

...
On Mon, 2011-06-27 at 13:59 -0700, Steven Dake wrote:

...
On 06/24/2011 04:16 PM, Steven Dake wrote:

...
Currently most linux distributions that use dbus store a UUID in /var/lib/dbus/machine_id. In our pacemaker-cloud work test tools, we must manipulate this file via oz image creation to match a value we know about.

Q1. Is this file freshly created on each image creation/cloning process?

If not, it should be, because Matahari uses this information to uniquely identify a host. If it is copied exactly to each new image, that creates a problem (all hosts appear the same to matahari).

Q2. If/when it is created by image factory, will it be stored in a database or other storage medium?

In pacemaker-cloud we need to have a mapping from image->internal id id so that we know which VM maps to which deployable HA configuration.

If we wait on this point until after our 1.0 release, we could end up with a bunch of images in the field that have either the same machine id or are not mapped in any way that allows us to provide HA functionality.

Regards -steve

After more investigation, Chris and I came up with a workable plan for handling unique VM ids (see thread with subject How Audrey, Conductor, and Audrey's config server interact and their relationship to a unique vm instance id).

Currently the Audrey script runs as a replacement to rc.local. Matahari runs at S99. The general idea is for Audrey script to run as S98 (before Matahari) and write a management-wide unique instance UUID to the file /etc/vm_machine_id (audrey has access to this information).

Matahari could be changed to read /etc/vm_machine_id first. If that file doesn't exist, /var/lib/dbus/machine_id would be read.

This creates some difficulty in running the audrey script at a specific runlevel (it requires some changes to oz to insert init scripts).

Another option is for the current rc.local script that audrey replaces to run the Matahari service starting giblits as its first action.

Comments welcome before I start writing code... -steve

I think the solution of having Audrey store a launch time unique UUID in a file that Matahari can read will work.

It may not be necessary to alter oz to insert init scripts to ensure Audrey runs before Matahari.

Let me explain.

Audrey does not replace /etc/rc.local

When Image Factory builds the image it appends to the end of /etc/rc.local a line of code that will start Audrey.

e.g.: [ -f /usr/bin/audrey ] && /usr/bin/audrey

I propose having Image Factory append another line to /etc/rc.local below where it starts Audrey to start Matahari.

e.g.:

[ -f /usr/bin/audrey ] && /usr/bin/audrey

[ -f <Matahari start> ] && <run Matahari start>

We may need to manage timing to ensure /usr/bin/audrey does not return until it has stored the unique UUID in a file and have it return an error status if it is unable to.

Thoughts?

This sounds fine to me, although the rc.local modification always sounded a bit hacky (rather then using a proper init script). One minor issue is matahari expects to be started via service xxx start and each agent has a separate init script. There are 5 or 6 agents.

Yes, I think having audrey 'start matahari' is very hackish, since in normal systems matahari would start via regular init scripts. So this means for cloud we'd need to disable the normal init scripts and then relegate control to audrey. I don't like that approach...

Let's back up a bit... why does matahari start need to depend on audrey starting? The answer is that we need audrey to put in the /etc/machine-id file.

Well, why not let matahari start at its normal runlevel, and respond to queries, etc, but if you call get-id API, then it should return something that indicates '/etc/machine-id not set yet, giving you dbus-id instead'

Then it's just up to the person doing the querying to wait until the id returned is the /etc/machine-id.

So remove the dep on service start and replace with intelligent application usage

This model still needs a mechanism to turn on the matahari init scripts. This could be done with the proposed <commands> oz patch and is what we are using today in pacemaker-cloud test suites.

Agreed. I think 'enabling matahari init scripts' should be done at image creation time, and not as part of post-boot-config/Audrey.

Greg Blomquist

6:47 p.m.

New subject: [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 06/28/2011 12:05 PM, Perry Myers wrote:

...

On 06/28/2011 10:26 AM, Steven Dake wrote:

...
On 06/28/2011 06:24 AM, Joseph VLcek wrote:

...
On Mon, 2011-06-27 at 13:59 -0700, Steven Dake wrote:

...
On 06/24/2011 04:16 PM, Steven Dake wrote:

...
Currently most linux distributions that use dbus store a UUID in /var/lib/dbus/machine_id. In our pacemaker-cloud work test tools, we must manipulate this file via oz image creation to match a value we know about.

Q1. Is this file freshly created on each image creation/cloning process?

If not, it should be, because Matahari uses this information to uniquely identify a host. If it is copied exactly to each new image, that creates a problem (all hosts appear the same to matahari).

Q2. If/when it is created by image factory, will it be stored in a database or other storage medium?

In pacemaker-cloud we need to have a mapping from image->internal id id so that we know which VM maps to which deployable HA configuration.

If we wait on this point until after our 1.0 release, we could end up with a bunch of images in the field that have either the same machine id or are not mapped in any way that allows us to provide HA functionality.

Regards -steve

After more investigation, Chris and I came up with a workable plan for handling unique VM ids (see thread with subject How Audrey, Conductor, and Audrey's config server interact and their relationship to a unique vm instance id).

Currently the Audrey script runs as a replacement to rc.local. Matahari runs at S99. The general idea is for Audrey script to run as S98 (before Matahari) and write a management-wide unique instance UUID to the file /etc/vm_machine_id (audrey has access to this information).

Matahari could be changed to read /etc/vm_machine_id first. If that file doesn't exist, /var/lib/dbus/machine_id would be read.

This creates some difficulty in running the audrey script at a specific runlevel (it requires some changes to oz to insert init scripts).

Another option is for the current rc.local script that audrey replaces to run the Matahari service starting giblits as its first action.

Comments welcome before I start writing code... -steve

I think the solution of having Audrey store a launch time unique UUID in a file that Matahari can read will work.

It may not be necessary to alter oz to insert init scripts to ensure Audrey runs before Matahari.

Let me explain.

Audrey does not replace /etc/rc.local

When Image Factory builds the image it appends to the end of /etc/rc.local a line of code that will start Audrey.

e.g.: [ -f /usr/bin/audrey ] && /usr/bin/audrey

I propose having Image Factory append another line to /etc/rc.local below where it starts Audrey to start Matahari.

e.g.:

[ -f /usr/bin/audrey ] && /usr/bin/audrey

[ -f <Matahari start> ] && <run Matahari start>

We may need to manage timing to ensure /usr/bin/audrey does not return until it has stored the unique UUID in a file and have it return an error status if it is unable to.

Thoughts?

This sounds fine to me, although the rc.local modification always sounded a bit hacky (rather then using a proper init script). One minor issue is matahari expects to be started via service xxx start and each agent has a separate init script. There are 5 or 6 agents.

Yes, I think having audrey 'start matahari' is very hackish, since in normal systems matahari would start via regular init scripts. So this means for cloud we'd need to disable the normal init scripts and then relegate control to audrey. I don't like that approach...

Let's back up a bit... why does matahari start need to depend on audrey starting? The answer is that we need audrey to put in the /etc/machine-id file.

Hmm...I'm not sure I buy this. I don't think there needs to be even this level of dependency. Why can't another mechanism be in charge of putting the /etc/machine-id in place?

It still looks like matahari and audrey both require a machine-id (and even pacemaker?). If they could rely on the same machine id for their individual purposes, then I think we're closer to a better answer.

So, how do we get that machine-id in place? With ec2, it's "user data". I honestly think with our own cloud offerings (rhev-m) we need a solution for this, too. With rhev-m 3, it's supposed to be solved with hooks (I don't claim to know how those work, just that they've been proposed as the solution for injecting data into a launching instance).

I really think that cloud engine should be responsible for creating this machine-id and injecting it into the launching instance. This gives us one place to generate this id and doesn't require mapping this id to other ids in order to translate a single instance between services that want to interact with the instance.

This puts all the logic back into deltacloud. It becomes deltacloud's responsibility to figure out how to interact with the various cloud providers, so that no matter in what provider an instance is running, we can always get to the machine-id for that instance.

Am I seeing this completely upside down? Or, off in the trees by myself on this?

...

Well, why not let matahari start at its normal runlevel, and respond to queries, etc, but if you call get-id API, then it should return something that indicates '/etc/machine-id not set yet, giving you dbus-id instead'

Then it's just up to the person doing the querying to wait until the id returned is the /etc/machine-id.

So remove the dep on service start and replace with intelligent application usage

Thoughts?

Perry _______________________________________________ aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel

Steven Dake

7:33 p.m.

New subject: [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 06/28/2011 10:47 AM, Greg Blomquist wrote:

...

On 06/28/2011 12:05 PM, Perry Myers wrote:

...
On 06/28/2011 10:26 AM, Steven Dake wrote:

...
On 06/28/2011 06:24 AM, Joseph VLcek wrote:

...
On Mon, 2011-06-27 at 13:59 -0700, Steven Dake wrote:

...
On 06/24/2011 04:16 PM, Steven Dake wrote:

...
Currently most linux distributions that use dbus store a UUID in /var/lib/dbus/machine_id. In our pacemaker-cloud work test tools, we must manipulate this file via oz image creation to match a value we know about.

Q1. Is this file freshly created on each image creation/cloning process?

If not, it should be, because Matahari uses this information to uniquely identify a host. If it is copied exactly to each new image, that creates a problem (all hosts appear the same to matahari).

Q2. If/when it is created by image factory, will it be stored in a database or other storage medium?

In pacemaker-cloud we need to have a mapping from image->internal id id so that we know which VM maps to which deployable HA configuration.

If we wait on this point until after our 1.0 release, we could end up with a bunch of images in the field that have either the same machine id or are not mapped in any way that allows us to provide HA functionality.

Regards -steve

After more investigation, Chris and I came up with a workable plan for handling unique VM ids (see thread with subject How Audrey, Conductor, and Audrey's config server interact and their relationship to a unique vm instance id).

Currently the Audrey script runs as a replacement to rc.local. Matahari runs at S99. The general idea is for Audrey script to run as S98 (before Matahari) and write a management-wide unique instance UUID to the file /etc/vm_machine_id (audrey has access to this information).

Matahari could be changed to read /etc/vm_machine_id first. If that file doesn't exist, /var/lib/dbus/machine_id would be read.

This creates some difficulty in running the audrey script at a specific runlevel (it requires some changes to oz to insert init scripts).

Another option is for the current rc.local script that audrey replaces to run the Matahari service starting giblits as its first action.

Comments welcome before I start writing code... -steve

I think the solution of having Audrey store a launch time unique UUID in a file that Matahari can read will work.

It may not be necessary to alter oz to insert init scripts to ensure Audrey runs before Matahari.

Let me explain.

Audrey does not replace /etc/rc.local

When Image Factory builds the image it appends to the end of /etc/rc.local a line of code that will start Audrey.

e.g.: [ -f /usr/bin/audrey ] && /usr/bin/audrey

I propose having Image Factory append another line to /etc/rc.local below where it starts Audrey to start Matahari.

e.g.:

[ -f /usr/bin/audrey ] && /usr/bin/audrey

[ -f <Matahari start> ] && <run Matahari start>

We may need to manage timing to ensure /usr/bin/audrey does not return until it has stored the unique UUID in a file and have it return an error status if it is unable to.

Thoughts?

This sounds fine to me, although the rc.local modification always sounded a bit hacky (rather then using a proper init script). One minor issue is matahari expects to be started via service xxx start and each agent has a separate init script. There are 5 or 6 agents.

Yes, I think having audrey 'start matahari' is very hackish, since in normal systems matahari would start via regular init scripts. So this means for cloud we'd need to disable the normal init scripts and then relegate control to audrey. I don't like that approach...

Let's back up a bit... why does matahari start need to depend on audrey starting? The answer is that we need audrey to put in the /etc/machine-id file.

Hmm...I'm not sure I buy this. I don't think there needs to be even this level of dependency. Why can't another mechanism be in charge of putting the /etc/machine-id in place?

It still looks like matahari and audrey both require a machine-id (and even pacemaker?). If they could rely on the same machine id for their individual purposes, then I think we're closer to a better answer.

So, how do we get that machine-id in place? With ec2, it's "user data". I honestly think with our own cloud offerings (rhev-m) we need a solution for this, too. With rhev-m 3, it's supposed to be solved with hooks (I don't claim to know how those work, just that they've been proposed as the solution for injecting data into a launching instance).

I really think that cloud engine should be responsible for creating this machine-id and injecting it into the launching instance. This gives us one place to generate this id and doesn't require mapping this id to other ids in order to translate a single instance between services that want to interact with the instance.

This puts all the logic back into deltacloud. It becomes deltacloud's responsibility to figure out how to interact with the various cloud providers, so that no matter in what provider an instance is running, we can always get to the machine-id for that instance.

Am I seeing this completely upside down? Or, off in the trees by myself on this?

I had originally thought deltacloud might be involved here in some way to consistently load instance data into the instance (and then provide a consistent way of accessing it inside the vm that doesn't involve writing 10 different access mechanisms).

I don't know enough about deltacloud, but if the insertion functionality is currently a gap in any virtual machine manager (such as paravirt Xen) or RHEVH, the entire Aeolus solution as well as cloud-ha becomes non-functional on those VMMs.

The question for deltacloud evaluation of this functionality then boils down to: 1) is this functionality available on every deltacloud cloud provider implementation 2) is the functionality inside the vm consistent (does a user only call a "read_my_id" function or read a file inside the vm) - if library access what is the new dependency inside the vm?

Regards -steve

...

...
Well, why not let matahari start at its normal runlevel, and respond to queries, etc, but if you call get-id API, then it should return something that indicates '/etc/machine-id not set yet, giving you dbus-id instead'

Then it's just up to the person doing the querying to wait until the id returned is the /etc/machine-id.

So remove the dep on service start and replace with intelligent application usage

Thoughts?

Perry _______________________________________________ aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel

aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel

Greg Blomquist

8:12 p.m.

New subject: [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 06/28/2011 02:33 PM, Steven Dake wrote:

...

On 06/28/2011 10:47 AM, Greg Blomquist wrote:

...
On 06/28/2011 12:05 PM, Perry Myers wrote:

...
On 06/28/2011 10:26 AM, Steven Dake wrote:

...
On 06/28/2011 06:24 AM, Joseph VLcek wrote:

...
On Mon, 2011-06-27 at 13:59 -0700, Steven Dake wrote:

...
On 06/24/2011 04:16 PM, Steven Dake wrote: > Currently most linux distributions that use dbus store a UUID in > /var/lib/dbus/machine_id. In our pacemaker-cloud work test tools, we > must manipulate this file via oz image creation to match a value we know > about. > > Q1. Is this file freshly created on each image creation/cloning process? > > If not, it should be, because Matahari uses this information to uniquely > identify a host. If it is copied exactly to each new image, that > creates a problem (all hosts appear the same to matahari). > > Q2. If/when it is created by image factory, will it be stored in a > database or other storage medium? > > In pacemaker-cloud we need to have a mapping from image->internal id id > so that we know which VM maps to which deployable HA configuration. > > If we wait on this point until after our 1.0 release, we could end up > with a bunch of images in the field that have either the same machine id > or are not mapped in any way that allows us to provide HA functionality. > > Regards > -steve >

After more investigation, Chris and I came up with a workable plan for handling unique VM ids (see thread with subject How Audrey, Conductor, and Audrey's config server interact and their relationship to a unique vm instance id).

Currently the Audrey script runs as a replacement to rc.local. Matahari runs at S99. The general idea is for Audrey script to run as S98 (before Matahari) and write a management-wide unique instance UUID to the file /etc/vm_machine_id (audrey has access to this information).

Matahari could be changed to read /etc/vm_machine_id first. If that file doesn't exist, /var/lib/dbus/machine_id would be read.

This creates some difficulty in running the audrey script at a specific runlevel (it requires some changes to oz to insert init scripts).

Another option is for the current rc.local script that audrey replaces to run the Matahari service starting giblits as its first action.

Comments welcome before I start writing code... -steve

I think the solution of having Audrey store a launch time unique UUID in a file that Matahari can read will work.

It may not be necessary to alter oz to insert init scripts to ensure Audrey runs before Matahari.

Let me explain.

Audrey does not replace /etc/rc.local

When Image Factory builds the image it appends to the end of /etc/rc.local a line of code that will start Audrey.

e.g.: [ -f /usr/bin/audrey ] && /usr/bin/audrey

I propose having Image Factory append another line to /etc/rc.local below where it starts Audrey to start Matahari.

e.g.:

[ -f /usr/bin/audrey ] && /usr/bin/audrey

[ -f <Matahari start> ] && <run Matahari start>

We may need to manage timing to ensure /usr/bin/audrey does not return until it has stored the unique UUID in a file and have it return an error status if it is unable to.

Thoughts?

This sounds fine to me, although the rc.local modification always sounded a bit hacky (rather then using a proper init script). One minor issue is matahari expects to be started via service xxx start and each agent has a separate init script. There are 5 or 6 agents.

Yes, I think having audrey 'start matahari' is very hackish, since in normal systems matahari would start via regular init scripts. So this means for cloud we'd need to disable the normal init scripts and then relegate control to audrey. I don't like that approach...

Let's back up a bit... why does matahari start need to depend on audrey starting? The answer is that we need audrey to put in the /etc/machine-id file.

Hmm...I'm not sure I buy this. I don't think there needs to be even this level of dependency. Why can't another mechanism be in charge of putting the /etc/machine-id in place?

It still looks like matahari and audrey both require a machine-id (and even pacemaker?). If they could rely on the same machine id for their individual purposes, then I think we're closer to a better answer.

So, how do we get that machine-id in place? With ec2, it's "user data". I honestly think with our own cloud offerings (rhev-m) we need a solution for this, too. With rhev-m 3, it's supposed to be solved with hooks (I don't claim to know how those work, just that they've been proposed as the solution for injecting data into a launching instance).

I really think that cloud engine should be responsible for creating this machine-id and injecting it into the launching instance. This gives us one place to generate this id and doesn't require mapping this id to other ids in order to translate a single instance between services that want to interact with the instance.

This puts all the logic back into deltacloud. It becomes deltacloud's responsibility to figure out how to interact with the various cloud providers, so that no matter in what provider an instance is running, we can always get to the machine-id for that instance.

Am I seeing this completely upside down? Or, off in the trees by myself on this?

I talked to clalance for a moment on IRC to figure out what I wasn't seeing. The main piece was the bit between networking and dbus. Since dbus is likely to be up and configured by the time networking starts, there are limitations to how we can get to the "user data" in some cases. Specifically for ec2 (see comments below).

...

...
I had originally thought deltacloud might be involved here in some way to consistently load instance data into the instance (and then provide a consistent way of accessing it inside the vm that doesn't involve writing 10 different access mechanisms).

I don't know enough about deltacloud, but if the insertion functionality is currently a gap in any virtual machine manager (such as paravirt Xen) or RHEVH, the entire Aeolus solution as well as cloud-ha becomes non-functional on those VMMs.

The question for deltacloud evaluation of this functionality then boils down to:

is this functionality available on every deltacloud cloud provider

implementation

Not readily or consistently. For instance, rhev-m has "hook" functionality, but we haven't yet figured out what it means to use the hooks to provide this functionality for rhev-m.

VMWare has some form of data injection available, as long as you install their guest agent, enabling certain api functionality that deltacloud could talk to.

It's significantly different for each provider, that it takes research each time a new provider is taken on.

...

is the functionality inside the vm consistent (does a user only call

a "read_my_id" function or read a file inside the vm) - if library access what is the new dependency inside the vm?

No. In fact, for the case of EC2, this particularly breaks down. Because EC2 stores the "user data" in a place only accessible over the network, the VM needs to ask a known ec2 location for the user data via an HTTP GET.

Since dbus is probably already firmly set in stone by the time networking comes up, it's likely too late to rely on ec2's user data to store to this information and have it impact dbus after retrieval.

...

Regards -steve

...
...
Well, why not let matahari start at its normal runlevel, and respond to queries, etc, but if you call get-id API, then it should return something that indicates '/etc/machine-id not set yet, giving you dbus-id instead'

Then it's just up to the person doing the querying to wait until the id returned is the /etc/machine-id.

So remove the dep on service start and replace with intelligent application usage

Thoughts?

Perry _______________________________________________ aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel

aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel

Chris Lalancette

9:30 p.m.

New subject: [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 06/28/11 - 03:12:57PM, Greg Blomquist wrote:

...

I talked to clalance for a moment on IRC to figure out what I wasn't seeing. The main piece was the bit between networking and dbus. Since dbus is likely to be up and configured by the time networking starts, there are limitations to how we can get to the "user data" in some cases. Specifically for ec2 (see comments below).

...
...
I had originally thought deltacloud might be involved here in some way to consistently load instance data into the instance (and then provide a consistent way of accessing it inside the vm that doesn't involve writing 10 different access mechanisms).

I don't know enough about deltacloud, but if the insertion functionality is currently a gap in any virtual machine manager (such as paravirt Xen) or RHEVH, the entire Aeolus solution as well as cloud-ha becomes non-functional on those VMMs.

The question for deltacloud evaluation of this functionality then boils down to:

is this functionality available on every deltacloud cloud provider

implementation

Not readily or consistently. For instance, rhev-m has "hook" functionality, but we haven't yet figured out what it means to use the hooks to provide this functionality for rhev-m.

VMWare has some form of data injection available, as long as you install their guest agent, enabling certain api functionality that deltacloud could talk to.

It's significantly different for each provider, that it takes research each time a new provider is taken on.

...

is the functionality inside the vm consistent (does a user only call

a "read_my_id" function or read a file inside the vm) - if library access what is the new dependency inside the vm?

No. In fact, for the case of EC2, this particularly breaks down. Because EC2 stores the "user data" in a place only accessible over the network, the VM needs to ask a known ec2 location for the user data via an HTTP GET.

Since dbus is probably already firmly set in stone by the time networking comes up, it's likely too late to rely on ec2's user data to store to this information and have it impact dbus after retrieval.

Just to be clear here, I don't know that this is the case. I haven't personally looked into dbus, and how to configure the UUID on-the-fly. This definitely bears investigation, as I feel that getting dbus to properly expose this information is our best path forward.

-- Chris Lalancette

Steven Dake

11:12 p.m.

New subject: [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 06/28/2011 01:30 PM, Chris Lalancette wrote:

...

On 06/28/11 - 03:12:57PM, Greg Blomquist wrote:

...
I talked to clalance for a moment on IRC to figure out what I wasn't seeing. The main piece was the bit between networking and dbus. Since dbus is likely to be up and configured by the time networking starts, there are limitations to how we can get to the "user data" in some cases. Specifically for ec2 (see comments below).

...
...
I had originally thought deltacloud might be involved here in some way to consistently load instance data into the instance (and then provide a consistent way of accessing it inside the vm that doesn't involve writing 10 different access mechanisms).

I don't know enough about deltacloud, but if the insertion functionality is currently a gap in any virtual machine manager (such as paravirt Xen) or RHEVH, the entire Aeolus solution as well as cloud-ha becomes non-functional on those VMMs.

The question for deltacloud evaluation of this functionality then boils down to:

is this functionality available on every deltacloud cloud provider

implementation

Not readily or consistently. For instance, rhev-m has "hook" functionality, but we haven't yet figured out what it means to use the hooks to provide this functionality for rhev-m.

VMWare has some form of data injection available, as long as you install their guest agent, enabling certain api functionality that deltacloud could talk to.

It's significantly different for each provider, that it takes research each time a new provider is taken on.

...

is the functionality inside the vm consistent (does a user only call

a "read_my_id" function or read a file inside the vm) - if library access what is the new dependency inside the vm?

No. In fact, for the case of EC2, this particularly breaks down. Because EC2 stores the "user data" in a place only accessible over the network, the VM needs to ask a known ec2 location for the user data via an HTTP GET.

Since dbus is probably already firmly set in stone by the time networking comes up, it's likely too late to rely on ec2's user data to store to this information and have it impact dbus after retrieval.

Just to be clear here, I don't know that this is the case. I haven't personally looked into dbus, and how to configure the UUID on-the-fly. This definitely bears investigation, as I feel that getting dbus to properly expose this information is our best path forward.

looks like upstream is taking a crack at moving this into systemd rather then dbus........

http://lists.freedesktop.org/archives/dbus/2011-March/014187.html

dbus does not have a set system id API.

if we add a dbus set system id API and call it, say at the end of the startup sequence, some apps will be aware of the old id and others will be aware of the new id. I expect this would break things badly, but I am not a dbus expert so can't comment here.

Perhaps we should let dbus do its own thing and just create a new file. Then we have two files

/etc/machine_id (worthless in a cloud environment except for dbus operation) /etc/vm_machine_id (written by audrey or a dbus/systemd api) which can be set after the network is operational. Then the machine id logic can be put in one place (audrey) rather then a bunch of separate apps.

I still have a bit of confusion how current VMs get their unique id inside a vm in the current model to communicate with Audrey's config server.

more fuel on the fire: http://www.freedesktop.org/wiki/Software/systemd/hostnamed

Lets go back to basics, the requirements:

R1: vm needs to have some mechanism of retrieving a globally unique UUID across all VMs in a cloud enviornment R2. whatever launches the vm needs to know how to map a UUID to an image R3. To obtain the vm UUID inside a vm, the network must be operational (is this true in atleast some cloud environments? If so, then it is a requirement).

Our current limits: systemd/dbus are nearly the first things that start and we can't teach dbus about the vm UUID because of R3.

Regards -steve

Chris Lalancette

29 Jun 29 Jun

1:11 p.m.

New subject: [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 06/28/11 - 03:12:02PM, Steven Dake wrote:

...

...
...
...
I had originally thought deltacloud might be involved here in some way to consistently load instance data into the instance (and then provide a consistent way of accessing it inside the vm that doesn't involve writing 10 different access mechanisms).

I don't know enough about deltacloud, but if the insertion functionality is currently a gap in any virtual machine manager (such as paravirt Xen) or RHEVH, the entire Aeolus solution as well as cloud-ha becomes non-functional on those VMMs.

The question for deltacloud evaluation of this functionality then boils down to:

is this functionality available on every deltacloud cloud provider

implementation

Not readily or consistently. For instance, rhev-m has "hook" functionality, but we haven't yet figured out what it means to use the hooks to provide this functionality for rhev-m.

VMWare has some form of data injection available, as long as you install their guest agent, enabling certain api functionality that deltacloud could talk to.

It's significantly different for each provider, that it takes research each time a new provider is taken on.

...

is the functionality inside the vm consistent (does a user only call

a "read_my_id" function or read a file inside the vm) - if library access what is the new dependency inside the vm?

No. In fact, for the case of EC2, this particularly breaks down. Because EC2 stores the "user data" in a place only accessible over the network, the VM needs to ask a known ec2 location for the user data via an HTTP GET.

Since dbus is probably already firmly set in stone by the time networking comes up, it's likely too late to rely on ec2's user data to store to this information and have it impact dbus after retrieval.

Just to be clear here, I don't know that this is the case. I haven't personally looked into dbus, and how to configure the UUID on-the-fly. This definitely bears investigation, as I feel that getting dbus to properly expose this information is our best path forward.

looks like upstream is taking a crack at moving this into systemd rather then dbus........

http://lists.freedesktop.org/archives/dbus/2011-March/014187.html

dbus does not have a set system id API.

if we add a dbus set system id API and call it, say at the end of the startup sequence, some apps will be aware of the old id and others will be aware of the new id. I expect this would break things badly, but I am not a dbus expert so can't comment here.

Perhaps we should let dbus do its own thing and just create a new file. Then we have two files

/etc/machine_id (worthless in a cloud environment except for dbus operation) /etc/vm_machine_id (written by audrey or a dbus/systemd api) which can be set after the network is operational. Then the machine id logic can be put in one place (audrey) rather then a bunch of separate apps.

I still have a bit of confusion how current VMs get their unique id inside a vm in the current model to communicate with Audrey's config server.

more fuel on the fire: http://www.freedesktop.org/wiki/Software/systemd/hostnamed

Lets go back to basics, the requirements:

R1: vm needs to have some mechanism of retrieving a globally unique UUID across all VMs in a cloud enviornment R2. whatever launches the vm needs to know how to map a UUID to an image R3. To obtain the vm UUID inside a vm, the network must be operational (is this true in atleast some cloud environments? If so, then it is a requirement).

Yes, this is all about right. Requirement 3 is the difficult one.

But maybe we are looking at this the wrong way. If we start from the premise that dbus/systemd can generate it's own UUID at boot time (suitably configured), then maybe we should think about using *that* as the UUID for the machine. What this would then require is a mapping between the UUID the cloud machinery has (i.e. the one that was generated prior to launch) and the UUID that dbus generated. It's a little ugly, but it prevents us from having problems when we restart dbus/systemd on an already running system.

Again, I'm just throwing out ideas here. I still think we need to look closer at dbus/systemd and see if we can possibly send some patches to those to support changing the UUID in a cleaner manner. Then we can determine what the best path forward is.

-- Chris Lalancette

Greg Blomquist

1:37 p.m.

New subject: [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 06/29/2011 08:11 AM, Chris Lalancette wrote:

...

On 06/28/11 - 03:12:02PM, Steven Dake wrote:

...
...
...
...
I had originally thought deltacloud might be involved here in some way to consistently load instance data into the instance (and then provide a consistent way of accessing it inside the vm that doesn't involve writing 10 different access mechanisms).

I don't know enough about deltacloud, but if the insertion functionality is currently a gap in any virtual machine manager (such as paravirt Xen) or RHEVH, the entire Aeolus solution as well as cloud-ha becomes non-functional on those VMMs.

The question for deltacloud evaluation of this functionality then boils down to:

is this functionality available on every deltacloud cloud provider

implementation

Not readily or consistently. For instance, rhev-m has "hook" functionality, but we haven't yet figured out what it means to use the hooks to provide this functionality for rhev-m.

VMWare has some form of data injection available, as long as you install their guest agent, enabling certain api functionality that deltacloud could talk to.

It's significantly different for each provider, that it takes research each time a new provider is taken on.

...

is the functionality inside the vm consistent (does a user only call

a "read_my_id" function or read a file inside the vm) - if library access what is the new dependency inside the vm?

No. In fact, for the case of EC2, this particularly breaks down. Because EC2 stores the "user data" in a place only accessible over the network, the VM needs to ask a known ec2 location for the user data via an HTTP GET.

Since dbus is probably already firmly set in stone by the time networking comes up, it's likely too late to rely on ec2's user data to store to this information and have it impact dbus after retrieval.

Just to be clear here, I don't know that this is the case. I haven't personally looked into dbus, and how to configure the UUID on-the-fly. This definitely bears investigation, as I feel that getting dbus to properly expose this information is our best path forward.

looks like upstream is taking a crack at moving this into systemd rather then dbus........

http://lists.freedesktop.org/archives/dbus/2011-March/014187.html

dbus does not have a set system id API.

if we add a dbus set system id API and call it, say at the end of the startup sequence, some apps will be aware of the old id and others will be aware of the new id. I expect this would break things badly, but I am not a dbus expert so can't comment here.

Perhaps we should let dbus do its own thing and just create a new file. Then we have two files

/etc/machine_id (worthless in a cloud environment except for dbus operation) /etc/vm_machine_id (written by audrey or a dbus/systemd api) which can be set after the network is operational. Then the machine id logic can be put in one place (audrey) rather then a bunch of separate apps.

I still have a bit of confusion how current VMs get their unique id inside a vm in the current model to communicate with Audrey's config server.

more fuel on the fire: http://www.freedesktop.org/wiki/Software/systemd/hostnamed

Lets go back to basics, the requirements:

R1: vm needs to have some mechanism of retrieving a globally unique UUID across all VMs in a cloud enviornment R2. whatever launches the vm needs to know how to map a UUID to an image R3. To obtain the vm UUID inside a vm, the network must be operational (is this true in atleast some cloud environments? If so, then it is a requirement).

Yes, this is all about right. Requirement 3 is the difficult one.

But maybe we are looking at this the wrong way. If we start from the premise that dbus/systemd can generate it's own UUID at boot time (suitably configured), then maybe we should think about using *that* as the UUID for the machine. What this would then require is a mapping between the UUID the cloud machinery has (i.e. the one that was generated prior to launch) and the UUID that dbus generated. It's a little ugly, but it prevents us from having problems when we restart dbus/systemd on an already running system.

Again, I'm just throwing out ideas here. I still think we need to look closer at dbus/systemd and see if we can possibly send some patches to those to support changing the UUID in a cleaner manner. Then we can determine what the best path forward is.

This makes me wonder if copy-on-write can be used here to seed the uuid we want. Knowing nothing about how to leverage copy-on-write and nothing about dbus puts a pretty big cloud over this for me. But, thought I'd throw it out there...

There are a couple guys in the office here that work with dbus, I'll see what they have to say.

Steven Dake

6:39 p.m.

New subject: [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 06/29/2011 05:11 AM, Chris Lalancette wrote:

...

On 06/28/11 - 03:12:02PM, Steven Dake wrote:

...
...
...
...
I had originally thought deltacloud might be involved here in some way to consistently load instance data into the instance (and then provide a consistent way of accessing it inside the vm that doesn't involve writing 10 different access mechanisms).

I don't know enough about deltacloud, but if the insertion functionality is currently a gap in any virtual machine manager (such as paravirt Xen) or RHEVH, the entire Aeolus solution as well as cloud-ha becomes non-functional on those VMMs.

The question for deltacloud evaluation of this functionality then boils down to:

is this functionality available on every deltacloud cloud provider

implementation

Not readily or consistently. For instance, rhev-m has "hook" functionality, but we haven't yet figured out what it means to use the hooks to provide this functionality for rhev-m.

VMWare has some form of data injection available, as long as you install their guest agent, enabling certain api functionality that deltacloud could talk to.

It's significantly different for each provider, that it takes research each time a new provider is taken on.

...

is the functionality inside the vm consistent (does a user only call

a "read_my_id" function or read a file inside the vm) - if library access what is the new dependency inside the vm?

No. In fact, for the case of EC2, this particularly breaks down. Because EC2 stores the "user data" in a place only accessible over the network, the VM needs to ask a known ec2 location for the user data via an HTTP GET.

Since dbus is probably already firmly set in stone by the time networking comes up, it's likely too late to rely on ec2's user data to store to this information and have it impact dbus after retrieval.

Just to be clear here, I don't know that this is the case. I haven't personally looked into dbus, and how to configure the UUID on-the-fly. This definitely bears investigation, as I feel that getting dbus to properly expose this information is our best path forward.

looks like upstream is taking a crack at moving this into systemd rather then dbus........

http://lists.freedesktop.org/archives/dbus/2011-March/014187.html

dbus does not have a set system id API.

if we add a dbus set system id API and call it, say at the end of the startup sequence, some apps will be aware of the old id and others will be aware of the new id. I expect this would break things badly, but I am not a dbus expert so can't comment here.

Perhaps we should let dbus do its own thing and just create a new file. Then we have two files

/etc/machine_id (worthless in a cloud environment except for dbus operation) /etc/vm_machine_id (written by audrey or a dbus/systemd api) which can be set after the network is operational. Then the machine id logic can be put in one place (audrey) rather then a bunch of separate apps.

I still have a bit of confusion how current VMs get their unique id inside a vm in the current model to communicate with Audrey's config server.

more fuel on the fire: http://www.freedesktop.org/wiki/Software/systemd/hostnamed

Lets go back to basics, the requirements:

R1: vm needs to have some mechanism of retrieving a globally unique UUID across all VMs in a cloud enviornment R2. whatever launches the vm needs to know how to map a UUID to an image R3. To obtain the vm UUID inside a vm, the network must be operational (is this true in atleast some cloud environments? If so, then it is a requirement).

Yes, this is all about right. Requirement 3 is the difficult one.

But maybe we are looking at this the wrong way. If we start from the premise that dbus/systemd can generate it's own UUID at boot time (suitably configured), then maybe we should think about using *that* as the UUID for the machine. What this would then require is a mapping between the UUID

This would be a great option if anyone has any ideas how to generate that mapping. The issue I get stuck on when thinking through this model is how does the dbus id get out of the VM to make the mapping. The problem is easy if only one vm starts at a time, but when multiple vms start, I don't see a reasonable way to correlate the info.

This creates a different set of requirements: 1) dbus/systed must generate a fresh machine id on each boot 2) a translation service must exist to convert dbus ids to cloud ids.

1 seems challenging but solvable if we plan to run on older distros (since these distros will have older versions of dbus/systemd). The id currently isn't generated freshly on each boot.

One hacky workaround to 1 is the addition of an init script that deletes the machine id file if it exists as the first thing that runs in the system for older distros, and newer distros systemd could be configured in some way to generate a fresh uuid via changes to their code base.

2 is more challenging - I don't have many ideas here. Audrey could potentially determine the runtime vm UUID and send it to the config server (which then could act as a translation service). This model turns the audrey config service into a T component which from a design standpoint is difficult to manage..

...

the cloud machinery has (i.e. the one that was generated prior to launch) and the UUID that dbus generated. It's a little ugly, but it prevents us from having problems when we restart dbus/systemd on an already running system.

Again, I'm just throwing out ideas here. I still think we need to look closer at dbus/systemd and see if we can possibly send some patches to those to support changing the UUID in a cleaner manner. Then we can determine what the best path forward is.

This leads to the same legacy problem - older distros wont have those patch sets and audrey as well as pacemaker-cloud wont be operational on those legacy distros.

Andrew Beekhof

30 Jun 30 Jun

5:32 a.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On Wed, Jun 29, 2011 at 2:05 AM, Perry Myers pmyers@redhat.com wrote:

...

On 06/28/2011 10:26 AM, Steven Dake wrote:

...
On 06/28/2011 06:24 AM, Joseph VLcek wrote:

...
On Mon, 2011-06-27 at 13:59 -0700, Steven Dake wrote:

...
On 06/24/2011 04:16 PM, Steven Dake wrote:

...
Currently most linux distributions that use dbus store a UUID in /var/lib/dbus/machine_id. In our pacemaker-cloud work test tools, we must manipulate this file via oz image creation to match a value we know about.

Q1. Is this file freshly created on each image creation/cloning process?

If not, it should be, because Matahari uses this information to uniquely identify a host. If it is copied exactly to each new image, that creates a problem (all hosts appear the same to matahari).

Q2. If/when it is created by image factory, will it be stored in a database or other storage medium?

In pacemaker-cloud we need to have a mapping from image->internal id id so that we know which VM maps to which deployable HA configuration.

If we wait on this point until after our 1.0 release, we could end up with a bunch of images in the field that have either the same machine id or are not mapped in any way that allows us to provide HA functionality.

Regards -steve

After more investigation, Chris and I came up with a workable plan for handling unique VM ids (see thread with subject How Audrey, Conductor, and Audrey's config server interact and their relationship to a unique vm instance id).

Currently the Audrey script runs as a replacement to rc.local. Matahari runs at S99. The general idea is for Audrey script to run as S98 (before Matahari) and write a management-wide unique instance UUID to the file /etc/vm_machine_id (audrey has access to this information).

Matahari could be changed to read /etc/vm_machine_id first. If that file doesn't exist, /var/lib/dbus/machine_id would be read.

This creates some difficulty in running the audrey script at a specific runlevel (it requires some changes to oz to insert init scripts).

Another option is for the current rc.local script that audrey replaces to run the Matahari service starting giblits as its first action.

Comments welcome before I start writing code... -steve

I think the solution of having Audrey store a launch time unique UUID in a file that Matahari can read will work.

It may not be necessary to alter oz to insert init scripts to ensure Audrey runs before Matahari.

Let me explain.

Audrey does not replace /etc/rc.local

When Image Factory builds the image it appends to the end of /etc/rc.local a line of code that will start Audrey.

e.g.: [ -f /usr/bin/audrey ] && /usr/bin/audrey

I propose having Image Factory append another line to /etc/rc.local below where it starts Audrey to start Matahari.

e.g.:

[ -f /usr/bin/audrey ] && /usr/bin/audrey

[ -f <Matahari start> ] && <run Matahari start>

We may need to manage timing to ensure /usr/bin/audrey does not return until it has stored the unique UUID in a file and have it return an error status if it is unable to.

Thoughts?

This sounds fine to me, although the rc.local modification always sounded a bit hacky (rather then using a proper init script). One minor issue is matahari expects to be started via service xxx start and each agent has a separate init script. There are 5 or 6 agents.

Yes, I think having audrey 'start matahari' is very hackish, since in normal systems matahari would start via regular init scripts. So this means for cloud we'd need to disable the normal init scripts and then relegate control to audrey. I don't like that approach...

Let's back up a bit... why does matahari start need to depend on audrey starting? The answer is that we need audrey to put in the /etc/machine-id file.

Well, why not let matahari start at its normal runlevel, and respond to queries, etc, but if you call get-id API, then it should return something that indicates '/etc/machine-id not set yet, giving you dbus-id instead'

Then it's just up to the person doing the querying to wait until the id returned is the /etc/machine-id.

So remove the dep on service start and replace with intelligent application usage

Thoughts?

TBH, I'm not a fan.

Partly because that UUID is a property and attribute of every agent and are initialised at startup - I don't like the idea of those being volatile. It also pre-supposes that you'd never want the dbus id if /etc/machine-id is present.

We talked in the past about providing access to both a hardware _and_ a software uuid. Which is /etc/machine-id supposed to be? If the former, then the solution is easy - add an extra API call.

Perry Myers

9:56 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

...

TBH, I'm not a fan.

Partly because that UUID is a property and attribute of every agent and are initialised at startup - I don't like the idea of those being volatile.

ick, I forgot about that...

...

It also pre-supposes that you'd never want the dbus id if /etc/machine-id is present.

right, that is a bad assumption. My original ideas around exposing UID in matahari APIs was to have _several_ api calls depending on what you considered to be the proper UUID

So maybe we need to have calls like:

get_dbus_uuid (/var/lib/dbus/machine_id) get_machine_uuid (/etc/machine-id) get_smbios_uuid

And if we _really_ need a separate vm_machine_id get_vm_machine_uuid (/etc/vm_machine_id)

Yes, it's silly to have so many uuids, and I'd love to declare one of these to be canonical so we could deprecate the rest. But until that happens, I fear we're going to need to at least provide support for exposing all of them...

Then Aeolus and CloudHA can pick which ID they want to set and standardize on.

The question specific to matahari would be: which uuid should we use consistently for each agent's core uuid? Right now we use the dbus one right? If no one is going to muck with that and reset it, I'm happy to continue using it for our purposes.

...

We talked in the past about providing access to both a hardware _and_ a software uuid. Which is /etc/machine-id supposed to be? If the former, then the solution is easy - add an extra API call.

Andrew Beekhof

1 Jul 1 Jul

7:09 a.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On Fri, Jul 1, 2011 at 6:56 AM, Perry Myers pmyers@redhat.com wrote:

...

...
TBH, I'm not a fan.

Partly because that UUID is a property and attribute of every agent and are initialised at startup - I don't like the idea of those being volatile.

ick, I forgot about that...

...
It also pre-supposes that you'd never want the dbus id if /etc/machine-id is present.

right, that is a bad assumption. My original ideas around exposing UID in matahari APIs was to have _several_ api calls depending on what you considered to be the proper UUID

So maybe we need to have calls like:

get_dbus_uuid (/var/lib/dbus/machine_id) get_machine_uuid (/etc/machine-id) get_smbios_uuid

And if we _really_ need a separate vm_machine_id get_vm_machine_uuid (/etc/vm_machine_id)

Slight variation on the theme... I think we should focus on the properties of the uuids rather than where they live and/or who generates them.

So I'm thinking we want 5 uuids and probably one accessor, where each uuid would have a different lifetime: - externally configured - generated once for the lifetime of the vm - generated every time the hardware changes - generated every time the machine boots - generated every time an agent starts

With the accessor being: get_uuid(lifetime), and lifetime ::= user|forever|hardware|boot|matahari

The externally configured one would/could be set as part of our "name other than config or postboot" agent process. People then pick whichever uuid has the properties they need.

I imagine matahari would provide the first two as attributes/properties also.

...

Yes, it's silly to have so many uuids, and I'd love to declare one of these to be canonical so we could deprecate the rest. But until that happens, I fear we're going to need to at least provide support for exposing all of them...

Then Aeolus and CloudHA can pick which ID they want to set and standardize on.

The question specific to matahari would be: which uuid should we use consistently for each agent's core uuid? Right now we use the dbus one right? If no one is going to muck with that and reset it, I'm happy to continue using it for our purposes.

...
We talked in the past about providing access to both a hardware _and_ a software uuid. Which is /etc/machine-id supposed to be? If the former, then the solution is easy - add an extra API call.

Perry Myers

2:10 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

...

...
...
It also pre-supposes that you'd never want the dbus id if /etc/machine-id is present.

right, that is a bad assumption. My original ideas around exposing UID in matahari APIs was to have _several_ api calls depending on what you considered to be the proper UUID

So maybe we need to have calls like:

get_dbus_uuid (/var/lib/dbus/machine_id) get_machine_uuid (/etc/machine-id) get_smbios_uuid

And if we _really_ need a separate vm_machine_id get_vm_machine_uuid (/etc/vm_machine_id)

Slight variation on the theme... I think we should focus on the properties of the uuids rather than where they live and/or who generates them.

So I'm thinking we want 5 uuids and probably one accessor, where each uuid would have a different lifetime:

externally configured

So as you say below, this is set by postboot, current thinking is that postboot puts this in /etc/vm-machine-id, right?

...

generated once for the lifetime of the vm

How is this different than the one above?

...

generated every time the hardware changes

This is == to smbios uuid right?

...

generated every time the machine boots

This is == DBus/systemd uuid right?

...

generated every time an agent starts

What does this map to, and what's the value in it? Does every Agent on a given guest get a different 'agent uuid'? If so, I think I see why you'd want this.

...

With the accessor being: get_uuid(lifetime), and lifetime ::= user|forever|hardware|boot|matahari

I'd call the last one 'agent' not 'matahari' if it's unique to each agent.

...

The externally configured one would/could be set as part of our "name other than config or postboot" agent process. People then pick whichever uuid has the properties they need.

I imagine matahari would provide the first two as attributes/properties also.

No objections to the general concepts above, just the nitpicky questions I asked need to be clarified :)

Perry

Andrew Beekhof

4 Jul 4 Jul

8:29 a.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On Fri, Jul 1, 2011 at 11:10 PM, Perry Myers pmyers@redhat.com wrote:

...

...
...
...
It also pre-supposes that you'd never want the dbus id if /etc/machine-id is present.

right, that is a bad assumption. My original ideas around exposing UID in matahari APIs was to have _several_ api calls depending on what you considered to be the proper UUID

So maybe we need to have calls like:

get_dbus_uuid (/var/lib/dbus/machine_id) get_machine_uuid (/etc/machine-id) get_smbios_uuid

And if we _really_ need a separate vm_machine_id get_vm_machine_uuid (/etc/vm_machine_id)

Slight variation on the theme... I think we should focus on the properties of the uuids rather than where they live and/or who generates them.

So I'm thinking we want 5 uuids and probably one accessor, where each uuid would have a different lifetime:

externally configured

So as you say below, this is set by postboot, current thinking is that postboot puts this in /etc/vm-machine-id, right?

Potentially. Where it lives should not be important for entities accessing it via qmf. At most the image creation tool may need to know the location.

...

...

generated once for the lifetime of the vm

How is this different than the one above?

One is configurable, this one is not. In practice they may have the same lifetime (as may the hardware and even per-boot uuids), however while this one will be populated if missing, the previous one will not.

...

...

generated every time the hardware changes

This is == to smbios uuid right?

Potentially, assuming it has the required properties. I'm not planning waste time re-inventing the wheel.

...

...

generated every time the machine boots

This is == DBus/systemd uuid right?

I believe not. I believe the DBus uuid persists across reboots, but potentially not across upgrades. I would be happy to be mistaken though.

...

...

generated every time an agent starts

What does this map to, and what's the value in it? Does every Agent on a given guest get a different 'agent uuid'?

Yes. So it exists in case anyone needs to know if the agent it was talking to previously is the same as it it talking to now. I'd not expect many users of this one.

...

If so, I think I see why you'd want this.

...
With the accessor being: get_uuid(lifetime), and lifetime ::= user|forever|hardware|boot|matahari

I'd call the last one 'agent' not 'matahari' if it's unique to each agent.

Sure

...

...
The externally configured one would/could be set as part of our "name other than config or postboot" agent process. People then pick whichever uuid has the properties they need.

I imagine matahari would provide the first two as attributes/properties also.

No objections to the general concepts above, just the nitpicky questions I asked need to be clarified :)

Perry

Perry Myers

5 Jul 5 Jul

2:58 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

...

...
...

generated every time the machine boots

This is == DBus/systemd uuid right?

I believe not. I believe the DBus uuid persists across reboots, but potentially not across upgrades.

Is that by design? If not, then it seems like we could improve that particular uuid by submitting a bug to have dbus/systemd uuids persist across upgrades.

ack to your other comments on this thread, can you write up an API proposal then with the new functions and properties that we'll be exposing for review on list?

Perry

Andrew Beekhof

21 Jul 21 Jul

7:24 a.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On Tue, Jul 5, 2011 at 11:58 PM, Perry Myers pmyers@redhat.com wrote:

...

...
...
...

generated every time the machine boots

This is == DBus/systemd uuid right?

I believe not. I believe the DBus uuid persists across reboots, but potentially not across upgrades.

Is that by design? If not, then it seems like we could improve that particular uuid by submitting a bug to have dbus/systemd uuids persist across upgrades.

ack to your other comments on this thread, can you write up an API proposal then with the new functions and properties that we'll be exposing for review on list?

Sorry for the delay... here are my proposed changes to the host api wrt. uuids.

<method name="set_uuid" desc="Set a UUID with the specified lifetime" > <arg name="lifetime" dir="I" type="sstr" /> <arg name="uuid" dir="I" type="sstr" /> </method>

Perry Myers

9:55 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 07/21/2011 02:24 AM, Andrew Beekhof wrote:

...

On Tue, Jul 5, 2011 at 11:58 PM, Perry Myerspmyers@redhat.com wrote:

...
...
...
...

generated every time the machine boots

This is == DBus/systemd uuid right?

I believe not. I believe the DBus uuid persists across reboots, but potentially not across upgrades.

Is that by design? If not, then it seems like we could improve that particular uuid by submitting a bug to have dbus/systemd uuids persist across upgrades.

ack to your other comments on this thread, can you write up an API proposal then with the new functions and properties that we'll be exposing for review on list?

Sorry for the delay... here are my proposed changes to the host api wrt. uuids.



<method name="get_uuid" desc="Obtain a UUID with the specified lifetime from the machine">

<arg name="lifetime" dir="I" type="sstr" /> </method>



<method name="set_uuid" desc="Set a UUID with the specified lifetime"> <arg name="lifetime" dir="I" type="sstr" /> <arg name="uuid" dir="I" type="sstr" /> </method>

Thanks for writing this up. My only question now is... what existing uuid implementations map here?

i.e. for the implementation of hardware UUID we plan on using X, for the implementation of immutable UUID we will use /etc/machine-id

Stuff along those lines would complete the matrix needed here.

Perry

Andrew Beekhof

11:19 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On Fri, Jul 22, 2011 at 6:55 AM, Perry Myers pmyers@redhat.com wrote:

...

On 07/21/2011 02:24 AM, Andrew Beekhof wrote:

...
On Tue, Jul 5, 2011 at 11:58 PM, Perry Myerspmyers@redhat.com wrote:

...
...
...
...

generated every time the machine boots

This is == DBus/systemd uuid right?

I believe not. I believe the DBus uuid persists across reboots, but potentially not across upgrades.

Is that by design? If not, then it seems like we could improve that particular uuid by submitting a bug to have dbus/systemd uuids persist across upgrades.

ack to your other comments on this thread, can you write up an API proposal then with the new functions and properties that we'll be exposing for review on list?

Sorry for the delay... here are my proposed changes to the host api wrt. uuids.



<method name="get_uuid" desc="Obtain a UUID with the specified lifetime from the machine"> <arg name="lifetime" dir="I" type="sstr" />

</method>



<method name="set_uuid" desc="Set a UUID with the specified lifetime"> <arg name="lifetime" dir="I" type="sstr" /> <arg name="uuid" dir="I" type="sstr" />

</method>

Thanks for writing this up. My only question now is... what existing uuid implementations map here?

i.e. for the implementation of hardware UUID we plan on using X, for the implementation of immutable UUID we will use /etc/machine-id

Are there any conditions under which /etc/machine-id would get regenerated? If so that would rule it out.

These were my initial thoughts:

Immutable: /etc/machine-id (systemd) Hardware: ? Something on top of the smbios API Reboot: ? /var/lib/dbus/machine-id (dbus) Agent: In memory using the libuuid API User: /etc/custom-machine-id

I intentionally left them out originally so people would concentrate on the API itself ;-)

...

Stuff along those lines would complete the matrix needed here.

Perry

Mark McLoughlin

22 Jul 22 Jul

5:40 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

Hi Andrew,

On Fri, 2011-07-22 at 08:19 +1000, Andrew Beekhof wrote:

...

Are there any conditions under which /etc/machine-id would get regenerated? If so that would rule it out.

These were my initial thoughts:

Immutable: /etc/machine-id (systemd) Hardware: ? Something on top of the smbios API Reboot: ? /var/lib/dbus/machine-id (dbus) Agent: In memory using the libuuid API User: /etc/custom-machine-id

Okay, AIUI it (and I've take a quick look to confirm), both /etc/machine-id and /var/lib/dbus/machine-id are basically equivalent concepts

They are both a UUID that should be generated when the machine is installed or booted for the first time and not change thereafter.

Looking at the code, I think dbus and systemd make an effort to have these UUIDs be identical, but I'm not 100% sure. Hmm, I just checked an F-16 machine and the two UUIDs are different.

The reference you've probably seen in the dbus docs to the UUID changing on reboot only applies to stateless systems - the UUID is stored on a part of the filesystem which may not persist across reboots. That could equally apply to /etc/machine-id too.

In VM disk images, both of these UUIDs should be deleted so that they are generated when a new VM is booted from the image. Similar to what is done for host SSH keys.

On the RHEV-M side, there's a UUID in SMBIOS that has nothing to do with dbus or systemd. This UUID corresponds to the instance ID you'd see in the deltacloud API.

When you launch a deployable, you'll get the list of deltacloud instance IDs back. I'm guessing you want Matahari to reliably report this UUID back as a "system hardware UUID"?

In that case, the answer when running under RHEV-M is to read the UUID from SMBIOS.

Under EC2, you get the ID from:

http://169.254.169.254/latest/meta-data/instance-id

and this isn't a UUID at all. I'm not sure about other clouds.

Now, config server seems to have its notion of a machine UUID. This is passed to the instance at launch time and the audrey startup script has logic to find it. The idea is that this is supplied to config server by whatever is launching the instance. I'm not sure why this wouldn't also be the deltacloud ID.

Summary:

- For the HA stuff, I think you need Matahari to be able to reliably report the deltacloud instance ID a "system hardware ID" or similar

- That's unrelated to the systemd and dbus IDs, but these IDs should be unique to each machine too.

- You should be able to determine the deltacloud instance ID from inside each VM.

- This deltacloud instance ID should be what Audrey is using when contacting config server.

- Audrey and Matahari should have the same logic for figuring this ID out.

Cheers, Mark.

Greg Blomquist

6:05 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 07/22/2011 12:40 PM, Mark McLoughlin wrote:

...

Hi Andrew,

On Fri, 2011-07-22 at 08:19 +1000, Andrew Beekhof wrote:

...
Are there any conditions under which /etc/machine-id would get regenerated? If so that would rule it out.

These were my initial thoughts:

Immutable: /etc/machine-id (systemd) Hardware: ? Something on top of the smbios API Reboot: ? /var/lib/dbus/machine-id (dbus) Agent: In memory using the libuuid API User: /etc/custom-machine-id

Okay, AIUI it (and I've take a quick look to confirm), both /etc/machine-id and /var/lib/dbus/machine-id are basically equivalent concepts

They are both a UUID that should be generated when the machine is installed or booted for the first time and not change thereafter.

Looking at the code, I think dbus and systemd make an effort to have these UUIDs be identical, but I'm not 100% sure. Hmm, I just checked an F-16 machine and the two UUIDs are different.

The reference you've probably seen in the dbus docs to the UUID changing on reboot only applies to stateless systems - the UUID is stored on a part of the filesystem which may not persist across reboots. That could equally apply to /etc/machine-id too.

In VM disk images, both of these UUIDs should be deleted so that they are generated when a new VM is booted from the image. Similar to what is done for host SSH keys.

On the RHEV-M side, there's a UUID in SMBIOS that has nothing to do with dbus or systemd. This UUID corresponds to the instance ID you'd see in the deltacloud API.

When you launch a deployable, you'll get the list of deltacloud instance IDs back. I'm guessing you want Matahari to reliably report this UUID back as a "system hardware UUID"?

In that case, the answer when running under RHEV-M is to read the UUID from SMBIOS.

Under EC2, you get the ID from:

http://169.254.169.254/latest/meta-data/instance-id

and this isn't a UUID at all. I'm not sure about other clouds.

Now, config server seems to have its notion of a machine UUID. This is passed to the instance at launch time and the audrey startup script has logic to find it. The idea is that this is supplied to config server by whatever is launching the instance. I'm not sure why this wouldn't also be the deltacloud ID.

If the deltacloud ID is the UUID presented to deltacloud by the cloud provider for the launched instance, then the only reason not to use the deltacloud ID for the config server and audrey startup script is that it requires the guest to launch and report this ID before the configurations can be handed to the config server.

That's not terrible, but just makes the launch sequence more serial instead of parallelizing the launch and seeding the configs to the config server. If it's necessary to reduce the number of IDs floating around, we can do this.

...

Summary:

For the HA stuff, I think you need Matahari to be able to reliably report the deltacloud instance ID a "system hardware ID" or similar

That's unrelated to the systemd and dbus IDs, but these IDs should be unique to each machine too.

You should be able to determine the deltacloud instance ID from inside each VM.

This deltacloud instance ID should be what Audrey is using when contacting config server.

Audrey and Matahari should have the same logic for figuring this ID out.

Cheers, Mark.

aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel

Mark McLoughlin

6:16 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

Hi Greg,

On Fri, 2011-07-22 at 13:05 -0400, Greg Blomquist wrote:

...

If the deltacloud ID is the UUID presented to deltacloud by the cloud provider for the launched instance, then the only reason not to use the deltacloud ID for the config server and audrey startup script is that it requires the guest to launch and report this ID before the configurations can be handed to the config server.

Yeah, it's not ideal.

...

That's not terrible, but just makes the launch sequence more serial instead of parallelizing the launch and seeding the configs to the config server. If it's necessary to reduce the number of IDs floating around, we can do this.

Well, AFAICT, the only ID that the cloud HA stuff will know about from conductor or deltacloud is the deltacloud instance ID. So, that's what we want the guest agent to use when reporting back.

If you choose to add another ID for config server purposes, that would mean the cloud HA stuff would need to find that out. And I think that's a step too far in terms of intertwining the architecture. The cloud HA stuff should work with standalone deltacloud, without conductor or config server IMHO.

Cheers, Mark.

Greg Blomquist

6:22 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 07/22/2011 01:16 PM, Mark McLoughlin wrote:

...

Hi Greg,

On Fri, 2011-07-22 at 13:05 -0400, Greg Blomquist wrote:

...
If the deltacloud ID is the UUID presented to deltacloud by the cloud provider for the launched instance, then the only reason not to use the deltacloud ID for the config server and audrey startup script is that it requires the guest to launch and report this ID before the configurations can be handed to the config server.

Yeah, it's not ideal.

...
That's not terrible, but just makes the launch sequence more serial instead of parallelizing the launch and seeding the configs to the config server. If it's necessary to reduce the number of IDs floating around, we can do this.

Well, AFAICT, the only ID that the cloud HA stuff will know about from conductor or deltacloud is the deltacloud instance ID. So, that's what we want the guest agent to use when reporting back.

Ah, good point.

...

If you choose to add another ID for config server purposes, that would mean the cloud HA stuff would need to find that out. And I think that's a step too far in terms of intertwining the architecture. The cloud HA stuff should work with standalone deltacloud, without conductor or config server IMHO.

Yep, I'm sold.

The first cut I'm working on will generate IDs for Audrey. But, we can circle back on this and revamp to use the deltacloud ID. The two changes this implies on the Audrey side are 1) the audrey startup script should use the UUID API to acquire the Immutable UUID (assuming this is what maps to Deltacloud ID) 2) the conductor launch sequence will be modified to wait to push configs to config server until the instance is launched and can report back the deltacloud ID

Sound about right?

...

Cheers, Mark.

Steven Dake

7:34 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 07/22/2011 10:22 AM, Greg Blomquist wrote:

...

On 07/22/2011 01:16 PM, Mark McLoughlin wrote:

...
Hi Greg,

On Fri, 2011-07-22 at 13:05 -0400, Greg Blomquist wrote:

...
If the deltacloud ID is the UUID presented to deltacloud by the cloud provider for the launched instance, then the only reason not to use the deltacloud ID for the config server and audrey startup script is that it requires the guest to launch and report this ID before the configurations can be handed to the config server.

Yeah, it's not ideal.

...
That's not terrible, but just makes the launch sequence more serial instead of parallelizing the launch and seeding the configs to the config server. If it's necessary to reduce the number of IDs floating around, we can do this.

Well, AFAICT, the only ID that the cloud HA stuff will know about from conductor or deltacloud is the deltacloud instance ID. So, that's what we want the guest agent to use when reporting back.

Ah, good point.

...
If you choose to add another ID for config server purposes, that would mean the cloud HA stuff would need to find that out. And I think that's a step too far in terms of intertwining the architecture. The cloud HA stuff should work with standalone deltacloud, without conductor or config server IMHO.

Yep, I'm sold.

The first cut I'm working on will generate IDs for Audrey. But, we can circle back on this and revamp to use the deltacloud ID. The two changes this implies on the Audrey side are

the audrey startup script should use the UUID API to acquire the

Immutable UUID (assuming this is what maps to Deltacloud ID) 2) the conductor launch sequence will be modified to wait to push configs to config server until the instance is launched and can report back the deltacloud ID

Sound about right?

This proposal sounds good but leaves out a bit more detail further down the bootstrap order.

We also need this ID loaded into matahari as well as matahari qpidd authentication data. This is where the discussion around the machine-id came from. If it is a different machine-external API coming out of matahari but available as soon as matahari is started, this WFM.

This could likely be done by audrey with following start order:

audrey client starts audrey sets matahari authentication information on the filesystem audrey sets UUID information on the filesystem audrey starts matahari

The issue with feeding the UUID into matahari via a local api call to matahari is that once matahari is started it registers with QMF, which leaves racey conditions where matahari may be active but without an instance id.

Regards -steve

...

...
Cheers, Mark.

aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel

Greg Blomquist

8:30 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 07/22/2011 02:34 PM, Steven Dake wrote:

...

On 07/22/2011 10:22 AM, Greg Blomquist wrote:

...
On 07/22/2011 01:16 PM, Mark McLoughlin wrote:

...
Hi Greg,

On Fri, 2011-07-22 at 13:05 -0400, Greg Blomquist wrote:

...
If the deltacloud ID is the UUID presented to deltacloud by the cloud provider for the launched instance, then the only reason not to use the deltacloud ID for the config server and audrey startup script is that it requires the guest to launch and report this ID before the configurations can be handed to the config server.

Yeah, it's not ideal.

...
That's not terrible, but just makes the launch sequence more serial instead of parallelizing the launch and seeding the configs to the config server. If it's necessary to reduce the number of IDs floating around, we can do this.

Well, AFAICT, the only ID that the cloud HA stuff will know about from conductor or deltacloud is the deltacloud instance ID. So, that's what we want the guest agent to use when reporting back.

Ah, good point.

...
If you choose to add another ID for config server purposes, that would mean the cloud HA stuff would need to find that out. And I think that's a step too far in terms of intertwining the architecture. The cloud HA stuff should work with standalone deltacloud, without conductor or config server IMHO.

Yep, I'm sold.

The first cut I'm working on will generate IDs for Audrey. But, we can circle back on this and revamp to use the deltacloud ID. The two changes this implies on the Audrey side are

the audrey startup script should use the UUID API to acquire the

Immutable UUID (assuming this is what maps to Deltacloud ID) 2) the conductor launch sequence will be modified to wait to push configs to config server until the instance is launched and can report back the deltacloud ID

Sound about right?

This proposal sounds good but leaves out a bit more detail further down the bootstrap order.

We also need this ID loaded into matahari as well as matahari qpidd authentication data. This is where the discussion around the machine-id came from. If it is a different machine-external API coming out of matahari but available as soon as matahari is started, this WFM.

This could likely be done by audrey with following start order:

audrey client starts audrey sets matahari authentication information on the filesystem audrey sets UUID information on the filesystem audrey starts matahari

There's still a bootstrapping issue here. Well, maybe...possibly depending on who cares.

The problem is that the matahari authentication information would likely come from the instance configuration delivered by the config server. The audrey client would not get configuration data from the config server until the config server learns about the instance from conductor. Conductor wouldn't alert the config server about the instance until it learns the deltacloud ID from the deltacloud driver.

So, it looks like this (I think--assuming we're talking about a cloud engine environment):

1. conductor tells deltacloud to launch guest 2. guest boots in cloud provider 3. audrey client starts 4. audrey client contacts config server, but finds nothing there...yet 5. deltacloud driver tells conductor deltacloud ID for guest 6. conductor tells config server about instance configs 7. audrey client contacts config server, gets configs 8. audrey client sets matahari auth information on filesystem 9. audrey sets UUID information on the filesystem 10. audrey starts matahari (finally) 11. matahari can auth against broker

Seems like a long time before matahari starts up, but I'm not sure of all the rest of the dependencies in the chain that may depend on matahari.

There are several steps that are not necessarily serial, like 2-4 and 5. There's some amount of time between step 1 and step 5 that is determined by the interaction between deltacloud and the cloud provider and how that interaction results in a guest ID. In some cases (as David L. pointed out in another thread), the guest ID is set by the provider, other times it's expected to be provided. When it's provided (either by deltacloud or by conductor) it can immediately be turned around to the config server, and not hold up step 6 much at all. When it's generated by the cloud provider, step 6 is at the mercy of when the cloud provider sees fit to make this information available to deltacloud.

...

The issue with feeding the UUID into matahari via a local api call to matahari is that once matahari is started it registers with QMF, which leaves racey conditions where matahari may be active but without an instance id.

Regards -steve

...
...
Cheers, Mark.

aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel

Hugh O. Brock

10:13 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On Fri, Jul 22, 2011 at 03:30:11PM -0400, Greg Blomquist wrote:

...

On 07/22/2011 02:34 PM, Steven Dake wrote:

...
On 07/22/2011 10:22 AM, Greg Blomquist wrote:

...
On 07/22/2011 01:16 PM, Mark McLoughlin wrote:

...
Hi Greg,

On Fri, 2011-07-22 at 13:05 -0400, Greg Blomquist wrote:

...
If the deltacloud ID is the UUID presented to deltacloud by the cloud provider for the launched instance, then the only reason not to use the deltacloud ID for the config server and audrey startup script is that it requires the guest to launch and report this ID before the configurations can be handed to the config server.

Yeah, it's not ideal.

...
That's not terrible, but just makes the launch sequence more serial instead of parallelizing the launch and seeding the configs to the config server. If it's necessary to reduce the number of IDs floating around, we can do this.

Well, AFAICT, the only ID that the cloud HA stuff will know about from conductor or deltacloud is the deltacloud instance ID. So, that's what we want the guest agent to use when reporting back.

Ah, good point.

...
If you choose to add another ID for config server purposes, that would mean the cloud HA stuff would need to find that out. And I think that's a step too far in terms of intertwining the architecture. The cloud HA stuff should work with standalone deltacloud, without conductor or config server IMHO.

Yep, I'm sold.

The first cut I'm working on will generate IDs for Audrey. But, we can circle back on this and revamp to use the deltacloud ID. The two changes this implies on the Audrey side are

the audrey startup script should use the UUID API to acquire the

Immutable UUID (assuming this is what maps to Deltacloud ID) 2) the conductor launch sequence will be modified to wait to push configs to config server until the instance is launched and can report back the deltacloud ID

Sound about right?

This proposal sounds good but leaves out a bit more detail further down the bootstrap order.

We also need this ID loaded into matahari as well as matahari qpidd authentication data. This is where the discussion around the machine-id came from. If it is a different machine-external API coming out of matahari but available as soon as matahari is started, this WFM.

This could likely be done by audrey with following start order:

audrey client starts audrey sets matahari authentication information on the filesystem audrey sets UUID information on the filesystem audrey starts matahari

There's still a bootstrapping issue here. Well, maybe...possibly depending on who cares.

The problem is that the matahari authentication information would likely come from the instance configuration delivered by the config server. The audrey client would not get configuration data from the config server until the config server learns about the instance from conductor. Conductor wouldn't alert the config server about the instance until it learns the deltacloud ID from the deltacloud driver.

So, it looks like this (I think--assuming we're talking about a cloud engine environment):

conductor tells deltacloud to launch guest

guest boots in cloud provider

audrey client starts

audrey client contacts config server, but finds nothing there...yet

deltacloud driver tells conductor deltacloud ID for guest

conductor tells config server about instance configs

audrey client contacts config server, gets configs

audrey client sets matahari auth information on filesystem

audrey sets UUID information on the filesystem

audrey starts matahari (finally)

matahari can auth against broker

Hmm...

Given that we now have a working userdata mechanism for EC2 and VMWare, and a soon-to-be-working one for RHEV-M... could we not leverage that to compress steps 3-7? Meaning, if we dump the instance configs into userdata, the audrey client could read them from there, then go straight to contacting config server to get its matahari auth information?

Tell me if i'm out of my mind here.

--Hugh

...

Seems like a long time before matahari starts up, but I'm not sure of all the rest of the dependencies in the chain that may depend on matahari.

There are several steps that are not necessarily serial, like 2-4 and 5. There's some amount of time between step 1 and step 5 that is determined by the interaction between deltacloud and the cloud provider and how that interaction results in a guest ID. In some cases (as David L. pointed out in another thread), the guest ID is set by the provider, other times it's expected to be provided. When it's provided (either by deltacloud or by conductor) it can immediately be turned around to the config server, and not hold up step 6 much at all. When it's generated by the cloud provider, step 6 is at the mercy of when the cloud provider sees fit to make this information available to deltacloud.

...
The issue with feeding the UUID into matahari via a local api call to matahari is that once matahari is started it registers with QMF, which leaves racey conditions where matahari may be active but without an instance id.

Regards -steve

...
...
Cheers, Mark.

aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel

aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel

-- == Hugh Brock, hbrock@redhat.com == == Engineering Manager, Cloud BU == == Aeolus Project: Manage virtual infrastructure across clouds. == == http://aeolusproject.org == "I know that you believe you understand what you think I said, but I’m not sure you realize that what you heard is not what I meant." --Robert McCloskey

Greg Blomquist

10:35 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 07/22/2011 05:13 PM, Hugh Brock wrote:

...

On Fri, Jul 22, 2011 at 03:30:11PM -0400, Greg Blomquist wrote:

...
On 07/22/2011 02:34 PM, Steven Dake wrote:

...
On 07/22/2011 10:22 AM, Greg Blomquist wrote:

...
On 07/22/2011 01:16 PM, Mark McLoughlin wrote:

...
Hi Greg,

On Fri, 2011-07-22 at 13:05 -0400, Greg Blomquist wrote:

...
If the deltacloud ID is the UUID presented to deltacloud by the cloud provider for the launched instance, then the only reason not to use the deltacloud ID for the config server and audrey startup script is that it requires the guest to launch and report this ID before the configurations can be handed to the config server.

Yeah, it's not ideal.

...
That's not terrible, but just makes the launch sequence more serial instead of parallelizing the launch and seeding the configs to the config server. If it's necessary to reduce the number of IDs floating around, we can do this.

Well, AFAICT, the only ID that the cloud HA stuff will know about from conductor or deltacloud is the deltacloud instance ID. So, that's what we want the guest agent to use when reporting back.

Ah, good point.

...
If you choose to add another ID for config server purposes, that would mean the cloud HA stuff would need to find that out. And I think that's a step too far in terms of intertwining the architecture. The cloud HA stuff should work with standalone deltacloud, without conductor or config server IMHO.

Yep, I'm sold.

The first cut I'm working on will generate IDs for Audrey. But, we can circle back on this and revamp to use the deltacloud ID. The two changes this implies on the Audrey side are

the audrey startup script should use the UUID API to acquire the

Immutable UUID (assuming this is what maps to Deltacloud ID) 2) the conductor launch sequence will be modified to wait to push configs to config server until the instance is launched and can report back the deltacloud ID

Sound about right?

This proposal sounds good but leaves out a bit more detail further down the bootstrap order.

We also need this ID loaded into matahari as well as matahari qpidd authentication data. This is where the discussion around the machine-id came from. If it is a different machine-external API coming out of matahari but available as soon as matahari is started, this WFM.

This could likely be done by audrey with following start order:

audrey client starts audrey sets matahari authentication information on the filesystem audrey sets UUID information on the filesystem audrey starts matahari

There's still a bootstrapping issue here. Well, maybe...possibly depending on who cares.

The problem is that the matahari authentication information would likely come from the instance configuration delivered by the config server. The audrey client would not get configuration data from the config server until the config server learns about the instance from conductor. Conductor wouldn't alert the config server about the instance until it learns the deltacloud ID from the deltacloud driver.

So, it looks like this (I think--assuming we're talking about a cloud engine environment):

conductor tells deltacloud to launch guest

guest boots in cloud provider

audrey client starts

oops, 3a. audrey client reads user data to find out how to contact config server

...

...

audrey client contacts config server, but finds nothing there...yet

deltacloud driver tells conductor deltacloud ID for guest

conductor tells config server about instance configs

audrey client contacts config server, gets configs

audrey client sets matahari auth information on filesystem

audrey sets UUID information on the filesystem

audrey starts matahari (finally)

matahari can auth against broker

Hmm...

Given that we now have a working userdata mechanism for EC2 and VMWare, and a soon-to-be-working one for RHEV-M... could we not leverage that to compress steps 3-7? Meaning, if we dump the instance configs into userdata, the audrey client could read them from there, then go straight to contacting config server to get its matahari auth information?

Tell me if i'm out of my mind here.

Well, I'm no professional... :)

The trick with user data mechanisms (especially in the public cloud) is that it's limited in size per guest. Of course, this could be part of the selection criteria for deciding on a provider account before launching (does your user data exceed the allowed size of user data in the provider?). But, we'd have to have a way to reliably capture that size (data entry from the user seems a little too brittle here).

Maybe there's a way to split the difference (some config data is required to be passed via "user data" b/c it's needed early, while other data can be acquired after networking, etc. is up). We'd still have to have a way to reliably capture the size of the provider's user data. And, this also introduces the need for a way to determine what user data requires early binding, and what can go the default deferred binding route.

The underlying problem remains that, for any arbitrary configurable guest, we don't know how much user data will be required until we get ready to launch. All this means is that it's possible to blow all of the user data limits out of the water, leaving no way to launch the guest, b/c we place no limits on this data.

...

--Hugh

Andrew Beekhof

25 Jul 25 Jul

6:39 a.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On Sat, Jul 23, 2011 at 5:30 AM, Greg Blomquist gblomqui@redhat.com wrote:

...

On 07/22/2011 02:34 PM, Steven Dake wrote:

...
On 07/22/2011 10:22 AM, Greg Blomquist wrote:

...
On 07/22/2011 01:16 PM, Mark McLoughlin wrote:

...
Hi Greg,

On Fri, 2011-07-22 at 13:05 -0400, Greg Blomquist wrote:

...
If the deltacloud ID is the UUID presented to deltacloud by the cloud provider for the launched instance, then the only reason not to use the deltacloud ID for the config server and audrey startup script is that it requires the guest to launch and report this ID before the configurations can be handed to the config server.

Yeah, it's not ideal.

...
That's not terrible, but just makes the launch sequence more serial instead of parallelizing the launch and seeding the configs to the config server. If it's necessary to reduce the number of IDs floating around, we can do this.

Well, AFAICT, the only ID that the cloud HA stuff will know about from conductor or deltacloud is the deltacloud instance ID. So, that's what we want the guest agent to use when reporting back.

Ah, good point.

...
If you choose to add another ID for config server purposes, that would mean the cloud HA stuff would need to find that out. And I think that's a step too far in terms of intertwining the architecture. The cloud HA stuff should work with standalone deltacloud, without conductor or config server IMHO.

Yep, I'm sold.

The first cut I'm working on will generate IDs for Audrey. But, we can circle back on this and revamp to use the deltacloud ID. The two changes this implies on the Audrey side are 1) the audrey startup script should use the UUID API to acquire the Immutable UUID (assuming this is what maps to Deltacloud ID) 2) the conductor launch sequence will be modified to wait to push configs to config server until the instance is launched and can report back the deltacloud ID

Sound about right?

This proposal sounds good but leaves out a bit more detail further down the bootstrap order.

We also need this ID loaded into matahari as well as matahari qpidd authentication data. This is where the discussion around the machine-id came from. If it is a different machine-external API coming out of matahari but available as soon as matahari is started, this WFM.

This could likely be done by audrey with following start order:

audrey client starts audrey sets matahari authentication information on the filesystem audrey sets UUID information on the filesystem audrey starts matahari

There's still a bootstrapping issue here. Well, maybe...possibly depending on who cares.

The problem is that the matahari authentication information would likely come from the instance configuration delivered by the config server. The audrey client would not get configuration data from the config server until the config server learns about the instance from conductor. Conductor wouldn't alert the config server about the instance until it learns the deltacloud ID from the deltacloud driver.

So, it looks like this (I think--assuming we're talking about a cloud engine environment):

1. conductor tells deltacloud to launch guest 2. guest boots in cloud provider 3. audrey client starts 4. audrey client contacts config server, but finds nothing there...yet 5. deltacloud driver tells conductor deltacloud ID for guest

What about: 5. deltacloud calls into the cloud provider to obtain the/a hardware-based UUID for the guest and tells conductor

...

6. conductor tells config server about instance configs 7. audrey client contacts config server, gets configs 8. audrey client sets matahari auth information on filesystem 9. audrey sets UUID information on the filesystem

No step 9 needed.

...

audrey starts matahari (finally)

matahari can auth against broker

12. matahari raises an "I'm here" event including the hardware based ID that conductor is also using.

...

Seems like a long time before matahari starts up, but I'm not sure of all the rest of the dependencies in the chain that may depend on matahari.

There are several steps that are not necessarily serial, like 2-4 and 5. There's some amount of time between step 1 and step 5 that is determined by the interaction between deltacloud and the cloud provider and how that interaction results in a guest ID. In some cases (as David L. pointed out in another thread), the guest ID is set by the provider, other times it's expected to be provided. When it's provided (either by deltacloud or by conductor) it can immediately be turned around to the config server, and not hold up step 6 much at all. When it's generated by the cloud provider, step 6 is at the mercy of when the cloud provider sees fit to make this information available to deltacloud.

...
The issue with feeding the UUID into matahari via a local api call to matahari is that once matahari is started it registers with QMF, which leaves racey conditions where matahari may be active but without an instance id.

Regards -steve

...
...
Cheers, Mark.

aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel

aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel

David Lutterkort

11:52 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On Mon, 2011-07-25 at 15:39 +1000, Andrew Beekhof wrote:

...

On Sat, Jul 23, 2011 at 5:30 AM, Greg Blomquist gblomqui@redhat.com wrote:

...
So, it looks like this (I think--assuming we're talking about a cloud engine environment):

conductor tells deltacloud to launch guest

guest boots in cloud provider

audrey client starts

audrey client contacts config server, but finds nothing there...yet

deltacloud driver tells conductor deltacloud ID for guest

What about: 5. deltacloud calls into the cloud provider to obtain the/a hardware-based UUID for the guest and tells conductor

Just a minor nit: I don't know any cloud provider that exposes a 'hardware-based UUID' through their API. Best case, they create a unique ID for the instance out of whole cloth.

David

Andrew Beekhof

26 Jul 26 Jul

6:55 a.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On Tue, Jul 26, 2011 at 8:52 AM, David Lutterkort lutter@redhat.com wrote:

...

On Mon, 2011-07-25 at 15:39 +1000, Andrew Beekhof wrote:

...
On Sat, Jul 23, 2011 at 5:30 AM, Greg Blomquist gblomqui@redhat.com wrote:

...
So, it looks like this (I think--assuming we're talking about a cloud engine environment):

1. conductor tells deltacloud to launch guest 2. guest boots in cloud provider 3. audrey client starts 4. audrey client contacts config server, but finds nothing there...yet 5. deltacloud driver tells conductor deltacloud ID for guest

What about: 5. deltacloud calls into the cloud provider to obtain the/a hardware-based UUID for the guest and tells conductor

Just a minor nit: I don't know any cloud provider that exposes a 'hardware-based UUID' through their API.

I used the term "hardware-based UUID" to cover things not part of a traditional image or non-guest installation. Something that is part of the guest's definition or the underlying hardware in the non-guest case.

What you referred to as the user_data parameter in your following email would do just fine.

...

Best case, they create a unique ID for the instance out of whole cloth.

David

Andrew Beekhof

25 Jul 25 Jul

6:20 a.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On Sat, Jul 23, 2011 at 2:40 AM, Mark McLoughlin markmc@redhat.com wrote:

...

Hi Andrew,

On Fri, 2011-07-22 at 08:19 +1000, Andrew Beekhof wrote:

...
Are there any conditions under which /etc/machine-id would get regenerated? If so that would rule it out.

These were my initial thoughts:

Immutable: /etc/machine-id (systemd)

On reflection, I think Filesystem might be a better label for this scope.

...

...
Hardware: ? Something on top of the smbios API Reboot: ? /var/lib/dbus/machine-id (dbus) Agent: In memory using the libuuid API User: /etc/custom-machine-id

Okay, AIUI it (and I've take a quick look to confirm), both /etc/machine-id and /var/lib/dbus/machine-id are basically equivalent concepts

They are both a UUID that should be generated when the machine is installed or booted for the first time and not change thereafter.

Ok, so it looks like the dbus uuid is of no use for the reboot scope and needs to be substituted with something else. I suspected as much - which is why I put a '?' in front of it.

I suspect the simplest path forward is to put something generated by libuuid into /var/run/matahari-reboot-id

...

Looking at the code, I think dbus and systemd make an effort to have these UUIDs be identical, but I'm not 100% sure. Hmm, I just checked an F-16 machine and the two UUIDs are different.

The reference you've probably seen in the dbus docs to the UUID changing on reboot only applies to stateless systems - the UUID is stored on a part of the filesystem which may not persist across reboots. That could equally apply to /etc/machine-id too.

Based on your summary at the end I think we're arguing the same thing, but clearly in order to have a persistent UUID you need to have access to persistent storage. If we don't have that, we can't use that kind of UUID. End of story.

If people don't want to pre-configure something, then a hardware UUID (either read from the hardware or generated based on what is installed) is really the only option.

I'm quite happy to have Matahari's 'Hardware' UUID use smbios if available* and do something with MAC addresses (and/or some other hardware identifier) if smbios is empty/unavailable.

* From what I can tell, RHEV, VMware and EC2 all support smbios... anyone who doesn't?

...

In VM disk images, both of these UUIDs should be deleted so that they are generated when a new VM is booted from the image. Similar to what is done for host SSH keys.

On the RHEV-M side, there's a UUID in SMBIOS that has nothing to do with dbus or systemd. This UUID corresponds to the instance ID you'd see in the deltacloud API.

When you launch a deployable, you'll get the list of deltacloud instance IDs back. I'm guessing you want Matahari to reliably report this UUID back as a "system hardware UUID"?

I did list smbios as an option for the Hardware UUID but not because it has anything to do with deployables. Matahari is not a cloud-specific project.

...

In that case, the answer when running under RHEV-M is to read the UUID from SMBIOS.

Under EC2, you get the ID from:

http://169.254.169.254/latest/meta-data/instance-id

and this isn't a UUID at all. I'm not sure about other clouds.

As long as its unique, Matahari doesn't care which uuid format (if any) its in.

...

Now, config server seems to have its notion of a machine UUID. This is passed to the instance at launch time and the audrey startup script has logic to find it. The idea is that this is supplied to config server by whatever is launching the instance. I'm not sure why this wouldn't also be the deltacloud ID.

Summary:

- For the HA stuff, I think you need Matahari to be able to reliably report the deltacloud instance ID a "system hardware ID" or similar

You basically mean something persistent right? Matahari isn't going to have a notion of a deltacloud ID, but if deltacloud arranges for its ID to be available (either via something like smbios or but pre-populating a known file on disk), then Matahari will happily return it.

...

- That's unrelated to the systemd and dbus IDs, but these IDs should be unique to each machine too.

- You should be able to determine the deltacloud instance ID from inside each VM.

- This deltacloud instance ID should be what Audrey is using when contacting config server.

- Audrey and Matahari should have the same logic for figuring this ID out.

Cheers, Mark.

Perry Myers

9:33 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 07/25/2011 01:20 AM, Andrew Beekhof wrote:

...

On Sat, Jul 23, 2011 at 2:40 AM, Mark McLoughlin markmc@redhat.com wrote:

...
Hi Andrew,

On Fri, 2011-07-22 at 08:19 +1000, Andrew Beekhof wrote:

...
Are there any conditions under which /etc/machine-id would get regenerated? If so that would rule it out.

These were my initial thoughts:

Immutable: /etc/machine-id (systemd)

On reflection, I think Filesystem might be a better label for this scope.

...
...
Hardware: ? Something on top of the smbios API Reboot: ? /var/lib/dbus/machine-id (dbus) Agent: In memory using the libuuid API User: /etc/custom-machine-id

Okay, AIUI it (and I've take a quick look to confirm), both /etc/machine-id and /var/lib/dbus/machine-id are basically equivalent concepts

They are both a UUID that should be generated when the machine is installed or booted for the first time and not change thereafter.

Ok, so it looks like the dbus uuid is of no use for the reboot scope and needs to be substituted with something else. I suspected as much - which is why I put a '?' in front of it.

Taking a step back here, what is the point of 'reboot scope' uuid anyhow?

...

I suspect the simplest path forward is to put something generated by libuuid into /var/run/matahari-reboot-id

...
Looking at the code, I think dbus and systemd make an effort to have these UUIDs be identical, but I'm not 100% sure. Hmm, I just checked an F-16 machine and the two UUIDs are different.

It seems that on a fresh install of F15 or rawhide the id's will not be the same. But when I upgraded an F14 machine to F15 because the /var/lib/dbus/machine-id already existed prior to systemd being installed they ended up being the same

I wonder if this is just an oversight or if it's intentional on F15 fresh installs that they're different... yaay uuid proliferation <sigh>

...

...
The reference you've probably seen in the dbus docs to the UUID changing on reboot only applies to stateless systems - the UUID is stored on a part of the filesystem which may not persist across reboots. That could equally apply to /etc/machine-id too.

Based on your summary at the end I think we're arguing the same thing, but clearly in order to have a persistent UUID you need to have access to persistent storage. If we don't have that, we can't use that kind of UUID. End of story.

There's a bit of confusion here. Guests in the cloud that _are persistent_ have access to persistent storage. However, there are a lot of instances where the guests literally evaporate on reboot. In that case, the lifetime of a particular guest (and therefore it's UUID) can and should be limited to a single boot, no?

...

If people don't want to pre-configure something, then a hardware UUID (either read from the hardware or generated based on what is installed) is really the only option.

I'm quite happy to have Matahari's 'Hardware' UUID use smbios if available* and do something with MAC addresses (and/or some other hardware identifier) if smbios is empty/unavailable.

From what I can tell, RHEV, VMware and EC2 all support smbios...

anyone who doesn't?

Greg, do you know?

...

...
In VM disk images, both of these UUIDs should be deleted so that they are generated when a new VM is booted from the image. Similar to what is done for host SSH keys.

Agreed.

...

...
On the RHEV-M side, there's a UUID in SMBIOS that has nothing to do with dbus or systemd. This UUID corresponds to the instance ID you'd see in the deltacloud API.

When you launch a deployable, you'll get the list of deltacloud instance IDs back. I'm guessing you want Matahari to reliably report this UUID back as a "system hardware UUID"?

I did list smbios as an option for the Hardware UUID but not because it has anything to do with deployables. Matahari is not a cloud-specific project.

...
In that case, the answer when running under RHEV-M is to read the UUID from SMBIOS.

Under EC2, you get the ID from:

http://169.254.169.254/latest/meta-data/instance-id

and this isn't a UUID at all. I'm not sure about other clouds.

As long as its unique, Matahari doesn't care which uuid format (if any) its in.

...
Now, config server seems to have its notion of a machine UUID. This is passed to the instance at launch time and the audrey startup script has logic to find it. The idea is that this is supplied to config server by whatever is launching the instance. I'm not sure why this wouldn't also be the deltacloud ID.

Summary:

For the HA stuff, I think you need Matahari to be able to reliably report the deltacloud instance ID a "system hardware ID" or similar

You basically mean something persistent right? Matahari isn't going to have a notion of a deltacloud ID, but if deltacloud arranges for its ID to be available (either via something like smbios or but pre-populating a known file on disk), then Matahari will happily return it.

...

That's unrelated to the systemd and dbus IDs, but these IDs should be unique to each machine too.

You should be able to determine the deltacloud instance ID from inside each VM.

This deltacloud instance ID should be what Audrey is using when contacting config server.

Audrey and Matahari should have the same logic for figuring this ID out.

Sooo... Where did we end up here?

Filesystem: /etc/machine-id (systemd) falling back to /var/lib /dbus/machine-id for pre-systemd guests (persistent as long as the root filesystem is persistent) Hardware: provided by either bare metal hardware, or by the cloud platform (EC2, RHEV, VMware) via smbios Agent: Generated by libuuid when each agent starts and exists only for the duration of that agent (disappears on agent restart) User: Location of this is tbd, but could be smth like /etc/user-machine-id

Perry

Greg Blomquist

11:12 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 07/25/2011 04:33 PM, Perry Myers wrote:

...

On 07/25/2011 01:20 AM, Andrew Beekhof wrote:

...
On Sat, Jul 23, 2011 at 2:40 AM, Mark McLoughlin markmc@redhat.com wrote:

...

...
If people don't want to pre-configure something, then a hardware UUID (either read from the hardware or generated based on what is installed) is really the only option.

I'm quite happy to have Matahari's 'Hardware' UUID use smbios if available* and do something with MAC addresses (and/or some other hardware identifier) if smbios is empty/unavailable.

From what I can tell, RHEV, VMware and EC2 all support smbios...

anyone who doesn't?

Greg, do you know?

I haven't heard of any that don't support reading uuid from smbios. I have heard (I think from David L.) that gogrid doesn't set a guest ID for launching guests...in that case, deltacloud generates the ID and hands it back to conductor. Don't know if this ID makes it into the guest, or if it shows up in smbios.

--- Greg

David Lutterkort

11:56 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On Mon, 2011-07-25 at 16:33 -0400, Perry Myers wrote:

...

On 07/25/2011 01:20 AM, Andrew Beekhof wrote:

...
If people don't want to pre-configure something, then a hardware UUID (either read from the hardware or generated based on what is installed) is really the only option.

I'm quite happy to have Matahari's 'Hardware' UUID use smbios if available* and do something with MAC addresses (and/or some other hardware identifier) if smbios is empty/unavailable.

From what I can tell, RHEV, VMware and EC2 all support smbios...

anyone who doesn't?

They all support (or, in the case of VMWare, will support shortly) injecting data with the user_data parameter to the create instance operation.

The data shows up in different places in the different clouds; for condor-cloud it's smbios, I believe that's also the case for RHEV 3; in vSphere the data will show up as a file on a virtual CD; on EC2 it appears on a webserver at 169.254.169.254

David

Chris Lalancette

26 Jul 26 Jul

8:06 a.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 07/25/11 - 04:33:48PM, Perry Myers wrote:

...

...
I'm quite happy to have Matahari's 'Hardware' UUID use smbios if available* and do something with MAC addresses (and/or some other hardware identifier) if smbios is empty/unavailable.

From what I can tell, RHEV, VMware and EC2 all support smbios...

anyone who doesn't?

Greg, do you know?

EC2, at least the linux instances, do not support SMBIOS as they are paravirtual Xen guests. VMware supports SMBIOS, but I'm not sure if we are given the tools to manipulate it; deltacloud uses an attached ISO image to "inject" userdata. RHEV supports SMBIOS, but the propsed solution to "injecting" a UUID from the RHEV guys is to use an attached floppy disk (since RHN also uses SMBIOS, and since SMBIOS is limited to 64-byte strings in the normal case, floppy was seen as larger, easier, and less likely to collide with RHN).

-- Chris Lalancette

Mark McLoughlin

8:22 a.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On Tue, 2011-07-26 at 03:06 -0400, Chris Lalancette wrote:

...

RHEV supports SMBIOS, but the propsed solution to "injecting" a UUID from the RHEV guys is to use an attached floppy disk (since RHN also uses SMBIOS, and since SMBIOS is limited to 64-byte strings in the normal case, floppy was seen as larger, easier, and less likely to collide with RHN).

Not quite right - floppy or file injection VDSM hooks is what the RHEV team is proposing for user data.

The deltacloud instance ID (i.e. the VM UUID) is available in SMBIOS by default in RHEV guests.

Cheers, Mark.

Perry Myers

28 Jul 28 Jul

2:56 a.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 07/26/2011 03:22 AM, Mark McLoughlin wrote:

...

On Tue, 2011-07-26 at 03:06 -0400, Chris Lalancette wrote:

...
RHEV supports SMBIOS, but the propsed solution to "injecting" a UUID from the RHEV guys is to use an attached floppy disk (since RHN also uses SMBIOS, and since SMBIOS is limited to 64-byte strings in the normal case, floppy was seen as larger, easier, and less likely to collide with RHN).

Not quite right - floppy or file injection VDSM hooks is what the RHEV team is proposing for user data.

The deltacloud instance ID (i.e. the VM UUID) is available in SMBIOS by default in RHEV guests.

Ok... too much info spread across this thread. I tried to capture some of it here, please let me know if any of this is wildly inaccurate and I will correct:

https://github.com/matahari/matahari/wiki/UUIDs

Greg Blomquist

22 Jul 22 Jul

3:34 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 07/21/2011 04:55 PM, Perry Myers wrote:

...

On 07/21/2011 02:24 AM, Andrew Beekhof wrote:

...
On Tue, Jul 5, 2011 at 11:58 PM, Perry Myerspmyers@redhat.com wrote:

...
...
...
...

generated every time the machine boots

This is == DBus/systemd uuid right?

I believe not. I believe the DBus uuid persists across reboots, but potentially not across upgrades.

Is that by design? If not, then it seems like we could improve that particular uuid by submitting a bug to have dbus/systemd uuids persist across upgrades.

ack to your other comments on this thread, can you write up an API proposal then with the new functions and properties that we'll be exposing for review on list?

Sorry for the delay... here are my proposed changes to the host api wrt. uuids.



<method name="get_uuid" desc="Obtain a UUID with the specified lifetime from the machine">

<arg name="lifetime" dir="I" type="sstr" /> </method>



<method name="set_uuid" desc="Set a UUID with the specified lifetime"> <arg name="lifetime" dir="I" type="sstr" /> <arg name="uuid" dir="I" type="sstr" /> </method>

Thanks for writing this up. My only question now is... what existing uuid implementations map here?

i.e. for the implementation of hardware UUID we plan on using X, for the implementation of immutable UUID we will use /etc/machine-id

Stuff along those lines would complete the matrix needed here.

My understanding is that the UUID API being discussed here is largely for identifying guests and guest services. To that end, it seems like the objects in the warehouse (images and target images; and later, templates, assemblies, deployables, deployments, and services, more?) fall outside the scope of this discussion. Lemme know if I'm wrong there...

As for the Audrey Startup script (the only thing I immediately see as falling in scope here), it would likely rely on a Agent or User scoped UUID. I balk at saying it relies on the Immutable scoped UUID, because there needs to be a reliable way to set the UUID used by the Audrey Startup script in the guest before it launches, so the same value can be shared with the config server.

We could force the Audrey Startup script to rely on the Immutable UUID (if the idea is to reduce the number of UUIDs floating around). That will turn the launching strategy around in Conductor, insofar as Conductor would have to wait until the guest launched and could report back the Immutable UUID (somehow) via deltacloud before it could publish configuration data to the config server for the guest. In the current planned implementation, Conductor generates the Audrey UUID and tells both the Config Server and the guest the value at roughly the same time.

...

Perry _______________________________________________ aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel

Andrew Beekhof

25 Jul 25 Jul

5:11 a.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On Sat, Jul 23, 2011 at 12:34 AM, Greg Blomquist gblomqui@redhat.com wrote:

...

On 07/21/2011 04:55 PM, Perry Myers wrote:

...
On 07/21/2011 02:24 AM, Andrew Beekhof wrote:

...
On Tue, Jul 5, 2011 at 11:58 PM, Perry Myerspmyers@redhat.com wrote:

...
...
...
> - generated every time the machine boots

This is == DBus/systemd uuid right?

I believe not. I believe the DBus uuid persists across reboots, but potentially not across upgrades.

Is that by design? If not, then it seems like we could improve that particular uuid by submitting a bug to have dbus/systemd uuids persist across upgrades.

ack to your other comments on this thread, can you write up an API proposal then with the new functions and properties that we'll be exposing for review on list?

Sorry for the delay... here are my proposed changes to the host api wrt. uuids.



<method name="get_uuid" desc="Obtain a UUID with the specified lifetime from the machine"> <arg name="lifetime" dir="I" type="sstr" />

</method>



<method name="set_uuid" desc="Set a UUID with the specified lifetime"> <arg name="lifetime" dir="I" type="sstr" /> <arg name="uuid" dir="I" type="sstr" />

</method>

Thanks for writing this up. My only question now is... what existing uuid implementations map here?

i.e. for the implementation of hardware UUID we plan on using X, for the implementation of immutable UUID we will use /etc/machine-id

Stuff along those lines would complete the matrix needed here.

My understanding is that the UUID API being discussed here is largely for identifying guests and guest services.

No. This (like the rest of Matahari) is in no way specific to guests or clouds.

It will likely address those cloud cases as well, but that is not its primary reason for existence nor the sole design reference point.

...

To that end, it seems like the objects in the warehouse (images and target images; and later, templates, assemblies, deployables, deployments, and services, more?) fall outside the scope of this discussion. Lemme know if I'm wrong there...

As for the Audrey Startup script (the only thing I immediately see as falling in scope here), it would likely rely on a Agent or User scoped UUID. I balk at saying it relies on the Immutable scoped UUID, because there needs to be a reliable way to set the UUID used by the Audrey Startup script in the guest before it launches, so the same value can be shared with the config server.

We could force the Audrey Startup script to rely on the Immutable UUID (if the idea is to reduce the number of UUIDs floating around). That will turn the launching strategy around in Conductor, insofar as Conductor would have to wait until the guest launched and could report back the Immutable UUID (somehow) via deltacloud before it could publish configuration data to the config server for the guest. In the current planned implementation, Conductor generates the Audrey UUID and tells both the Config Server and the guest the value at roughly the same time.

...
Perry _______________________________________________ aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel

aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel

David Lutterkort

22 Jul 22 Jul

6:16 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On Thu, 2011-07-21 at 16:55 -0400, Perry Myers wrote:

...

Thanks for writing this up. My only question now is... what existing uuid implementations map here?

As for Deltacloud, we generally use whatever unique id the backend uses; that id is used in generating the URL to the resource (image, instance, etc.) and is reported as the id attribute for that instance.

We only require that the id is unique across resources of the same type for that backend. In some cases, those id's are chosen by the user when they create the resources (e.g., instance names in some backends), in others they are generated by the backend cloud.

There are a few cases where we generate unqiue id's if the user doesn't supply one; one of them is the instance ID for Gogrid.

David

Chris Lalancette

1 Jul 1 Jul

2:19 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 07/01/11 - 04:09:20PM, Andrew Beekhof wrote:

...

On Fri, Jul 1, 2011 at 6:56 AM, Perry Myers pmyers@redhat.com wrote:

...
...
TBH, I'm not a fan.

Partly because that UUID is a property and attribute of every agent and are initialised at startup - I don't like the idea of those being volatile.

ick, I forgot about that...

...
It also pre-supposes that you'd never want the dbus id if /etc/machine-id is present.

right, that is a bad assumption. My original ideas around exposing UID in matahari APIs was to have _several_ api calls depending on what you considered to be the proper UUID

So maybe we need to have calls like:

get_dbus_uuid (/var/lib/dbus/machine_id) get_machine_uuid (/etc/machine-id) get_smbios_uuid

And if we _really_ need a separate vm_machine_id get_vm_machine_uuid (/etc/vm_machine_id)

Slight variation on the theme... I think we should focus on the properties of the uuids rather than where they live and/or who generates them.

So I'm thinking we want 5 uuids and probably one accessor, where each uuid would have a different lifetime:

externally configured

generated once for the lifetime of the vm

generated every time the hardware changes

generated every time the machine boots

generated every time an agent starts

With the accessor being: get_uuid(lifetime), and lifetime ::= user|forever|hardware|boot|matahari

One of the things we were trying to do was to not add YAU (Yet Another UUID). That is, we were hoping to hook into one of the existing mechanisms (either dbus or systemd) so that we could re-use this infrastructure. That way existing software wouldn't have to learn about yet another place to fetch a UUID from.

If it turns out that this is impossible, then so be it. But I still think we should investigate the options for integrating with the existing mechanisms.

Or did I completely mis-understand your point?

-- Chris Lalancette

Perry Myers

2:25 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

...

One of the things we were trying to do was to not add YAU (Yet Another UUID). That is, we were hoping to hook into one of the existing mechanisms (either dbus or systemd) so that we could re-use this infrastructure. That way existing software wouldn't have to learn about yet another place to fetch a UUID from.

If it turns out that this is impossible, then so be it. But I still think we should investigate the options for integrating with the existing mechanisms.

Or did I completely mis-understand your point?

I don't think what you said conflicts with what Andrew said... Matahari has other use cases than just cloud, so we'll provide a set of UUIDs. Some of them may be useful for Cloud, others would be ignored. Make sense?

Andrew Beekhof

4 Jul 4 Jul

8:31 a.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On Fri, Jul 1, 2011 at 11:25 PM, Perry Myers pmyers@redhat.com wrote:

...

...
One of the things we were trying to do was to not add YAU (Yet Another UUID). That is, we were hoping to hook into one of the existing mechanisms (either dbus or systemd) so that we could re-use this infrastructure. That way existing software wouldn't have to learn about yet another place to fetch a UUID from.

If it turns out that this is impossible, then so be it. But I still think we should investigate the options for integrating with the existing mechanisms.

Or did I completely mis-understand your point?

I don't think what you said conflicts with what Andrew said... Matahari has other use cases than just cloud, so we'll provide a set of UUIDs. Some of them may be useful for Cloud, others would be ignored.

Right. And if there are existing UUIDs that have the described properties, we will of course use them instead of rolling our own.

...

Make sense?

Steven Dake

30 Jun 30 Jun

11:22 p.m.

New subject: [Matahari] [Pcmk-cloud] /var/lib/dbus/machine_id, imagefactory, and matahari

On 06/29/2011 09:32 PM, Andrew Beekhof wrote:

...

On Wed, Jun 29, 2011 at 2:05 AM, Perry Myers pmyers@redhat.com wrote:

...
On 06/28/2011 10:26 AM, Steven Dake wrote:

...
On 06/28/2011 06:24 AM, Joseph VLcek wrote:

...
On Mon, 2011-06-27 at 13:59 -0700, Steven Dake wrote:

...
On 06/24/2011 04:16 PM, Steven Dake wrote:

...
Currently most linux distributions that use dbus store a UUID in /var/lib/dbus/machine_id. In our pacemaker-cloud work test tools, we must manipulate this file via oz image creation to match a value we know about.

Q1. Is this file freshly created on each image creation/cloning process?

If not, it should be, because Matahari uses this information to uniquely identify a host. If it is copied exactly to each new image, that creates a problem (all hosts appear the same to matahari).

Q2. If/when it is created by image factory, will it be stored in a database or other storage medium?

In pacemaker-cloud we need to have a mapping from image->internal id id so that we know which VM maps to which deployable HA configuration.

If we wait on this point until after our 1.0 release, we could end up with a bunch of images in the field that have either the same machine id or are not mapped in any way that allows us to provide HA functionality.

Regards -steve

After more investigation, Chris and I came up with a workable plan for handling unique VM ids (see thread with subject How Audrey, Conductor, and Audrey's config server interact and their relationship to a unique vm instance id).

Currently the Audrey script runs as a replacement to rc.local. Matahari runs at S99. The general idea is for Audrey script to run as S98 (before Matahari) and write a management-wide unique instance UUID to the file /etc/vm_machine_id (audrey has access to this information).

Matahari could be changed to read /etc/vm_machine_id first. If that file doesn't exist, /var/lib/dbus/machine_id would be read.

This creates some difficulty in running the audrey script at a specific runlevel (it requires some changes to oz to insert init scripts).

Another option is for the current rc.local script that audrey replaces to run the Matahari service starting giblits as its first action.

Comments welcome before I start writing code... -steve

I think the solution of having Audrey store a launch time unique UUID in a file that Matahari can read will work.

It may not be necessary to alter oz to insert init scripts to ensure Audrey runs before Matahari.

Let me explain.

Audrey does not replace /etc/rc.local

When Image Factory builds the image it appends to the end of /etc/rc.local a line of code that will start Audrey.

e.g.: [ -f /usr/bin/audrey ] && /usr/bin/audrey

I propose having Image Factory append another line to /etc/rc.local below where it starts Audrey to start Matahari.

e.g.:

[ -f /usr/bin/audrey ] && /usr/bin/audrey

[ -f <Matahari start> ] && <run Matahari start>

We may need to manage timing to ensure /usr/bin/audrey does not return until it has stored the unique UUID in a file and have it return an error status if it is unable to.

Thoughts?

This sounds fine to me, although the rc.local modification always sounded a bit hacky (rather then using a proper init script). One minor issue is matahari expects to be started via service xxx start and each agent has a separate init script. There are 5 or 6 agents.

Yes, I think having audrey 'start matahari' is very hackish, since in normal systems matahari would start via regular init scripts. So this means for cloud we'd need to disable the normal init scripts and then relegate control to audrey. I don't like that approach...

Let's back up a bit... why does matahari start need to depend on audrey starting? The answer is that we need audrey to put in the /etc/machine-id file.

Well, why not let matahari start at its normal runlevel, and respond to queries, etc, but if you call get-id API, then it should return something that indicates '/etc/machine-id not set yet, giving you dbus-id instead'

Then it's just up to the person doing the querying to wait until the id returned is the /etc/machine-id.

So remove the dep on service start and replace with intelligent application usage

Thoughts?

TBH, I'm not a fan.

Partly because that UUID is a property and attribute of every agent and are initialised at startup - I don't like the idea of those being volatile. It also pre-supposes that you'd never want the dbus id if /etc/machine-id is present.

We talked in the past about providing access to both a hardware _and_ a software uuid.

A software id representing the unique vm instance is what we need. The key feature of the software id is that it needs to be dynamically chanageable (ie: Audrey needs to change it at start time to something unique it knows about based upon a data transfer which we don't yet have a clear handle on how to make happen).

...

Which is /etc/machine-id supposed to be? If the former, then the solution is easy - add an extra API call.

systemd is moving the location of the "hardware id" from /var/lib/dbus/machine-id to /etc/machine-id.

In a cloud environment, if you upload a created image to something like EC2, each time you start a new instance with that image, the machine-id doesn't change. The problem is launching two vms - there is no way to correlate which one is the one we care about - essentially the machine-id would be the same for every instance of that image. If matahari gives out a machine id for two vms, surely it would confuse some piece of software (pacemaker-cloud for example).

...

aeolus-devel mailing list aeolus-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/aeolus-devel

4720

Age (days ago)

4754

Last active (days ago)

aeolus-devel@lists.fedorahosted.org

54 comments

10 participants

tags (0)

participants (10)

Andrew Beekhof
apevec＠redhat.com
Chris Lalancette
David Lutterkort
Greg Blomquist
Hugh O. Brock
Joseph VLcek
Mark McLoughlin
Perry Myers
Steven Dake