RHEVM Integration with beaker

List overview All Threads
Download

newer

older

Beaker provisioning without Cobbler

scheduling deadlock

Bill Peck

19 Dec 2011 19 Dec '11

7:16 p.m.

Hi Steve,

I'd like to discuss more on how we can best integrate Beaker with RHEVM.

Goals: 1 - Dynamically create systems based on the requirements from Beaker recipe. - Attempt to schedule on RHEVM first. 2 - Quick provisioning by using pre-built images 3 - Provide images for operating systems that are difficult to automate (Windows)

Implementing all of the goals at once may be too much. I'd rather see us tackle 1 and then move on to 2 and then 3.

The big question is how do we do this efficiently? Do we want to be able to support multiple RHEVM servers?

thinking out loud here, here is how we currently process a recipe.

- A new recipe comes in that requires an x86_64 system with 4 gigs of ram and 20 gig of disk space on Distro X. - We select all systems that could possibly match, even ones that are on labs that don't have Distro X (it may show up later) - If the recipe is a multi-host recipe we then remove bad choices, all recipes need to run in one lab. - We schedule the recipe when a join condition from the recipe matches 1 or more systems. - Finally when all recipes in a multi-host recipe set are scheduled we move all of them to Running and kick off pxe installs

Because we want to create the system dynamically we can't rely on sql to alert us that a match is available. I'm concerned with scalability, I don't think you remember this, but old rhts attempted to match every queued recipe every time through the loop. It was horrendous! The more queued recipes you had the longer it took to do one loop. The sql method works wonderfully because we join on the system being available, we never see recipes that can't be serviced. So my concern is how we implement this with RHEVM without creating a bottle neck?

I'm also wondering if RHEVM will be able to tell us the reason for not being able to create a host. For example, when we ask to create a 32 gig system and RHEVM replies that it can't, is it because it doesn't currently have enough ram because other hosts are running or is it because the RHEVH box only has 16 gig ram it will never be able to satisfy it. And if no one can satisfy it we should abort it.

I think you had some ideas on this and I'm hoping we can work through them here. :-)

Show replies by date

ykaul＠redhat.com

20 Dec 20 Dec

7 a.m.

On 12/19/2011 09:16 PM, Bill Peck wrote:

...

Hi Steve,

I'd like to discuss more on how we can best integrate Beaker with RHEVM.

RHEV or oVirt? RHEV 3.0 is almost available, oVirt's first release is scheduled to end of Jan 2012 (http://www.ovirt.org/wiki/Releases/First_Release ). The advantage of oVirt over RHEV is that it'll have the Python CLI/SDK, which is cooler than RHEV's REST interface. But personally I'd go with the REST interface first. Conversion should be fairly easy later on.

...

Goals: 1 - Dynamically create systems based on the requirements from Beaker recipe. - Attempt to schedule on RHEVM first. 2 - Quick provisioning by using pre-built images 3 - Provide images for operating systems that are difficult to automate (Windows)

Windows is not that difficult to provision automatically, from my experience. It's very configurable and scriptable. But you need the environment set up (Windows RIS services, local Windows Update server, etc.). I used to have that (all in VMs, btw!).

...

Implementing all of the goals at once may be too much. I'd rather see us tackle 1 and then move on to 2 and then 3.

The big question is how do we do this efficiently? Do we want to be able to support multiple RHEVM servers?

thinking out loud here, here is how we currently process a recipe.

A new recipe comes in that requires an x86_64 system with 4 gigs of

ram and 20 gig of disk space on Distro X.

- We support 32bit as well, of course. - Q: do you actually need 20GB of disks? If not really, you can use thin provisioning, which will be quicker to provision. RAW would provide better performance, though.

...

We select all systems that could possibly match, even ones that are

on labs that don't have Distro X (it may show up later)

If the recipe is a multi-host recipe we then remove bad choices, all

recipes need to run in one lab.

We schedule the recipe when a join condition from the recipe matches

1 or more systems.

Finally when all recipes in a multi-host recipe set are scheduled we

move all of them to Running and kick off pxe installs

Because we want to create the system dynamically we can't rely on sql to alert us that a match is available. I'm concerned with scalability, I don't think you remember this, but old rhts attempted to match every queued recipe every time through the loop. It was horrendous! The more queued recipes you had the longer it took to do one loop. The sql method works wonderfully because we join on the system being available, we never see recipes that can't be serviced. So my concern is how we implement this with RHEVM without creating a bottle neck?

You CAN rely on a match 'partially' available - a template may already exist that matches your requirement. The 'partially' refers to the fact you may not be able to execute a snapshot of that template due to lack of resources.

...

I'm also wondering if RHEVM will be able to tell us the reason for not being able to create a host. For example, when we ask to create a 32 gig system and RHEVM replies that it can't, is it because it doesn't currently have enough ram because other hosts are running or is it because the RHEVH box only has 16 gig ram it will never be able to satisfy it. And if no one can satisfy it we should abort it.

Yes, you should get failure message. Some messages are better than others ;-/

...

I think you had some ideas on this and I'm hoping we can work through them here. :-)

Alternatively, (and something we are looking for future versions to be built in to the product) - have X number of systems ready for re-provisioning. For example, our internal continuous integration system has several VMs which are always ready for re-install if needed, used as RHEVM servers. There's a fixed number of them. For phase 1, I'd start with that, see how it goes. Then we can add dynamically provisioned systems.

Note that creating VMs should be merely a snapshot from an existing template and then 'prepping' the VM (sysprep in Windows, something similar in Linux). That should be quick - take ~5 minutes, tops. If you are installing from scratch it may take a bit more, similar to a physical system.

...

Beaker-devel mailing list Beaker-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/beaker-devel

Steven Lawrance

8:22 a.m.

Excerpts from Yaniv Kaul's message of Tue Dec 20 17:00:54 +1000 2011:

...

On 12/19/2011 09:16 PM, Bill Peck wrote:

...
3 - Provide images for operating systems that are difficult to automate (Windows)

Windows is not that difficult to provision automatically, from my experience. It's very configurable and scriptable. But you need the environment set up (Windows RIS services, local Windows Update server, etc.). I used to have that (all in VMs, btw!).

Our approach might depend on how much flexibility we want or require with each of these other operating systems. Automatic provisioning of Windows (without images) sounds easy enough, but we would need a sensible way to model the requirements for a job (for all variants of all the OSes we are going to support). If maintaining a handful of standard images isn't going to generate too much overhead, it would make this aspect simpler. (And there is the performance advantage, mentioned below.)

...

...
thinking out loud here, here is how we currently process a recipe.

A new recipe comes in that requires an x86_64 system with 4 gigs of

ram and 20 gig of disk space on Distro X.

We support 32bit as well, of course.

Q: do you actually need 20GB of disks? If not really, you can use thin

provisioning, which will be quicker to provision. RAW would provide better performance, though.

If the recipe requires at least 20GB of disk, then yes, it should have a device with at least that amount of disk available because there's no telling what the job will do with it. Bill's example is for a job that might be submitted with something like:

<hostRequires> <memory op=">=" value="4000"/> <arch op="=" value="x86_64"/> <key_value key="DISKSPACE" op=">=" value="20000"/> </hostRequires>

Currently, we just search for all systems that match those requirements. But there's nothing to say this recipe couldn't run in a VM but we would need to first check whether RHEVM can provision such a guest, and if so, create a specific guest definition.

...

...
Because we want to create the system dynamically we can't rely on sql to alert us that a match is available. I'm concerned with scalability, I don't think you remember this, but old rhts attempted to match every queued recipe every time through the loop. It was horrendous! The more queued recipes you had the longer it took to do one loop. The sql method works wonderfully because we join on the system being available, we never see recipes that can't be serviced. So my concern is how we implement this with RHEVM without creating a bottle neck?

You CAN rely on a match 'partially' available - a template may already exist that matches your requirement. The 'partially' refers to the fact you may not be able to execute a snapshot of that template due to lack of resources.

So basically RHEVM will tell us "yes, this guest could be created, just not right now"?

Knowing whether we can provision that VM immediately is important because otherwise we can move onto the search for a "real" system that matches the requirements, i.e. we wouldn't want a job held up because it *could* be run via RHEV while there are idle baremetal boxes sitting in the lab. But at the same time, we don't want to poll RHEVM continually for all jobs sitting in the queue.

Could RHEVM notify Beaker of changes in available resources (such as when a guest is destroyed or created)? Maybe Beaker itself could track RHEV's available resources and just poll for changes at the start of each loop over the queued recipes, rather than repeatedly asking RHEVM whether it could provision each one?

...

Alternatively, (and something we are looking for future versions to be built in to the product) - have X number of systems ready for re-provisioning. For example, our internal continuous integration system has several VMs which are always ready for re-install if needed, used as RHEVM servers. There's a fixed number of them. For phase 1, I'd start with that, see how it goes. Then we can add dynamically provisioned systems.

I'm not sure I follow -- do you mean as target/test hosts? That's more or less the approach we already take; we have quite a few "systems" available to be scheduled in Beaker that are virt guests with fixed memory and storage. So the whole goal here is dynamically provisioned systems in that sense.

...

Note that creating VMs should be merely a snapshot from an existing template and then 'prepping' the VM (sysprep in Windows, something similar in Linux). That should be quick - take ~5 minutes, tops. If you are installing from scratch it may take a bit more, similar to a physical system.

This speed-up is a good benefit, but I think first of all we really need to be able to install from scratch just like we currently do, by tweaking the kickstarts (and allowing users to submit custom kickstarts) so that we can run more recipes as-is.

Steven.

ykaul＠redhat.com

10:58 a.m.

On 12/20/2011 10:22 AM, Steven Lawrance wrote:

...

Excerpts from Yaniv Kaul's message of Tue Dec 20 17:00:54 +1000 2011:

...
On 12/19/2011 09:16 PM, Bill Peck wrote:

...
3 - Provide images for operating systems that are difficult to automate (Windows)

Windows is not that difficult to provision automatically, from my experience. It's very configurable and scriptable. But you need the environment set up (Windows RIS services, local Windows Update server, etc.). I used to have that (all in VMs, btw!).

Our approach might depend on how much flexibility we want or require with each of these other operating systems. Automatic provisioning of Windows (without images) sounds easy enough, but we would need a sensible way to model the requirements for a job (for all variants of all the OSes we are going to support). If maintaining a handful of standard images isn't going to generate too much overhead, it would make this aspect simpler. (And there is the performance advantage, mentioned below.)

...
...
thinking out loud here, here is how we currently process a recipe.

A new recipe comes in that requires an x86_64 system with 4 gigs of

ram and 20 gig of disk space on Distro X.

We support 32bit as well, of course.

Q: do you actually need 20GB of disks? If not really, you can use thin

provisioning, which will be quicker to provision. RAW would provide better performance, though.

If the recipe requires at least 20GB of disk, then yes, it should have a device with at least that amount of disk available because there's no telling what the job will do with it. Bill's example is for a job that might be submitted with something like:

<hostRequires> <memory op=">=" value="4000"/> <arch op="=" value="x86_64"/> <key_value key="DISKSPACE" op=">=" value="20000"/> </hostRequires>

Currently, we just search for all systems that match those requirements. But there's nothing to say this recipe couldn't run in a VM but we would need to first check whether RHEVM can provision such a guest, and if so, create a specific guest definition.

OK, so if DISKSPACE is mentioned, we should probably go with RAW pre-allocated storage, otherwise with a standard 100GB or so thin-provisioned disk, I guess.

...

...
...
Because we want to create the system dynamically we can't rely on sql to alert us that a match is available. I'm concerned with scalability, I don't think you remember this, but old rhts attempted to match every queued recipe every time through the loop. It was horrendous! The more queued recipes you had the longer it took to do one loop. The sql method works wonderfully because we join on the system being available, we never see recipes that can't be serviced. So my concern is how we implement this with RHEVM without creating a bottle neck?

You CAN rely on a match 'partially' available - a template may already exist that matches your requirement. The 'partially' refers to the fact you may not be able to execute a snapshot of that template due to lack of resources.

So basically RHEVM will tell us "yes, this guest could be created, just not right now"?

Knowing whether we can provision that VM immediately is important because otherwise we can move onto the search for a "real" system that matches the requirements, i.e. we wouldn't want a job held up because it *could* be run via RHEV while there are idle baremetal boxes sitting in the lab. But at the same time, we don't want to poll RHEVM continually for all jobs sitting in the queue.

The only real way to know if a guest can run is actually execute it.

...

Could RHEVM notify Beaker of changes in available resources (such as when a guest is destroyed or created)? Maybe Beaker itself could track RHEV's available resources and just poll for changes at the start of each loop over the queued recipes, rather than repeatedly asking RHEVM whether it could provision each one?

All events can be registered to and sent via an email notification. But I think we are optimizing for the worst case scenario, where we ran out of resources. I expect us to have enough resources to handle peak loads. And btw, we are talking about tens to hundreds of concurrent instances, right?

...

...
Alternatively, (and something we are looking for future versions to be built in to the product) - have X number of systems ready for re-provisioning. For example, our internal continuous integration system has several VMs which are always ready for re-install if needed, used as RHEVM servers. There's a fixed number of them. For phase 1, I'd start with that, see how it goes. Then we can add dynamically provisioned systems.

I'm not sure I follow -- do you mean as target/test hosts? That's more or less the approach we already take; we have quite a few "systems" available to be scheduled in Beaker that are virt guests with fixed memory and storage. So the whole goal here is dynamically provisioned systems in that sense.

The 'dynamic' is in the sense that you just need to 'prep' them, not do the whole allocation: 1. The VM is already created 2. Its disks are defined. 3. It has a network interface (may add more, but this brings more complexity)

Now, you: 4. You may change the number of vCPUs and memory 5. execute and prep it 6. Assign it to a user

So this is 'half dynamic'. A more dynamic would replace step 1 with: 1. Create a VM with a template 4-6

And completely loose would be: 1. Create a VM from a blank template 2-4 5. Provision an OS on that blank VM (installed via PXE/CDROM) 6. Assign to a user.

...

...
Note that creating VMs should be merely a snapshot from an existing template and then 'prepping' the VM (sysprep in Windows, something similar in Linux). That should be quick - take ~5 minutes, tops. If you are installing from scratch it may take a bit more, similar to a physical system.

This speed-up is a good benefit, but I think first of all we really need to be able to install from scratch just like we currently do, by tweaking the kickstarts (and allowing users to submit custom kickstarts) so that we can run more recipes as-is.

I'm not sure I agree. Certainly for Windows, where Sysprep is an established process and is well integrated into RHEVM. For Linux, see http://libguestfs.org/virt-sysprep.1.html

But we can start one way and convert to the other later on. Y.

...

Steven. _______________________________________________ Beaker-devel mailing list Beaker-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/beaker-devel

Steven Lawrance

5 Jan 5 Jan

10:48 p.m.

Excerpts from Yaniv Kaul's message of Tue Dec 20 20:58:55 +1000 2011:

...

On 12/20/2011 10:22 AM, Steven Lawrance wrote:

...
Knowing whether we can provision that VM immediately is important because otherwise we can move onto the search for a "real" system that matches the requirements, i.e. we wouldn't want a job held up because it *could* be run via RHEV while there are idle baremetal boxes sitting in the lab. But at the same time, we don't want to poll RHEVM continually for all jobs sitting in the queue.

The only real way to know if a guest can run is actually execute it.

...
Could RHEVM notify Beaker of changes in available resources (such as when a guest is destroyed or created)? Maybe Beaker itself could track RHEV's available resources and just poll for changes at the start of each loop over the queued recipes, rather than repeatedly asking RHEVM whether it could provision each one?

But I think we are optimizing for the worst case scenario, where we ran out of resources. I expect us to have enough resources to handle peak loads. And btw, we are talking about tens to hundreds of concurrent instances, right?

I don't think it's safe to make that assumption (particularly not for all Beaker instances). In fact, we can probably assume the opposite...

So if we can't distinguish not being able to provision a particular recipe via RHEV "right now" as opposed to "never (with current resources)", then we at least need a cheap way to determine the former in order to avoid a bottleneck in the scheduler.

That is, when a job is submitted, we want to determine whether it _can_ be satisfied by RHEV without then restricting it to RHEV and preventing it from going to baremetal hardware in the lab.

Perhaps it is enough for Beaker to retry executing virtualisable recipes when it knows the situation has changed (e.g. another recipe completed thereby freeing resources, or if it can be informed of events like new virt hosts being added). But if tens or hundreds of virtualisable recipes are sitting in the queue, that could still take quite a long time.

...

All events can be registered to and sent via an email notification.

Is any other notification method available?

Steven.

Dan Callaghan

11:41 p.m.

Excerpts from Steven Lawrance's message of Fri Jan 06 08:48:35 +1000 2012:

...

So if we can't distinguish not being able to provision a particular recipe via RHEV "right now" as opposed to "never (with current resources)", then we at least need a cheap way to determine the former in order to avoid a bottleneck in the scheduler.

That is, when a job is submitted, we want to determine whether it _can_ be satisfied by RHEV without then restricting it to RHEV and preventing it from going to baremetal hardware in the lab.

Does Beaker even need to care about the difference between "right now" and "never"? If a recipe can't be provisioned by RHEV "right now", whether because RHEV's resources are temporarily exhausted or the recipe could never fit, shouldn't Beaker then try to provision it on real hardware immediately? Otherwise we could end up with recipes queued waiting for RHEV while real hardware sits idle.

Steven Lawrance

9 Jan 9 Jan

12:36 a.m.

Excerpts from Dan Callaghan's message of Fri Jan 06 09:41:41 +1000 2012:

...

Excerpts from Steven Lawrance's message of Fri Jan 06 08:48:35 +1000 2012:

...
So if we can't distinguish not being able to provision a particular recipe via RHEV "right now" as opposed to "never (with current resources)", then we at least need a cheap way to determine the former in order to avoid a bottleneck in the scheduler.

That is, when a job is submitted, we want to determine whether it _can_ be satisfied by RHEV without then restricting it to RHEV and preventing it from going to baremetal hardware in the lab.

Does Beaker even need to care about the difference between "right now" and "never"? If a recipe can't be provisioned by RHEV "right now", whether because RHEV's resources are temporarily exhausted or the recipe could never fit, shouldn't Beaker then try to provision it on real hardware immediately?

Yes it should, but what if the real hardware resources are also exhausted...

...

Otherwise we could end up with recipes queued waiting for RHEV while real hardware sits idle.

Right, so we can't decide at submission time whether the recipe _is_ going to be provisioned via RHEV, only whether it _can_.

Bill's concern is the time spent polling RHEV to check whether it has the resources, on top of the existing SQL query for "real" systems, once for each queued recipe, on each iteration of the loop (currently every 20 seconds).

Steven.

Dan Callaghan

12:53 a.m.

Excerpts from Steven Lawrance's message of Mon Jan 09 10:36:48 +1000 2012:

...

Bill's concern is the time spent polling RHEV to check whether it has the resources, on top of the existing SQL query for "real" systems, once for each queued recipe, on each iteration of the loop (currently every 20 seconds).

That SQL query is already fairly expensive, but it doesn't really hurt. Is the call to RHEV to say "can you provision this recipe?" actually slower than that SQL query? (Can RHEV handle the load from such frequent requests?)

Maybe we are worrying over nothing. Maybe adding lots of code in Beaker to track RHEV's available resources isn't really necessary?

-- Dan Callaghan dcallagh@redhat.com

ykaul＠redhat.com

7:32 a.m.

On 01/09/2012 02:53 AM, Dan Callaghan wrote:

...

Excerpts from Steven Lawrance's message of Mon Jan 09 10:36:48 +1000 2012:

...
Bill's concern is the time spent polling RHEV to check whether it has the resources, on top of the existing SQL query for "real" systems, once for each queued recipe, on each iteration of the loop (currently every 20 seconds).

That SQL query is already fairly expensive, but it doesn't really hurt. Is the call to RHEV to say "can you provision this recipe?" actually slower than that SQL query? (Can RHEV handle the load from such frequent requests?)

Maybe we are worrying over nothing. Maybe adding lots of code in Beaker to track RHEV's available resources isn't really necessary?

I honestly don't think it is necessary. The being said, the 'call' is expensive, in the sense that RHEVM would try to launch the VM - and if it fails, would try on other servers as well. So a complete 'failure' would take time. I still think we are optimizing for a corner-case here and do suggest we'll have a POC - with intentionally limited RHEV setup, to see what happens. Obviously, I expect the real deployment to have a generous amount of hardware. Y.

...

Bill Peck

3:31 p.m.

On 01/08/2012 07:53 PM, Dan Callaghan wrote:

...

Excerpts from Steven Lawrance's message of Mon Jan 09 10:36:48 +1000 2012:

...
Bill's concern is the time spent polling RHEV to check whether it has the resources, on top of the existing SQL query for "real" systems, once for each queued recipe, on each iteration of the loop (currently every 20 seconds).

That SQL query is already fairly expensive, but it doesn't really hurt. Is the call to RHEV to say "can you provision this recipe?" actually slower than that SQL query? (Can RHEV handle the load from such frequent requests?)

The difference is we're doing one sql command that joins the queued recipes to available systems. So even if you have thousands of queued recipes its still one sql call and you only get back a list that has work to be done.

We don't want to constantly process thousands of queued recipes, that's what the old rhts did.

...

Maybe we are worrying over nothing. Maybe adding lots of code in Beaker to track RHEV's available resources isn't really necessary?

ykaul＠redhat.com

6 Jan 6 Jan

8:34 a.m.

On 01/06/2012 12:48 AM, Steven Lawrance wrote:

...

Excerpts from Yaniv Kaul's message of Tue Dec 20 20:58:55 +1000 2011:

...
On 12/20/2011 10:22 AM, Steven Lawrance wrote:

...
Knowing whether we can provision that VM immediately is important because otherwise we can move onto the search for a "real" system that matches the requirements, i.e. we wouldn't want a job held up because it *could* be run via RHEV while there are idle baremetal boxes sitting in the lab. But at the same time, we don't want to poll RHEVM continually for all jobs sitting in the queue.

The only real way to know if a guest can run is actually execute it.

...
Could RHEVM notify Beaker of changes in available resources (such as when a guest is destroyed or created)? Maybe Beaker itself could track RHEV's available resources and just poll for changes at the start of each loop over the queued recipes, rather than repeatedly asking RHEVM whether it could provision each one?

But I think we are optimizing for the worst case scenario, where we ran out of resources. I expect us to have enough resources to handle peak loads. And btw, we are talking about tens to hundreds of concurrent instances, right?

I don't think it's safe to make that assumption (particularly not for all Beaker instances). In fact, we can probably assume the opposite...

I highly doubt that, but we can approach the problem from a different angle: say we have a RHEV instance which can run 100 VMs, total of 100GB RAM and 100vCPUs. Just don't let Beaker provision more than X percent of that total number.

(BTW, next version should have a quota feature - http://www.ovirt.org/wiki/Features/DetailedQuota ).

...

So if we can't distinguish not being able to provision a particular recipe via RHEV "right now" as opposed to "never (with current resources)", then we at least need a cheap way to determine the former in order to avoid a bottleneck in the scheduler.

'Never' is easy, in the sense that you cannot run a VM with more vCPUs than available pCPUs on the hosts, nor one with more memory than any host has. I'm not aware of other 'never' scenarios.

...

That is, when a job is submitted, we want to determine whether it _can_ be satisfied by RHEV without then restricting it to RHEV and preventing it from going to baremetal hardware in the lab.

Perhaps it is enough for Beaker to retry executing virtualisable recipes when it knows the situation has changed (e.g. another recipe completed thereby freeing resources, or if it can be informed of events like new virt hosts being added). But if tens or hundreds of virtualisable recipes are sitting in the queue, that could still take quite a long time.

...
All events can be registered to and sent via an email notification.

Is any other notification method available?

Not right now. Anything in specific? Y.

...

Steven. _______________________________________________ Beaker-devel mailing list Beaker-devel@lists.fedorahosted.org https://fedorahosted.org/mailman/listinfo/beaker-devel

4555

Age (days ago)

4576

Last active (days ago)

beaker-devel@lists.fedorahosted.org

10 comments

4 participants

tags (0)

participants (4)

Bill Peck
Dan Callaghan
Steven Lawrance
ykaul＠redhat.com