If you are running a number of qemu-kvm's with similar guests, you can gain a lot of memory by using ksm properly.
An unattended host running a variable number of qemu-kvm's needs to tune ksm automatically, since when memory is tight, it's better to spend more cpu on merging pages. In more relaxed cases, it's just a waste of time.
The attached service tries to do just that.
It monitors how much memory is used by qemu-kvm processes, and starts ksm when a threshold is passed. Ksm usually manages to free up some memory.
As long as memory used by qemu is above the defined threshold, ksm tries harder and harder to share memory pages (up to a limit). This may happen if a guest starts working and consumes new memory. If there's enough free memory, ksm cools down.
Ksmd service has the usual start/status/stop verbs, and an additional one: signal. One should use that verb just after one starts a new qemu-kvm process or just after such process dies, to let ksm adjust immediately.
Comments and suggestion are welcome.
Thanks,
Dan.
On Thu, Sep 03, 2009 at 02:49:23PM +0300, Dan Kenigsberg wrote:
If you are running a number of qemu-kvm's with similar guests, you can gain a lot of memory by using ksm properly.
An unattended host running a variable number of qemu-kvm's needs to tune ksm automatically, since when memory is tight, it's better to spend more cpu on merging pages. In more relaxed cases, it's just a waste of time.
The attached service tries to do just that.
It monitors how much memory is used by qemu-kvm processes, and starts ksm when a threshold is passed. Ksm usually manages to free up some memory.
As long as memory used by qemu is above the defined threshold, ksm tries harder and harder to share memory pages (up to a limit). This may happen if a guest starts working and consumes new memory. If there's enough free memory, ksm cools down.
Ksmd service has the usual start/status/stop verbs, and an additional one: signal. One should use that verb just after one starts a new qemu-kvm process or just after such process dies, to let ksm adjust immediately.
Comments and suggestion are welcome.
Looks like a nice idea.
I'd be inclined to split this file up a little to get separation of the init script bits, from the tuning logic, and to allow ksm / tuning to be managed more independantly, eg
- /etc/init.d/ksmd - to start/stop ksmd - /etc/init.d/ksmtuned - to start/stop the automatic tuning process - /usr/sbin/ksmtuned - the logic from the loop() function & things it calls
That makes it easier for other distros to share the important bits - eg the actual tuning logic, even if they use different initscript system
Regards, Daniel
On Thu, Sep 03, 2009 at 01:05:47PM +0100, Daniel P. Berrange wrote:
On Thu, Sep 03, 2009 at 02:49:23PM +0300, Dan Kenigsberg wrote:
If you are running a number of qemu-kvm's with similar guests, you can gain a lot of memory by using ksm properly.
An unattended host running a variable number of qemu-kvm's needs to tune ksm automatically, since when memory is tight, it's better to spend more cpu on merging pages. In more relaxed cases, it's just a waste of time.
The attached service tries to do just that.
It monitors how much memory is used by qemu-kvm processes, and starts ksm when a threshold is passed. Ksm usually manages to free up some memory.
As long as memory used by qemu is above the defined threshold, ksm tries harder and harder to share memory pages (up to a limit). This may happen if a guest starts working and consumes new memory. If there's enough free memory, ksm cools down.
Ksmd service has the usual start/status/stop verbs, and an additional one: signal. One should use that verb just after one starts a new qemu-kvm process or just after such process dies, to let ksm adjust immediately.
Comments and suggestion are welcome.
Looks like a nice idea.
I'd be inclined to split this file up a little to get separation of the init script bits, from the tuning logic, and to allow ksm / tuning to be managed more independantly, eg
- /etc/init.d/ksmd - to start/stop ksmd
- /etc/init.d/ksmtuned - to start/stop the automatic tuning process
- /usr/sbin/ksmtuned - the logic from the loop() function & things it calls
That makes it easier for other distros to share the important bits - eg the actual tuning logic, even if they use different initscript system
I can see the merit in splitting to /usr/sbin/ksmtuned (though who cares about other distros !?).
But why should we want two services? I would like ksmtuned to stop ksm completely when it is not needed (and start it up if needed), so what ksmd would be good for?
Dan.
On Fri, Sep 04, 2009 at 06:19:29PM +0300, Dan Kenigsberg wrote:
On Thu, Sep 03, 2009 at 01:05:47PM +0100, Daniel P. Berrange wrote:
On Thu, Sep 03, 2009 at 02:49:23PM +0300, Dan Kenigsberg wrote:
If you are running a number of qemu-kvm's with similar guests, you can gain a lot of memory by using ksm properly.
An unattended host running a variable number of qemu-kvm's needs to tune ksm automatically, since when memory is tight, it's better to spend more cpu on merging pages. In more relaxed cases, it's just a waste of time.
The attached service tries to do just that.
It monitors how much memory is used by qemu-kvm processes, and starts ksm when a threshold is passed. Ksm usually manages to free up some memory.
As long as memory used by qemu is above the defined threshold, ksm tries harder and harder to share memory pages (up to a limit). This may happen if a guest starts working and consumes new memory. If there's enough free memory, ksm cools down.
Ksmd service has the usual start/status/stop verbs, and an additional one: signal. One should use that verb just after one starts a new qemu-kvm process or just after such process dies, to let ksm adjust immediately.
Comments and suggestion are welcome.
Looks like a nice idea.
I'd be inclined to split this file up a little to get separation of the init script bits, from the tuning logic, and to allow ksm / tuning to be managed more independantly, eg
- /etc/init.d/ksmd - to start/stop ksmd
- /etc/init.d/ksmtuned - to start/stop the automatic tuning process
- /usr/sbin/ksmtuned - the logic from the loop() function & things it calls
That makes it easier for other distros to share the important bits - eg the actual tuning logic, even if they use different initscript system
I can see the merit in splitting to /usr/sbin/ksmtuned (though who cares about other distros !?).
But why should we want two services? I would like ksmtuned to stop ksm completely when it is not needed (and start it up if needed), so what ksmd would be good for?
Management applications / developers might like to implement/use a different way of tuning KSM. Thus we should be able to start the core KSM service, without also starting the KSM tuning service.
Daniel
changes since v1: - broken into two services, one starting ksm up (ksmd), and the other tuning it (ksmtuned). - ksmtune logic separated from service code for cleanliness and simpler availability to other distros - ksm/max_kernel_pages default is absurdly low (allows for 8M of shared mem). ksmd now sets it to half of available RAM. - a handful of typos corrected
Can these files be poured into qemu-system rpm? Or should I file for a new package?
Comments and suggestion are still welcome.
Dan.
Hi Dan,
On Tue, 2009-09-15 at 17:22 +0300, Dan Kenigsberg wrote:
changes since v1:
- broken into two services, one starting ksm up (ksmd), and the other tuning it (ksmtuned).
- ksmtune logic separated from service code for cleanliness and simpler availability to other distros
- ksm/max_kernel_pages default is absurdly low (allows for 8M of shared mem). ksmd now sets it to half of available RAM.
- a handful of typos corrected
Can these files be poured into qemu-system rpm? Or should I file for a new package?
Comments and suggestion are still welcome.
My initial reaction was that I'd prefer these to be in a separate ksm RPM, but since ksmtuned is tied to qemu at the moment, I guess it makes sense to include it in qemu for now. We can split it out later, if needs be.
I've added the ksm init script to qemu-common in rawhide. I put it in qemu-common since it doesn't appear to be specific to qemu-kvm or qemu-system.
I was going to add ksmtuned, but when I tested it, it didn't daemonize and 'service ksmtuned start' just hung. So, that needs to be fixed first.
I've pushed a git repo with the scripts and a bunch of minor changes I made:
http://gitorious.org/ksm-control-scripts/ksm-control-scripts
Cheers, Mark.
On Wed, Sep 16, 2009 at 11:02:37AM +0100, Mark McLoughlin wrote:
Hi Dan,
On Tue, 2009-09-15 at 17:22 +0300, Dan Kenigsberg wrote:
changes since v1:
- broken into two services, one starting ksm up (ksmd), and the other tuning it (ksmtuned).
- ksmtune logic separated from service code for cleanliness and simpler availability to other distros
- ksm/max_kernel_pages default is absurdly low (allows for 8M of shared mem). ksmd now sets it to half of available RAM.
- a handful of typos corrected
Can these files be poured into qemu-system rpm? Or should I file for a new package?
Comments and suggestion are still welcome.
My initial reaction was that I'd prefer these to be in a separate ksm RPM, but since ksmtuned is tied to qemu at the moment, I guess it makes sense to include it in qemu for now. We can split it out later, if needs be.
I've added the ksm init script to qemu-common in rawhide. I put it in qemu-common since it doesn't appear to be specific to qemu-kvm or qemu-system.
I was going to add ksmtuned, but when I tested it, it didn't daemonize and 'service ksmtuned start' just hung. So, that needs to be fixed first.
I've pushed a git repo with the scripts and a bunch of minor changes I made:
http://gitorious.org/ksm-control-scripts/ksm-control-scripts
oh, thanks!
Please consider the attached patch for daemonizing ksmtuned.
Dan.
On Wed, 2009-09-16 at 15:40 +0300, Dan Kenigsberg wrote:
On Wed, Sep 16, 2009 at 11:02:37AM +0100, Mark McLoughlin wrote:
Hi Dan,
On Tue, 2009-09-15 at 17:22 +0300, Dan Kenigsberg wrote:
changes since v1:
- broken into two services, one starting ksm up (ksmd), and the other tuning it (ksmtuned).
- ksmtune logic separated from service code for cleanliness and simpler availability to other distros
- ksm/max_kernel_pages default is absurdly low (allows for 8M of shared mem). ksmd now sets it to half of available RAM.
- a handful of typos corrected
Can these files be poured into qemu-system rpm? Or should I file for a new package?
Comments and suggestion are still welcome.
My initial reaction was that I'd prefer these to be in a separate ksm RPM, but since ksmtuned is tied to qemu at the moment, I guess it makes sense to include it in qemu for now. We can split it out later, if needs be.
I've added the ksm init script to qemu-common in rawhide. I put it in qemu-common since it doesn't appear to be specific to qemu-kvm or qemu-system.
I was going to add ksmtuned, but when I tested it, it didn't daemonize and 'service ksmtuned start' just hung. So, that needs to be fixed first.
I've pushed a git repo with the scripts and a bunch of minor changes I made:
http://gitorious.org/ksm-control-scripts/ksm-control-scripts
oh, thanks!
Please consider the attached patch for daemonizing ksmtuned.
...
diff --git a/ksmtuned b/ksmtuned index 5bdc4a3..f97fa6d 100644 --- a/ksmtuned +++ b/ksmtuned @@ -113,4 +113,8 @@ loop () { done }
-loop +PIDFILE=${PIDFILE-/var/run/ksmtune.pid} +if touch "$PIDFILE"; then
- loop &
- echo $! > "$PIDFILE"
+fi
1.6.2.5
Nice and simple and it seems to work well, I like it :-)
Pushing this to rawhide now
Thanks, Mark.
On Wed, Sep 16, 2009 at 05:46:27PM +0100, Mark McLoughlin wrote:
Pushing this to rawhide now
Thanks. Though only now did I notice that you dropped my non-standard "signal" verb. http://gitorious.org/ksm-control-scripts/ksm-control-scripts/commit/84e59d1e... This is how I want managemet to tell ksmtune that something has changed (new qemu process up, or just died). I want ksm to kick in as soon as this happens, not wait another minute.
If you really hate this, we can add a SIGUSR handler to ksmtune for the same aim.
Regards,
Dan.
On Wed, Sep 16, 2009 at 09:19:01PM +0300, Dan Kenigsberg wrote:
On Wed, Sep 16, 2009 at 05:46:27PM +0100, Mark McLoughlin wrote:
Pushing this to rawhide now
Thanks. Though only now did I notice that you dropped my non-standard "signal" verb. http://gitorious.org/ksm-control-scripts/ksm-control-scripts/commit/84e59d1e... This is how I want managemet to tell ksmtune that something has changed (new qemu process up, or just died). I want ksm to kick in as soon as this happens, not wait another minute.
If you really hate this, we can add a SIGUSR handler to ksmtune for the same aim.
I think your 'signal' verb makes sense - I'd just call it 'reload' instead
Regards, Daniel
On Wed, Sep 16, 2009 at 07:21:14PM +0100, Daniel P. Berrange wrote:
On Wed, Sep 16, 2009 at 09:19:01PM +0300, Dan Kenigsberg wrote:
On Wed, Sep 16, 2009 at 05:46:27PM +0100, Mark McLoughlin wrote:
Pushing this to rawhide now
Thanks. Though only now did I notice that you dropped my non-standard "signal" verb. http://gitorious.org/ksm-control-scripts/ksm-control-scripts/commit/84e59d1e... This is how I want managemet to tell ksmtune that something has changed (new qemu process up, or just died). I want ksm to kick in as soon as this happens, not wait another minute.
If you really hate this, we can add a SIGUSR handler to ksmtune for the same aim.
I think your 'signal' verb makes sense - I'd just call it 'reload' instead
Mark, would you prefer the attached implementation? (I know I do) DPB, is the name 'retune' good enough? (I just do not see what is 'loaded')
Dan.
Hi Dan,
Sorry for taking so long to get around to this
On Wed, 2009-09-23 at 10:52 +0300, Dan Kenigsberg wrote:
On Wed, Sep 16, 2009 at 07:21:14PM +0100, Daniel P. Berrange wrote:
On Wed, Sep 16, 2009 at 09:19:01PM +0300, Dan Kenigsberg wrote:
On Wed, Sep 16, 2009 at 05:46:27PM +0100, Mark McLoughlin wrote:
Pushing this to rawhide now
Thanks. Though only now did I notice that you dropped my non-standard "signal" verb. http://gitorious.org/ksm-control-scripts/ksm-control-scripts/commit/84e59d1e... This is how I want managemet to tell ksmtune that something has changed (new qemu process up, or just died). I want ksm to kick in as soon as this happens, not wait another minute.
If you really hate this, we can add a SIGUSR handler to ksmtune for the same aim.
I think your 'signal' verb makes sense - I'd just call it 'reload' instead
Mark, would you prefer the attached implementation? (I know I do) DPB, is the name 'retune' good enough? (I just do not see what is 'loaded')
Yeah, that makes sense - e.g. 'retune' doesn't reload ksmtuned.conf
I've pushed this now, it'll be in qemu-0.11.0-5.fc12
Thanks, Mark.
On Wed, 2009-09-23 at 10:52 +0300, Dan Kenigsberg wrote:
diff --git a/ksmtuned.init b/ksmtuned.init index 205531a..46332f8 100644 --- a/ksmtuned.init +++ b/ksmtuned.init @@ -75,8 +75,11 @@ case "$1" in condrestart) condrestart ;;
- retune)
kill -SIGUSR1 `cat ${pidfile}`
*)RETVAL=$?
echo $"Usage: $prog {start|stop|restart|condrestart|status|help}"
echo $"Usage: $prog {start|stop|restart|condrestart|status|retune|help}" RETVAL=3
esac
Unusually, I'm actually testing the stuff I'm building today and I hit:
/etc/init.d/ksmtuned: line 81: syntax error near unexpected token `)' /etc/init.d/ksmtuned: line 81: ` *)'
Missing a ';;' in the handling of retune; fixed now
Cheers, Mark.
On Tue, Sep 15, 2009 at 05:22:45PM +0300, Dan Kenigsberg wrote:
changes since v1:
- ksm/max_kernel_pages default is absurdly low (allows for 8M of shared mem). ksmd now sets it to half of available RAM.
Thanks for including this. Unfortunately upstream is going with the absurdly low number, and we would rather not patch the kernel just to change the default if there is a sane way to handle this in userspace. I am working on the KSM test cases for virt test day tomorrow right now, and I will include the init script (and ksmtuned if it is ready in time).
Justin