[irqbalance/f17] Bugfixes:

Petr Holasek pholasek at fedoraproject.org
Tue Aug 28 11:41:56 UTC 2012


commit 2594fc85fc55be2eaade71cd4cda8b72fa408860
Author: Petr Holasek <pholasek at redhat.com>
Date:   Thu Aug 23 15:27:51 2012 +0200

    Bugfixes:
    
    - make irqbalance scan for new irqs when it detects new irqs (bz832815)
    - Fixes SIGFPE crash for some banning configuration (bz849792)
    - Fixes affinity_hint values processing (bz832815)
    - Adds banirq and bansript options (bz837049)
    - imake isn't needed for building any more (bz844359)
    - Fixes clogging of syslog (bz837646)
    - Added IRQBALANCE_ARGS variable for passing arguments via systemd(bz837048)
    - Fixes --hint-policy=subset behavior (bz844381)

 0001-Add-sample-irqbalance-environment-file.patch  |   74 +++++++
 0002-introduce-banirq-option.patch                 |  172 +++++++++++++++
 ...ANCE_BANNED_CPUS-is-set-proc-stat-is-not-.patch |   44 ++++
 ...ance-scan-for-new-irqs-when-it-detects-ne.patch |   91 ++++++++
 0005-Add-banscript-option.patch                    |  218 ++++++++++++++++++++
 ...cpu-powersave-code-disabled-when-power_th.patch |   41 ++++
 ...ity-hint-also-if-the-current-policy-is-su.patch |  103 +++++++++
 ...ed-check-for-avoidance-of-division-by-zer.patch |   31 +++
 irqbalance.spec                                    |   33 +++-
 9 files changed, 804 insertions(+), 3 deletions(-)
---
diff --git a/0001-Add-sample-irqbalance-environment-file.patch b/0001-Add-sample-irqbalance-environment-file.patch
new file mode 100644
index 0000000..ec6f25e
--- /dev/null
+++ b/0001-Add-sample-irqbalance-environment-file.patch
@@ -0,0 +1,74 @@
+From 626dded557de1e7b90cb847df9e900d40be5af1a Mon Sep 17 00:00:00 2001
+From: Neil Horman <nhorman at tuxdriver.com>
+Date: Wed, 14 Dec 2011 07:09:07 -0500
+Subject: [PATCH 1/8] Add sample irqbalance environment file
+
+It was pointed out that the example systemd unit file pointed to a corresponding
+environment file that had no sample.  Fix that up, and modify the unit file to
+pass available option via environment variables rather than command line options
+since that looks a little cleaner.
+
+Signed-off-by: Neil Horman <nhorman at tuxdriver.com>
+
+add irqbalance args variable to env file
+
+Allow users to pass general arguments to irqbalance through systemd
+
+Signed-off-by: Neil Horman <nhorman at tuxdriver.com>
+---
+ misc/irqbalance.env     | 26 ++++++++++++++++++++++++++
+ misc/irqbalance.service |  5 ++---
+ 2 files changed, 28 insertions(+), 3 deletions(-)
+ create mode 100644 misc/irqbalance.env
+
+diff --git a/misc/irqbalance.env b/misc/irqbalance.env
+new file mode 100644
+index 0000000..bd87e3d
+--- /dev/null
++++ b/misc/irqbalance.env
+@@ -0,0 +1,26 @@
++# irqbalance is a daemon process that distributes interrupts across
++# CPUS on SMP systems.  The default is to rebalance once every 10
++# seconds.  This is the environment file that is specified to systemd via the 
++# EnvironmentFile key in the service unit file (or via whatever method the init
++# system you're using has. 
++#
++# ONESHOT=yes
++#    after starting, wait for a minute, then look at the interrupt
++#    load and balance it once; after balancing exit and do not change
++#    it again.
++#IRQBALANCE_ONESHOT=
++
++#
++# IRQBALANCE_BANNED_CPUS
++#    64 bit bitmask which allows you to indicate which cpu's should
++#    be skipped when reblancing irqs.  Cpu numbers which have their 
++#    corresponding bits set to one in this mask will not have any
++#    irq's assigned to them on rebalance
++#
++#IRQBALANCE_BANNED_CPUS=
++
++#
++# IRQBALANCE_ARGS
++#    append any args here to the irqbalance daemon as documented in the man page
++#
++#IRQBALANCE_ARGS=
+diff --git a/misc/irqbalance.service b/misc/irqbalance.service
+index f349616..aae2b03 100644
+--- a/misc/irqbalance.service
++++ b/misc/irqbalance.service
+@@ -3,9 +3,8 @@ Description=irqbalance daemon
+ After=syslog.target
+ 
+ [Service]
+-EnvironmentFile=/etc/sysconfig/irqbalance
+-Type=forking
+-ExecStart=/usr/sbin/irqbalance $ONESHOT
++EnvironmentFile=/path/to/irqbalance.env
++ExecStart=/usr/sbin/irqbalance $IRQBALANCE_ARGS
+ 
+ [Install]
+ WantedBy=multi-user.target
+-- 
+1.7.11.4
+
diff --git a/0002-introduce-banirq-option.patch b/0002-introduce-banirq-option.patch
new file mode 100644
index 0000000..137de84
--- /dev/null
+++ b/0002-introduce-banirq-option.patch
@@ -0,0 +1,172 @@
+From 4da232bbf763e535ec2512087aa9ac8a96fba3d9 Mon Sep 17 00:00:00 2001
+From: Neil Horman <nhorman at tuxdriver.com>
+Date: Fri, 17 Feb 2012 14:27:11 -0500
+Subject: [PATCH 2/8] introduce banirq option
+
+Fixing bug http://code.google.com/p/irqbalance/issues/detail?id=25
+
+It was pointed out that during the rewrite of irqbalance I inadvertently removed
+the support for the IRQBALANCE_BANNED_IRQS environment variable. While going to
+return it to the build, it occured to me that, given the availability of msi[x]
+irqs, a single system can literally have thousands of interrupt sources, making
+the environment variable a non-scalable solution.  Instead I'm adding a new
+option, banirqs, which takes its place.  It lets you build a list of irqs that
+you want irqbalance to leave alone.
+
+Signed-off-by: Neil Horman <nhorman at tuxdriver.com>
+---
+ classify.c   | 32 ++++++++++++++++++++++++++++++++
+ irqbalance.1 | 11 +++++++----
+ irqbalance.c | 15 ++++++++++++---
+ irqbalance.h |  1 +
+ 4 files changed, 52 insertions(+), 7 deletions(-)
+
+diff --git a/classify.c b/classify.c
+index 124dab0..d59da7f 100644
+--- a/classify.c
++++ b/classify.c
+@@ -52,6 +52,7 @@ static short class_codes[MAX_CLASS] = {
+ };
+ 
+ static GList *interrupts_db;
++static GList *banned_irqs;
+ 
+ #define SYSDEV_DIR "/sys/bus/pci/devices"
+ 
+@@ -63,6 +64,30 @@ static gint compare_ints(gconstpointer a, gconstpointer b)
+ 	return ai->irq - bi->irq;
+ }
+ 
++void add_banned_irq(int irq)
++{
++	struct irq_info find, *new;
++	GList *entry;
++
++	find.irq = irq;
++	entry = g_list_find_custom(banned_irqs, &find, compare_ints);
++	if (entry)
++		return;
++
++	new = calloc(sizeof(struct irq_info), 1);
++	if (!new) {
++		if (debug_mode)
++			printf("No memory to ban irq %d\n", irq);
++		return;
++	}
++
++	new->irq = irq;
++
++	banned_irqs = g_list_append(banned_irqs, new);
++	return;
++}
++
++			
+ /*
+  * Inserts an irq_info struct into the intterupts_db list
+  * devpath points to the device directory in sysfs for the 
+@@ -90,6 +115,13 @@ static struct irq_info *add_one_irq_to_db(const char *devpath, int irq)
+ 		return NULL;
+ 	}
+ 
++	entry = g_list_find_custom(banned_irqs, &find, compare_ints);
++	if (entry) {
++		if (debug_mode)
++			printf("SKIPPING BANNED IRQ %d\n", irq);
++		return NULL;
++	}
++
+ 	new = calloc(sizeof(struct irq_info), 1);
+ 	if (!new)
+ 		return NULL;
+diff --git a/irqbalance.1 b/irqbalance.1
+index 55fc15f..978c7c1 100644
+--- a/irqbalance.1
++++ b/irqbalance.1
+@@ -62,6 +62,13 @@ average cpu softirq workload, and no cpus are more than 1 standard deviation
+ above (and have more than 1 irq assigned to them), attempt to place 1 cpu in
+ powersave mode.  In powersave mode, a cpu will not have any irqs balanced to it,
+ in an effort to prevent that cpu from waking up without need.
++
++.TP
++.B --banirq=<irqnum>
++Add the specified irq list to the set of banned irqs. irqbalance will not affect
++the affinity of any irqs on the banned list, allowing them to be specified
++manually.  This option is addative and can be specified multiple times
++
+ .SH "ENVIRONMENT VARIABLES"
+ .TP
+ .B IRQBALANCE_ONESHOT
+@@ -75,10 +82,6 @@ Same as --debug
+ .B IRQBALANCE_BANNED_CPUS
+ Provides a mask of cpus which irqbalance should ignore and never assign interrupts to
+ 
+-.TP
+-.B IRQBALANCE_BANNED_INTERRUPTS
+-A list of space delimited IRQ numbers that irqbalance should not touch
+-
+ .SH "Homepage"
+ http://code.google.com/p/irqbalance
+ 
+diff --git a/irqbalance.c b/irqbalance.c
+index 99c5db7..c613e2b 100644
+--- a/irqbalance.c
++++ b/irqbalance.c
+@@ -72,7 +72,7 @@ struct option lopts[] = {
+ static void usage(void)
+ {
+ 	printf("irqbalance [--oneshot | -o] [--debug | -d] [--hintpolicy= | -h [exact|subset|ignore]]\n");
+-	printf("	[--powerthresh= | -p <off> | <n>]\n");
++	printf("	[--powerthresh= | -p <off> | <n>] [--banirq= | -i <n>]\n");
+ }
+ 
+ static void parse_command_line(int argc, char **argv)
+@@ -81,7 +81,7 @@ static void parse_command_line(int argc, char **argv)
+ 	int longind;
+ 
+ 	while ((opt = getopt_long(argc, argv,
+-		"odh:p:",
++		"odh:p:b:",
+ 		lopts, &longind)) != -1) {
+ 
+ 		switch(opt) {
+@@ -103,6 +103,14 @@ static void parse_command_line(int argc, char **argv)
+ 					exit(1);
+ 				}
+ 				break;
++			case 'i':
++				val = strtoull(optarg, NULL, 10);
++				if (val == ULONG_MAX) {
++					usage();
++					exit(1);
++				}
++				add_banned_irq((int)val);
++				break;
+ 			case 'p':
+ 				if (!strncmp(optarg, "off", strlen(optarg)))
+ 					power_thresh = ULONG_MAX;
+@@ -179,8 +187,9 @@ int main(int argc, char** argv)
+ #ifdef HAVE_GETOPT_LONG
+ 	parse_command_line(argc, argv);
+ #else
+-	if (argc>1 && strstr(argv[1],"--debug"))
++	if (argc>1 && strstr(argv[1],"--debug")) {
+ 		debug_mode=1;
++	}
+ 	if (argc>1 && strstr(argv[1],"--oneshot"))
+ 		one_shot_mode=1;
+ #endif
+diff --git a/irqbalance.h b/irqbalance.h
+index 4e85325..956aa8c 100644
+--- a/irqbalance.h
++++ b/irqbalance.h
+@@ -103,6 +103,7 @@ extern int get_cpu_count(void);
+  */
+ extern void rebuild_irq_db(void);
+ extern void free_irq_db(void);
++extern void add_banned_irq(int irq);
+ extern void for_each_irq(GList *list, void (*cb)(struct irq_info *info,  void *data), void *data);
+ extern struct irq_info *get_irq_info(int irq);
+ extern void migrate_irq(GList **from, GList **to, struct irq_info *info);
+-- 
+1.7.11.4
+
diff --git a/0003-When-IRQBALANCE_BANNED_CPUS-is-set-proc-stat-is-not-.patch b/0003-When-IRQBALANCE_BANNED_CPUS-is-set-proc-stat-is-not-.patch
new file mode 100644
index 0000000..3eac789
--- /dev/null
+++ b/0003-When-IRQBALANCE_BANNED_CPUS-is-set-proc-stat-is-not-.patch
@@ -0,0 +1,44 @@
+From 718561bc79c095909f0c9d3fb2f0c1c163478b1e Mon Sep 17 00:00:00 2001
+From: Petr Holasek <pholasek at redhat.com>
+Date: Mon, 20 Feb 2012 16:59:05 +0100
+Subject: [PATCH 3/8] When IRQBALANCE_BANNED_CPUS is set, /proc/stat is not
+ parsed properly.
+
+proc stats counts all the cpus in /proc/stat, but compares that number to the
+value in get_cpu_count(), which returns the number of cpus actively being
+balanced.  Since that value doesn't include banned cpus, its incorrect.  Since
+we don't want to measure the load on banned cpus anyway, just skip those lines
+so cpucount doesn't increment and the count remains equal.
+
+Signed-off-by: Petr Holasek <pholasek at redhat.com>
+Signed-off-by: Neil Horman <nhorman at tuxdriver.com>
+---
+ procinterrupts.c | 5 +++++
+ 1 file changed, 5 insertions(+)
+
+diff --git a/procinterrupts.c b/procinterrupts.c
+index 4d3b07b..c032caf 100644
+--- a/procinterrupts.c
++++ b/procinterrupts.c
+@@ -32,6 +32,8 @@
+ 
+ #define LINESIZE 4096
+ 
++extern cpumask_t banned_cpus;
++
+ static int proc_int_has_msi = 0;
+ static int msi_found_in_sysfs = 0;
+ 
+@@ -217,6 +219,9 @@ void parse_proc_stat(void)
+ 
+ 		cpunr = strtoul(&line[3], NULL, 10);
+ 
++		if (cpu_isset(cpunr, banned_cpus))
++			continue;
++
+ 		rc = sscanf(line, "%*s %*d %*d %*d %*d %*d %d %d", &irq_load, &softirq_load);
+ 		if (rc < 2)
+ 			break;	
+-- 
+1.7.11.4
+
diff --git a/0004-Make-irqbalance-scan-for-new-irqs-when-it-detects-ne.patch b/0004-Make-irqbalance-scan-for-new-irqs-when-it-detects-ne.patch
new file mode 100644
index 0000000..045892e
--- /dev/null
+++ b/0004-Make-irqbalance-scan-for-new-irqs-when-it-detects-ne.patch
@@ -0,0 +1,91 @@
+From 0edc531b0a2ebb41eb5cf49168e2897640cba0ec Mon Sep 17 00:00:00 2001
+From: Neil Horman <nhorman at tuxdriver.com>
+Date: Mon, 2 Jul 2012 13:27:14 -0400
+Subject: [PATCH 4/8] Make irqbalance scan for new irqs when it detects new
+ irqs
+
+Like cpu hotplug, irqbalance needs to rebuild its topo map and irq db when it
+detects new irqs in the system.  This patch adds that ability
+
+Resolves: http://code.google.com/p/irqbalance/issues/detail?id=32
+
+Singed-off-by: Neil Horman <nhorman at tuxdriver.com>
+---
+ irqbalance.c     |  6 +++---
+ irqbalance.h     |  2 +-
+ procinterrupts.c | 14 ++++++++++++--
+ 3 files changed, 16 insertions(+), 6 deletions(-)
+
+diff --git a/irqbalance.c b/irqbalance.c
+index c613e2b..5d40321 100644
+--- a/irqbalance.c
++++ b/irqbalance.c
+@@ -40,7 +40,7 @@ volatile int keep_going = 1;
+ int one_shot_mode;
+ int debug_mode;
+ int numa_avail;
+-int need_cpu_rescan;
++int need_rescan;
+ extern cpumask_t banned_cpus;
+ enum hp_e hint_policy = HINT_POLICY_SUBSET;
+ unsigned long power_thresh = ULONG_MAX;
+@@ -256,8 +256,8 @@ int main(int argc, char** argv)
+ 		parse_proc_stat();
+ 
+ 		/* cope with cpu hotplug -- detected during /proc/interrupts parsing */
+-		if (need_cpu_rescan) {
+-			need_cpu_rescan = 0;
++		if (need_rescan) {
++			need_rescan = 0;
+ 			/* if there's a hotplug event we better turn off power mode for a bit until things settle */
+ 			power_mode = 0;
+ 			if (debug_mode)
+diff --git a/irqbalance.h b/irqbalance.h
+index 956aa8c..043bfe6 100644
+--- a/irqbalance.h
++++ b/irqbalance.h
+@@ -64,7 +64,7 @@ enum hp_e {
+ extern int debug_mode;
+ extern int one_shot_mode;
+ extern int power_mode;
+-extern int need_cpu_rescan;
++extern int need_rescan;
+ extern enum hp_e hint_policy;
+ extern unsigned long long cycle_count;
+ extern unsigned long power_thresh;
+diff --git a/procinterrupts.c b/procinterrupts.c
+index c032caf..4559b16 100644
+--- a/procinterrupts.c
++++ b/procinterrupts.c
+@@ -82,8 +82,18 @@ void parse_proc_interrupts(void)
+ 		c++;
+ 		number = strtoul(line, NULL, 10);
+ 		info = get_irq_info(number);
+-		if (!info)
++		if (!info) {
++			/*
++ 			 * If this is our 0th pass through this routine
++ 			 * this is an irq that wasn't reported in sysfs
++ 			 * and we should just add it.  If we've been running
++ 			 * a while then this irq just appeared and its time  
++ 			 * to rescan our irqs
++ 			 */
++			if (cycle_count)
++				need_rescan = 1;
+ 			info = add_misc_irq(number);
++		}
+ 
+ 		count = 0;
+ 		cpunr = 0;
+@@ -99,7 +109,7 @@ void parse_proc_interrupts(void)
+ 			cpunr++;
+ 		}
+ 		if (cpunr != core_count) 
+-			need_cpu_rescan = 1;
++			need_rescan = 1;
+ 
+ 		info->last_irq_count = info->irq_count;		
+ 		info->irq_count = count;
+-- 
+1.7.11.4
+
diff --git a/0005-Add-banscript-option.patch b/0005-Add-banscript-option.patch
new file mode 100644
index 0000000..bf02d1e
--- /dev/null
+++ b/0005-Add-banscript-option.patch
@@ -0,0 +1,218 @@
+From b18eb8f6b28cc9b0816be0fb8fe3468c9f64f345 Mon Sep 17 00:00:00 2001
+From: Neil Horman <nhorman at tuxdriver.com>
+Date: Thu, 5 Jul 2012 14:54:35 -0400
+Subject: [PATCH 5/8] Add banscript option
+
+Its been requested in several different ways, that irqbalance have a more robust
+mechanism for setting balancing policy at run time.  While I don't feel its
+apropriate to have irqbalance be able to implement arbitrary balance policy
+(having a flexible mechanism to define which irqs should be placed where can
+become exceedingly complex), I do think we need some mechanism that easily
+allows users to dynamically exclude irqs from the irqbalance policy at run time.
+The banscript option does exactly this.  It allows the user to point irqbalance
+toward an exacutable file that is run one for each irq deiscovered passing the
+sysfs path of the device and an irq vector as arguments.  A zero exit code tells
+irqbalance to manage the irq as it normally would, while a non-zero exit tells
+irqbalance to ignore the interrupt entirely.  This provides adminstrators a code
+point with which to exclude irqs dynamically based on any programatic
+informatino available, and to manage those irqs independently, etither via
+another irqbalance like program, or via static affinity setting.
+
+Signed-off-by: Neil Horman <nhorman at tuxdriver.com>
+
+Reesolves: http://code.google.com/p/irqbalance/issues/detail?id=33
+---
+ classify.c   | 46 ++++++++++++++++++++++++++++++++++++++++++++++
+ irqbalance.1 | 11 +++++++++++
+ irqbalance.c | 25 +++++++++++++++++++++----
+ irqbalance.h |  1 +
+ 4 files changed, 79 insertions(+), 4 deletions(-)
+
+diff --git a/classify.c b/classify.c
+index d59da7f..750d946 100644
+--- a/classify.c
++++ b/classify.c
+@@ -207,6 +207,43 @@ out:
+ 	return new;
+ }
+ 
++static int check_for_irq_ban(char *path, int irq)
++{
++	char *cmd;
++	int rc;
++
++	if (!banscript)
++		return 0;
++
++	cmd = alloca(strlen(path)+strlen(banscript)+32);
++	if (!cmd)
++		return 0;
++	
++	sprintf(cmd, "%s %s %d",banscript, path, irq);
++	rc = system(cmd);
++
++	/*
++ 	 * The system command itself failed
++ 	 */
++	if (rc == -1) {
++		if (debug_mode)
++			printf("%s failed, please check the --banscript option\n", cmd);
++		else
++			syslog(LOG_INFO, "%s failed, please check the --banscript option\n", cmd);
++		return 0;
++	}
++
++	if (WEXITSTATUS(rc)) {
++		if (debug_mode)
++			printf("irq %d is baned by %s\n", irq, banscript);
++		else
++			syslog(LOG_INFO, "irq %d is baned by %s\n", irq, banscript);
++		return 1;
++	}
++	return 0;
++
++}
++
+ /*
+  * Figures out which interrupt(s) relate to the device we're looking at in dirname
+  */
+@@ -231,6 +268,10 @@ static void build_one_dev_entry(const char *dirname)
+ 			irqnum = strtol(entry->d_name, NULL, 10);
+ 			if (irqnum) {
+ 				sprintf(path, "%s/%s", SYSDEV_DIR, dirname);
++				if (check_for_irq_ban(path, irqnum)) {
++					add_banned_irq(irqnum);
++					continue;
++				}
+ 				new = add_one_irq_to_db(path, irqnum);
+ 				if (!new)
+ 					continue;
+@@ -253,6 +294,11 @@ static void build_one_dev_entry(const char *dirname)
+ 	 */
+ 	if (irqnum) {
+ 		sprintf(path, "%s/%s", SYSDEV_DIR, dirname);
++		if (check_for_irq_ban(path, irqnum)) {
++			add_banned_irq(irqnum);
++			goto done;
++		}
++
+ 		new = add_one_irq_to_db(path, irqnum);
+ 		if (!new)
+ 			goto done;
+diff --git a/irqbalance.1 b/irqbalance.1
+index 978c7c1..63b0e26 100644
+--- a/irqbalance.1
++++ b/irqbalance.1
+@@ -69,6 +69,17 @@ Add the specified irq list to the set of banned irqs. irqbalance will not affect
+ the affinity of any irqs on the banned list, allowing them to be specified
+ manually.  This option is addative and can be specified multiple times
+ 
++.TP
++.B --banscript=<script>
++Execute the specified script for each irq that is discovered, passing the sysfs
++path to the associated device as the first argument, and the irq vector as the
++second.  An exit value of 0 tells irqbalance that this interrupt should balanced
++and managed as a normal irq, while a non-zero exit code indicates this irq
++should be ignored by irqbalance completely (see --banirq above).  Use of this
++script provides users the ability to dynamically select which irqs get exluded
++from balancing, and provides an opportunity for manual affinity setting in one
++single code point.
++
+ .SH "ENVIRONMENT VARIABLES"
+ .TP
+ .B IRQBALANCE_ONESHOT
+diff --git a/irqbalance.c b/irqbalance.c
+index 5d40321..0184f0f 100644
+--- a/irqbalance.c
++++ b/irqbalance.c
+@@ -1,5 +1,6 @@
+ /* 
+  * Copyright (C) 2006, Intel Corporation
++ * Copyright (C) 2012, Neil Horman <nhorman at tuxdriver.com> 
+  * 
+  * This file is part of irqbalance
+  *
+@@ -45,6 +46,7 @@ extern cpumask_t banned_cpus;
+ enum hp_e hint_policy = HINT_POLICY_SUBSET;
+ unsigned long power_thresh = ULONG_MAX;
+ unsigned long long cycle_count = 0;
++char *banscript = NULL;
+ 
+ void sleep_approx(int seconds)
+ {
+@@ -66,6 +68,8 @@ struct option lopts[] = {
+ 	{"debug", 0, NULL, 'd'},
+ 	{"hintpolicy", 1, NULL, 'h'},
+ 	{"powerthresh", 1, NULL, 'p'},
++	{"banirq", 1 , NULL, 'i'},
++	{"banscript", 1, NULL, 'b'},
+ 	{0, 0, 0, 0}
+ };
+ 
+@@ -79,9 +83,10 @@ static void parse_command_line(int argc, char **argv)
+ {
+ 	int opt;
+ 	int longind;
++	unsigned long val;
+ 
+ 	while ((opt = getopt_long(argc, argv,
+-		"odh:p:b:",
++		"odh:i:p:b:",
+ 		lopts, &longind)) != -1) {
+ 
+ 		switch(opt) {
+@@ -193,6 +198,12 @@ int main(int argc, char** argv)
+ 	if (argc>1 && strstr(argv[1],"--oneshot"))
+ 		one_shot_mode=1;
+ #endif
++
++	/*
++ 	 * Open the syslog connection
++ 	 */
++	openlog(argv[0], 0, LOG_DAEMON);
++
+ 	if (getenv("IRQBALANCE_BANNED_CPUS"))  {
+ 		cpumask_parse_user(getenv("IRQBALANCE_BANNED_CPUS"), strlen(getenv("IRQBALANCE_BANNED_CPUS")), banned_cpus);
+ 	}
+@@ -221,8 +232,16 @@ int main(int argc, char** argv)
+ 
+ 
+ 	/* On single core UP systems irqbalance obviously has no work to do */
+-	if (core_count<2) 
++	if (core_count<2) {
++		char *msg = "Balancing is ineffective on systems with a "
++			    "single cache domain.  Shutting down\n";
++
++		if (debug_mode)
++			printf("%s", msg);
++		else
++			syslog(LOG_INFO, "%s", msg);
+ 		exit(EXIT_SUCCESS);
++	}
+ 	/* On dual core/hyperthreading shared cache systems just do a one shot setup */
+ 	if (cache_domain_count==1)
+ 		one_shot_mode = 1;
+@@ -231,8 +250,6 @@ int main(int argc, char** argv)
+ 		if (daemon(0,0))
+ 			exit(EXIT_FAILURE);
+ 
+-	openlog(argv[0], 0, LOG_DAEMON);
+-
+ #ifdef HAVE_LIBCAP_NG
+ 	// Drop capabilities
+ 	capng_clear(CAPNG_SELECT_BOTH);
+diff --git a/irqbalance.h b/irqbalance.h
+index 043bfe6..425e0dd 100644
+--- a/irqbalance.h
++++ b/irqbalance.h
+@@ -68,6 +68,7 @@ extern int need_rescan;
+ extern enum hp_e hint_policy;
+ extern unsigned long long cycle_count;
+ extern unsigned long power_thresh;
++extern char *banscript;
+ 
+ /*
+  * Numa node access routines
+-- 
+1.7.11.4
+
diff --git a/0006-irqbalance-cpu-powersave-code-disabled-when-power_th.patch b/0006-irqbalance-cpu-powersave-code-disabled-when-power_th.patch
new file mode 100644
index 0000000..8e00a63
--- /dev/null
+++ b/0006-irqbalance-cpu-powersave-code-disabled-when-power_th.patch
@@ -0,0 +1,41 @@
+From ab5ee2928b75f12a2340afe6778a106886509b4c Mon Sep 17 00:00:00 2001
+From: Petr Holasek <pholasek at redhat.com>
+Date: Thu, 12 Jul 2012 14:54:16 +0200
+Subject: [PATCH 6/8] irqbalance: cpu powersave code disabled when
+ power_thresh is not set
+
+When user doesn't set power_thresh argument no cpu can enter powersave
+mode. This patch should remove syslog clogging with pointless message
+about re-enabling all cpus for irq balancing.
+
+Signed-off-by: Petr Holasek <pholasek at redhat.com>
+Signed-off-by: Neil Horman <nhorman at tuxdriver.com>
+---
+ irqlist.c | 4 ++--
+ 1 file changed, 2 insertions(+), 2 deletions(-)
+
+diff --git a/irqlist.c b/irqlist.c
+index c29ee84..e03aa7b 100644
+--- a/irqlist.c
++++ b/irqlist.c
+@@ -112,7 +112,7 @@ static void migrate_overloaded_irqs(struct topo_obj *obj, void *data)
+ 	if (obj->load <= info->avg_load) {
+ 		if ((obj->load + info->std_deviation) <= info->avg_load) {
+ 			info->num_under++;
+-			if (!info->powersave)
++			if (power_thresh != ULONG_MAX && !info->powersave)
+ 				if (!obj->powersave_mode)
+ 					info->powersave = obj;
+ 		} else
+@@ -172,7 +172,7 @@ void update_migration_status(void)
+ {
+ 	struct load_balance_info info;
+ 	find_overloaded_objs(cpus, info);
+-	if (cycle_count > 5) {
++	if (power_thresh != ULONG_MAX && cycle_count > 5) {
+ 		if (!info.num_over && (info.num_under >= power_thresh) && info.powersave) {
+ 			syslog(LOG_INFO, "cpu %d entering powersave mode\n", info.powersave->number);
+ 			info.powersave->powersave_mode = 1;
+-- 
+1.7.11.4
+
diff --git a/0007-apply-affinity-hint-also-if-the-current-policy-is-su.patch b/0007-apply-affinity-hint-also-if-the-current-policy-is-su.patch
new file mode 100644
index 0000000..66618c3
--- /dev/null
+++ b/0007-apply-affinity-hint-also-if-the-current-policy-is-su.patch
@@ -0,0 +1,103 @@
+From 7475c3e26d14bb210eb3524396adef77021e696f Mon Sep 17 00:00:00 2001
+From: Paolo Bonzini <pbonzini at redhat.com>
+Date: Tue, 7 Aug 2012 02:54:34 -0400
+Subject: [PATCH 7/8] apply affinity hint also if the current policy is subset
+
+--hintpolicy=subset chooses an object that has a non-empty intersection
+with the affinity hint, but it never restricts the object's CPU mask
+with the hint itself.  As a result, there is no guarantee that the
+object's CPU mask is a subset of the hint.
+
+This is visible for interrupts whose balancing policy is not BALANCE_CORE.
+For example, if there is only one cache domain and the interrupt's policy
+is BALANCE_CACHE, the chosen object will correspond to "all CPUs" and
+the affinity hint will be effectively ignored.
+
+Signed-off-by: Paolo Bonzini <pbonzini at redhat.com>
+Signed-off-by: Neil Horman <nhorman at tuxdriver.com>
+---
+ activate.c | 45 ++++++++++++++++++++++++++++++++++++++-------
+ 1 file changed, 38 insertions(+), 7 deletions(-)
+
+diff --git a/activate.c b/activate.c
+index 292c44a..97d84a8 100644
+--- a/activate.c
++++ b/activate.c
+@@ -1,5 +1,6 @@
+ /* 
+  * Copyright (C) 2006, Intel Corporation
++ * Copyright (C) 2012, Neil Horman <nhorman at tuxdriver.com> 
+  * 
+  * This file is part of irqbalance
+  *
+@@ -31,17 +32,53 @@
+ 
+ #include "irqbalance.h"
+ 
++static int check_affinity(struct irq_info *info, cpumask_t applied_mask)
++{
++	cpumask_t current_mask;
++	char buf[PATH_MAX];
++	char *line = NULL;
++	size_t size = 0;
++	FILE *file;
++
++	sprintf(buf, "/proc/irq/%i/smp_affinity", info->irq);
++	file = fopen(buf, "r");
++	if (!file)
++		return 1;
++	if (getline(&line, &size, file)==0) {
++		free(line);
++		fclose(file);
++		return 1;
++	}
++	cpumask_parse_user(line, strlen(line), current_mask);
++	fclose(file);
++	free(line);
++
++	return cpus_equal(applied_mask, current_mask);
++}
+ 
+ static void activate_mapping(struct irq_info *info, void *data __attribute__((unused)))
+ {
+ 	char buf[PATH_MAX];
+ 	FILE *file;
+ 	cpumask_t applied_mask;
++	int valid_mask = 0;
++
++	if ((hint_policy == HINT_POLICY_EXACT) &&
++	    (!cpus_empty(info->affinity_hint))) {
++		applied_mask = info->affinity_hint;
++		valid_mask = 1;
++	} else if (info->assigned_obj) {
++		applied_mask = info->assigned_obj->mask;
++		valid_mask = 1;
++		if ((hint_policy == HINT_POLICY_SUBSET) &&
++		    (!cpus_empty(info->affinity_hint)))
++			cpus_and(applied_mask, applied_mask, info->affinity_hint);
++	}
+ 
+ 	/*
+  	 * only activate mappings for irqs that have moved
+  	 */
+-	if (!info->moved)
++	if (!info->moved && (!valid_mask || check_affinity(info, applied_mask)))
+ 		return;
+ 
+ 	if (!info->assigned_obj)
+@@ -53,12 +90,6 @@ static void activate_mapping(struct irq_info *info, void *data __attribute__((un
+ 	if (!file)
+ 		return;
+ 
+-	if ((hint_policy == HINT_POLICY_EXACT) &&
+-	    (!cpus_empty(info->affinity_hint)))
+-		applied_mask = info->affinity_hint;
+-	else
+-		applied_mask = info->assigned_obj->mask;
+-
+ 	cpumask_scnprintf(buf, PATH_MAX, applied_mask);
+ 	fprintf(file, "%s", buf);
+ 	fclose(file);
+-- 
+1.7.11.4
+
diff --git a/0008-irqlist-added-check-for-avoidance-of-division-by-zer.patch b/0008-irqlist-added-check-for-avoidance-of-division-by-zer.patch
new file mode 100644
index 0000000..aab34ea
--- /dev/null
+++ b/0008-irqlist-added-check-for-avoidance-of-division-by-zer.patch
@@ -0,0 +1,31 @@
+From 8285d9a1cac9cf74130ae71df0ddb4ed14122544 Mon Sep 17 00:00:00 2001
+From: Petr Holasek <pholasek at redhat.com>
+Date: Tue, 21 Aug 2012 14:45:57 +0200
+Subject: [PATCH 8/8] irqlist: added check for avoidance of division by zero
+
+When counting load_sources, its occasionally possible to have one of our object
+lists be zero (if you exlude all the cpus from balancing for instance).  In
+these cases load_sources can be zero, and that will cause a SIGFPE.  Avoid that
+by making sure that load_sources is always at least 1.
+
+Signed-off-by: Petr Holasek <pholasek at redhat.com>
+Signed-off-by: Neil Horman <nhorman at tuxdriver.com>
+---
+ irqlist.c | 1 +
+ 1 file changed, 1 insertion(+)
+
+diff --git a/irqlist.c b/irqlist.c
+index e03aa7b..c0e0d2b 100644
+--- a/irqlist.c
++++ b/irqlist.c
+@@ -160,6 +160,7 @@ static void clear_powersave_mode(struct topo_obj *obj, void *data __attribute__(
+ 	int ___load_sources;\
+ 	memset(&(info), 0, sizeof(struct load_balance_info));\
+ 	for_each_object((name), gather_load_stats, &(info));\
++	(info).load_sources = ((info).load_sources == 0) ? 1 : ((info).load_sources);\
+ 	(info).avg_load = (info).total_load / (info).load_sources;\
+ 	for_each_object((name), compute_deviations, &(info));\
+ 	___load_sources = ((info).load_sources == 1) ? 1 : ((info).load_sources - 1);\
+-- 
+1.7.11.4
+
diff --git a/irqbalance.spec b/irqbalance.spec
index 9e86a95..57667f8 100644
--- a/irqbalance.spec
+++ b/irqbalance.spec
@@ -1,6 +1,6 @@
 Name:           irqbalance
 Version:        1.0.3
-Release:        4%{?dist}
+Release:        5%{?dist}
 Epoch:          2
 Summary:        IRQ balancing daemon
 
@@ -11,7 +11,7 @@ Source0:        http://irqbalance.googlecode.com/files/irqbalance-%{version}.tar
 Source1:        irqbalance.sysconfig
 
 BuildRequires:  autoconf automake libtool libcap-ng
-BuildRequires:  glib2-devel pkgconfig imake libcap-ng-devel
+BuildRequires:  glib2-devel pkgconfig libcap-ng-devel
 %ifnarch %{arm}
 BuildRequires:	numactl-devel numactl-libs
 Requires: numactl-libs
@@ -23,12 +23,29 @@ Requires(preun):systemd-units
 
 ExclusiveArch: %{ix86} x86_64 ia64 ppc ppc64 %{arm}
 
+Patch1: 0001-Add-sample-irqbalance-environment-file.patch
+Patch2: 0002-introduce-banirq-option.patch
+Patch3: 0003-When-IRQBALANCE_BANNED_CPUS-is-set-proc-stat-is-not-.patch
+Patch4: 0004-Make-irqbalance-scan-for-new-irqs-when-it-detects-ne.patch
+Patch5: 0005-Add-banscript-option.patch
+Patch6: 0006-irqbalance-cpu-powersave-code-disabled-when-power_th.patch
+Patch7: 0007-apply-affinity-hint-also-if-the-current-policy-is-su.patch
+Patch8: 0008-irqlist-added-check-for-avoidance-of-division-by-zer.patch
+
 %description
 irqbalance is a daemon that evenly distributes IRQ load across
 multiple CPUs for enhanced performance.
 
 %prep
 %setup -q
+%patch1 -p1
+%patch2 -p1
+%patch3 -p1
+%patch4 -p1
+%patch5 -p1
+%patch6 -p1
+%patch7 -p1
+%patch8 -p1
 
 %build
 %{configure}
@@ -77,7 +94,17 @@ fi
 /sbin/chkconfig --del irqbalance >/dev/null 2>&1 || :
 
 %changelog
-* Mon Apr 16 2012 Petr Holasek <pholasek at redhat.com> - 2:1.0.3-4
+* Mon Aug 27 2012 Petr Holasek <pholasek at redhat.com> - 2:1.0.3-5
+- Make irqbalance scan for new irqs when it detects new irqs (bz832815)
+- Fixes SIGFPE crash for some banning configuration (bz849792)
+- Fixes affinity_hint values processing (bz832815)
+- Adds banirq and bansript options (bz837049)
+- imake isn't needed for building any more (bz844359)
+- Fixes clogging of syslog (bz837646)
+- Added IRQBALANCE_ARGS variable for passing arguments via systemd(bz837048)
+- Fixes --hint-policy=subset behavior (bz844381)
+
+* Sun Apr 15 2012 Petr Holasek <pholasek at redhat.com> - 2:1.0.3-4
 - Updated libnuma dependencies
 
 * Sun Feb  5 2012 Peter Robinson <pbrobinson at fedoraproject.org> - 2:1.0.3-3


More information about the scm-commits mailing list