src/lockspace.c
by David Teigland
src/lockspace.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
New commits:
commit 47cdae637b7e2ed78ce8d5c1867582859d71dd44
Author: David Teigland <teigland(a)redhat.com>
Date: Mon Aug 11 09:27:04 2014 -0500
sanlock: log error for invalid set_event args
Signed-off-by: David Teigland <teigland(a)redhat.com>
diff --git a/src/lockspace.c b/src/lockspace.c
index 4c31c09..352287e 100644
--- a/src/lockspace.c
+++ b/src/lockspace.c
@@ -1374,11 +1374,11 @@ int lockspace_set_event(struct sanlk_lockspace *ls, struct sanlk_host_event *he,
uint64_t now;
int i, rv = 0;
- if (!ls->name[0])
- return -EINVAL;
-
- if (!he->host_id || he->host_id > DEFAULT_MAX_HOSTS)
+ if (!ls->name[0] || !he->host_id || he->host_id > DEFAULT_MAX_HOSTS) {
+ log_error("set_event invalid args host_id %llu name %s",
+ (unsigned long long)he->host_id, ls->name);
return -EINVAL;
+ }
pthread_mutex_lock(&spaces_mutex);
sp = _search_space(ls->name, NULL, 0, &spaces, NULL, NULL, NULL);
9 years, 1 month
init.d/sanlk-resetd.service init.d/sanlock.service init.d/wdmd.service
by David Teigland
init.d/sanlk-resetd.service | 3 ++-
init.d/sanlock.service | 1 -
init.d/wdmd.service | 1 -
3 files changed, 2 insertions(+), 3 deletions(-)
New commits:
commit 54ad08f5e40b3a2ed8ad64cbc40befff89107ea4
Author: David Teigland <teigland(a)redhat.com>
Date: Wed Aug 6 14:18:09 2014 -0500
init: misc changes
- use requires for sanlk-resetd because it's probably better
- remove syslog from sanlk-resetd because it seems systemd
doesn't like that any more
- remove controlgroup from sanlock and wdmd services files
because systemd doesn't seem to like that any more
Signed-off-by: David Teigland <teigland(a)redhat.com>
diff --git a/init.d/sanlk-resetd.service b/init.d/sanlk-resetd.service
index 44a88a6..3046a5b 100644
--- a/init.d/sanlk-resetd.service
+++ b/init.d/sanlk-resetd.service
@@ -1,6 +1,7 @@
[Unit]
Description=daemon for host reset
-After=syslog.target wdmd.service sanlock.service
+After=wdmd.service sanlock.service
+Requires=wdmd.service sanlock.service
[Service]
Type=forking
diff --git a/init.d/sanlock.service b/init.d/sanlock.service
index 64d9ced..f9ed52b 100644
--- a/init.d/sanlock.service
+++ b/init.d/sanlock.service
@@ -5,7 +5,6 @@ Wants=wdmd.service
[Service]
Type=forking
-ControlGroup=cpu:/
ExecStart=/lib/systemd/systemd-sanlock start
ExecStop=/lib/systemd/systemd-sanlock stop
diff --git a/init.d/wdmd.service b/init.d/wdmd.service
index efe46bf..7e6d973 100644
--- a/init.d/wdmd.service
+++ b/init.d/wdmd.service
@@ -4,7 +4,6 @@ After=syslog.target
[Service]
Type=forking
-ControlGroup=cpu:/
ExecStart=/lib/systemd/systemd-wdmd start
ExecStop=/lib/systemd/systemd-wdmd stop
9 years, 1 month
init.d/sanlk-resetd.service
by David Teigland
init.d/sanlk-resetd.service | 11 +++++++++++
1 file changed, 11 insertions(+)
New commits:
commit 415f7ec9473fc9254f6f88314ef6ff2ce7e0f49d
Author: David Teigland <teigland(a)redhat.com>
Date: Wed Aug 6 11:49:15 2014 -0500
init: add service file for sanlk-resetd
Signed-off-by: David Teigland <teigland(a)redhat.com>
diff --git a/init.d/sanlk-resetd.service b/init.d/sanlk-resetd.service
new file mode 100644
index 0000000..44a88a6
--- /dev/null
+++ b/init.d/sanlk-resetd.service
@@ -0,0 +1,11 @@
+[Unit]
+Description=daemon for host reset
+After=syslog.target wdmd.service sanlock.service
+
+[Service]
+Type=forking
+ExecStart=/usr/sbin/sanlk-resetd
+
+[Install]
+WantedBy=multi-user.target
+
9 years, 1 month
reset/Makefile
by David Teigland
reset/Makefile | 1 -
1 file changed, 1 deletion(-)
New commits:
commit a9d7f0e0b9ff85ccbff9d159ffeb07f10761acb7
Author: David Teigland <teigland(a)redhat.com>
Date: Tue Aug 5 14:15:58 2014 -0500
reset: remove DEBUG
Signed-off-by: David Teigland <teigland(a)redhat.com>
diff --git a/reset/Makefile b/reset/Makefile
index 5b9fe52..457da5e 100644
--- a/reset/Makefile
+++ b/reset/Makefile
@@ -4,7 +4,6 @@ TARGET2 = sanlk-reset
SOURCE1 = sanlk_resetd.c
SOURCE2 = sanlk_reset.c
-DEBUG = 1
OPTIMIZE_FLAG = -O2 -Wp,-D_FORTIFY_SOURCE=2
ifeq ($(DEBUG), 1)
OPTIMIZE_FLAG = -O0
9 years, 1 month
Question about Sanlock -> Wdmd communication
by Russell Jones
Hi David/All,
I have a CentOS 6 KVM host that is utilizing Sanlock
and Wdmd on a shared NFS mount that contains both the VM disks and the
Lockspace. I experienced an NFS outage that of course resulted in
Sanlock entering recovery and terminating all of the VM PID's. I would
have expected at that point for Sanlock to disarm the watchdog, however
that did not happen and the watchdog did eventually reset the host.
I
am having trouble making sense of the logs to determine if the watchdog
was not disarmed because the Lockspace could not be renewed and Sanlock
is not going to terminate itself, or if it's because there was a VM PID
hanging around that would not exit. Below is a snippet of the log
messages. Some clarity about what the ultimate case was for the watchdog
firing is much appreciated!
Thank you!
<NFS server died>
Aug 5
02:41:14 vmhost sanlock[3111]: 1964804 __LIBVIR aio timeout
0x7f43300008c0:0x7f43300008d0:0x7f434296a000 sec 10 to_count 6
Aug 5
02:41:14 vmhost sanlock[3111]: 1964804 s1 delta_renew read rv -202
offset 0 /vmstore02/sanlock/__LIBVIRT__DISKS__
Aug 5 02:41:14 vmhost
sanlock[3111]: 1964804 s1 renewal error -202 delta_length 10
last_success 1964773
Aug 5 02:41:25 vmhost sanlock[3111]: 1964815
__LIBVIR aio timeout 0x7f4330000910:0x7f4330000920:0x7f4342767000 sec 10
to_count 7
Aug 5 02:41:25 vmhost sanlock[3111]: 1964815 s1 delta_renew
read rv -202 offset 0 /vmstore02/sanlock/__LIBVIRT__DISKS__
Aug 5
02:41:25 vmhost sanlock[3111]: 1964815 s1 renewal error -202
delta_length 11 last_success 1964773
Aug 5 02:41:36 vmhost
sanlock[3111]: 1964826 __LIBVIR aio timeout
0x7f4330000960:0x7f4330000970:0x7f4342665000 sec 10 to_count 8
Aug 5
02:41:36 vmhost sanlock[3111]: 1964826 s1 delta_renew read rv -202
offset 0 /vmstore02/sanlock/__LIBVIRT__DISKS__
Aug 5 02:41:36 vmhost
sanlock[3111]: 1964826 s1 renewal error -202 delta_length 11
last_success 1964773
Aug 5 02:41:43 vmhost sanlock[3111]: 1964833 s1
check_our_lease warning 60 last_success 1964773
Aug 5 02:41:44 vmhost
sanlock[3111]: 1964834 s1 check_our_lease warning 61 last_success
1964773
Aug 5 02:41:45 vmhost sanlock[3111]: 1964835 s1 check_our_lease
warning 62 last_success 1964773
Aug 5 02:41:46 vmhost sanlock[3111]:
1964836 s1 check_our_lease warning 63 last_success 1964773
Aug 5
02:41:47 vmhost sanlock[3111]: 1964837 __LIBVIR aio timeout
0x7f43300009b0:0x7f43300009c0:0x7f4342563000 sec 10 to_count 9
Aug 5
02:41:47 vmhost sanlock[3111]: 1964837 s1 delta_renew read rv -202
offset 0 /vmstore02/sanlock/__LIBVIRT__DISKS__
Aug 5 02:41:47 vmhost
sanlock[3111]: 1964837 s1 renewal error -202 delta_length 11
last_success 1964773
Aug 5 02:41:47 vmhost sanlock[3111]: 1964837 s1
check_our_lease warning 64 last_success 1964773
Aug 5 02:41:48 vmhost
sanlock[3111]: 1964838 s1 check_our_lease warning 65 last_success
1964773
Aug 5 02:41:49 vmhost sanlock[3111]: 1964839 s1 check_our_lease
warning 66 last_success 1964773
Aug 5 02:41:50 vmhost sanlock[3111]:
1964840 s1 check_our_lease warning 67 last_success 1964773
Aug 5
02:41:51 vmhost sanlock[3111]: 1964841 s1 check_our_lease warning 68
last_success 1964773
Aug 5 02:41:52 vmhost sanlock[3111]: 1964842 s1
check_our_lease warning 69 last_success 1964773
Aug 5 02:41:53 vmhost
sanlock[3111]: 1964843 s1 check_our_lease warning 70 last_success
1964773
Aug 5 02:41:54 vmhost sanlock[3111]: 1964844 s1 check_our_lease
warning 71 last_success 1964773
Aug 5 02:41:55 vmhost sanlock[3111]:
1964845 s1 check_our_lease warning 72 last_success 1964773
Aug 5
02:41:56 vmhost sanlock[3111]: 1964846 s1 check_our_lease warning 73
last_success 1964773
Aug 5 02:41:57 vmhost sanlock[3111]: 1964847 s1
check_our_lease warning 74 last_success 1964773
Aug 5 02:41:58 vmhost
sanlock[3111]: 1964848 s1 delta_renew read rv -2 offset 0
/vmstore02/sanlock/__LIBVIRT__DISKS__
Aug 5 02:41:58 vmhost
sanlock[3111]: 1964848 s1 renewal error -2 delta_length 11 last_success
1964773
Aug 5 02:41:58 vmhost sanlock[3111]: 1964848 s1 check_our_lease
warning 75 last_success 1964773
Aug 5 02:41:59 vmhost sanlock[3111]:
1964849 s1 check_our_lease warning 76 last_success 1964773
Aug 5
02:42:00 vmhost sanlock[3111]: 1964850 s1 check_our_lease warning 77
last_success 1964773
Aug 5 02:42:01 vmhost sanlock[3111]: 1964851 s1
check_our_lease warning 78 last_success 1964773
Aug 5 02:42:02 vmhost
sanlock[3111]: 1964852 s1 check_our_lease warning 79 last_success
1964773
Aug 5 02:42:03 vmhost sanlock[3111]: 1964853 s1 check_our_lease
failed 80
<VMs are being terminated here>
Aug 5 02:42:04 vmhost kernel:
br0: port 5(vnet4) entering disabled state
Aug 5 02:42:04 vmhost kernel:
device vnet4 left promiscuous mode
Aug 5 02:42:04 vmhost kernel: br0:
port 5(vnet4) entering disabled state
Aug 5 02:42:04 vmhost kernel: br0:
port 3(vnet1) entering disabled state
Aug 5 02:42:04 vmhost kernel:
device vnet1 left promiscuous mode
Aug 5 02:42:04 vmhost kernel: br0:
port 3(vnet1) entering disabled state
Aug 5 02:42:04 vmhost kernel: br0:
port 4(vnet3) entering disabled state
Aug 5 02:42:04 vmhost kernel:
device vnet3 left promiscuous mode
Aug 5 02:42:04 vmhost kernel: br0:
port 4(vnet3) entering disabled state
Aug 5 02:42:04 vmhost kernel: br1:
port 2(vnet2) entering disabled state
Aug 5 02:42:04 vmhost kernel:
device vnet2 left promiscuous mode
Aug 5 02:42:04 vmhost kernel: br1:
port 2(vnet2) entering disabled state
Aug 5 02:42:06 vmhost ntpd[2340]:
Deleting interface #11 vnet1, fe80::fc52:ff:fe5d:4afd#123, interface
stats: received=0, sent=0, dropped=0, active_time=1964539 secs
Aug 5
02:42:06 vmhost ntpd[2340]: Deleting interface #14 vnet4,
fe80::fc54:ff:fe60:244e#123, interface stats: received=0, sent=0,
dropped=0, active_time=1964539 secs
Aug 5 02:42:06 vmhost ntpd[2340]:
Deleting interface #15 vnet3, fe80::fc52:ff:fe38:2713#123, interface
stats: received=0, sent=0, dropped=0, active_time=1964539 secs
Aug 5
02:42:06 vmhost ntpd[2340]: Deleting interface #16 vnet2,
fe80::fc54:ff:fee8:c0fb#123, interface stats: received=0, sent=0,
dropped=0, active_time=1964539 secs
Aug 5 02:42:08 vmhost wdmd[3001]:
test failed pid 3111 renewal 1964773 expire 1964853
Aug 5 02:42:08
vmhost sanlock[3111]: 1964858 s1 delta_renew read rv -2 offset 0
/vmstore02/sanlock/__LIBVIRT__DISKS__
Aug 5 02:42:08 vmhost
sanlock[3111]: 1964858 s1 renewal error -2 delta_length 10 last_success
1964773
Aug 5 02:42:18 vmhost wdmd[3001]: test failed pid 3111 renewal
1964773 expire 1964853
Aug 5 02:42:19 vmhost sanlock[3111]: 1964869 s1
delta_renew read rv -2 offset 0
/vmstore02/sanlock/__LIBVIRT__DISKS__
Aug 5 02:42:19 vmhost
sanlock[3111]: 1964869 s1 renewal error -2 delta_length 10 last_success
1964773
Aug 5 02:42:28 vmhost wdmd[3001]: test failed pid 3111 renewal
1964773 expire 1964853
Aug 5 02:42:29 vmhost sanlock[3111]: 1964879 s1
delta_renew read rv -2 offset 0
/vmstore02/sanlock/__LIBVIRT__DISKS__
Aug 5 02:42:29 vmhost
sanlock[3111]: 1964879 s1 renewal error -2 delta_length 10 last_success
1964773
Aug 5 02:42:38 vmhost wdmd[3001]: test failed pid 3111 renewal
1964773 expire 1964853
Aug 5 02:42:40 vmhost sanlock[3111]: 1964890 s1
delta_renew read rv -2 offset 0
/vmstore02/sanlock/__LIBVIRT__DISKS__
Aug 5 02:42:40 vmhost
sanlock[3111]: 1964890 s1 renewal error -2 delta_length 10 last_success
1964773
Aug 5 02:42:48 vmhost wdmd[3001]: test failed pid 3111 renewal
1964773 expire 1964853
Aug 5 02:42:50 vmhost sanlock[3111]: 1964900 s1
delta_renew read rv -2 offset 0
/vmstore02/sanlock/__LIBVIRT__DISKS__
Aug 5 02:42:50 vmhost
sanlock[3111]: 1964900 s1 renewal error -2 delta_length 10 last_success
1964773
<host is reset>
9 years, 1 month