fence-agents: RHEL6 - fence_vmware_soap: Add delay option
by Marek Grác
Gitweb: http://git.fedorahosted.org/git/?p=fence-agents.git;a=commitdiff;h=da403a...
Commit: da403ab459474ba70303b2b46c9459c2872d6d68
Parent: 7b86da6001e65c9dc309bf4956f1bced9a894c5f
Author: Marek 'marx' Grac <mgrac(a)redhat.com>
AuthorDate: Fri Jan 24 13:39:36 2014 +0100
Committer: Marek 'marx' Grac <mgrac(a)redhat.com>
CommitterDate: Fri Jan 24 13:39:36 2014 +0100
fence_vmware_soap: Add delay option
Remove duplicity of "delay" option in metadata. In general, this should be solved in fencing library but we do not want to change that
much in this branch.
Resolves: rhbz#1051159
---
fence/agents/vmware_soap/fence_vmware_soap.py | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fence/agents/vmware_soap/fence_vmware_soap.py b/fence/agents/vmware_soap/fence_vmware_soap.py
index 8f1ff6a..e6e62ac 100644
--- a/fence/agents/vmware_soap/fence_vmware_soap.py
+++ b/fence/agents/vmware_soap/fence_vmware_soap.py
@@ -172,7 +172,7 @@ def remove_tmp_dir(tmp_dir):
def main():
device_opt = [ "help", "version", "agent", "quiet", "verbose", "debug",
"action", "ipaddr", "login", "passwd", "passwd_script",
- "ssl", "port", "uuid", "separator", "ipport", "delay",
+ "ssl", "port", "uuid", "separator", "ipport",
"power_timeout", "shell_timeout", "login_timeout", "power_wait" ]
atexit.register(atexit_handler)
10 years, 3 months
fence-agents: RHEL6 - fence_vmware_soap: Add delay option
by Marek Grác
Gitweb: http://git.fedorahosted.org/git/?p=fence-agents.git;a=commitdiff;h=7b86da...
Commit: 7b86da6001e65c9dc309bf4956f1bced9a894c5f
Parent: 1c392c0c6fa2bfd89a349d7d618f686d07e6fa0b
Author: Marek 'marx' Grac <mgrac(a)redhat.com>
AuthorDate: Thu Jan 23 19:48:37 2014 +0100
Committer: Marek 'marx' Grac <mgrac(a)redhat.com>
CommitterDate: Thu Jan 23 19:48:37 2014 +0100
fence_vmware_soap: Add delay option
Resolves: rhbz#1051159
---
fence/agents/vmware_soap/fence_vmware_soap.py | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fence/agents/vmware_soap/fence_vmware_soap.py b/fence/agents/vmware_soap/fence_vmware_soap.py
index d73e323..8f1ff6a 100644
--- a/fence/agents/vmware_soap/fence_vmware_soap.py
+++ b/fence/agents/vmware_soap/fence_vmware_soap.py
@@ -1,6 +1,6 @@
#!/usr/bin/python
-import sys, re, pexpect, exceptions
+import sys, re, pexpect, exceptions, time
import shutil, tempfile
sys.path.append("@FENCEAGENTSLIBDIR@")
10 years, 3 months
fence-agents: RHEL6 - fence_vmware_soap: Add delay option
by Marek Grác
Gitweb: http://git.fedorahosted.org/git/?p=fence-agents.git;a=commitdiff;h=1c392c...
Commit: 1c392c0c6fa2bfd89a349d7d618f686d07e6fa0b
Parent: 9d23663ae93316eedb5c925f719a21ea74e9f59f
Author: Marek 'marx' Grac <mgrac(a)redhat.com>
AuthorDate: Thu Jan 23 19:14:45 2014 +0100
Committer: Marek 'marx' Grac <mgrac(a)redhat.com>
CommitterDate: Thu Jan 23 19:14:45 2014 +0100
fence_vmware_soap: Add delay option
Resolves: rhbz#1051159
---
fence/agents/vmware_soap/fence_vmware_soap.py | 5 ++++-
1 files changed, 4 insertions(+), 1 deletions(-)
diff --git a/fence/agents/vmware_soap/fence_vmware_soap.py b/fence/agents/vmware_soap/fence_vmware_soap.py
index 6abf626..d73e323 100644
--- a/fence/agents/vmware_soap/fence_vmware_soap.py
+++ b/fence/agents/vmware_soap/fence_vmware_soap.py
@@ -16,6 +16,9 @@ BUILD_DATE="April, 2011"
#END_VERSION_GENERATION
def soap_login(options):
+ if options.has_key("-f"):
+ time.sleep(int(options["-f"]))
+
if options.has_key("-z"):
url = "https://"
else:
@@ -169,7 +172,7 @@ def remove_tmp_dir(tmp_dir):
def main():
device_opt = [ "help", "version", "agent", "quiet", "verbose", "debug",
"action", "ipaddr", "login", "passwd", "passwd_script",
- "ssl", "port", "uuid", "separator", "ipport",
+ "ssl", "port", "uuid", "separator", "ipport", "delay",
"power_timeout", "shell_timeout", "login_timeout", "power_wait" ]
atexit.register(atexit_handler)
10 years, 3 months
fence-agents: master - fence_kdump: Add vendor-url to metadata
by Marek Grác
Gitweb: http://git.fedorahosted.org/git/?p=fence-agents.git;a=commitdiff;h=849d0d...
Commit: 849d0dba262c2111446fb5a03040b22146c35726
Parent: cc04df682a343c6627c250cffc0f4d60383a7baa
Author: Marek 'marx' Grac <mgrac(a)redhat.com>
AuthorDate: Thu Jan 23 18:29:35 2014 +0100
Committer: Marek 'marx' Grac <mgrac(a)redhat.com>
CommitterDate: Thu Jan 23 18:29:35 2014 +0100
fence_kdump: Add vendor-url to metadata
Resolves: rhbz#1022529
---
fence/agents/kdump/fence_kdump.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/fence/agents/kdump/fence_kdump.c b/fence/agents/kdump/fence_kdump.c
index fa1f6a4..cae9842 100644
--- a/fence/agents/kdump/fence_kdump.c
+++ b/fence/agents/kdump/fence_kdump.c
@@ -178,6 +178,7 @@ do_action_metadata (const char *self)
fprintf (stdout, "<longdesc>");
fprintf (stdout, "The fence_kdump agent is intended to be used with with kdump service.");
fprintf (stdout, "</longdesc>\n");
+ fprintf (stdout, "<vendor-url>http://www.kernel.org/pub/linux/utils/kernel/kexec/</vendor-url>\n");
fprintf (stdout, "<parameters>\n");
10 years, 3 months
fence-agents: master - fence_scsi: Change path to corosync from /sbin to /usr/sbin
by Marek Grác
Gitweb: http://git.fedorahosted.org/git/?p=fence-agents.git;a=commitdiff;h=cc04df...
Commit: cc04df682a343c6627c250cffc0f4d60383a7baa
Parent: 116512c174f4acef0faee4459158c45ddf6922d2
Author: Marek 'marx' Grac <mgrac(a)redhat.com>
AuthorDate: Thu Jan 23 17:32:25 2014 +0100
Committer: Marek 'marx' Grac <mgrac(a)redhat.com>
CommitterDate: Thu Jan 23 17:32:25 2014 +0100
fence_scsi: Change path to corosync from /sbin to /usr/sbin
/sbin is just a symlink to /usr/bin - so it does not impact functionality
---
fence/agents/scsi/fence_scsi.pl | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/fence/agents/scsi/fence_scsi.pl b/fence/agents/scsi/fence_scsi.pl
index 3ad0f09..6808ff5 100644
--- a/fence/agents/scsi/fence_scsi.pl
+++ b/fence/agents/scsi/fence_scsi.pl
@@ -429,7 +429,7 @@ sub get_node_id ($)
my $self = (caller(0))[3];
my $node = $_[0];
- my $cmd = "/sbin/corosync-cmapctl nodelist.";
+ my $cmd = "/usr/sbin/corosync-cmapctl nodelist.";
my @out = qx { $cmd 2> /dev/null };
my $err = ($?>>8);
@@ -454,7 +454,7 @@ sub get_cluster_id ()
my $self = (caller(0))[3];
my $cluster_id;
- my $cmd = "/sbin/corosync-cmapctl totem.cluster_name";
+ my $cmd = "/usr/sbin/corosync-cmapctl totem.cluster_name";
my $out = qx { $cmd 2> /dev/null };
my $err = ($?>>8);
10 years, 3 months
fence-agents: master - fence_scsi: Replace automatic key generation to work with corosync clusters instead of cman
by Marek Grác
Gitweb: http://git.fedorahosted.org/git/?p=fence-agents.git;a=commitdiff;h=116512...
Commit: 116512c174f4acef0faee4459158c45ddf6922d2
Parent: 8b127ebff6a38b0c6dd9c2a1ad738e2d7637e0fa
Author: Marek 'marx' Grac <mgrac(a)redhat.com>
AuthorDate: Wed Jan 22 15:35:20 2014 +0100
Committer: Marek 'marx' Grac <mgrac(a)redhat.com>
CommitterDate: Wed Jan 22 15:35:20 2014 +0100
fence_scsi: Replace automatic key generation to work with corosync clusters instead of cman
Resolves: rhbz#994466
---
fence/agents/scsi/fence_scsi.pl | 38 ++++++++++++++++++++++----------------
1 files changed, 22 insertions(+), 16 deletions(-)
diff --git a/fence/agents/scsi/fence_scsi.pl b/fence/agents/scsi/fence_scsi.pl
index c959417..3ad0f09 100644
--- a/fence/agents/scsi/fence_scsi.pl
+++ b/fence/agents/scsi/fence_scsi.pl
@@ -5,6 +5,7 @@ use File::Basename;
use File::Path;
use Getopt::Std;
use POSIX;
+use B;
#BEGIN_VERSION_GENERATION
$RELEASE_VERSION="";
@@ -426,10 +427,10 @@ sub get_key ($)
sub get_node_id ($)
{
my $self = (caller(0))[3];
- my $node_id;
+ my $node = $_[0];
- my $cmd = "cman_tool nodes -n $_[0] -F id";
- my $out = qx { $cmd 2> /dev/null };
+ my $cmd = "/sbin/corosync-cmapctl nodelist.";
+ my @out = qx { $cmd 2> /dev/null };
my $err = ($?>>8);
if ($err != 0) {
@@ -438,11 +439,14 @@ sub get_node_id ($)
# die "[error]: $self\n" if ($?>>8);
- chomp ($out);
-
- $node_id = $out;
-
- return ($node_id);
+ foreach my $line (@out) {
+ chomp($line);
+ if ($line =~ /.(\d+?).ring._addr \(str\) = ${node}$/) {
+ return $1;
+ }
+ }
+
+ log_error("$self (unable to parse output of corosync-cmapctl or node does not exist)");
}
sub get_cluster_id ()
@@ -450,8 +454,8 @@ sub get_cluster_id ()
my $self = (caller(0))[3];
my $cluster_id;
- my $cmd = "cman_tool status";
- my @out = qx { $cmd 2> /dev/null };
+ my $cmd = "/sbin/corosync-cmapctl totem.cluster_name";
+ my $out = qx { $cmd 2> /dev/null };
my $err = ($?>>8);
if ($err != 0) {
@@ -460,12 +464,14 @@ sub get_cluster_id ()
# die "[error]: $self\n" if ($?>>8);
- foreach (@out) {
- chomp;
- my ($param, $value) = split (/\s*:\s*/, $_);
- if ($param =~ /^cluster\s+id/i) {
- $cluster_id = $value;
- }
+ chomp($out);
+
+ if ($out =~ /=\s(.*?)$/) {
+ my $cluster_name = $1;
+ # tranform string to a number
+ $cluster_id = (hex B::hash($cluster_name)) % 65536;
+ } else {
+ log_error("$self (unable to parse output of corosync-cmapctl)");
}
return ($cluster_id);
10 years, 3 months
fence-agents: master - fencing: Fabric fence agents should have default action "off"
by Marek Grác
Gitweb: http://git.fedorahosted.org/git/?p=fence-agents.git;a=commitdiff;h=8b127e...
Commit: 8b127ebff6a38b0c6dd9c2a1ad738e2d7637e0fa
Parent: 530e97f05e43bdd5bef9d24c75d4cc3057a491e8
Author: Marek 'marx' Grac <mgrac(a)redhat.com>
AuthorDate: Wed Jan 22 13:51:50 2014 +0100
Committer: Marek 'marx' Grac <mgrac(a)redhat.com>
CommitterDate: Wed Jan 22 13:51:50 2014 +0100
fencing: Fabric fence agents should have default action "off"
Previously, when you have run fence agent without -o XYZ, reboot was performed. Fabric fence agents do not have them
so fence agent fails. This update does not fix only this issue but also text --help and in manual pages.
Resolves: rhbz#1021392
---
fence/agents/lib/fencing.py.py | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/fence/agents/lib/fencing.py.py b/fence/agents/lib/fencing.py.py
index 9cc7407..889bb04 100644
--- a/fence/agents/lib/fencing.py.py
+++ b/fence/agents/lib/fencing.py.py
@@ -618,6 +618,10 @@ def check_input(device_opt, opt):
else:
all_opt["login"]["required"] = "0"
+ if device_opt.count("fabric_fencing"):
+ all_opt["action"]["default"] = "off"
+ all_opt["action"]["help"] = "-o, --action=[action] Action: status, off (default) or on"
+
## Set default values
#####
for opt in device_opt:
10 years, 3 months
cluster: RHEL6 - dlm_controld: adjust fence time comparison
by David Teigland
Gitweb: http://git.fedorahosted.org/git/?p=cluster.git;a=commitdiff;h=2d06dd478c2...
Commit: 2d06dd478c27bf864ba1a5ac0cbb1ba6c3ed947f
Parent: cca7cf733d03a58d94eb4ab3bee7dcc2e39b7ea1
Author: David Teigland <teigland(a)redhat.com>
AuthorDate: Fri Jan 10 16:01:35 2014 -0600
Committer: David Teigland <teigland(a)redhat.com>
CommitterDate: Fri Jan 10 16:01:35 2014 -0600
dlm_controld: adjust fence time comparison
An unusual combination of events can cause the fence
time comparison to not work properly, leaving
dlm_controld recovery stuck.
If fencing in fenced completes very quickly, and the
cpg callback into dlm_controld is delayed, the effect
is that the fence_time returned from fenced is later
than the fail_time recorded in the cpg callback.
dlm_controld requires that the fencing time is after
the fail time.
This is solved by saving the add_time when fail_time
is recorded as need_fence_after. The fencing check
is then changed to also succeed if fence_time is
later than need_fence_after. A simple comparison
with add_time does not work as shown in commit
4039bf4817a96b6aab20de948389f43b89ce4a8e.
bz 843160
Signed-off-by: David Teigland <teigland(a)redhat.com>
---
group/dlm_controld/cpg.c | 17 ++++++++++++++---
1 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/group/dlm_controld/cpg.c b/group/dlm_controld/cpg.c
index 6a4023b..795efc4 100644
--- a/group/dlm_controld/cpg.c
+++ b/group/dlm_controld/cpg.c
@@ -47,6 +47,7 @@ struct node {
uint64_t add_time;
uint64_t fail_time;
uint64_t fence_time; /* for debug */
+ uint64_t need_fence_after;
uint64_t cluster_add_time;
uint64_t cluster_remove_time;
uint32_t fence_queries; /* for debug */
@@ -502,6 +503,7 @@ static void node_history_fail(struct lockspace *ls, int nodeid,
node->fence_time = 0;
node->fence_queries = 0;
node->fail_time = time(NULL);
+ node->need_fence_after = node->add_time;
}
/* fenced will take care of making sure the quorum value
@@ -546,12 +548,20 @@ static int check_fencing_done(struct lockspace *ls)
we've seen fenced_time within the same second as
fail_time: with external fencing, e.g. fence_node */
- if (last_fenced_time >= node->fail_time) {
+ /* the comparison with need_fence_after is to deal with
+ the odd case where fencing completes very quickly in
+ fenced and there is a delay of the delivery of the cpg
+ callback (and setting fail_time) in dlm_controld,
+ placing the fail_time after the fence_time. */
+
+ if ((last_fenced_time >= node->fail_time) ||
+ (last_fenced_time > node->need_fence_after)) {
log_group(ls, "check_fencing %d done "
- "add %llu fail %llu last %llu",
+ "add %llu fail %llu need %llu last %llu",
node->nodeid,
(unsigned long long)node->add_time,
(unsigned long long)node->fail_time,
+ (unsigned long long)node->need_fence_after,
(unsigned long long)last_fenced_time);
node->check_fencing = 0;
node->add_time = 0;
@@ -560,10 +570,11 @@ static int check_fencing_done(struct lockspace *ls)
if (!node->fence_queries ||
node->fence_time != last_fenced_time) {
log_group(ls, "check_fencing %d wait "
- "add %llu fail %llu last %llu",
+ "add %llu fail %llu need %llu last %llu",
node->nodeid,
(unsigned long long)node->add_time,
(unsigned long long)node->fail_time,
+ (unsigned long long)node->need_fence_after,
(unsigned long long)last_fenced_time);
node->fence_queries++;
node->fence_time = last_fenced_time;
10 years, 3 months
cluster: RHEL6 - gfs_controld: fix first recovery case 2
by David Teigland
Gitweb: http://git.fedorahosted.org/git/?p=cluster.git;a=commitdiff;h=cca7cf733d0...
Commit: cca7cf733d03a58d94eb4ab3bee7dcc2e39b7ea1
Parent: 72619738e20e2627a7e5fc3268b003d33ce699b2
Author: David Teigland <teigland(a)redhat.com>
AuthorDate: Tue Jul 9 11:54:06 2013 -0500
Committer: David Teigland <teigland(a)redhat.com>
CommitterDate: Fri Jan 10 14:29:37 2014 -0600
gfs_controld: fix first recovery case 2
- node A is doing first recovery
- node B joins the mount group and is waiting for A to finish
- node A sets some journals X and Y as needing recovery based
on start message from A (it's not clear how/why A has journals
X,Y marked as needing recovery if it's doing first recovery.)
- node A completes first recovery and sends first recovery done
message
- node B still has X,Y journals as needing recovery, which
prevents the mount group recovery from completing
node B should clear the needs recovery state on any journals
when it receives first recovery done.
bz 982305
Signed-off-by: David Teigland <teigland(a)redhat.com>
---
group/gfs_controld/cpg-new.c | 8 ++++++++
1 files changed, 8 insertions(+), 0 deletions(-)
diff --git a/group/gfs_controld/cpg-new.c b/group/gfs_controld/cpg-new.c
index 8943f62..845d183 100644
--- a/group/gfs_controld/cpg-new.c
+++ b/group/gfs_controld/cpg-new.c
@@ -1544,12 +1544,20 @@ static void receive_recovery_result(struct mountgroup *mg,
static void receive_first_recovery_done(struct mountgroup *mg,
struct gfs_header *hd, int len)
{
+ struct journal *j;
int master = mg->first_recovery_master;
log_group(mg, "receive_first_recovery_done from %d master %d "
"mount_client_notified %d",
hd->nodeid, master, mg->mount_client_notified);
+ list_for_each_entry(j, &mg->journals, list) {
+ if (!j->needs_recovery)
+ continue;
+ j->needs_recovery = 0;
+ log_debug("receive_first_recovery_done clear %d needs_recovery", j->jid);
+ }
+
if (list_empty(&mg->changes)) {
/* everything is idle, no changes in progress */
10 years, 3 months
cluster: RHEL6 - gfs_controld: fix first recovery case
by David Teigland
Gitweb: http://git.fedorahosted.org/git/?p=cluster.git;a=commitdiff;h=72619738e20...
Commit: 72619738e20e2627a7e5fc3268b003d33ce699b2
Parent: e3f8a987f0108b0f5c1c76e8750c35f23fca2191
Author: David Teigland <teigland(a)redhat.com>
AuthorDate: Tue Jul 9 11:35:25 2013 -0500
Committer: David Teigland <teigland(a)redhat.com>
CommitterDate: Fri Jan 10 14:28:38 2014 -0600
gfs_controld: fix first recovery case
- node A is doing first recovery
- node B joins the mount group and is waiting for A to finish
- node A sets some journals X and Y as needing recovery based
on start message from A (it's not clear how/why A has journals
X,Y marked as needing recovery if it's doing first recovery.)
- node A fails
- node B marks A's journal as needing recovery
- node B takes over doing first recovery
- node B successfully finishes first recovery
- node B still has X,Y,A journals as needing recovery, which
prevents the mount group recovery from completing
First mount recovery allows the first mounter to recover all
journals without any other nodes present. This is meant to
guarantee that all journals are clean when first mount recovery
is done. So, after B completes first mount recovery it should
assume all journals are clean and it should clear any needs
recovery indication on journals.
bz 982305
Signed-off-by: David Teigland <teigland(a)redhat.com>
---
group/gfs_controld/cpg-new.c | 12 ++++++++++--
1 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/group/gfs_controld/cpg-new.c b/group/gfs_controld/cpg-new.c
index 537624d..8943f62 100644
--- a/group/gfs_controld/cpg-new.c
+++ b/group/gfs_controld/cpg-new.c
@@ -2304,8 +2304,8 @@ void process_recovery_uevent(char *name, int jid, int recover_status,
to check below that we've seen uevents for all jids
during first recovery before sending first_recovery_done. */
- log_group(mg, "recovery_uevent jid %d first recovery done %d",
- jid, mg->first_done_uevent);
+ log_group(mg, "recovery_uevent jid %d status %d first recovery done %d",
+ jid, recover_status, mg->first_done_uevent);
/* ignore extraneous uevent from others_may_mount */
if (mg->first_done_uevent)
@@ -2323,6 +2323,14 @@ void process_recovery_uevent(char *name, int jid, int recover_status,
if (first_done) {
log_group(mg, "recovery_uevent first_done");
mg->first_done_uevent = 1;
+
+ list_for_each_entry(j, &mg->journals, list) {
+ if (!j->needs_recovery)
+ continue;
+ j->needs_recovery = 0;
+ log_debug("recovery_uevent first_done clear %d needs_recovery", j->jid);
+ }
+
send_first_recovery_done(mg);
}
}
10 years, 3 months