Gitweb: http://git.fedorahosted.org/git/cluster.git?p=cluster.git;a=commitdiff;h=bdf... Commit: bdf7691f84eac400323c84a490f0147ec851afdf Parent: c6fca5e49ad1c8a6e1bce1874d0688aab3ccac26 Author: Jonathan Brassow jbrassow@redhat.com AuthorDate: Thu Apr 26 12:39:47 2012 -0500 Committer: Jonathan Brassow jbrassow@redhat.com CommitterDate: Thu Apr 26 12:39:47 2012 -0500
Fix bug in cmirror that caused incorrect status info to print on some nodes.
Here's the upstream commit message:
commit 172a9457bf8dcc1e5c3a607be2e8d1ac80ac619b Author: Jonathan Earl Brassow jbrassow@redhat.com Date: Thu Apr 26 17:30:49 2012 +0000
Fix bug in cmirror that caused incorrect status info to print on some nodes.
Looking at the code in cmirrord/local.c, we can see the various different request types handled in different ways. Some information that is non-changin does not need to go around the cluster and can be short-circuited. For example, once the cluster mirror is in-sync, it is pointless to continue sending that query around the cluster. We can save network bandwidth and repl directly back to the kernel. When it comes to status information, there are two types 'TABLE' and 'INFO'. The 'TABLE' information never changes and belongs to the group of requests that can be safely short-circuited. The 'STATUS' information can change - and will change if a device fails. Thus it cannot be short-circuited, but this is exactly what was found. The 'STATUS' information request was being short-circuited and therefore never reporting th failure condition to anyone other than the "server" that experienced it directly. --- cmirror/src/local.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/cmirror/src/local.c b/cmirror/src/local.c index 3e6d74a..29184a0 100644 --- a/cmirror/src/local.c +++ b/cmirror/src/local.c @@ -209,7 +209,6 @@ static int do_local_work(void *data) case DM_CLOG_DTR: case DM_CLOG_IN_SYNC: case DM_CLOG_GET_SYNC_COUNT: - case DM_CLOG_STATUS_INFO: case DM_CLOG_STATUS_TABLE: case DM_CLOG_PRESUSPEND: /* We do not specify ourselves as server here */ @@ -245,6 +244,7 @@ static int do_local_work(void *data) case DM_CLOG_MARK_REGION: case DM_CLOG_GET_RESYNC_WORK: case DM_CLOG_SET_REGION_SYNC: + case DM_CLOG_STATUS_INFO: case DM_CLOG_IS_REMOTE_RECOVERING: case DM_CLOG_POSTSUSPEND: r = cluster_send(tfr);