[Fedora-directory-users] Re: MMR: excessive clock skew

Gary Windham windhamg at email.arizona.edu
Mon May 26 22:45:21 UTC 2008


Sorry for not replying to the original thread, but I just joined this  
list.

On Tue, 13 May 2008, Rich Megginson wrote:

 > Has anyone seen these errors with 1.1?  We fixed a few 64-bit  
issues in 1.1.

I am running two 32-bit FDS 1.1 (fedora-ds-1.1.0-3.fc6) servers, on  
RHEL 5.1, in an MMR configuration.  These servers, which are  
configured behind a load balancer, act as the University's central  
authentication service.  We have are using the password policy plugin  
and have the "passwordisglobalpolicy" setting enabled, so there is a  
substantial amount of write activity due to replication of password- 
policy-related attributes (e.g., passwordRetryCount,  
retryCountResetTime, etc).  Time on both systems is synchronized via  
NTP; clocks are in sync.

We have the same situation as Reinhard Nappert reported on 5/13/2008:  
MMR will work fine for a while (usually a few weeks; the longest  
period we've gone is a month, the shortest time a few hours).   
Eventually replication will fail with the following sequence of  
messages in the errors log:

[24/May/2008:05:18:54 -0700] - csngen_adjust_time: adjustment limit  
exceeded; value - 86401, limit - 86400
[24/May/2008:05:18:54 -0700] NSMMReplicationPlugin - conn=1800  
op=60262 replica="<suffix>": Unable to acquire replica: error:  
excessive clock skew
[24/May/2008:05:20:05 -0700] - csngen_adjust_time: adjustment limit  
exceeded; value - 86401, limit - 86400
[24/May/2008:05:20:05 -0700] NSMMReplicationPlugin -  
agmt="cn=kif2zapp" (zapp:389): Incremental protocol: fatal er
ror - too much time skew between replicas!
[24/May/2008:05:20:05 -0700] NSMMReplicationPlugin -  
agmt="cn=kif2zapp" (zapp:389): Incremental update failed and
requires administrator action

The "csngen_adjust_time" error message always reports the same value  
when this occurs (86401).

We have also employed the workaround described by Chris St. Pierre in https://bugzilla.redhat.com/show_bug.cgi?id=233642 
#c3.  This resolves the problem for a short while, but it always  
reappears.  BTW, I was in contact with Chris recently about his  
experiences with MMR and he said that, in addition to moving to FDS  
1.1, he moved a lot of "frequently updated" data out of FDS and into  
MySQL, and that his problem disappeared afterward; obviously this  
isn't a solution for us as we are utilizing FDS as an authentication  
engine.

We are desperately trying to find a solution to this issue that will  
allow us to continue using MMR...we could resort to a traditional  
passive/active + shared storage HA design, but we want to keep that as  
a last resort.  If there is any additional information I should  
provide, please let me know.

--
Gary Windham
Senior Enterprise Systems Architect
The University of Arizona, UITS
+1 520 626 5981




More information about the 389-users mailing list