[389-users] replication from 1.2.8.3 to 1.2.10.4

Rich Megginson rmeggins at redhat.com
Fri Jul 13 14:05:40 UTC 2012


On 07/13/2012 08:02 AM, Robert Viduya wrote:
>>> I've enabled the core dump stuff, but now I can't seem to get it to crash.  But I'm still getting the changelog messages in the error logs whenever I restart.  In addition, the hub server keeps running out of disk space.  I tracked it down to the access log filling up with MOD messages from replication.  It looks like changes are coming down from our 1.2.8 servers and being applied over and over again.  As an example, one of our entries was modified three times today, and on all our other machines I see the following in the access log file:
>>>
>>> # egrep 78b8cc871a3cda9f352580e797b270bc access
>>> [12/Jul/2012:11:00:59 -0400] conn=383671 op=3145 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
>>> [12/Jul/2012:11:01:24 -0400] conn=383671 op=3153 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
>>> [12/Jul/2012:11:01:38 -0400] conn=383671 op=3157 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
>>>
>>> But on the problematic hub server, I see:
>>>
>>> # egrep 78b8cc871a3cda9f352580e797b270bc access
>>> [12/Jul/2012:15:17:29 -0400] conn=2 op=58 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
>>> [12/Jul/2012:15:17:29 -0400] conn=2 op=60 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
>>> [12/Jul/2012:15:17:29 -0400] conn=2 op=61 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
>>> [12/Jul/2012:15:24:42 -0400] conn=6 op=169 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
>>> [12/Jul/2012:15:24:42 -0400] conn=6 op=171 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
>>> [12/Jul/2012:15:24:42 -0400] conn=6 op=172 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
>>> [12/Jul/2012:15:24:45 -0400] conn=3 op=170 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
>>> [12/Jul/2012:15:24:45 -0400] conn=3 op=172 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
>>> [12/Jul/2012:15:24:45 -0400] conn=3 op=173 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
>>> [12/Jul/2012:15:24:51 -0400] conn=2 op=2234 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
>>> [12/Jul/2012:15:24:51 -0400] conn=2 op=2236 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
>>> [12/Jul/2012:15:24:51 -0400] conn=2 op=2237 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
>>> [12/Jul/2012:15:24:55 -0400] conn=6 op=2233 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
>>> [12/Jul/2012:15:24:55 -0400] conn=6 op=2235 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
>>> [12/Jul/2012:15:24:55 -0400] conn=6 op=2236 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
>>> [12/Jul/2012:15:24:57 -0400] conn=3 op=2234 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
>>> ...
>>>
>>> I truncated the output for brevity, but there's over 250 MODs to that one object.  It's as if the server isn't able to do the replication bookkeeping and is accepting changes over and over again.  Eventually the disk fills up.
>> Do you see error messages from the supplier suggesting that it is attempting to send the operation but failing and retrying?
> No, there's nothing in the error logs on the supplier side.
>
>> Do all of these operations have the same CSN?  The csn will be logged with the RESULT line for the operation.  Also, what is the err=? for the MOD operations?  err=0?  Some other code?
> Here's some sample out, again, limited for brevity.  Most of the RESULT lines don't have a CSN, just the first few.  All the err= codes are 0.  I've grepped out just the DN sample from my previous mail, again for brevity.  There's a lot more DNs being reported:
>
> [12/Jul/2012:15:17:29 -0400] conn=2 op=58 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
> [12/Jul/2012:15:17:29 -0400] conn=2 op=58 RESULT err=0 tag=103 nentries=0 etime=0 csn=4fff2000000000330000
> [12/Jul/2012:15:17:29 -0400] conn=2 op=60 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
> [12/Jul/2012:15:17:29 -0400] conn=2 op=60 RESULT err=0 tag=103 nentries=0 etime=0 csn=4fff200f000000330000
> [12/Jul/2012:15:17:29 -0400] conn=2 op=61 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
> [12/Jul/2012:15:17:29 -0400] conn=2 op=61 RESULT err=0 tag=103 nentries=0 etime=0 csn=4fff2027000000330000
> [12/Jul/2012:15:24:42 -0400] conn=6 op=169 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
> [12/Jul/2012:15:24:42 -0400] conn=6 op=169 RESULT err=0 tag=103 nentries=0 etime=0
> [12/Jul/2012:15:24:42 -0400] conn=6 op=171 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
> [12/Jul/2012:15:24:42 -0400] conn=6 op=171 RESULT err=0 tag=103 nentries=0 etime=0
> [12/Jul/2012:15:24:42 -0400] conn=6 op=172 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
> [12/Jul/2012:15:24:42 -0400] conn=6 op=172 RESULT err=0 tag=103 nentries=0 etime=0
> [12/Jul/2012:15:24:45 -0400] conn=3 op=170 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
> [12/Jul/2012:15:24:45 -0400] conn=3 op=170 RESULT err=0 tag=103 nentries=0 etime=0
> [12/Jul/2012:15:40:34 -0400] conn=3 op=170 MOD dn="gtdirguid=64898416edc9887656a2f933ae48a113,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
> [12/Jul/2012:15:40:34 -0400] conn=3 op=170 RESULT err=0 tag=103 nentries=0 etime=0 csn=4fff25b5000300330000
> [12/Jul/2012:15:24:45 -0400] conn=3 op=172 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
> [12/Jul/2012:15:24:45 -0400] conn=3 op=172 RESULT err=0 tag=103 nentries=0 etime=0
> [12/Jul/2012:15:40:34 -0400] conn=3 op=172 MOD dn="gtdirguid=e824607afc4eb02a105b633bcbf9e7c1,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
> [12/Jul/2012:15:40:34 -0400] conn=3 op=172 RESULT err=0 tag=103 nentries=0 etime=0 csn=4fff25b6000100330000
> [12/Jul/2012:16:03:44 -0400] conn=3 op=172 EXT oid="2.16.840.1.113730.3.5.5" name="Netscape Replication End Session"
> [12/Jul/2012:16:03:44 -0400] conn=3 op=172 RESULT err=0 tag=120 nentries=0 etime=0
> [12/Jul/2012:15:24:45 -0400] conn=3 op=173 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
> [12/Jul/2012:15:24:45 -0400] conn=3 op=173 RESULT err=0 tag=103 nentries=0 etime=0
> [12/Jul/2012:15:40:34 -0400] conn=3 op=173 MOD dn="gtdirguid=427dd677597bb6143e227143e771b811,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
> [12/Jul/2012:15:40:34 -0400] conn=3 op=173 RESULT err=0 tag=103 nentries=0 etime=0 csn=4fff25b6000200330000
> [12/Jul/2012:16:03:47 -0400] conn=3 op=173 EXT oid="2.16.840.1.113730.3.5.12" name="replication-multimaster-extop"
> [12/Jul/2012:16:03:47 -0400] conn=3 op=173 RESULT err=0 tag=120 nentries=0 etime=0
> [12/Jul/2012:15:24:51 -0400] conn=2 op=2234 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
> [12/Jul/2012:15:24:51 -0400] conn=2 op=2234 RESULT err=0 tag=103 nentries=0 etime=0
> [12/Jul/2012:15:24:51 -0400] conn=2 op=2236 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
> [12/Jul/2012:15:24:51 -0400] conn=2 op=2236 RESULT err=0 tag=103 nentries=0 etime=0
> [12/Jul/2012:15:24:51 -0400] conn=2 op=2237 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
> [12/Jul/2012:15:24:51 -0400] conn=2 op=2237 RESULT err=0 tag=103 nentries=0 etime=0
> [12/Jul/2012:15:24:55 -0400] conn=6 op=2233 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
> [12/Jul/2012:15:24:55 -0400] conn=6 op=2233 RESULT err=0 tag=103 nentries=0 etime=0
> [12/Jul/2012:15:24:55 -0400] conn=6 op=2235 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
> [12/Jul/2012:15:24:55 -0400] conn=6 op=2235 RESULT err=0 tag=103 nentries=0 etime=0
> [12/Jul/2012:15:24:55 -0400] conn=6 op=2236 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
> [12/Jul/2012:15:24:55 -0400] conn=6 op=2236 RESULT err=0 tag=103 nentries=0 etime=0
> [12/Jul/2012:15:24:57 -0400] conn=3 op=2234 MOD dn="gtdirguid=78b8cc871a3cda9f352580e797b270bc,ou=accounts,ou=gtaccounts,ou=departments,dc=gted,dc=gatech,dc=edu"
> [12/Jul/2012:15:24:57 -0400] conn=3 op=2234 RESULT err=0 tag=103 nentries=0 etime=0
>
> The upgrade to 1.2.10.12 seems to have fixed the issue however, I'm not seeing these repeated entries anymore nor am I seeing changelog error messages when I restart the server.  I know you're all working on 1.2.11, but are there any major problems with 1.2.10.12 that's keeping it from being pushed to stable?

The only thing 1.2.10.12 needs is testers to give it positive karma 
("Works For Me") in 
https://admin.fedoraproject.org/updates/FEDORA-EPEL-2012-6265/389-ds-base-1.2.10.12-1.el5 
or whatever your platform is.

If you don't have a FAS account or don't want to do this, do I have your 
permission to provide your name and email to the update as a user for 
which the update is working?

> 1.2.10.4 definitely isn't working for us.
> --
> 389 users mailing list
> 389-users at lists.fedoraproject.org
> https://admin.fedoraproject.org/mailman/listinfo/389-users




More information about the 389-users mailing list