[389-users] Replication reinit skipping entries

Trey Dockendorf treydock at gmail.com
Mon Aug 10 22:31:36 UTC 2015


I ran verify-db.pl on both servers and got the same output [1].  Both were
running while I did this so unsure if that can cause false negatives, the
message printed only mentions false positives.

I've had nsslapd-errorlog-level set to 8192 since I saw the warnings in my
first email, and those warnings haven't shown up since.  I also have not
done a reinit since then either as my scripts that now look for
inconsistencies in the directory show no problems.  Our directory is rather
static with an occasional addition of a new user.  The users missing from
ldap02 that prompted the reinit were not ones I had moved like the accounts
mentioned in my original logs.  Those users were created by an automated
system and are always left in ou=People.

If newer versions available to EL6 appear to have solved this for others
then I'll plan on upgrading these systems in hopes of removing this issue
with MMR.

Thanks,
- Trey

[1]:
# /usr/lib64/dirsrv/slapd-ldap01/verify-db.pl
*****************************************************************
verify-db: This tool should only be run if recovery start fails
and the server is down.  If you run this tool while the server is
running, you may get false reports of corrupted files or other
false errors.
*****************************************************************
Verify log files in /var/lib/dirsrv/slapd-ldap01/db ... Good
Verify db files ... Good

# /usr/lib64/dirsrv/slapd-ldap02/verify-db.pl
*****************************************************************
verify-db: This tool should only be run if recovery start fails
and the server is down.  If you run this tool while the server is
running, you may get false reports of corrupted files or other
false errors.
*****************************************************************
Verify log files in /var/lib/dirsrv/slapd-ldap02/db ... Good
Verify db files ... Good

On Mon, Aug 10, 2015 at 1:56 PM, Mark Reynolds <mareynol at redhat.com> wrote:

>
>
> On 08/10/2015 02:51 PM, German Parente wrote:
>
>> hi Trey,
>>
>> not sure which is the bug. Perhaps someone else here can give details ?
>> It could have come from the moment that entryrdn index has been created
>> but this was a very old version.
>>
>> For instance:
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=729369
>>
>> Sincerely, I cannot say when the entryrdn index got corrupted.
>>
> You can try running verify-db.pl to see if it reports any problems. If it
> does say there are issues, you could try exporting (db2ldif -r) and
> importing (ldif2db) on the master to reindex the entire database, and then
> try reiniting the other replica.
>
> Mark
>
> But what I could say is that our customers in recent versions are not
>> hitting this issue any more.
>>
>> Thanks and regards,
>>
>> German
>>
>>
>>
>>
>> ----- Original Message -----
>>
>>> From: "Trey Dockendorf" <treydock at gmail.com>
>>> To: "General discussion list for the 389 Directory server project." <
>>> 389-users at lists.fedoraproject.org>
>>> Sent: Monday, August 10, 2015 6:55:41 PM
>>> Subject: Re: [389-users] Replication reinit skipping entries
>>>
>>>
>>>
>>> German,
>>>
>>> Thanks for the response. Do you recall which version it was that fixed
>>> this
>>> issue or have reference to bug ticket? Looking at latest EL6 RPM
>>> changlelog
>>> doesn't show anything obviously related to this issue. I'm on
>>> 1.2.11.15-32.el6_5 and appears latest available is 1.2.11.15-60.el6. The
>>> 1.2.2 package is from EPEL and not sure why it was installed but appears
>>> to
>>> only install a LICENSE file.
>>> Thanks
>>> - Trey
>>>
>>> Hi again Trey,
>>>
>>> Sorry, I haven't seen your logs. But the errors are identical to what I
>>> am
>>> describing.
>>>
>>> Version 389-ds-1.2.2-1.el6.noarch is rather old and I would advice to, as
>>> first action, update to current version of 389-ds-base.
>>>
>>> Regards,
>>>
>>> German.
>>>
>>> ----- Original Message -----
>>>
>>>> From: "German Parente" < gparente at redhat.com >
>>>> To: "General discussion list for the 389 Directory server project." <
>>>> 389-users at lists.fedoraproject.org >
>>>> Sent: Friday, August 7, 2015 8:22:27 PM
>>>> Subject: Re: [389-users] Replication reinit skipping entries
>>>>
>>>> Hi Trey,
>>>>
>>>> I have seen this issue twice in customer cases. There was a bug sometime
>>>> ago
>>>> which provoke that during on-line re-init, an entry was not sent from
>>>> supplier side (because of corruption in entryrdn) and then, in the
>>>> consumer
>>>> side all the children of this entry were skipped.
>>>>
>>>> this is fixed in recent versions of 389-ds-base. All our customers
>>>> having
>>>> this issue have workarounded it by:
>>>>
>>>> - updating to current version so as the issue will not happen any more.
>>>> - fix db by: export -r + off-line re-import in all the replicas.
>>>>
>>>> the errors you mention are of this sort ?
>>>>
>>>> [28/May/2015:10:38:12 -0300] - WARNING: Import is running with
>>>> nsslapd-db-private-import-mem on; No other process is allowed to access
>>>> the
>>>> dat
>>>> abase
>>>> [28/May/2015:10:38:16 -0300] - import xxxx: WARNING: Skipping entry
>>>> "uid=13364081204,dc=somedc" which has
>>>> no parent, ending at line 0 of file "(bulk import)"
>>>> [28/May/2015:10:38:16 -0300] - import xxxx: WARNING: bad entry: ID 7127
>>>> [28/May/2015:10:38:16 -0300] - import xxxx: WARNING: Skipping entry
>>>> "uid=05722535249,dc=somedc" which has
>>>> no parent, ending at line 0 of file "(bulk import)"
>>>> [28/May/2015:10:38:17 -0300] - import xxxx: WARNING: bad entry: ID 7242
>>>>
>>>> Regards,
>>>>
>>>> German.
>>>>
>>>>
>>>>
>>>> ----- Original Message -----
>>>>
>>>>> From: "Trey Dockendorf" < treydock at gmail.com >
>>>>> To: "General discussion list for the 389 Directory server project."
>>>>> < 389-users at lists.fedoraproject.org >
>>>>> Sent: Friday, August 7, 2015 7:51:05 PM
>>>>> Subject: [389-users] Replication reinit skipping entries
>>>>>
>>>>> I recently discovered my two 389DS servers in master-master replication
>>>>> had
>>>>> some inconsistencies. Initially the only differences were 3 users added
>>>>> to
>>>>> ldap01 did not exist in ldap02. I re-initialized ldap02 from ldap01 and
>>>>> now
>>>>> am seeing that 3 groups defined are being skipped [1].
>>>>>
>>>>> I read in another thread that someone else saw this when they moved a
>>>>> LDAP
>>>>> record from one location to another in the directory. I believe that
>>>>> may
>>>>> be
>>>>> what happened here as I know the SLURM user and group both used to
>>>>> exist
>>>>> in
>>>>> a different OU. I moved them to the "Service" OUs some months ago.
>>>>> What's
>>>>> odd is that this move did not cause the user records to be skipped,
>>>>> just
>>>>> the
>>>>> group records. The thread I saw regarding something similar appears to
>>>>> have
>>>>> the fix resolved in 1.2.10 series. Is this some different bug?
>>>>>
>>>>> As a work around and test of a fix I deleted the 'backupuser' LDAP
>>>>> group
>>>>> from
>>>>> ldap01 and added it back via a LDIF. I then reinitialized ldap02 from
>>>>> ldap01
>>>>> and that group now exists on ldap02, but I still get a warning [2]. The
>>>>> nsuniqueid in the warning is not the nsuniqueid of the newly created
>>>>> backupuser entry. Is there anything to be concerned about with this
>>>>> warning?
>>>>>
>>>>> These are the 389-ds packages installed on both ldap01 and ldap02:
>>>>>
>>>>> 389-admin-1.1.35-1.el6.x86_64
>>>>> 389-admin-console-1.1.8-1.el6.noarch
>>>>> 389-admin-console-doc-1.1.8-1.el6.noarch
>>>>> 389-adminutil-1.1.19-1.el6.x86_64
>>>>> 389-adminutil-devel-1.1.19-1.el6.x86_64
>>>>> 389-console-1.1.7-1.el6.noarch
>>>>> 389-ds-1.2.2-1.el6.noarch
>>>>> 389-ds-base-1.2.11.15-32.el6_5.x86_64
>>>>> 389-ds-base-devel-1.2.11.15-32.el6_5.x86_64
>>>>> 389-ds-base-libs-1.2.11.15-32.el6_5.x86_64
>>>>> 389-ds-console-1.2.6-1.el6.noarch
>>>>> 389-ds-console-doc-1.2.6-1.el6.noarch
>>>>> 389-dsgw-1.1.11-1.el6.x86_64
>>>>>
>>>>> Let me know what other information may be useful and if this is
>>>>> something
>>>>> I
>>>>> need to submit as a bug report.
>>>>>
>>>>> Thanks,
>>>>> - Trey
>>>>>
>>>>> [1]:
>>>>>
>>>>> [07/Aug/2015:12:35:20 -0500] NSMMReplicationPlugin - conn=353332 op=3
>>>>> Relinquishing consumer connection extension
>>>>> [07/Aug/2015:12:35:20 -0500] - import userRoot: WARNING: Skipping entry
>>>>> "cn=slurm,ou=Service Groups,dc=brazos,dc=tamu,dc=edu" which has no
>>>>> parent,
>>>>> ending at line 0 of file "(bulk import)"
>>>>> [07/Aug/2015:12:35:20 -0500] - import userRoot: WARNING: Skipping entry
>>>>> "cn=rsv,ou=Service Groups,dc=brazos,dc=tamu,dc=edu" which has no
>>>>> parent,
>>>>> ending at line 0 of file "(bulk import)"
>>>>> [07/Aug/2015:12:35:20 -0500] - import userRoot: WARNING: bad entry: ID
>>>>> 20
>>>>> [07/Aug/2015:12:35:20 -0500] - import userRoot: WARNING: bad entry: ID
>>>>> 22
>>>>> [07/Aug/2015:12:35:21 -0500] - import userRoot: WARNING: Skipping entry
>>>>> "cn=backupuser,ou=Service Groups,dc=brazos,dc=tamu,dc=edu" which has no
>>>>> parent, ending at line 0 of file "(bulk import)"
>>>>> [07/Aug/2015:12:35:21 -0500] - import userRoot: WARNING: bad entry: ID
>>>>> 4102
>>>>> [07/Aug/2015:12:35:24 -0500] NSMMReplicationPlugin - conn=353332
>>>>> op=4242
>>>>> Acquired consumer connection extension
>>>>> [07/Aug/2015:12:35:24 -0500] - import userRoot: Workers finished;
>>>>> cleaning
>>>>> up...
>>>>> [07/Aug/2015:12:35:24 -0500] - import userRoot: Workers cleaned up.
>>>>> [07/Aug/2015:12:35:24 -0500] - import userRoot: Indexing complete.
>>>>> Post-processing...
>>>>> [07/Aug/2015:12:35:24 -0500] - import userRoot: Generating
>>>>> numSubordinates
>>>>> complete.
>>>>> [07/Aug/2015:12:35:24 -0500] - import userRoot: Flushing caches...
>>>>> [07/Aug/2015:12:35:24 -0500] - import userRoot: Closing files...
>>>>> [07/Aug/2015:12:35:24 -0500] - import userRoot: Import complete.
>>>>> Processed
>>>>> 4238 entries (3 were skipped) in 4 seconds. (1059.50 entries/sec)
>>>>>
>>>>> [2]:
>>>>> [07/Aug/2015:12:38:48 -0500] NSMMReplicationPlugin - conn=353340 op=3
>>>>> Relinquishing consumer connection extension
>>>>> [07/Aug/2015:12:38:49 -0500] - import userRoot: WARNING: Skipping entry
>>>>> "cn=slurm,ou=Service Groups,dc=brazos,dc=tamu,dc=edu" which has no
>>>>> parent,
>>>>> ending at line 0 of file "(bulk import)"
>>>>> [07/Aug/2015:12:38:49 -0500] - import userRoot: WARNING: Skipping entry
>>>>> "cn=rsv,ou=Service Groups,dc=brazos,dc=tamu,dc=edu" which has no
>>>>> parent,
>>>>> ending at line 0 of file "(bulk import)"
>>>>> [07/Aug/2015:12:38:49 -0500] - import userRoot: WARNING: bad entry: ID
>>>>> 20
>>>>> [07/Aug/2015:12:38:49 -0500] - import userRoot: WARNING: bad entry: ID
>>>>> 22
>>>>> [07/Aug/2015:12:38:50 -0500] - import userRoot: WARNING: Skipping entry
>>>>>
>>>>> "nsuniqueid=15ed1e81-b6a411e3-9084dfca-5696e563,cn=backupuser,ou=Service
>>>>> Groups,dc=brazos,dc=tamu,dc=edu" which has no parent, ending at line 0
>>>>> of
>>>>> file "(bulk import)"
>>>>> [07/Aug/2015:12:38:50 -0500] - import userRoot: WARNING: bad entry: ID
>>>>> 4102
>>>>> [07/Aug/2015:12:38:52 -0500] NSMMReplicationPlugin - conn=353340
>>>>> op=4243
>>>>> Acquired consumer connection extension
>>>>> [07/Aug/2015:12:38:52 -0500] - import userRoot: Workers finished;
>>>>> cleaning
>>>>> up...
>>>>> [07/Aug/2015:12:38:52 -0500] - import userRoot: Workers cleaned up.
>>>>> [07/Aug/2015:12:38:52 -0500] - import userRoot: Indexing complete.
>>>>> Post-processing...
>>>>> [07/Aug/2015:12:38:52 -0500] - import userRoot: Generating
>>>>> numSubordinates
>>>>> complete.
>>>>> [07/Aug/2015:12:38:52 -0500] - import userRoot: Flushing caches...
>>>>> [07/Aug/2015:12:38:52 -0500] - import userRoot: Closing files...
>>>>> [07/Aug/2015:12:38:53 -0500] - import userRoot: Import complete.
>>>>> Processed
>>>>> 4239 entries (3 were skipped) in 5 seconds. (847.80 entries/sec)
>>>>>
>>>>> --
>>>>> 389 users mailing list
>>>>> 389-users at lists.fedoraproject.org
>>>>> https://admin.fedoraproject.org/mailman/listinfo/389-users
>>>>>
>>>> --
>>>> 389 users mailing list
>>>> 389-users at lists.fedoraproject.org
>>>> https://admin.fedoraproject.org/mailman/listinfo/389-users
>>>>
>>> --
>>> 389 users mailing list
>>> 389-users at lists.fedoraproject.org
>>> https://admin.fedoraproject.org/mailman/listinfo/389-users
>>>
>>> --
>>> 389 users mailing list
>>> 389-users at lists.fedoraproject.org
>>> https://admin.fedoraproject.org/mailman/listinfo/389-users
>>>
>> --
>> 389 users mailing list
>> 389-users at lists.fedoraproject.org
>> https://admin.fedoraproject.org/mailman/listinfo/389-users
>>
>
> --
> 389 users mailing list
> 389-users at lists.fedoraproject.org
> https://admin.fedoraproject.org/mailman/listinfo/389-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.fedoraproject.org/pipermail/389-users/attachments/20150810/0c88b8d3/attachment.html>


More information about the 389-users mailing list