Thanks Mark,

So this example is a user password change using kinit, the password has been changed on freeipa02 but not then replicated to the others. This happens for other records, but I don't have examples of these at the moment.

As far as I'm aware, there is no fractal replication set up.

Freeipa01:

# dynamic-kepler, users, accounts, ipa.example.com
dn: uid=dynamic-kepler,cn=users,cn=accounts,dc=ipa,dc=example,dc=com
uid: dynamic-kepler
krbLastPwdChange: 20170608170011Z
krbPasswordExpiration: 20170608170011Z

Freeipa02:

# dynamic-kepler, users, accounts, ipa.example.com
dn: uid=dynamic-kepler,cn=users,cn=accounts,dc=ipa,dc=example,dc=com
uid: dynamic-kepler
krbLastPwdChange: 20170608170021Z
krbPasswordExpiration: 20170906170021Z

Freeipa03:

# dynamic-kepler, users, accounts, ipa.example.com
dn: uid=dynamic-kepler,cn=users,cn=accounts,dc=ipa,dc=example,dc=com
uid: dynamic-kepler
krbLastPwdChange: 20170608170011Z
krbPasswordExpiration: 20170608170011Z

Errors on Freeipa02:

[08/Jun/2017:01:46:50.635529447 +0000] replica_generate_next_csn: opcsn=5938ac8b000500030000 <= basecsn=5938ac8b000500040000, adjusted opcsn=5938ac8b000600030000
[08/Jun/2017:12:16:46.497249649 +0000] replica_generate_next_csn: opcsn=5939402f000500030000 <= basecsn=5939402f000800040000, adjusted opcsn=5939402f000900030000
[08/Jun/2017:23:38:48.197750001 +0000] replica_generate_next_csn: opcsn=5939e009000100030000 <= basecsn=5939e009000f00040000, adjusted opcsn=5939e009001000030000

The other nodes have no errors from this data.

Access logs:

Freeipa01:

[08/Jun/2017:01:46:50.635529447 +0000] replica_generate_next_csn: opcsn=5938ac8b000500030000 <= basecsn=5938ac8b000500040000, adjusted opcsn=5938ac8b000600030000
[08/Jun/2017:12:16:46.497249649 +0000] replica_generate_next_csn: opcsn=5939402f000500030000 <= basecsn=5939402f000800040000, adjusted opcsn=5939402f000900030000
[08/Jun/2017:23:38:48.197750001 +0000] replica_generate_next_csn: opcsn=5939e009000100030000 <= basecsn=5939e009000f00040000, adjusted opcsn=5939e009001000030000

Freeipa02:

Shows no logs "to" the other 2 nodes.

Freeipa03:

[08/Jun/2017:17:10:06.343697044 +0000] conn=9237 fd=70 slot=70 connection from 192.168.0.12 to 192.168.0.13
[08/Jun/2017:19:54:05.025713675 +0000] conn=9665 fd=70 slot=70 connection from 192.168.0.12 to 192.168.0.13

Freeipa02 replication logging:

[09/Jun/2017:11:24:58.827281135 +0000] NSMMReplicationPlugin - csnplCommitALL: processing data csn 593964af000900030000

Repeats 800 - 900 time per second with a different csn.

Full logs attached.


On 08/06/17 15:45, Mark Reynolds wrote:


On 06/07/2017 10:58 AM, Nick Campion via FreeIPA-users wrote:

Hi all,

 

We have a 3 master setup that is failing to replicate changes from a particular node to the other IPA instances. The replication status says it's all fine, however the record hasn't been changed on the other servers. We've seen this on user password changes, adding hosts and services. The only thing we've found that seems to fix this temporarily is to re-initialize from the master with the changed record. A force-sync doesn't pick up the changed record.

What is the change you making, what attribute are you updating?  Could it be possible that its being excluded by fractional replication?  Or is it all changes?

Any errors in the logs on the nodes(good and bad):  /var/log/dirsrv/slapd-INSTANCE/errors

Do you see replication sessions starting between the bad node and good ones?  Are they talking?  Check the access log ( /var/log/dirsrv/slapd-INSTANCE/access) on a good node and look for "connection from <BAD NODE IP address>"

Next would be to enable replication logging on the bad node and reproduce the problem (then disable repl logging right away), then send us the logs to look at.  See  https://access.redhat.com/documentation/en-us/red_hat_directory_server/10/html/administration_guide/managing_replication-troubleshooting_replication_related_problems

Regards,
Mark

Not sure what logs would be helpful to diagnose what is happening in this setup. 

# ipa-replica-manage -v list `hostname`
freeipa03.mgmt.example.com: replica
last init status: None
last init ended: 1970-01-01 00:00:00+00:00
last update status: Error (0) Replica acquired successfully: Incremental update succeeded
last update ended: 2017-06-07 14:43:53+00:00
freeipa02.mgmt.example.com: replica
last init status: None
last init ended: 1970-01-01 00:00:00+00:00
last update status: Error (0) Replica acquired successfully: Incremental update succeeded
last update ended: 2017-06-07 14:43:53+00:00

# ldapsearch -W -x -D "cn=directory manager" -b "cn=users,cn=accounts,dc=ipa,dc=example,dc=com" "nsds5ReplConflict=*" \* nsds5ReplConflict
Enter LDAP Password:
# extended LDIF
#
# LDAPv3
# base <cn=users,cn=accounts,dc=ipa,dc=example,dc=com> with scope subtree
# filter: nsds5ReplConflict=*
# requesting: * nsds5ReplConflict
#

# search result
search: 2
result: 0 Success

# numResponses: 1

Any help in what else can be checked or what logs would be helpful would be appreciated.

Thanks

Nick



_______________________________________________
FreeIPA-users mailing list -- freeipa-users@lists.fedorahosted.org
To unsubscribe send an email to freeipa-users-leave@lists.fedorahosted.org