Author: rmeggins
Update of /cvs/dirsec/ldapserver/ldap/servers/plugins/replication
In directory
cvs-int.fedora.redhat.com:/tmp/cvs-serv24462/ldapserver/ldap/servers/plugins/replication
Modified Files:
repl5_plugins.c
Log Message:
Resolves: bug 283041
Bug Description: MMR: Directory updates on same object
Reviewed by: nhosoi (Thanks!)
Fix Description: The problem does appear to be concurrency. I think the original
intention of
the urp fixup code was that it should only be run inside the database lock, so
that the database could be restored to a consistent state before the next
operation was processed. However, this requires the database code to know when
the database is already locked, so that if e.g. a modrdn operation needs to
call an internal delete, the database should not be locked again. The flag
OP_FLAG_REPL_FIXUP is used to denote both that the operation is such an
internal operation, and that the database should not be locked again.
There are a couple of cases where these operations can be called from outside
of the database lock:
urp_fixup_rename_entry is called from multimaster_postop_modrdn and
multimaster_postop_delete, both of which are front end post op plugins, not
called from within the database lock. Same with urp_fixup_delete_entry and
urp_fixup_modify_entry. In other cases, such as urp_fixup_add_entry, and other
places where urp_fixup_rename_entry and urp_fixup_modify_entry are called, they
are called from a bepostop plugin function, which is called after the original
database operation has been processed, within the database lock. So the
solution appears to be to move the urp_* functions to the bepostop plugin
functions. One of these functions does an internal search -
urp_get_min_naming_conflict_entry - but it does not appear that search locks
the database, so there was nothing to be done to make it "reentrant".
Without this patch, I can crash the server in a matter of minutes (x86_64
rhel5) using the latest Fedora DS 1.1 code. With the patch, the server runs
for several hours (maybe longer, I had to stop the test).
Also, to really exercise the urp code, I added a rename operation between the
add and delete e.g.
add("ou=test");
rename("ou=test", "ou=test2");
delete("ou=test2");
The server still runs for several hours with no problems.
Platforms tested: RHEL5 x86_64
Flag Day: no
Doc impact: no
Index: repl5_plugins.c
===================================================================
RCS file: /cvs/dirsec/ldapserver/ldap/servers/plugins/replication/repl5_plugins.c,v
retrieving revision 1.7
retrieving revision 1.8
diff -u -r1.7 -r1.8
--- repl5_plugins.c 10 Nov 2006 23:45:17 -0000 1.7
+++ repl5_plugins.c 12 Sep 2007 00:59:53 -0000 1.8
@@ -789,12 +789,26 @@
int
multimaster_bepostop_modrdn (Slapi_PBlock *pb)
{
+ Slapi_Operation *op;
+
+ slapi_pblock_get(pb, SLAPI_OPERATION, &op);
+ if ( ! operation_is_flag_set (op, OP_FLAG_REPL_FIXUP) )
+ {
+ urp_post_modrdn_operation (pb);
+ }
return 0;
}
int
multimaster_bepostop_delete (Slapi_PBlock *pb)
{
+ Slapi_Operation *op;
+
+ slapi_pblock_get(pb, SLAPI_OPERATION, &op);
+ if ( ! operation_is_flag_set (op, OP_FLAG_REPL_FIXUP) )
+ {
+ urp_post_delete_operation (pb);
+ }
return 0;
}
@@ -814,16 +828,7 @@
int
multimaster_postop_delete (Slapi_PBlock *pb)
{
- int rc;
- Slapi_Operation *op;
-
- slapi_pblock_get(pb, SLAPI_OPERATION, &op);
- if ( ! operation_is_flag_set (op, OP_FLAG_REPL_FIXUP) )
- {
- urp_post_delete_operation (pb);
- }
- rc = process_postop(pb);
- return rc;
+ return process_postop(pb);
}
int
@@ -835,16 +840,7 @@
int
multimaster_postop_modrdn (Slapi_PBlock *pb)
{
- int rc;
- Slapi_Operation *op;
-
- slapi_pblock_get(pb, SLAPI_OPERATION, &op);
- if ( ! operation_is_flag_set (op, OP_FLAG_REPL_FIXUP) )
- {
- urp_post_modrdn_operation (pb);
- }
- rc = process_postop(pb);
- return rc;
+ return process_postop(pb);
}
Show replies by date