On Wed, Jun 19, 2013 at 10:51:32AM +0200, Lukas Slebodnik wrote:
On (18/06/13 13:51), Pavel Březina wrote:
https://fedorahosted.org/sssd/ticket/1947
A fast explanation how _srv_ expansion works. _srv_ is inserted into server list as so called meta server. Let us consider following configuration:
*Setup* ipa_server = _srv_, ipa.pb server list will contain: meta -> ipa.pb
*Expansion* meta -> ipa.pb:389 -> ipa.pb ; remove meta ipa.pb:389 -> ipa.pb ; meta
*Collapse* remove ipa.pb:389 ; insert meta meta -> ipa.pb
The main problem is that expanded SRV servers are marked as NEUTRAL during online check, but they don't collapse back into a meta server.
This will trigger another SRV expansion, leaving the old server in the list and trying to add the servers again. This is present in both master and 1.9 (and probably older versions), although the result is slightly different.
In master, we don't insert a server into server list if it is already present. Because state->meta is orphaned from the previous SRV expansion, state->meta->next is NULL and SSSD crashes later.
I can confirm, tah patches fix ticket 1947 (crash).
In 1.9, we simply insert duplicate servers. Those servers are inserted after orphaned state->meta, state->meta is orphaned again, leaving those servers globally unreachable. However, it seemingly does not affect the fail over. You just run into d25e7c65.
Here are four patches for master, and two patches for 1.9.
I have just a question about code convention. I thought, that name of function parameter starting with "_" means output variable, but in function collapse_srv_lookup it is used as a in/out variable.
If some function is called with NULL in place of output parameter, it means that you do not want store output to this parameter. But function collapse_srv_lookup could not be called with NULL
static struct fo_server * -collapse_srv_lookup(struct fo_server *server) +collapse_srv_lookup(struct fo_server **_server) {
- struct fo_server *tmp, *meta;
- server = *_server;
^^^^^^^ This is reason, why function could not be called with NULL.Do we have any code conventions for in/out variable?
I don't think we do. Does the logic of collapse_srv_lookup mandate that the input and output are always the same? If not, can we simply add an input parameter that can't be NULL?