On Fri, 2010-04-23 at 11:57 +0200, Jakub Hrozek wrote:
[PATCH 1/2] Remove freed server_common entities from list We didn't hit this before as we never removed common entities. When using service requests, we remove the resolved fo_servers when we hit a timeout, so the server_common can be also removed.
Nack.
common = rc_alloc(mem_ctx, struct server_common); if (common == NULL) return NULL; + talloc_set_destructor((TALLOC_CTX *) common, destroy_server_common);
Please set the destructor later in the function, rc_alloc() does not return zeroed memory and common can be freed if talloc_strdup() fails. BTW, this might be a good opportunity to remove all the silly common->something = NULL; statements and instead use a memset().
Another issue: Notice that in the destructor, if there really are some requests pending and we really free the server_common, this might cause the resolve_service_request_destructor() to be called later and try to remove the request from common->request_list. Attached is a small untested patch that might help, please squash it in with your patch. Additionally, change the destructor to not traverse the list, but instead print a warning if it is not empty (to notify us that we have a bug in the code).
Lastly, a small nitpick: Could you please name the server_common destructor "server_common_destructor"? It would be obvious on a first glance that it is a destructor and shouldn't be called directly. This also seems to be the convention for most of the rest of the code.
[PATCH 2/2] Support SRV servers in failover Adds a new failover API call fo_add_srv_server that allows the caller to specify a server that is later resolved into a list of specific servers using SRV requests.
Also adds a new failover option that specifies how often should the servers resolved from SRV query considered valid until we need a refresh.
The "real" servers to connect to are returned to the user as usual, using the fo_resolve_service_{send,recv} calls.
Nack.
In collapse_srv_lookup(): + while (server->prev && server->prev->srv_data == meta->srv_data) { + tmp = server->prev; + DLIST_REMOVE(server->service->server_list, tmp); + talloc_zfree(tmp); + } + while (server->next && server->next->srv_data == meta->srv_data) { + tmp = server->prev; ^^^^ + DLIST_REMOVE(server->service->server_list, tmp); + talloc_zfree(tmp); + }
Looks like a copy&paste error.
Nitpic in fo_add_srv_server(): + server->port = 0; Not really needed since we use talloc_zero().
In the same function, you allocate srv_data, domain, proto and srv from the service context, but I'd expect you to use server for the former and srv_data for the later three. Was this a mistake or am I missing something?
In function fo_resolve_srv_done(): + for (reply = reply_list; reply; reply = reply->next) { + DLIST_FOR_EACH(server, state->service->server_list) { + if (server->port == reply->port) { + ret = EEXIST; + break; + } + } + if (ret == EEXIST) continue;
You should reset ret to EOK before the DLIST_FOR_EACH() macro, otherwise if you mark one as EEXIST, the others will be marked that way as well.
I see some space/tabs mixing in dlinklist.h, plus: + for (tmp = (list2); tmp->next; tmp = tmp->next) This will crash if list2 == NULL.
Another small nitpick: Would you mind not using the fo_* prefix for static functions? I wanted to keep it only for the public functions that will be used by other translation units.
Ah, yet another nitpick: + * The 'srv_retry_timeout' member specifies how log a SRV lookup ^^^ - typo
And lastly, since you added a substantial amount of code, please add you name to the list of authors at the top of the file and add this year :)
Thanks Martin