-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
New patches are attached, comments are inline.
On 04/26/2010 05:54 PM, Martin Nagy wrote:
On Fri, 2010-04-23 at 11:57 +0200, Jakub Hrozek wrote:
[PATCH 1/2] Remove freed server_common entities from list We didn't hit this before as we never removed common entities. When using service requests, we remove the resolved fo_servers when we hit a timeout, so the server_common can be also removed.
Nack.
common = rc_alloc(mem_ctx, struct server_common); if (common == NULL) return NULL;
- talloc_set_destructor((TALLOC_CTX *) common,
destroy_server_common);
Please set the destructor later in the function, rc_alloc() does not return zeroed memory and common can be freed if talloc_strdup() fails.
done
BTW, this might be a good opportunity to remove all the silly common->something = NULL; statements and instead use a memset().
That sounds too dangerous on a rc_alloc-ed structure, it might also zero the reference counts, no? I've left these assignments explicitly
Another issue: Notice that in the destructor, if there really are some requests pending and we really free the server_common, this might cause the resolve_service_request_destructor() to be called later and try to remove the request from common->request_list. Attached is a small untested patch that might help, please squash it in with your patch. Additionally, change the destructor to not traverse the list, but instead print a warning if it is not empty (to notify us that we have a bug in the code).
I agree with the solution. This is very difficult to test (a SRV records collapsing at the same time another request to a server with the same name is being resolved), but I have included your patch in the revised version and ran some smoketests.
Lastly, a small nitpick: Could you please name the server_common destructor "server_common_destructor"? It would be obvious on a first glance that it is a destructor and shouldn't be called directly. This also seems to be the convention for most of the rest of the code.
OK, I was trying to keep up with the ${action}_server_common scheme but I agree that destructor is more descriptive
[PATCH 2/2] Support SRV servers in failover Adds a new failover API call fo_add_srv_server that allows the caller to specify a server that is later resolved into a list of specific servers using SRV requests.
Also adds a new failover option that specifies how often should the servers resolved from SRV query considered valid until we need a refresh.
The "real" servers to connect to are returned to the user as usual, using the fo_resolve_service_{send,recv} calls.
Nack.
In collapse_srv_lookup():
while (server->prev && server->prev->srv_data ==meta->srv_data) {
tmp = server->prev;DLIST_REMOVE(server->service->server_list, tmp);talloc_zfree(tmp);}while (server->next && server->next->srv_data ==meta->srv_data) {
tmp = server->prev; ^^^^DLIST_REMOVE(server->service->server_list, tmp);talloc_zfree(tmp);}Looks like a copy&paste error.
Yep, this was a bug.
Nitpic in fo_add_srv_server():
- server->port = 0;
Not really needed since we use talloc_zero().
done
In the same function, you allocate srv_data, domain, proto and srv from the service context, but I'd expect you to use server for the former and srv_data for the later three. Was this a mistake or am I missing something?
ok, allocating them on top of srv_data makes more sense.
In function fo_resolve_srv_done():
- for (reply = reply_list; reply; reply = reply->next) {
DLIST_FOR_EACH(server, state->service->server_list) {if (server->port == reply->port) {ret = EEXIST;break;}}if (ret == EEXIST) continue;You should reset ret to EOK before the DLIST_FOR_EACH() macro, otherwise if you mark one as EEXIST, the others will be marked that way as well.
nice catch, thanks
I see some space/tabs mixing in dlinklist.h, plus:
for (tmp = (list2); tmp->next; tmp = tmp->next)This will crash if list2 == NULL.
fixed
Another small nitpick: Would you mind not using the fo_* prefix for static functions? I wanted to keep it only for the public functions that will be used by other translation units.
done
Ah, yet another nitpick:
- The 'srv_retry_timeout' member specifies how log a SRV lookup ^^^ - typo
done
And lastly, since you added a substantial amount of code, please add you name to the list of authors at the top of the file and add this year :)
done
Thanks Martin
Thank you for the review!