Patch 1 - tools_util.h now provides signal_sssd function. It replaces send_sighup function from sss_debuglevel.c. It can be used to send any signal to monitor (used in patch 2).
Patch 2 - https://fedorahosted.org/sssd/ticket/1311
Patches are in attachment.
Michal
On 09/12/2012 02:39 PM, Michal Židek wrote:
Patch 1 - tools_util.h now provides signal_sssd function. It replaces send_sighup function from sss_debuglevel.c. It can be used to send any signal to monitor (used in patch 2).
Patch 2 - https://fedorahosted.org/sssd/ticket/1311
Patches are in attachment.
Michal
sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-devel
I've found two over-80-chars lines. Sending both patches again.
Michal
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Wed 12 Sep 2012 09:12:08 AM EDT, Michal Židek wrote:
On 09/12/2012 02:39 PM, Michal Židek wrote:
Patch 1 - tools_util.h now provides signal_sssd function. It replaces send_sighup function from sss_debuglevel.c. It can be used to send any signal to monitor (used in patch 2).
Ack
Please rename nss_logrotate() to nss_hup(). It's not just rotating logs, so I'd rather it be less confusing.
Also: + struct nss_ctx *nctx = (struct nss_tx*)rctx->pvt_ctx;
you're casting to a typoed type.
I'd prefer to leave the review of the sss_mmap_cache_invalidate() routine to Simo, as he's more familiar with the code.
On 09/13/2012 07:26 PM, Stephen Gallagher wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Wed 12 Sep 2012 09:12:08 AM EDT, Michal Židek wrote:
On 09/12/2012 02:39 PM, Michal Židek wrote:
Patch 1 - tools_util.h now provides signal_sssd function. It replaces send_sighup function from sss_debuglevel.c. It can be used to send any signal to monitor (used in patch 2).
Ack
Please rename nss_logrotate() to nss_hup(). It's not just rotating logs, so I'd rather it be less confusing.
Done.
Also:
- struct nss_ctx *nctx = (struct nss_tx*)rctx->pvt_ctx;
you're casting to a typoed type.
Good catch, however, I wonder how is it possible, that the code even worked. Some kind of black compilation magic.
I'd prefer to leave the review of the sss_mmap_cache_invalidate() routine to Simo, as he's more familiar with the code.
Yes. I mostly reused Simos code, so he is probably the right person to review it.
Sending both patches again.
Thanks Michal
On Wed, 2012-09-12 at 15:12 +0200, Michal Židek wrote:
On 09/12/2012 02:39 PM, Michal Židek wrote:
Patch 1 - tools_util.h now provides signal_sssd function. It replaces send_sighup function from sss_debuglevel.c. It can be used to send any signal to monitor (used in patch 2).
I have questions, nack until answered.
Comments inline:
From 17da45856fc0a723ae9d09431cb372c68a2ef07a Mon Sep 17 00:00:00 2001 From: Michal Zidek mzidek@redhat.com Date: Tue, 11 Sep 2012 18:44:52 +0200 Subject: [PATCH 2/2] sss_cache tool invalidates records in memory cache.
src/responder/nss/nsssrv.c | 37 +++++++++++- src/responder/nss/nsssrv_mmap_cache.c | 110 ++++++++++++++++++++++++++++++++++ src/responder/nss/nsssrv_mmap_cache.h | 2 + src/tools/sss_cache.c | 9 +++ 4 files changed, 157 insertions(+), 1 deletion(-)
diff --git a/src/responder/nss/nsssrv.c b/src/responder/nss/nsssrv.c index 64267e8..59f92c5 100644 --- a/src/responder/nss/nsssrv.c +++ b/src/responder/nss/nsssrv.c @@ -51,10 +51,13 @@ #define SHELL_REALLOC_INCREMENT 5 #define SHELL_REALLOC_MAX 50
+static int nss_logrotate(DBusMessage *message,
struct sbus_connection *conn);struct sbus_method monitor_nss_methods[] = { { MON_CLI_METHOD_PING, monitor_common_pong }, { MON_CLI_METHOD_RES_INIT, monitor_common_res_init },
- { MON_CLI_METHOD_ROTATE, responder_logrotate },
- { MON_CLI_METHOD_ROTATE, nss_logrotate }, { NULL, NULL }
};
@@ -66,6 +69,37 @@ struct sbus_interface monitor_nss_interface = { NULL };
+static int nss_logrotate(DBusMessage *message,
struct sbus_connection *conn)+{
- errno_t ret;
- struct resp_ctx *rctx =
talloc_get_type(sbus_conn_get_private_data(conn),
struct resp_ctx);- struct nss_ctx *nctx = (struct nss_tx*)rctx->pvt_ctx;
- ret = monitor_common_rotate_logs(rctx->cdb,
rctx->confdb_service_path);
- if (ret != EOK) return ret;
Main questions: the invalidation functions seem to be called unconditionally, what prevent cache invalidation durint a normal log rotation ? Shouldn't cache invalidation happen exclusively when sss_cache is asking to drop caches ?
- /* Invalidate memory caches if used */
- if (nctx->pwd_mc_ctx != NULL) {
you do not need to check here ^^ you already check inside sss_mmap_cache_validate if the input is null and return in that case.
ret = sss_mmap_cache_invalidate(nctx->pwd_mc_ctx);if (ret != EOK) {DEBUG(SSSDBG_CRIT_FAILURE,("passwd mmap cache invalidation failed\n"));}- }
- if (nctx->grp_mc_ctx != NULL) {
same as above
ret = sss_mmap_cache_invalidate(nctx->grp_mc_ctx);if (ret != EOK) {DEBUG(SSSDBG_CRIT_FAILURE,("group mmap cache invalidation failed\n"));}- }
- return monitor_common_pong(message, conn);
+}
static errno_t nss_get_etc_shells(TALLOC_CTX *mem_ctx, char ***_shells) { int i = 0; @@ -266,6 +300,7 @@ int nss_process_init(TALLOC_CTX *mem_ctx, struct sss_cmd_table *nss_cmds; struct be_conn *iter; struct nss_ctx *nctx;
- struct tevent_signal *tes; int memcache_timeout; int ret, max_retries; int hret;
diff --git a/src/responder/nss/nsssrv_mmap_cache.c b/src/responder/nss/nsssrv_mmap_cache.c index 07498a9..0a87754 100644 --- a/src/responder/nss/nsssrv_mmap_cache.c +++ b/src/responder/nss/nsssrv_mmap_cache.c @@ -58,6 +58,8 @@ struct sss_mc_ctx { uint32_t seed; /* pseudo-random seed to avoid collision attacks */ time_t valid_time_slot; /* maximum time the entry is valid in seconds */
- int n_elem; /* max number of entries stored in mmap
cache */
- void *mmap_base; /* base address of mmap */ size_t mmap_size; /* total size of mmap */
@@ -648,6 +650,9 @@ errno_t sss_mmap_cache_init(TALLOC_CTX *mem_ctx, const char *name, /* We can use MC_ALIGN64 for this */ n_elem = MC_ALIGN64(n_elem);
- /* store n_elem into context, so it can be used later for
invalidation */
- mc_ctx->n_elem = n_elem;
- /* hash table is double the size because it will store both
forward and * reverse keys (name/uid, name/gid, ..) */ mc_ctx->ht_size = MC_HT_SIZE(n_elem * 2); @@ -725,3 +730,108 @@ done: return ret; }
Nack:
I do not like the following function as it is largely a copy/paste of sss_mmap_cache_init. Why donp;t you simply pass a pointer to struct sss_mc_ctx *mc_ctx release it, and then simply call sss_mmap_cache_init() from scratch ?
Also n_elem is currently hardcoded, so saving it is not really useful, however I would also *really* not save it as on cache invalidation you will want to go and see if the configuration also changed and get the new number of elements. This is a perfect time to do that because you are killing the old file and creating a new one so changing size during this operation is actually a good idea (if needed).
Same for memcache_timeout parameters (which are already parametrized and can be fetched from CONFDB_NSS_CONF_ENTRY / CONFDB_MEMCACHE_TIMEOUT
+errno_t sss_mmap_cache_invalidate(struct sss_mc_ctx *mc_ctx) +{
- unsigned int rseed;
- int payload;
- errno_t ret;
- if (mc_ctx == NULL) {
DEBUG(SSSDBG_TRACE_FUNC,("Memory cache not used. Nothing to do here.\n"));return EOK;- }
- switch (mc_ctx->type) {
- case SSS_MC_PASSWD:
payload = SSS_AVG_PASSWD_PAYLOAD;break;- case SSS_MC_GROUP:
payload = SSS_AVG_GROUP_PAYLOAD;break;- default:
DEBUG(SSSDBG_CRIT_FAILURE, ("Memory cache corrupted.\n"));return EINVAL;- }
- /* close the old file descriptor */
- ret = close(mc_ctx->fd);
- if (ret != 0) {
DEBUG(SSSDBG_CRIT_FAILURE, ("Unable to close file descriptor:%d"
": %s", ret, strerror(ret)));return ret;- }
- mc_ctx->fd = -1;
- /* hash table is double the size because it will store both
forward and
* reverse keys (name/uid, name/gid, ..) */- mc_ctx->ht_size = MC_HT_SIZE(mc_ctx->n_elem * 2);
- mc_ctx->dt_size = MC_DT_SIZE(mc_ctx->n_elem, payload);
- mc_ctx->ft_size = MC_FT_SIZE(mc_ctx->n_elem);
- mc_ctx->mmap_size = MC_HEADER_SIZE +
MC_ALIGN64(mc_ctx->dt_size) +MC_ALIGN64(mc_ctx->ft_size) +MC_ALIGN64(mc_ctx->ht_size);- /* create a new file */
- ret = sss_mc_create_file(mc_ctx);
- if (ret) {
goto done;- }
- ret = ftruncate(mc_ctx->fd, mc_ctx->mmap_size);
- if (ret == -1) {
ret = errno;DEBUG(SSSDBG_CRIT_FAILURE, ("Failed to resize file %s: %d(%s)\n",
mc_ctx->file, ret,strerror(ret)));
goto done;- }
- /* map the file to memory */
- mc_ctx->mmap_base = mmap(NULL, mc_ctx->mmap_size,
PROT_READ | PROT_WRITE,MAP_SHARED, mc_ctx->fd, 0);- if (mc_ctx->mmap_base == MAP_FAILED) {
ret = errno;DEBUG(SSSDBG_CRIT_FAILURE, ("Failed to mmap file %s(%ld): %d(%s)\n",
mc_ctx->file, mc_ctx->mmap_size,ret, strerror(ret)));goto done;- }
- mc_ctx->data_table = MC_PTR_ADD(mc_ctx->mmap_base,
MC_HEADER_SIZE);
- mc_ctx->free_table = MC_PTR_ADD(mc_ctx->data_table,
MC_ALIGN64(mc_ctx->dt_size));- mc_ctx->hash_table = MC_PTR_ADD(mc_ctx->free_table,
MC_ALIGN64(mc_ctx->ft_size));- memset(mc_ctx->data_table, 0x00, mc_ctx->dt_size);
- memset(mc_ctx->free_table, 0x00, mc_ctx->ft_size);
- memset(mc_ctx->hash_table, 0xff, mc_ctx->ht_size);
- /* generate a pseudo-random seed.
* Needed to fend off dictionary based collision attacks */- rseed = time(NULL) * getpid();
- mc_ctx->seed = rand_r(&rseed);
- sss_mc_header_update(mc_ctx, SSS_MC_HEADER_ALIVE);
- ret = EOK;
+done:
- if (ret) {
if (mc_ctx->mmap_base && mc_ctx->mmap_base != MAP_FAILED) {munmap(mc_ctx->mmap_base, mc_ctx->mmap_size);}if (mc_ctx->fd != -1) {close(mc_ctx->fd);ret = unlink(mc_ctx->file);if (ret == -1) {DEBUG(SSSDBG_CRIT_FAILURE,("Failed to rm mmap file %s: %d(%s)\n",mc_ctx->file, ret, strerror(ret)));}}- }
- return ret;
+} diff --git a/src/responder/nss/nsssrv_mmap_cache.h b/src/responder/nss/nsssrv_mmap_cache.h index 81241b2..3a808e6 100644 --- a/src/responder/nss/nsssrv_mmap_cache.h +++ b/src/responder/nss/nsssrv_mmap_cache.h @@ -48,4 +48,6 @@ errno_t sss_mmap_cache_gr_store(struct sss_mc_ctx *mcc, gid_t gid, size_t memnum, char *membuf, size_t memsize);
+errno_t sss_mmap_cache_invalidate(struct sss_mc_ctx*);
#endif /* _NSSSRV_MMAP_CACHE_H_ */ diff --git a/src/tools/sss_cache.c b/src/tools/sss_cache.c index 950ff1c..63f31ad 100644 --- a/src/tools/sss_cache.c +++ b/src/tools/sss_cache.c @@ -141,6 +141,15 @@ int main(int argc, const char *argv[]) ERROR("No cache object matched the specified search\n"); ret = ENOENT; goto done;
} else {/*Local cache changed -> signal monitor to invalidatefastcache */
DEBUG(SSSDBG_TRACE_FUNC, ("Sending SIGHUP tomonitor.\n"));
ret = signal_sssd(SIGHUP);if (ret != EOK) {DEBUG(SSSDBG_CRIT_FAILURE,("Failed to send SIGHUP to monitor.\n"));goto done; }} }-- 1.7.11.2
On 09/13/2012 07:59 PM, Simo Sorce wrote:
Main questions: the invalidation functions seem to be called unconditionally, what prevent cache invalidation durint a normal log rotation ? Shouldn't cache invalidation happen exclusively when sss_cache is asking to drop caches ?
It would be nice to have these two operations separate, but I do not know how to do it if we are to use SIGHUP for this. The application does not know what is the origin of SIGHUP, so it has to do both operations. For now, I wrote FIXME comment to the code in the new patch.
Question: What do we plan to use to signal SSSD to reload configuration during runtime? Isn't it SIGHUP too? I am not sure now, but if it is SIGHUP, then we need to reinitialize memory caches every time SIGHUP is received by monitor (this is what the patch does right know). So maybe it is not that bad after all.
Nack:
I do not like the following function as it is largely a copy/paste of sss_mmap_cache_init. Why donp;t you simply pass a pointer to struct sss_mc_ctx *mc_ctx release it, and then simply call sss_mmap_cache_init() from scratch ?
Also n_elem is currently hardcoded, so saving it is not really useful, however I would also *really* not save it as on cache invalidation you will want to go and see if the configuration also changed and get the new number of elements. This is a perfect time to do that because you are killing the old file and creating a new one so changing size during this operation is actually a good idea (if needed).
Same for memcache_timeout parameters (which are already parametrized and can be fetched from CONFDB_NSS_CONF_ENTRY / CONFDB_MEMCACHE_TIMEOUT
I removed the old invalidate function and created new called sss_mmap_cache_reinit. This function is used for re-initialize already initialized memory cache. I think the name is more suitable, because it now allows to change entry timeout and cache size.
New patches are attached.
Thanks Michal
On Fri, 2012-09-14 at 17:16 +0200, Michal Židek wrote:
On 09/13/2012 07:59 PM, Simo Sorce wrote:
Main questions: the invalidation functions seem to be called unconditionally, what prevent cache invalidation durint a normal log rotation ? Shouldn't cache invalidation happen exclusively when sss_cache is asking to drop caches ?
It would be nice to have these two operations separate, but I do not know how to do it if we are to use SIGHUP for this. The application does not know what is the origin of SIGHUP, so it has to do both operations. For now, I wrote FIXME comment to the code in the new patch.
Before sending the SIGHUP touch a file in the same directory where the cache files are, and in sssd_nss stat and check if that file is present. If it is, remove it and clean the caches.
Question: What do we plan to use to signal SSSD to reload configuration during runtime? Isn't it SIGHUP too? I am not sure now, but if it is SIGHUP, then we need to reinitialize memory caches every time SIGHUP is received by monitor (this is what the patch does right know). So maybe it is not that bad after all.
You really *do not* want purge the caches just because the configuration was reloaded. That would be quite a waste.
Nack:
I do not like the following function as it is largely a copy/paste of sss_mmap_cache_init. Why donp;t you simply pass a pointer to struct sss_mc_ctx *mc_ctx release it, and then simply call sss_mmap_cache_init() from scratch ?
Also n_elem is currently hardcoded, so saving it is not really useful, however I would also *really* not save it as on cache invalidation you will want to go and see if the configuration also changed and get the new number of elements. This is a perfect time to do that because you are killing the old file and creating a new one so changing size during this operation is actually a good idea (if needed).
Same for memcache_timeout parameters (which are already parametrized and can be fetched from CONFDB_NSS_CONF_ENTRY / CONFDB_MEMCACHE_TIMEOUT
I removed the old invalidate function and created new called sss_mmap_cache_reinit. This function is used for re-initialize already initialized memory cache. I think the name is more suitable, because it now allows to change entry timeout and cache size.
New patches are attached.
Looks ok, but I do not like that you keep the n_elem variable.
Please create a macro until we get a tunable option.
Something like: #define SSS_MC_CACHE_ELEMENTS 50000
And just use that.
Simo.
On 09/14/2012 07:55 PM, Simo Sorce wrote:
On Fri, 2012-09-14 at 17:16 +0200, Michal Židek wrote:
On 09/13/2012 07:59 PM, Simo Sorce wrote:
Main questions: the invalidation functions seem to be called unconditionally, what prevent cache invalidation durint a normal log rotation ? Shouldn't cache invalidation happen exclusively when sss_cache is asking to drop caches ?
It would be nice to have these two operations separate, but I do not know how to do it if we are to use SIGHUP for this. The application does not know what is the origin of SIGHUP, so it has to do both operations. For now, I wrote FIXME comment to the code in the new patch.
Before sending the SIGHUP touch a file in the same directory where the cache files are, and in sssd_nss stat and check if that file is present. If it is, remove it and clean the caches.
Question: What do we plan to use to signal SSSD to reload configuration during runtime? Isn't it SIGHUP too? I am not sure now, but if it is SIGHUP, then we need to reinitialize memory caches every time SIGHUP is received by monitor (this is what the patch does right know). So maybe it is not that bad after all.
You really *do not* want purge the caches just because the configuration was reloaded. That would be quite a waste.
Nack:
I do not like the following function as it is largely a copy/paste of sss_mmap_cache_init. Why donp;t you simply pass a pointer to struct sss_mc_ctx *mc_ctx release it, and then simply call sss_mmap_cache_init() from scratch ?
Also n_elem is currently hardcoded, so saving it is not really useful, however I would also *really* not save it as on cache invalidation you will want to go and see if the configuration also changed and get the new number of elements. This is a perfect time to do that because you are killing the old file and creating a new one so changing size during this operation is actually a good idea (if needed).
Same for memcache_timeout parameters (which are already parametrized and can be fetched from CONFDB_NSS_CONF_ENTRY / CONFDB_MEMCACHE_TIMEOUT
I removed the old invalidate function and created new called sss_mmap_cache_reinit. This function is used for re-initialize already initialized memory cache. I think the name is more suitable, because it now allows to change entry timeout and cache size.
New patches are attached.
Looks ok, but I do not like that you keep the n_elem variable.
Please create a macro until we get a tunable option.
Something like: #define SSS_MC_CACHE_ELEMENTS 50000
And just use that.
Simo.
Ok. I changed it all a bit. The mechanism is now as follows: 1. sss_cache creates a file, that serves as flag with the meaning "It was me, who sended SIGHUP". 2. monitor receives sighup and checks if the flag (file) exist - exists: signal (dbus signal) nss to clear memory cache - not exists: rotate logs and signal other services to rotate logs too
To achieve this the te_server_hup is no longer used by monitor, all what monitor uses to handle SIGHUP is monitor_hup. Other services still remain using te_server_hup.
With the file flag, there is no need for nss_hup function. NSS will remain using the old te_server_hup function and new nss_clear_memcache function is introduced to handle the caches.
The flag (file) is removed in the sss_mmap_cache_init function. The reason for not placing it to the monitor_hup handler was to avoid this scenario: 1. Someone by mistake calls sss_cache tool when sssd is not running 2. The flag (file) is created 3. after that, sssd is turned on 4. someone sends SIGHUP to monitor to rotate logs 5. monitor sees the flag (file) 6. sssd clears memory cache and does not rotate anything 7. someone thinks how good he cleared the caches (?!NOoo!?)
When unlink is moved to init function, this will never happen (reinit calls init, so it should be OK).
New patches are in attachment.
Thanks Michal
On 09/17/2012 03:11 PM, Michal Židek wrote:
On 09/14/2012 07:55 PM, Simo Sorce wrote:
On Fri, 2012-09-14 at 17:16 +0200, Michal Židek wrote:
On 09/13/2012 07:59 PM, Simo Sorce wrote:
Main questions: the invalidation functions seem to be called unconditionally, what prevent cache invalidation durint a normal log rotation ? Shouldn't cache invalidation happen exclusively when sss_cache is asking to drop caches ?
It would be nice to have these two operations separate, but I do not know how to do it if we are to use SIGHUP for this. The application does not know what is the origin of SIGHUP, so it has to do both operations. For now, I wrote FIXME comment to the code in the new patch.
Before sending the SIGHUP touch a file in the same directory where the cache files are, and in sssd_nss stat and check if that file is present. If it is, remove it and clean the caches.
Question: What do we plan to use to signal SSSD to reload configuration during runtime? Isn't it SIGHUP too? I am not sure now, but if it is SIGHUP, then we need to reinitialize memory caches every time SIGHUP is received by monitor (this is what the patch does right know). So maybe it is not that bad after all.
You really *do not* want purge the caches just because the configuration was reloaded. That would be quite a waste.
Nack:
I do not like the following function as it is largely a copy/paste of sss_mmap_cache_init. Why donp;t you simply pass a pointer to struct sss_mc_ctx *mc_ctx release it, and then simply call sss_mmap_cache_init() from scratch ?
Also n_elem is currently hardcoded, so saving it is not really useful, however I would also *really* not save it as on cache invalidation you will want to go and see if the configuration also changed and get the new number of elements. This is a perfect time to do that because you are killing the old file and creating a new one so changing size during this operation is actually a good idea (if needed).
Same for memcache_timeout parameters (which are already parametrized and can be fetched from CONFDB_NSS_CONF_ENTRY / CONFDB_MEMCACHE_TIMEOUT
I removed the old invalidate function and created new called sss_mmap_cache_reinit. This function is used for re-initialize already initialized memory cache. I think the name is more suitable, because it now allows to change entry timeout and cache size.
New patches are attached.
Looks ok, but I do not like that you keep the n_elem variable.
Please create a macro until we get a tunable option.
Something like: #define SSS_MC_CACHE_ELEMENTS 50000
And just use that.
Simo.
Ok. I changed it all a bit. The mechanism is now as follows:
- sss_cache creates a file, that serves as flag with the meaning
"It was me, who sended SIGHUP". 2. monitor receives sighup and checks if the flag (file) exist - exists: signal (dbus signal) nss to clear memory cache - not exists: rotate logs and signal other services to rotate logs too
To achieve this the te_server_hup is no longer used by monitor, all what monitor uses to handle SIGHUP is monitor_hup. Other services still remain using te_server_hup.
With the file flag, there is no need for nss_hup function. NSS will remain using the old te_server_hup function and new nss_clear_memcache function is introduced to handle the caches.
The flag (file) is removed in the sss_mmap_cache_init function. The reason for not placing it to the monitor_hup handler was to avoid this scenario:
- Someone by mistake calls sss_cache tool when sssd is not running
- The flag (file) is created
- after that, sssd is turned on
- someone sends SIGHUP to monitor to rotate logs
- monitor sees the flag (file)
- sssd clears memory cache and does not rotate anything
- someone thinks how good he cleared the caches (?!NOoo!?)
It should have been: 7. someone thinks how good he rotated logs
When unlink is moved to init function, this will never happen (reinit calls init, so it should be OK).
New patches are in attachment.
Thanks Michal
sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-devel
----- Original Message -----
Ok. I changed it all a bit. The mechanism is now as follows:
- sss_cache creates a file, that serves as flag with the meaning
"It was me, who sended SIGHUP". 2. monitor receives sighup and checks if the flag (file) exist - exists: signal (dbus signal) nss to clear memory cache - not exists: rotate logs and signal other services to rotate logs too
To achieve this the te_server_hup is no longer used by monitor, all what monitor uses to handle SIGHUP is monitor_hup. Other services still remain using te_server_hup.
With the file flag, there is no need for nss_hup function. NSS will remain using the old te_server_hup function and new nss_clear_memcache function is introduced to handle the caches.
The flag (file) is removed in the sss_mmap_cache_init function. The reason for not placing it to the monitor_hup handler was to avoid this scenario:
- Someone by mistake calls sss_cache tool when sssd is not running
- The flag (file) is created
- after that, sssd is turned on
- someone sends SIGHUP to monitor to rotate logs
- monitor sees the flag (file)
- sssd clears memory cache and does not rotate anything
- someone thinks how good he cleared the caches (?!NOoo!?)
When unlink is moved to init function, this will never happen (reinit calls init, so it should be OK).
New patches are in attachment.
Hi Michal, you bring up a good point about the race with startup, but I am not totally convinced about the approach you used to address it.
The race you point out could be easily resolved by having the init code unlink() the reset file w/o acting on it. That would address the problem of cleaning up stale file.
Your solution though has another race, if someone runs sss_cache and at the same time the system wants to rotate logs you may get the monitor to ignore the legit SIGHUP and only clean the cache but not rotate the logs.
This is why I planned to have the check in sssd_nss and always do log rotation (monitor always sends the SIGHUP around).
The other 'problem' with the current code is that you are 'leaking' information specific to the sssd_nss backend into the monitor code. This is not really a major issue, but I prefer to avoid these 'shortcuts' and keep the monitor dumber wrt frontend or provider specific features.
Other nitpicks: in the current code if you get a fatal error do not do anything if the stat fails with an error different from ENOENT. I think the SIGHUP should always be sent (as I said above) and the failure should only be logged, not be fatal (of course this is not a problem if we move back checks into sssd_nss).
Simo.
On 09/17/2012 06:02 PM, Simo Sorce wrote:
Hi Michal, you bring up a good point about the race with startup, but I am not totally convinced about the approach you used to address it.
The race you point out could be easily resolved by having the init code unlink() the reset file w/o acting on it. That would address the problem of cleaning up stale file.
unlink() is added to the nss_process_init in the new patch, to avoid unnecessary cache cleaning when not needed.
Your solution though has another race, if someone runs sss_cache and at the same time the system wants to rotate logs you may get the monitor to ignore the legit SIGHUP and only clean the cache but not rotate the logs.
This is why I planned to have the check in sssd_nss and always do log rotation (monitor always sends the SIGHUP around).
You are right. I changed it to always do log rotation.
Btw. monitor does not send SIGHUP to other process. Monitor receives SIGHUP and tells other services what to do (via D-Bus). Right now, 'rotate logs' and 'clear memory cache'(NSS only) messages are sent from monitor_hup function. I've changed the comment in the code to make it more obvious.
The other 'problem' with the current code is that you are 'leaking' information specific to the sssd_nss backend into the monitor code. This is not really a major issue, but I prefer to avoid these 'shortcuts' and keep the monitor dumber wrt frontend or provider specific features.
Everything I considered unrelated to monitor is removed in the new patch, but it still needs to know the NSS process name (nsssrv.h), to be able to send it the request for cache clean up. It would not need it if we had a single NSS specific function to handle the logrotate request, which would do both logrotation and memcache clean up (I had it this way before and I did not like it myself). I thing if monitor knows that on SIGHUP (sent to monitor) all processes rotate logs, it should also know that NSS needs to clear caches.
Other nitpicks: in the current code if you get a fatal error do not do anything if the stat fails with an error different from ENOENT. I think the SIGHUP should always be sent (as I said above) and the failure should only be logged, not be fatal (of course this is not a problem if we move back checks into sssd_nss).
I made the first possible unlink() error (the one that can happen in nss_process_init) not fatal. I think sssd should at least try to continue even if the caches might be removed on next log rotation request.
New patches are in attachment.
Thanks Michal
On Tue, Sep 18, 2012 at 01:26:20PM +0200, Michal Židek wrote:
On 09/17/2012 06:02 PM, Simo Sorce wrote:
Hi Michal, you bring up a good point about the race with startup, but I am not totally convinced about the approach you used to address it.
The race you point out could be easily resolved by having the init code unlink() the reset file w/o acting on it. That would address the problem of cleaning up stale file.
unlink() is added to the nss_process_init in the new patch, to avoid unnecessary cache cleaning when not needed.
Your solution though has another race, if someone runs sss_cache and at the same time the system wants to rotate logs you may get the monitor to ignore the legit SIGHUP and only clean the cache but not rotate the logs.
This is why I planned to have the check in sssd_nss and always do log rotation (monitor always sends the SIGHUP around).
You are right. I changed it to always do log rotation.
Btw. monitor does not send SIGHUP to other process. Monitor receives SIGHUP and tells other services what to do (via D-Bus). Right now, 'rotate logs' and 'clear memory cache'(NSS only) messages are sent from monitor_hup function. I've changed the comment in the code to make it more obvious.
The other 'problem' with the current code is that you are 'leaking' information specific to the sssd_nss backend into the monitor code. This is not really a major issue, but I prefer to avoid these 'shortcuts' and keep the monitor dumber wrt frontend or provider specific features.
Everything I considered unrelated to monitor is removed in the new patch, but it still needs to know the NSS process name (nsssrv.h), to be able to send it the request for cache clean up. It would not need it if we had a single NSS specific function to handle the logrotate request, which would do both logrotation and memcache clean up (I had it this way before and I did not like it myself). I thing if monitor knows that on SIGHUP (sent to monitor) all processes rotate logs, it should also know that NSS needs to clear caches.
I think "leaking" the service name is OK or perhaps the lesser of all evils. I remember we even had a check in the past that would test if the list of configured services contains "nss" and "pam".
Other nitpicks: in the current code if you get a fatal error do not do anything if the stat fails with an error different from ENOENT. I think the SIGHUP should always be sent (as I said above) and the failure should only be logged, not be fatal (of course this is not a problem if we move back checks into sssd_nss).
I made the first possible unlink() error (the one that can happen in nss_process_init) not fatal. I think sssd should at least try to continue even if the caches might be removed on next log rotation request.
New patches are in attachment.
Thanks Michal
The code works as advertized.
I only have some minor nitpicks style-wise:
+static int nss_clear_memcache(DBusMessage *message, + struct sbus_connection *conn)
Please align the line after the linebreak.
+ struct nss_ctx *nctx = (struct nss_ctx*)rctx->pvt_ctx; (and elsewhere) Please put whitespace after the cast.
sss_mmap_cache_reinit: We have a convention to call our "local" talloc contexts tmp_ctx. We usually also allocate them on NULL so that a run with valgrind would immediatelly reveal them as leaks[*]
I'm not sure about this check: + ret = talloc_free(*mc_ctx); Is there actually a descructor? Wouldn't we want to continue and try to init the cache again instead of just failing?
Why the whitespace change in server_setup() ?
Otherwise I'm fine with the patches. Good work, Michal!
On Thu, Sep 20, 2012 at 07:12:40PM +0200, Jakub Hrozek wrote:
The code works as advertized.
I only have some minor nitpicks style-wise:
+static int nss_clear_memcache(DBusMessage *message,
struct sbus_connection *conn)Please align the line after the linebreak.
- struct nss_ctx *nctx = (struct nss_ctx*)rctx->pvt_ctx;
(and elsewhere) Please put whitespace after the cast.
sss_mmap_cache_reinit: We have a convention to call our "local" talloc contexts tmp_ctx. We usually also allocate them on NULL so that a run with valgrind would immediatelly reveal them as leaks[*]
Sorry I forgot to add the point I intended to add: [*] I'm actually thinking that for async tevent requests it might be safe to use "state" as the temp context. Unless they are long-running and especially recursive such as group processing to avoid growing the context.
On 09/20/2012 07:12 PM, Jakub Hrozek wrote:
On Tue, Sep 18, 2012 at 01:26:20PM +0200, Michal Židek wrote:
On 09/17/2012 06:02 PM, Simo Sorce wrote:
Hi Michal, you bring up a good point about the race with startup, but I am not totally convinced about the approach you used to address it.
The race you point out could be easily resolved by having the init code unlink() the reset file w/o acting on it. That would address the problem of cleaning up stale file.
unlink() is added to the nss_process_init in the new patch, to avoid unnecessary cache cleaning when not needed.
Your solution though has another race, if someone runs sss_cache and at the same time the system wants to rotate logs you may get the monitor to ignore the legit SIGHUP and only clean the cache but not rotate the logs.
This is why I planned to have the check in sssd_nss and always do log rotation (monitor always sends the SIGHUP around).
You are right. I changed it to always do log rotation.
Btw. monitor does not send SIGHUP to other process. Monitor receives SIGHUP and tells other services what to do (via D-Bus). Right now, 'rotate logs' and 'clear memory cache'(NSS only) messages are sent from monitor_hup function. I've changed the comment in the code to make it more obvious.
The other 'problem' with the current code is that you are 'leaking' information specific to the sssd_nss backend into the monitor code. This is not really a major issue, but I prefer to avoid these 'shortcuts' and keep the monitor dumber wrt frontend or provider specific features.
Everything I considered unrelated to monitor is removed in the new patch, but it still needs to know the NSS process name (nsssrv.h), to be able to send it the request for cache clean up. It would not need it if we had a single NSS specific function to handle the logrotate request, which would do both logrotation and memcache clean up (I had it this way before and I did not like it myself). I thing if monitor knows that on SIGHUP (sent to monitor) all processes rotate logs, it should also know that NSS needs to clear caches.
I think "leaking" the service name is OK or perhaps the lesser of all evils. I remember we even had a check in the past that would test if the list of configured services contains "nss" and "pam".
Other nitpicks: in the current code if you get a fatal error do not do anything if the stat fails with an error different from ENOENT. I think the SIGHUP should always be sent (as I said above) and the failure should only be logged, not be fatal (of course this is not a problem if we move back checks into sssd_nss).
I made the first possible unlink() error (the one that can happen in nss_process_init) not fatal. I think sssd should at least try to continue even if the caches might be removed on next log rotation request.
New patches are in attachment.
Thanks Michal
The code works as advertized.
I only have some minor nitpicks style-wise:
+static int nss_clear_memcache(DBusMessage *message,
struct sbus_connection *conn)Please align the line after the linebreak.
Fixed.
- struct nss_ctx *nctx = (struct nss_ctx*)rctx->pvt_ctx;
(and elsewhere) Please put whitespace after the cast.
Fixed.
sss_mmap_cache_reinit: We have a convention to call our "local" talloc contexts tmp_ctx. We usually also allocate them on NULL so that a run with valgrind would immediatelly reveal them as leaks[*]
Fixed. It is now allocated as new top level ctx.
I'm not sure about this check: + ret = talloc_free(*mc_ctx); Is there actually a descructor? Wouldn't we want to continue and try to init the cache again instead of just failing?
I removed the goto statement, so we get only debug message and the function continues normally.
Why the whitespace change in server_setup() ?
Fixed.
Otherwise I'm fine with the patches. Good work, Michal! _______________________________________________ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-devel
New patches attached.
Thanks Michal
On Mon, Sep 24, 2012 at 12:15:06PM +0200, Michal Židek wrote:
On 09/20/2012 07:12 PM, Jakub Hrozek wrote:
On Tue, Sep 18, 2012 at 01:26:20PM +0200, Michal Židek wrote:
On 09/17/2012 06:02 PM, Simo Sorce wrote:
Hi Michal, you bring up a good point about the race with startup, but I am not totally convinced about the approach you used to address it.
The race you point out could be easily resolved by having the init code unlink() the reset file w/o acting on it. That would address the problem of cleaning up stale file.
unlink() is added to the nss_process_init in the new patch, to avoid unnecessary cache cleaning when not needed.
Your solution though has another race, if someone runs sss_cache and at the same time the system wants to rotate logs you may get the monitor to ignore the legit SIGHUP and only clean the cache but not rotate the logs.
This is why I planned to have the check in sssd_nss and always do log rotation (monitor always sends the SIGHUP around).
You are right. I changed it to always do log rotation.
Btw. monitor does not send SIGHUP to other process. Monitor receives SIGHUP and tells other services what to do (via D-Bus). Right now, 'rotate logs' and 'clear memory cache'(NSS only) messages are sent from monitor_hup function. I've changed the comment in the code to make it more obvious.
The other 'problem' with the current code is that you are 'leaking' information specific to the sssd_nss backend into the monitor code. This is not really a major issue, but I prefer to avoid these 'shortcuts' and keep the monitor dumber wrt frontend or provider specific features.
Everything I considered unrelated to monitor is removed in the new patch, but it still needs to know the NSS process name (nsssrv.h), to be able to send it the request for cache clean up. It would not need it if we had a single NSS specific function to handle the logrotate request, which would do both logrotation and memcache clean up (I had it this way before and I did not like it myself). I thing if monitor knows that on SIGHUP (sent to monitor) all processes rotate logs, it should also know that NSS needs to clear caches.
I think "leaking" the service name is OK or perhaps the lesser of all evils. I remember we even had a check in the past that would test if the list of configured services contains "nss" and "pam".
Other nitpicks: in the current code if you get a fatal error do not do anything if the stat fails with an error different from ENOENT. I think the SIGHUP should always be sent (as I said above) and the failure should only be logged, not be fatal (of course this is not a problem if we move back checks into sssd_nss).
I made the first possible unlink() error (the one that can happen in nss_process_init) not fatal. I think sssd should at least try to continue even if the caches might be removed on next log rotation request.
New patches are in attachment.
Thanks Michal
The code works as advertized.
I only have some minor nitpicks style-wise:
+static int nss_clear_memcache(DBusMessage *message,
struct sbus_connection *conn)Please align the line after the linebreak.
Fixed.
- struct nss_ctx *nctx = (struct nss_ctx*)rctx->pvt_ctx;
(and elsewhere) Please put whitespace after the cast.
Fixed.
sss_mmap_cache_reinit: We have a convention to call our "local" talloc contexts tmp_ctx. We usually also allocate them on NULL so that a run with valgrind would immediatelly reveal them as leaks[*]
Fixed. It is now allocated as new top level ctx.
I'm not sure about this check: + ret = talloc_free(*mc_ctx); Is there actually a descructor? Wouldn't we want to continue and try to init the cache again instead of just failing?
I removed the goto statement, so we get only debug message and the function continues normally.
Why the whitespace change in server_setup() ?
Fixed.
Otherwise I'm fine with the patches. Good work, Michal!
Ack
On Mon, Sep 24, 2012 at 01:13:47PM +0200, Jakub Hrozek wrote:
On Mon, Sep 24, 2012 at 12:15:06PM +0200, Michal Židek wrote:
On 09/20/2012 07:12 PM, Jakub Hrozek wrote:
On Tue, Sep 18, 2012 at 01:26:20PM +0200, Michal Židek wrote:
On 09/17/2012 06:02 PM, Simo Sorce wrote:
Hi Michal, you bring up a good point about the race with startup, but I am not totally convinced about the approach you used to address it.
The race you point out could be easily resolved by having the init code unlink() the reset file w/o acting on it. That would address the problem of cleaning up stale file.
unlink() is added to the nss_process_init in the new patch, to avoid unnecessary cache cleaning when not needed.
Your solution though has another race, if someone runs sss_cache and at the same time the system wants to rotate logs you may get the monitor to ignore the legit SIGHUP and only clean the cache but not rotate the logs.
This is why I planned to have the check in sssd_nss and always do log rotation (monitor always sends the SIGHUP around).
You are right. I changed it to always do log rotation.
Btw. monitor does not send SIGHUP to other process. Monitor receives SIGHUP and tells other services what to do (via D-Bus). Right now, 'rotate logs' and 'clear memory cache'(NSS only) messages are sent from monitor_hup function. I've changed the comment in the code to make it more obvious.
The other 'problem' with the current code is that you are 'leaking' information specific to the sssd_nss backend into the monitor code. This is not really a major issue, but I prefer to avoid these 'shortcuts' and keep the monitor dumber wrt frontend or provider specific features.
Everything I considered unrelated to monitor is removed in the new patch, but it still needs to know the NSS process name (nsssrv.h), to be able to send it the request for cache clean up. It would not need it if we had a single NSS specific function to handle the logrotate request, which would do both logrotation and memcache clean up (I had it this way before and I did not like it myself). I thing if monitor knows that on SIGHUP (sent to monitor) all processes rotate logs, it should also know that NSS needs to clear caches.
I think "leaking" the service name is OK or perhaps the lesser of all evils. I remember we even had a check in the past that would test if the list of configured services contains "nss" and "pam".
Other nitpicks: in the current code if you get a fatal error do not do anything if the stat fails with an error different from ENOENT. I think the SIGHUP should always be sent (as I said above) and the failure should only be logged, not be fatal (of course this is not a problem if we move back checks into sssd_nss).
I made the first possible unlink() error (the one that can happen in nss_process_init) not fatal. I think sssd should at least try to continue even if the caches might be removed on next log rotation request.
New patches are in attachment.
Thanks Michal
The code works as advertized.
I only have some minor nitpicks style-wise:
+static int nss_clear_memcache(DBusMessage *message,
struct sbus_connection *conn)Please align the line after the linebreak.
Fixed.
- struct nss_ctx *nctx = (struct nss_ctx*)rctx->pvt_ctx;
(and elsewhere) Please put whitespace after the cast.
Fixed.
sss_mmap_cache_reinit: We have a convention to call our "local" talloc contexts tmp_ctx. We usually also allocate them on NULL so that a run with valgrind would immediatelly reveal them as leaks[*]
Fixed. It is now allocated as new top level ctx.
I'm not sure about this check: + ret = talloc_free(*mc_ctx); Is there actually a descructor? Wouldn't we want to continue and try to init the cache again instead of just failing?
I removed the goto statement, so we get only debug message and the function continues normally.
Why the whitespace change in server_setup() ?
Fixed.
Otherwise I'm fine with the patches. Good work, Michal!
Ack
Pushed to master.
sssd-devel@lists.fedorahosted.org