https://fedorahosted.org/sssd/ticket/1584
First, I thought the race condition between sssd_nss and sss_cache should be solved by some sort of file locking mechanism, but when started working on it, the places where we needed to check for the file being locked or free were too many and spread among monitor, nss and sss_cache tool processes and it was not clear how the access is controlled.
So I decided to do it this way: 1. sss_cache tries to send_sighup to monitor as usual 2. if signal_sssd returns that sssd is not running, proceed with memcache invalidation 3. As a part of memcache invalidation: - it first opens the mc file - then it checks if sssd is running (with pgrep) - if sssd is running it stops the the process of invalidation - if sssd is still not running it proceeds with the invalidation
See that if sssd starts after (or during) the pgrep check (so we will not catch it as running, but will assume it is off) it is not a problem, because we have file descriptor associated with file that was present before sssd was running (we open the file before pgrep call). sssd_nss alwas creates a new memory cache file on startup, so we will only mark the old one as recycled (and not the new one), that we do not care about (because it will be deleted by sssd_nss and marking that as recycled is not a problem, I think -- the worst thing that can happen is race between nss and cache tool while both are marking the OLD file as recycled, but I can not figure out situation where this could be dangerous, because there are no "temp" states that could be harmful. Both processes are changing the value of status, but both to the same value).
Another thing that I like about this is that we do not have to care about communication sssd vs sssd_nss vs sss_cache but only control sssd vs sss_cache, which is much easier to understand.
Can someone see any problems that I missed?
NOTE: The function sss_mc_set_recycled is copied from different module. I did not want to make this function non static and put it to a header file because it is not intended to be used directly.
I tested it and it works fine for me.
The patch is attached.
Thanks Michal
On Wed, 2012-10-24 at 15:49 +0200, Michal Židek wrote:
https://fedorahosted.org/sssd/ticket/1584
First, I thought the race condition between sssd_nss and sss_cache should be solved by some sort of file locking mechanism, but when started working on it, the places where we needed to check for the file being locked or free were too many and spread among monitor, nss and sss_cache tool processes and it was not clear how the access is controlled.
So I decided to do it this way:
- sss_cache tries to send_sighup to monitor as usual
- if signal_sssd returns that sssd is not running, proceed with
memcache invalidation 3. As a part of memcache invalidation: - it first opens the mc file - then it checks if sssd is running (with pgrep) - if sssd is running it stops the the process of invalidation - if sssd is still not running it proceeds with the invalidation
Nack.
Sorry but this is not acceptable to me. You are forking a full new process (pgrep) and that simply makes the race condition wider not narrower.
You are moving from a simple signal to a whole setting up a new process and pipes communication with that process which takes a lot of time (relatively speaking).
Using a lock on the file removes the race completely and is much simpler as it requires no communication at all.
You need to do this in only 2 places, in sss_cache and in sssd_nss right after the open call.
Please use fcntl() with a F_SETLK and F_WRLCK on the first byte of the cache file in sss_cache.
You do not want to wait (therefore F_SETLK instead of F_SETLCKW) in sss_cache, if you can't get the lock it means sssd_nss is running and you abort. You can choose to try the lock before actually sending signals to sssd_nss and fallback to the signal only if sssd_nss is running (lock failed).
In sssd_nss you also need to take a F_WRLCK on the first byte of the file, however in sssd_nss case you want to retry a few times just in case you races with sss_cache. I think retrying a couple of times waiting a few milliseconds between each retry would be fine.
Say 3 retries waiting 50ms between each.
Make sure to handle EACESS, EAGAIN, EINTR all the same way as they all mean the file is locked or the syscal wasn't able to get a lock for other reasons.
See that if sssd starts after (or during) the pgrep check (so we will not catch it as running, but will assume it is off) it is not a problem, because we have file descriptor associated with file that was present before sssd was running (we open the file before pgrep call). sssd_nss alwas creates a new memory cache file on startup, so we will only mark the old one as recycled (and not the new one), that we do not care about (because it will be deleted by sssd_nss and marking that as recycled is not a problem, I think -- the worst thing that can happen is race between nss and cache tool while both are marking the OLD file as recycled, but I can not figure out situation where this could be dangerous, because there are no "temp" states that could be harmful.
There are.
The problem is this file is single writer and sssd_nss assumes it is the only writer, so if you invalidate the file mistakenly when sssd_nss was active say pgrep grepped and found nothing but before it returned the result to your process sssd_nss started and opened the file. It is possible because sssd_nss can be scheduled to run right after pgrep has given up the information to send back to sss_cache to a syscall, and sssd_nss is schedule before sss_cache can read and act on that information.
This would cause sssd_nss to not notice the file was tampered with and keep running using the old file, which has been unlinked, so clients wouldn't find nor use it. Performances would degrade, as the cache is not being used until sssd is restarted.
Both processes are changing the value of status, but both to the same value).
Another thing that I like about this is that we do not have to care about communication sssd vs sssd_nss vs sss_cache but only control sssd vs sss_cache, which is much easier to understand.
This would be true if you used locking, the current scheme is brittle and racy.
Can someone see any problems that I missed?
See above.
NOTE: The function sss_mc_set_recycled is copied from different module. I did not want to make this function non static and put it to a header file because it is not intended to be used directly.
This is right, thanks for not doing it.
Simo.
On 10/24/2012 05:04 PM, Simo Sorce wrote:
In sssd_nss you also need to take a F_WRLCK on the first byte of the file, however in sssd_nss case you want to retry a few times just in case you races with sss_cache. I think retrying a couple of times waiting a few milliseconds between each retry would be fine.
Say 3 retries waiting 50ms between each.
Ok. I used this for both old (to be unlinked) and new (created) memcache files. But why is F_SETLKW not sufficient here (sssd_nss)? Are you afraid of possible deadlock (it could happen if more sssd's were running at the same time, like it happened recently because of a bug with pidfile creation)? Or is there some other reason?
NOTE: This patch applies on top of the recently posted patch in (see thread: [SSSD] [PATCH] sss_cache: Multiple domains not handled properly)
I tested it and it worked well for me. Any comments appreciated.
Thanks Michal
On Fri, 2012-10-26 at 18:33 +0200, Michal Židek wrote:
On 10/24/2012 05:04 PM, Simo Sorce wrote:
In sssd_nss you also need to take a F_WRLCK on the first byte of the file, however in sssd_nss case you want to retry a few times just in case you races with sss_cache. I think retrying a couple of times waiting a few milliseconds between each retry would be fine.
Say 3 retries waiting 50ms between each.
Ok. I used this for both old (to be unlinked) and new (created) memcache files. But why is F_SETLKW not sufficient here (sssd_nss)? Are you afraid of possible deadlock (it could happen if more sssd's were running at the same time, like it happened recently because of a bug with pidfile creation)? Or is there some other reason?
Yes and no. We can't wait forever because the monitor would kill us after a while if we do not respond to pings and then restart the sssd_nss process, which would probably get stuck again, but the admin wouldn't know why that is happening. It is better if the sssd_nss process fails fast with a clear critical error message (old level 0) in the logs.
NOTE: This patch applies on top of the recently posted patch in (see thread: [SSSD] [PATCH] sss_cache: Multiple domains not handled properly)
I tested it and it worked well for me. Any comments appreciated.
Will comment on the patch soon.
Simo.
On Fri, 2012-10-26 at 18:33 +0200, Michal Židek wrote:
--- a/src/responder/nss/nsssrv_mmap_cache.c +++ b/src/responder/nss/nsssrv_mmap_cache.c @@ -537,9 +537,46 @@ static errno_t sss_mc_create_file(struct sss_mc_ctx *mc_ctx) mode_t old_mask; int ofd; int ret;
- int retries_left;
- struct flock mc_lock;
- useconds_t t = 50000;
- mc_lock.l_type = F_WRLCK;
- mc_lock.l_whence = SEEK_SET;
- mc_lock.l_start = 0;
- mc_lock.l_start = 1;
Something wrong here ^ this should be l_len
- mc_lock.l_pid = getpid();
You do not need to set this, it's a read-out variable, and should be initialized to 0.
ofd = open(mc_ctx->file, O_RDWR); if (ofd != -1) {
for (retries_left = 2; retries_left >= 0; retries_left--) {
nitpick but this would be more readable as: for (retries_left = 3; retries_left > 0; retries_left--) { however I see it would require (retries_left - 1) in the debug message, so if you prefer the current form add a comment that says /* try 3 times */ before the for loop.
@@ -570,6 +607,34 @@ static errno_t sss_mc_create_file(struct sss_mc_ctx *mc_ctx) mc_ctx->file, ret, strerror(ret))); }
- for (retries_left = 2; retries_left >= 0; retries_left--) {
ret = fcntl(mc_ctx->fd, F_SETLK, &mc_lock);if (ret == EACCES || ret == EAGAIN || ret == EINTR) {/* File is locked by someone else */DEBUG(SSSDBG_TRACE_FUNC,("Failed to lock mc file %s. Retries left: %d\n",mc_ctx->file, retries_left));ret = usleep(t);if (ret == -1) {DEBUG(SSSDBG_MINOR_FAILURE,("usleep() failed -> ignoring\n"));}} else if (ret == 0) {/* File successfuly locked */break;} else {/* Error occurred */DEBUG(SSSDBG_CRIT_FAILURE,("Unable to lock mc file %s\n", mc_ctx->file));return ret;}- }
- if (retries_left < 0) {
DEBUG(SSSDBG_CRIT_FAILURE,("Unable to lock mc file %s\n", mc_ctx->file));return ret;- }
This block is the same as above, please do not repeat blocks of code, create a static helper function instead and use it in both places.
+errno_t sss_memcache_invalidate(const char *mc_filename) +{
- int mc_fd = -1;
- struct flock mc_lock;
- errno_t ret;
- bool locked = false;
- if (!mc_filename) {
return EINVAL;- }
- mc_fd = open(mc_filename, O_RDWR);
- if (mc_fd == -1) {
ret = errno;if (ret == ENOENT) {DEBUG(SSSDBG_TRACE_FUNC,("Memory cache file %s ""does not exist.\n", mc_filename));return EOK;} else {DEBUG(SSSDBG_CRIT_FAILURE, ("Unable to open file %s: %s\n",
mc_filename, strerror(ret)));return ret;}- }
- mc_lock.l_type = F_WRLCK;
- mc_lock.l_whence = SEEK_SET;
- mc_lock.l_start = 0;
- mc_lock.l_start = 1;
- mc_lock.l_pid = getpid();
Same comments as per sssd_nss code.
- ret = fcntl(mc_fd, F_SETLK, &mc_lock);
- if (ret == -1) {
ret = errno;if (ret == EACCES || ret == EAGAIN || ret == EINTR) {/* File already locked by sssd_nss */ret = EACCES;goto done;
EINTR must be handled differently here, I would retry at least once on EINTR as it doesn't mean the file is locked, only that the syscall was interrupted before the lock could be attempted. (The server case is different because it makes no difference for what reason you fail, there you have to retry anyway, here only for EINTR).
} else {DEBUG(SSSDBG_CRIT_FAILURE,("Failed to lock memory cache file %s: %s\n",mc_filename, strerror(ret)));goto done;}- }
- locked = true;
- /* Mark the mc file as recycled. */
- ret = sss_mc_set_recycled(mc_fd);
- if (ret != EOK) {
DEBUG(SSSDBG_CRIT_FAILURE, ("Failed to mark memory cache file%s "
"as recycled.\n", mc_filename));goto done;- }
- ret = EOK;
+done:
- if (locked) {
mc_lock.l_type = F_UNLCK;fcntl(mc_fd, F_SETLK, &mc_lock);- }
- if (mc_fd != -1) {
close(mc_fd);- }
Shouldn't you also unlink the file here ?
- return ret;
+}
In general the approach is correct and the patch is ok, please fix the bugs and will get in master right away.
Simo.
On 10/26/2012 07:53 PM, Simo Sorce wrote:
On Fri, 2012-10-26 at 18:33 +0200, Michal Židek wrote:
--- a/src/responder/nss/nsssrv_mmap_cache.c +++ b/src/responder/nss/nsssrv_mmap_cache.c @@ -537,9 +537,46 @@ static errno_t sss_mc_create_file(struct sss_mc_ctx *mc_ctx) mode_t old_mask; int ofd; int ret;
- int retries_left;
- struct flock mc_lock;
- useconds_t t = 50000;
- mc_lock.l_type = F_WRLCK;
- mc_lock.l_whence = SEEK_SET;
- mc_lock.l_start = 0;
- mc_lock.l_start = 1;
Something wrong here ^ this should be l_len
- mc_lock.l_pid = getpid();
You do not need to set this, it's a read-out variable, and should be initialized to 0.
ofd = open(mc_ctx->file, O_RDWR); if (ofd != -1) {
for (retries_left = 2; retries_left >= 0; retries_left--) {nitpick but this would be more readable as: for (retries_left = 3; retries_left > 0; retries_left--) { however I see it would require (retries_left - 1) in the debug message, so if you prefer the current form add a comment that says /* try 3 times */ before the for loop.
@@ -570,6 +607,34 @@ static errno_t sss_mc_create_file(struct sss_mc_ctx *mc_ctx) mc_ctx->file, ret, strerror(ret))); }
- for (retries_left = 2; retries_left >= 0; retries_left--) {
ret = fcntl(mc_ctx->fd, F_SETLK, &mc_lock);if (ret == EACCES || ret == EAGAIN || ret == EINTR) {/* File is locked by someone else */DEBUG(SSSDBG_TRACE_FUNC,("Failed to lock mc file %s. Retries left: %d\n",mc_ctx->file, retries_left));ret = usleep(t);if (ret == -1) {DEBUG(SSSDBG_MINOR_FAILURE,("usleep() failed -> ignoring\n"));}} else if (ret == 0) {/* File successfuly locked */break;} else {/* Error occurred */DEBUG(SSSDBG_CRIT_FAILURE,("Unable to lock mc file %s\n", mc_ctx->file));return ret;}- }
- if (retries_left < 0) {
DEBUG(SSSDBG_CRIT_FAILURE,("Unable to lock mc file %s\n", mc_ctx->file));return ret;- }
This block is the same as above, please do not repeat blocks of code, create a static helper function instead and use it in both places.
+errno_t sss_memcache_invalidate(const char *mc_filename) +{
- int mc_fd = -1;
- struct flock mc_lock;
- errno_t ret;
- bool locked = false;
- if (!mc_filename) {
return EINVAL;- }
- mc_fd = open(mc_filename, O_RDWR);
- if (mc_fd == -1) {
ret = errno;if (ret == ENOENT) {DEBUG(SSSDBG_TRACE_FUNC,("Memory cache file %s ""does not exist.\n", mc_filename));return EOK;} else {DEBUG(SSSDBG_CRIT_FAILURE, ("Unable to open file %s: %s\n",
mc_filename, strerror(ret)));return ret;}- }
- mc_lock.l_type = F_WRLCK;
- mc_lock.l_whence = SEEK_SET;
- mc_lock.l_start = 0;
- mc_lock.l_start = 1;
- mc_lock.l_pid = getpid();
Same comments as per sssd_nss code.
- ret = fcntl(mc_fd, F_SETLK, &mc_lock);
- if (ret == -1) {
ret = errno;if (ret == EACCES || ret == EAGAIN || ret == EINTR) {/* File already locked by sssd_nss */ret = EACCES;goto done;EINTR must be handled differently here, I would retry at least once on EINTR as it doesn't mean the file is locked, only that the syscall was interrupted before the lock could be attempted. (The server case is different because it makes no difference for what reason you fail, there you have to retry anyway, here only for EINTR).
} else {DEBUG(SSSDBG_CRIT_FAILURE,("Failed to lock memory cache file %s: %s\n",mc_filename, strerror(ret)));goto done;}- }
- locked = true;
- /* Mark the mc file as recycled. */
- ret = sss_mc_set_recycled(mc_fd);
- if (ret != EOK) {
DEBUG(SSSDBG_CRIT_FAILURE, ("Failed to mark memory cache file%s "
"as recycled.\n", mc_filename));goto done;- }
- ret = EOK;
+done:
- if (locked) {
mc_lock.l_type = F_UNLCK;fcntl(mc_fd, F_SETLK, &mc_lock);- }
- if (mc_fd != -1) {
close(mc_fd);- }
Shouldn't you also unlink the file here ?
That is not necessary, the file will be deleted by sssd_nss. But I added the unlink here in the new patch, so it is more obvious, that the file is no longer needed.
- return ret;
+}
In general the approach is correct and the patch is ok, please fix the bugs and will get in master right away.
Simo.
I also changed one debug message in the new patch. In sss_mc_create_file, if we fail to mark the old cache as recycled, we simply continue (failing here would not prevent the processes that use the old cache from using it). I think the message that informs about this failure should be level 0.
Thanks for the review.
New Patch is attached.
NOTE: Just want to remind that this patch applies on top of the [SSSD] [PATCH] sss_cache: Multiple domains not handled properly
Michal
On 10/29/2012 12:41 PM, Michal Židek wrote:
On 10/26/2012 07:53 PM, Simo Sorce wrote:
On Fri, 2012-10-26 at 18:33 +0200, Michal Židek wrote:
--- a/src/responder/nss/nsssrv_mmap_cache.c +++ b/src/responder/nss/nsssrv_mmap_cache.c @@ -537,9 +537,46 @@ static errno_t sss_mc_create_file(struct sss_mc_ctx *mc_ctx) mode_t old_mask; int ofd; int ret;
- int retries_left;
- struct flock mc_lock;
- useconds_t t = 50000;
- mc_lock.l_type = F_WRLCK;
- mc_lock.l_whence = SEEK_SET;
- mc_lock.l_start = 0;
- mc_lock.l_start = 1;
Something wrong here ^ this should be l_len
- mc_lock.l_pid = getpid();
You do not need to set this, it's a read-out variable, and should be initialized to 0.
ofd = open(mc_ctx->file, O_RDWR); if (ofd != -1) {
for (retries_left = 2; retries_left >= 0; retries_left--) {nitpick but this would be more readable as: for (retries_left = 3; retries_left > 0; retries_left--) { however I see it would require (retries_left - 1) in the debug message, so if you prefer the current form add a comment that says /* try 3 times */ before the for loop.
@@ -570,6 +607,34 @@ static errno_t sss_mc_create_file(struct sss_mc_ctx *mc_ctx) mc_ctx->file, ret, strerror(ret))); }
- for (retries_left = 2; retries_left >= 0; retries_left--) {
ret = fcntl(mc_ctx->fd, F_SETLK, &mc_lock);if (ret == EACCES || ret == EAGAIN || ret == EINTR) {/* File is locked by someone else */DEBUG(SSSDBG_TRACE_FUNC,("Failed to lock mc file %s. Retries left: %d\n",mc_ctx->file, retries_left));ret = usleep(t);if (ret == -1) {DEBUG(SSSDBG_MINOR_FAILURE,("usleep() failed -> ignoring\n"));}} else if (ret == 0) {/* File successfuly locked */break;} else {/* Error occurred */DEBUG(SSSDBG_CRIT_FAILURE,("Unable to lock mc file %s\n", mc_ctx->file));return ret;}- }
- if (retries_left < 0) {
DEBUG(SSSDBG_CRIT_FAILURE,("Unable to lock mc file %s\n", mc_ctx->file));return ret;- }
This block is the same as above, please do not repeat blocks of code, create a static helper function instead and use it in both places.
+errno_t sss_memcache_invalidate(const char *mc_filename) +{
- int mc_fd = -1;
- struct flock mc_lock;
- errno_t ret;
- bool locked = false;
- if (!mc_filename) {
return EINVAL;- }
- mc_fd = open(mc_filename, O_RDWR);
- if (mc_fd == -1) {
ret = errno;if (ret == ENOENT) {DEBUG(SSSDBG_TRACE_FUNC,("Memory cache file %s ""does not exist.\n", mc_filename));return EOK;} else {DEBUG(SSSDBG_CRIT_FAILURE, ("Unable to open file %s: %s\n",
mc_filename, strerror(ret)));return ret;}- }
- mc_lock.l_type = F_WRLCK;
- mc_lock.l_whence = SEEK_SET;
- mc_lock.l_start = 0;
- mc_lock.l_start = 1;
- mc_lock.l_pid = getpid();
Same comments as per sssd_nss code.
- ret = fcntl(mc_fd, F_SETLK, &mc_lock);
- if (ret == -1) {
ret = errno;if (ret == EACCES || ret == EAGAIN || ret == EINTR) {/* File already locked by sssd_nss */ret = EACCES;goto done;EINTR must be handled differently here, I would retry at least once on EINTR as it doesn't mean the file is locked, only that the syscall was interrupted before the lock could be attempted. (The server case is different because it makes no difference for what reason you fail, there you have to retry anyway, here only for EINTR).
} else {DEBUG(SSSDBG_CRIT_FAILURE,("Failed to lock memory cache file %s: %s\n",mc_filename, strerror(ret)));goto done;}- }
- locked = true;
- /* Mark the mc file as recycled. */
- ret = sss_mc_set_recycled(mc_fd);
- if (ret != EOK) {
DEBUG(SSSDBG_CRIT_FAILURE, ("Failed to mark memory cache file%s "
"as recycled.\n", mc_filename));goto done;- }
- ret = EOK;
+done:
- if (locked) {
mc_lock.l_type = F_UNLCK;fcntl(mc_fd, F_SETLK, &mc_lock);- }
- if (mc_fd != -1) {
close(mc_fd);- }
Shouldn't you also unlink the file here ?
That is not necessary, the file will be deleted by sssd_nss. But I added the unlink here in the new patch, so it is more obvious, that the file is no longer needed.
- return ret;
+}
In general the approach is correct and the patch is ok, please fix the bugs and will get in master right away.
Simo.
I also changed one debug message in the new patch. In sss_mc_create_file, if we fail to mark the old cache as recycled, we simply continue (failing here would not prevent the processes that use the old cache from using it). I think the message that informs about this failure should be level 0.
Thanks for the review.
New Patch is attached.
NOTE: Just want to remind that this patch applies on top of the [SSSD] [PATCH] sss_cache: Multiple domains not handled properly
Michal
There was bad indentation on the place where I changed the debug level of the message.
New patch attached.
Thanks Michal
On Mon, 2012-10-29 at 12:53 +0100, Michal Židek wrote:
On 10/29/2012 12:41 PM, Michal Židek wrote:
On 10/26/2012 07:53 PM, Simo Sorce wrote:
On Fri, 2012-10-26 at 18:33 +0200, Michal Židek wrote:
--- a/src/responder/nss/nsssrv_mmap_cache.c +++ b/src/responder/nss/nsssrv_mmap_cache.c @@ -537,9 +537,46 @@ static errno_t sss_mc_create_file(struct sss_mc_ctx *mc_ctx) mode_t old_mask; int ofd; int ret;
- int retries_left;
- struct flock mc_lock;
- useconds_t t = 50000;
- mc_lock.l_type = F_WRLCK;
- mc_lock.l_whence = SEEK_SET;
- mc_lock.l_start = 0;
- mc_lock.l_start = 1;
Something wrong here ^ this should be l_len
- mc_lock.l_pid = getpid();
You do not need to set this, it's a read-out variable, and should be initialized to 0.
ofd = open(mc_ctx->file, O_RDWR); if (ofd != -1) {
for (retries_left = 2; retries_left >= 0; retries_left--) {nitpick but this would be more readable as: for (retries_left = 3; retries_left > 0; retries_left--) { however I see it would require (retries_left - 1) in the debug message, so if you prefer the current form add a comment that says /* try 3 times */ before the for loop.
@@ -570,6 +607,34 @@ static errno_t sss_mc_create_file(struct sss_mc_ctx *mc_ctx) mc_ctx->file, ret, strerror(ret))); }
- for (retries_left = 2; retries_left >= 0; retries_left--) {
ret = fcntl(mc_ctx->fd, F_SETLK, &mc_lock);if (ret == EACCES || ret == EAGAIN || ret == EINTR) {/* File is locked by someone else */DEBUG(SSSDBG_TRACE_FUNC,("Failed to lock mc file %s. Retries left: %d\n",mc_ctx->file, retries_left));ret = usleep(t);if (ret == -1) {DEBUG(SSSDBG_MINOR_FAILURE,("usleep() failed -> ignoring\n"));}} else if (ret == 0) {/* File successfuly locked */break;} else {/* Error occurred */DEBUG(SSSDBG_CRIT_FAILURE,("Unable to lock mc file %s\n", mc_ctx->file));return ret;}- }
- if (retries_left < 0) {
DEBUG(SSSDBG_CRIT_FAILURE,("Unable to lock mc file %s\n", mc_ctx->file));return ret;- }
This block is the same as above, please do not repeat blocks of code, create a static helper function instead and use it in both places.
+errno_t sss_memcache_invalidate(const char *mc_filename) +{
- int mc_fd = -1;
- struct flock mc_lock;
- errno_t ret;
- bool locked = false;
- if (!mc_filename) {
return EINVAL;- }
- mc_fd = open(mc_filename, O_RDWR);
- if (mc_fd == -1) {
ret = errno;if (ret == ENOENT) {DEBUG(SSSDBG_TRACE_FUNC,("Memory cache file %s ""does not exist.\n", mc_filename));return EOK;} else {DEBUG(SSSDBG_CRIT_FAILURE, ("Unable to open file %s: %s\n",
mc_filename, strerror(ret)));return ret;}- }
- mc_lock.l_type = F_WRLCK;
- mc_lock.l_whence = SEEK_SET;
- mc_lock.l_start = 0;
- mc_lock.l_start = 1;
- mc_lock.l_pid = getpid();
Same comments as per sssd_nss code.
- ret = fcntl(mc_fd, F_SETLK, &mc_lock);
- if (ret == -1) {
ret = errno;if (ret == EACCES || ret == EAGAIN || ret == EINTR) {/* File already locked by sssd_nss */ret = EACCES;goto done;EINTR must be handled differently here, I would retry at least once on EINTR as it doesn't mean the file is locked, only that the syscall was interrupted before the lock could be attempted. (The server case is different because it makes no difference for what reason you fail, there you have to retry anyway, here only for EINTR).
} else {DEBUG(SSSDBG_CRIT_FAILURE,("Failed to lock memory cache file %s: %s\n",mc_filename, strerror(ret)));goto done;}- }
- locked = true;
- /* Mark the mc file as recycled. */
- ret = sss_mc_set_recycled(mc_fd);
- if (ret != EOK) {
DEBUG(SSSDBG_CRIT_FAILURE, ("Failed to mark memory cache file%s "
"as recycled.\n", mc_filename));goto done;- }
- ret = EOK;
+done:
- if (locked) {
mc_lock.l_type = F_UNLCK;fcntl(mc_fd, F_SETLK, &mc_lock);- }
- if (mc_fd != -1) {
close(mc_fd);- }
Shouldn't you also unlink the file here ?
That is not necessary, the file will be deleted by sssd_nss. But I added the unlink here in the new patch, so it is more obvious, that the file is no longer needed.
- return ret;
+}
In general the approach is correct and the patch is ok, please fix the bugs and will get in master right away.
Simo.
I also changed one debug message in the new patch. In sss_mc_create_file, if we fail to mark the old cache as recycled, we simply continue (failing here would not prevent the processes that use the old cache from using it). I think the message that informs about this failure should be level 0.
Thanks for the review.
New Patch is attached.
NOTE: Just want to remind that this patch applies on top of the [SSSD] [PATCH] sss_cache: Multiple domains not handled properly
Michal
There was bad indentation on the place where I changed the debug level of the message.
New patch attached.
Codewise looks ok, but I still see duplication of the code used to lock the file.
I was wondering, would it make sense to split this patch in 2 and put the lock_mc_file functions as a more generic function in the common utils as a separate patch first ?
I would see it added in util/util_lock.c so it is available to both tools and responders (avoids the duplication you still have).
The prototype should be something like:
static errno_t sss_br_lock_file(int fd, size_t start, size_t len, int retries, int wait);
Where wait is expressed in milliseconds.
Also you do not really need to unlock the file if you are going to close() it. Posix semantics require the OS to drop all locks on a file if you close it, so you can safely drop the logic around unlocking in tools_util.c (however add a comment before the close like: /* closing the file will drop the lock */ ).
Simo.
On 10/29/2012 03:53 PM, Simo Sorce wrote:
Codewise looks ok, but I still see duplication of the code used to lock the file.
I was wondering, would it make sense to split this patch in 2 and put the lock_mc_file functions as a more generic function in the common utils as a separate patch first ?
I would see it added in util/util_lock.c so it is available to both tools and responders (avoids the duplication you still have).
The prototype should be something like:
static errno_t sss_br_lock_file(int fd, size_t start, size_t len, int retries, int wait);
Where wait is expressed in milliseconds.
I was thinking about this too, but there is one thing I do not like about it. Maybe I am wrong, but IMO functions related to memory cache should be as private to sssd_nss as possible. That is why the function for locking the files is static (sssd_nss part). Function for memory cache is invalidation is an exception to this rule, because it is used by sss_cache, but it should also keep the code that manipulates memory cache private (and it uses it only on one place, so static function was not needed that much).
I do not know about any other place, where we would like to lock the mc files except of sssd_nss and the sss_memcache_invalidate used by sss_cache and it should stay like this. I consider the usage of mc files in sss_cache as necessary pollution, but providing non-static API to spread this pollution does not sound good to me.
But if you really want the util_lock.c, I can send it as a separate patch on top of this one (it is only a small refactor). But still... I am not sure if this is good idea.
Also you do not really need to unlock the file if you are going to close() it. Posix semantics require the OS to drop all locks on a file if you close it, so you can safely drop the logic around unlocking in tools_util.c (however add a comment before the close like: /* closing the file will drop the lock */ ).
I thought closing the file explicitly is more readable. But yes, I'll remove it, adding the comment as you suggested is better.
Simo.
New patch is attached.
Thanks Michal
On Tue, 2012-10-30 at 10:39 +0100, Michal Židek wrote:
On 10/29/2012 03:53 PM, Simo Sorce wrote:
Codewise looks ok, but I still see duplication of the code used to lock the file.
I was wondering, would it make sense to split this patch in 2 and put the lock_mc_file functions as a more generic function in the common utils as a separate patch first ?
I would see it added in util/util_lock.c so it is available to both tools and responders (avoids the duplication you still have).
The prototype should be something like:
static errno_t sss_br_lock_file(int fd, size_t start, size_t len, int retries, int wait);
Where wait is expressed in milliseconds.
I was thinking about this too, but there is one thing I do not like about it. Maybe I am wrong, but IMO functions related to memory cache should be as private to sssd_nss as possible. That is why the function for locking the files is static (sssd_nss part). Function for memory cache is invalidation is an exception to this rule, because it is used by sss_cache, but it should also keep the code that manipulates memory cache private (and it uses it only on one place, so static function was not needed that much).
This is why I made it a generic function not strictly related to the cache in the prototype. I prefer to keep a single function to be used by all components, because copy&paste is a source of pain in the long term.
I do not know about any other place, where we would like to lock the mc files except of sssd_nss and the sss_memcache_invalidate used by sss_cache and it should stay like this. I consider the usage of mc files in sss_cache as necessary pollution, but providing non-static API to spread this pollution does not sound good to me.
The function prototype I suggested is not mc specific. Look at it, it can be used with any file to lock any part, we can also extend it to take a lock type if you want it even more generic.
But if you really want the util_lock.c, I can send it as a separate patch on top of this one (it is only a small refactor). But still... I am not sure if this is good idea.
I think it should be a patch that comes before the current one and just provides the facility, then the current patch will build on top of it.
Also you do not really need to unlock the file if you are going to close() it. Posix semantics require the OS to drop all locks on a file if you close it, so you can safely drop the logic around unlocking in tools_util.c (however add a comment before the close like: /* closing the file will drop the lock */ ).
I thought closing the file explicitly is more readable. But yes, I'll remove it, adding the comment as you suggested is better.
Yup, thanks, this part looks good, I think we only need to split the patch and move the locking function into util, then we are golden.
Simo.
On 10/30/2012 02:25 PM, Simo Sorce wrote:
On Tue, 2012-10-30 at 10:39 +0100, Michal Židek wrote:
On 10/29/2012 03:53 PM, Simo Sorce wrote:
Codewise looks ok, but I still see duplication of the code used to lock the file.
I was wondering, would it make sense to split this patch in 2 and put the lock_mc_file functions as a more generic function in the common utils as a separate patch first ?
I would see it added in util/util_lock.c so it is available to both tools and responders (avoids the duplication you still have).
The prototype should be something like:
static errno_t sss_br_lock_file(int fd, size_t start, size_t len, int retries, int wait);
Where wait is expressed in milliseconds.
I was thinking about this too, but there is one thing I do not like about it. Maybe I am wrong, but IMO functions related to memory cache should be as private to sssd_nss as possible. That is why the function for locking the files is static (sssd_nss part). Function for memory cache is invalidation is an exception to this rule, because it is used by sss_cache, but it should also keep the code that manipulates memory cache private (and it uses it only on one place, so static function was not needed that much).
This is why I made it a generic function not strictly related to the cache in the prototype. I prefer to keep a single function to be used by all components, because copy&paste is a source of pain in the long term.
I do not know about any other place, where we would like to lock the mc files except of sssd_nss and the sss_memcache_invalidate used by sss_cache and it should stay like this. I consider the usage of mc files in sss_cache as necessary pollution, but providing non-static API to spread this pollution does not sound good to me.
The function prototype I suggested is not mc specific. Look at it, it can be used with any file to lock any part, we can also extend it to take a lock type if you want it even more generic.
But if you really want the util_lock.c, I can send it as a separate patch on top of this one (it is only a small refactor). But still... I am not sure if this is good idea.
I think it should be a patch that comes before the current one and just provides the facility, then the current patch will build on top of it.
Ok. I did not add the parameter to select the type of lock (not all types would fit to this function). But we can expand the util_lock.c later with additional functions if more lock types are needed.
Also you do not really need to unlock the file if you are going to close() it. Posix semantics require the OS to drop all locks on a file if you close it, so you can safely drop the logic around unlocking in tools_util.c (however add a comment before the close like: /* closing the file will drop the lock */ ).
I thought closing the file explicitly is more readable. But yes, I'll remove it, adding the comment as you suggested is better.
Yup, thanks, this part looks good, I think we only need to split the patch and move the locking function into util, then we are golden.
Simo.
New patches attached.
Thanks Michal
On Wed, 2012-10-31 at 12:43 +0100, Michal Židek wrote:
On 10/30/2012 02:25 PM, Simo Sorce wrote:
On Tue, 2012-10-30 at 10:39 +0100, Michal Židek wrote:
On 10/29/2012 03:53 PM, Simo Sorce wrote:
Codewise looks ok, but I still see duplication of the code used to lock the file.
I was wondering, would it make sense to split this patch in 2 and put the lock_mc_file functions as a more generic function in the common utils as a separate patch first ?
I would see it added in util/util_lock.c so it is available to both tools and responders (avoids the duplication you still have).
The prototype should be something like:
static errno_t sss_br_lock_file(int fd, size_t start, size_t len, int retries, int wait);
Where wait is expressed in milliseconds.
I was thinking about this too, but there is one thing I do not like about it. Maybe I am wrong, but IMO functions related to memory cache should be as private to sssd_nss as possible. That is why the function for locking the files is static (sssd_nss part). Function for memory cache is invalidation is an exception to this rule, because it is used by sss_cache, but it should also keep the code that manipulates memory cache private (and it uses it only on one place, so static function was not needed that much).
This is why I made it a generic function not strictly related to the cache in the prototype. I prefer to keep a single function to be used by all components, because copy&paste is a source of pain in the long term.
I do not know about any other place, where we would like to lock the mc files except of sssd_nss and the sss_memcache_invalidate used by sss_cache and it should stay like this. I consider the usage of mc files in sss_cache as necessary pollution, but providing non-static API to spread this pollution does not sound good to me.
The function prototype I suggested is not mc specific. Look at it, it can be used with any file to lock any part, we can also extend it to take a lock type if you want it even more generic.
But if you really want the util_lock.c, I can send it as a separate patch on top of this one (it is only a small refactor). But still... I am not sure if this is good idea.
I think it should be a patch that comes before the current one and just provides the facility, then the current patch will build on top of it.
Ok. I did not add the parameter to select the type of lock (not all types would fit to this function). But we can expand the util_lock.c later with additional functions if more lock types are needed.
Also you do not really need to unlock the file if you are going to close() it. Posix semantics require the OS to drop all locks on a file if you close it, so you can safely drop the logic around unlocking in tools_util.c (however add a comment before the close like: /* closing the file will drop the lock */ ).
I thought closing the file explicitly is more readable. But yes, I'll remove it, adding the comment as you suggested is better.
Yup, thanks, this part looks good, I think we only need to split the patch and move the locking function into util, then we are golden.
Simo.
New patches attached.
Michal, the structure now is fine, only one minor nitpick is that it would be probably better to remove references to mmap cache from sss_br_lock_file debugging messages as it is not mmap cache specific anymore.
Anyway I was trying to test your patch on my system but on current master, with your 3 patches on top I cannot get sss_cache to do anything as far as I can see.
It always returns: "No cache object matched the specified search"
Looking at the code it seem to me that this always happen (skipped is always true if you pass only one switch ?) unless all maps are full and you pass -UGNSA, but I haven't used this tool much so I may just be missing something. What's the right way to test this patches ?
Simo.
On 11/01/2012 06:54 AM, Simo Sorce wrote:
On Wed, 2012-10-31 at 12:43 +0100, Michal Židek wrote:
On 10/30/2012 02:25 PM, Simo Sorce wrote:
On Tue, 2012-10-30 at 10:39 +0100, Michal Židek wrote:
On 10/29/2012 03:53 PM, Simo Sorce wrote:
Codewise looks ok, but I still see duplication of the code used to lock the file.
I was wondering, would it make sense to split this patch in 2 and put the lock_mc_file functions as a more generic function in the common utils as a separate patch first ?
I would see it added in util/util_lock.c so it is available to both tools and responders (avoids the duplication you still have).
The prototype should be something like:
static errno_t sss_br_lock_file(int fd, size_t start, size_t len, int retries, int wait);
Where wait is expressed in milliseconds.
I was thinking about this too, but there is one thing I do not like about it. Maybe I am wrong, but IMO functions related to memory cache should be as private to sssd_nss as possible. That is why the function for locking the files is static (sssd_nss part). Function for memory cache is invalidation is an exception to this rule, because it is used by sss_cache, but it should also keep the code that manipulates memory cache private (and it uses it only on one place, so static function was not needed that much).
This is why I made it a generic function not strictly related to the cache in the prototype. I prefer to keep a single function to be used by all components, because copy&paste is a source of pain in the long term.
I do not know about any other place, where we would like to lock the mc files except of sssd_nss and the sss_memcache_invalidate used by sss_cache and it should stay like this. I consider the usage of mc files in sss_cache as necessary pollution, but providing non-static API to spread this pollution does not sound good to me.
The function prototype I suggested is not mc specific. Look at it, it can be used with any file to lock any part, we can also extend it to take a lock type if you want it even more generic.
But if you really want the util_lock.c, I can send it as a separate patch on top of this one (it is only a small refactor). But still... I am not sure if this is good idea.
I think it should be a patch that comes before the current one and just provides the facility, then the current patch will build on top of it.
Ok. I did not add the parameter to select the type of lock (not all types would fit to this function). But we can expand the util_lock.c later with additional functions if more lock types are needed.
Also you do not really need to unlock the file if you are going to close() it. Posix semantics require the OS to drop all locks on a file if you close it, so you can safely drop the logic around unlocking in tools_util.c (however add a comment before the close like: /* closing the file will drop the lock */ ).
I thought closing the file explicitly is more readable. But yes, I'll remove it, adding the comment as you suggested is better.
Yup, thanks, this part looks good, I think we only need to split the patch and move the locking function into util, then we are golden.
Simo.
New patches attached.
Michal, the structure now is fine, only one minor nitpick is that it would be probably better to remove references to mmap cache from sss_br_lock_file debugging messages as it is not mmap cache specific anymore.
Sure, this was leftover from previous versions.
Anyway I was trying to test your patch on my system but on current master, with your 3 patches on top I cannot get sss_cache to do anything as far as I can see.
It always returns: "No cache object matched the specified search"
Looking at the code it seem to me that this always happen (skipped is always true if you pass only one switch ?) unless all maps are full and you pass -UGNSA, but I haven't used this tool much so I may just be missing something. What's the right way to test this patches ?
This is strange. It works fine for me. I test it for example this way:
1. start sssd 2. fill the cache with getent passwd user1 3. turn off sssd 4. getent passwd user1 again (result is returned from memory cache) 5. sss_cache -u user1 (or sss_cache -U) should remove the cache 6. getent passwd user1 should give no results
I also tested the sss_cache while sssd was running and it gives me expected results (sended SIGHUP to monitor and next getent did not use the mem cache).
I tested it on current master, but maybe there really is a bug. If it is still not working for you, could you send how you test it?
Thanks Michal
On Thu, 2012-11-01 at 11:19 +0100, Michal Židek wrote:
On 11/01/2012 06:54 AM, Simo Sorce wrote:
On Wed, 2012-10-31 at 12:43 +0100, Michal Židek wrote:
On 10/30/2012 02:25 PM, Simo Sorce wrote:
On Tue, 2012-10-30 at 10:39 +0100, Michal Židek wrote:
On 10/29/2012 03:53 PM, Simo Sorce wrote:
Codewise looks ok, but I still see duplication of the code used to lock the file.
I was wondering, would it make sense to split this patch in 2 and put the lock_mc_file functions as a more generic function in the common utils as a separate patch first ?
I would see it added in util/util_lock.c so it is available to both tools and responders (avoids the duplication you still have).
The prototype should be something like:
static errno_t sss_br_lock_file(int fd, size_t start, size_t len, int retries, int wait);
Where wait is expressed in milliseconds.
I was thinking about this too, but there is one thing I do not like about it. Maybe I am wrong, but IMO functions related to memory cache should be as private to sssd_nss as possible. That is why the function for locking the files is static (sssd_nss part). Function for memory cache is invalidation is an exception to this rule, because it is used by sss_cache, but it should also keep the code that manipulates memory cache private (and it uses it only on one place, so static function was not needed that much).
This is why I made it a generic function not strictly related to the cache in the prototype. I prefer to keep a single function to be used by all components, because copy&paste is a source of pain in the long term.
I do not know about any other place, where we would like to lock the mc files except of sssd_nss and the sss_memcache_invalidate used by sss_cache and it should stay like this. I consider the usage of mc files in sss_cache as necessary pollution, but providing non-static API to spread this pollution does not sound good to me.
The function prototype I suggested is not mc specific. Look at it, it can be used with any file to lock any part, we can also extend it to take a lock type if you want it even more generic.
But if you really want the util_lock.c, I can send it as a separate patch on top of this one (it is only a small refactor). But still... I am not sure if this is good idea.
I think it should be a patch that comes before the current one and just provides the facility, then the current patch will build on top of it.
Ok. I did not add the parameter to select the type of lock (not all types would fit to this function). But we can expand the util_lock.c later with additional functions if more lock types are needed.
Also you do not really need to unlock the file if you are going to close() it. Posix semantics require the OS to drop all locks on a file if you close it, so you can safely drop the logic around unlocking in tools_util.c (however add a comment before the close like: /* closing the file will drop the lock */ ).
I thought closing the file explicitly is more readable. But yes, I'll remove it, adding the comment as you suggested is better.
Yup, thanks, this part looks good, I think we only need to split the patch and move the locking function into util, then we are golden.
Simo.
New patches attached.
Michal, the structure now is fine, only one minor nitpick is that it would be probably better to remove references to mmap cache from sss_br_lock_file debugging messages as it is not mmap cache specific anymore.
Sure, this was leftover from previous versions.
Anyway I was trying to test your patch on my system but on current master, with your 3 patches on top I cannot get sss_cache to do anything as far as I can see.
It always returns: "No cache object matched the specified search"
Looking at the code it seem to me that this always happen (skipped is always true if you pass only one switch ?) unless all maps are full and you pass -UGNSA, but I haven't used this tool much so I may just be missing something. What's the right way to test this patches ?
This is strange. It works fine for me. I test it for example this way:
- start sssd
- fill the cache with getent passwd user1
- turn off sssd
- getent passwd user1 again (result is returned from memory cache)
- sss_cache -u user1 (or sss_cache -U) should remove the cache
- getent passwd user1 should give no results
I also tested the sss_cache while sssd was running and it gives me expected results (sended SIGHUP to monitor and next getent did not use the mem cache).
I tested it on current master, but maybe there really is a bug. If it is still not working for you, could you send how you test it?
I tired the same above commands, however I didn't make sure to repopulate the cache after the first try, maybe I was indeed operating on an empty one, I'll retry with your 2 new patches (btw it would be nice if you could send them in separate emails in future, so patchwork picks both them up :).
Simo.
On Thu, 2012-11-01 at 09:14 -0400, Simo Sorce wrote:
On Thu, 2012-11-01 at 11:19 +0100, Michal Židek wrote:
On 11/01/2012 06:54 AM, Simo Sorce wrote:
On Wed, 2012-10-31 at 12:43 +0100, Michal Židek wrote:
On 10/30/2012 02:25 PM, Simo Sorce wrote:
On Tue, 2012-10-30 at 10:39 +0100, Michal Židek wrote:
On 10/29/2012 03:53 PM, Simo Sorce wrote: > > Codewise looks ok, but I still see duplication of the code used to lock > the file. > > I was wondering, would it make sense to split this patch in 2 and put > the lock_mc_file functions as a more generic function in the common > utils as a separate patch first ? > > I would see it added in util/util_lock.c so it is available to both > tools and responders (avoids the duplication you still have). > > The prototype should be something like: > > static errno_t sss_br_lock_file(int fd, size_t start, size_t len, int > retries, int wait); > > Where wait is expressed in milliseconds. >
I was thinking about this too, but there is one thing I do not like about it. Maybe I am wrong, but IMO functions related to memory cache should be as private to sssd_nss as possible. That is why the function for locking the files is static (sssd_nss part). Function for memory cache is invalidation is an exception to this rule, because it is used by sss_cache, but it should also keep the code that manipulates memory cache private (and it uses it only on one place, so static function was not needed that much).
This is why I made it a generic function not strictly related to the cache in the prototype. I prefer to keep a single function to be used by all components, because copy&paste is a source of pain in the long term.
I do not know about any other place, where we would like to lock the mc files except of sssd_nss and the sss_memcache_invalidate used by sss_cache and it should stay like this. I consider the usage of mc files in sss_cache as necessary pollution, but providing non-static API to spread this pollution does not sound good to me.
The function prototype I suggested is not mc specific. Look at it, it can be used with any file to lock any part, we can also extend it to take a lock type if you want it even more generic.
But if you really want the util_lock.c, I can send it as a separate patch on top of this one (it is only a small refactor). But still... I am not sure if this is good idea.
I think it should be a patch that comes before the current one and just provides the facility, then the current patch will build on top of it.
Ok. I did not add the parameter to select the type of lock (not all types would fit to this function). But we can expand the util_lock.c later with additional functions if more lock types are needed.
> Also you do not really need to unlock the file if you are going to > close() it. > Posix semantics require the OS to drop all locks on a file if you close > it, so you can safely drop the logic around unlocking in tools_util.c > (however add a comment before the close like: /* closing the file will > drop the lock */ ).
I thought closing the file explicitly is more readable. But yes, I'll remove it, adding the comment as you suggested is better.
Yup, thanks, this part looks good, I think we only need to split the patch and move the locking function into util, then we are golden.
Simo.
New patches attached.
Michal, the structure now is fine, only one minor nitpick is that it would be probably better to remove references to mmap cache from sss_br_lock_file debugging messages as it is not mmap cache specific anymore.
Sure, this was leftover from previous versions.
Anyway I was trying to test your patch on my system but on current master, with your 3 patches on top I cannot get sss_cache to do anything as far as I can see.
It always returns: "No cache object matched the specified search"
Looking at the code it seem to me that this always happen (skipped is always true if you pass only one switch ?) unless all maps are full and you pass -UGNSA, but I haven't used this tool much so I may just be missing something. What's the right way to test this patches ?
This is strange. It works fine for me. I test it for example this way:
- start sssd
- fill the cache with getent passwd user1
- turn off sssd
- getent passwd user1 again (result is returned from memory cache)
- sss_cache -u user1 (or sss_cache -U) should remove the cache
- getent passwd user1 should give no results
I also tested the sss_cache while sssd was running and it gives me expected results (sended SIGHUP to monitor and next getent did not use the mem cache).
I tested it on current master, but maybe there really is a bug. If it is still not working for you, could you send how you test it?
I tired the same above commands, however I didn't make sure to repopulate the cache after the first try, maybe I was indeed operating on an empty one, I'll retry with your 2 new patches (btw it would be nice if you could send them in separate emails in future, so patchwork picks both them up :).
Ok I have found out what I was doing wrong. I was trying with a user in a trusted domain. These users are apparently not yet stored in the memory ccache (need to open a bug on that I guess).
Once I started trying with an simple ipa users it worked fine.
ACK!
Simo.
On Thu, 2012-11-01 at 09:45 -0400, Simo Sorce wrote:
Ok I have found out what I was doing wrong. I was trying with a user in a trusted domain. These users are apparently not yet stored in the memory ccache (need to open a bug on that I guess).
Ah and I just found what is the issue here. I was trying with the FLATNAME\username form here, but the ccache still does not have an alias for that and only supports storing user@dns-domain form. That's why my getent queries with sssd stoped were failing for the trusted user.
Simo.
On 11/01/2012 02:45 PM, Simo Sorce wrote:
On Thu, 2012-11-01 at 09:14 -0400, Simo Sorce wrote:
On Thu, 2012-11-01 at 11:19 +0100, Michal Židek wrote:
On 11/01/2012 06:54 AM, Simo Sorce wrote:
On Wed, 2012-10-31 at 12:43 +0100, Michal Židek wrote:
On 10/30/2012 02:25 PM, Simo Sorce wrote:
On Tue, 2012-10-30 at 10:39 +0100, Michal Židek wrote: > On 10/29/2012 03:53 PM, Simo Sorce wrote: >> >> Codewise looks ok, but I still see duplication of the code used to lock >> the file. >> >> I was wondering, would it make sense to split this patch in 2 and put >> the lock_mc_file functions as a more generic function in the common >> utils as a separate patch first ? >> >> I would see it added in util/util_lock.c so it is available to both >> tools and responders (avoids the duplication you still have). >> >> The prototype should be something like: >> >> static errno_t sss_br_lock_file(int fd, size_t start, size_t len, int >> retries, int wait); >> >> Where wait is expressed in milliseconds. >> > > I was thinking about this too, but there is one thing I do not like > about it. Maybe I am wrong, but IMO functions related to memory cache > should be as private to sssd_nss as possible. That is why the function > for locking the files is static (sssd_nss part). Function for memory > cache is invalidation is an exception to this rule, because it is used > by sss_cache, but it should also keep the code that manipulates memory > cache private (and it uses it only on one place, so static function was > not needed that much).
This is why I made it a generic function not strictly related to the cache in the prototype. I prefer to keep a single function to be used by all components, because copy&paste is a source of pain in the long term.
> I do not know about any other place, where we would like to lock the mc > files except of sssd_nss and the sss_memcache_invalidate used by > sss_cache and it should stay like this. I consider the usage of mc files > in sss_cache as necessary pollution, but providing non-static API to > spread this pollution does not sound good to me.
The function prototype I suggested is not mc specific. Look at it, it can be used with any file to lock any part, we can also extend it to take a lock type if you want it even more generic.
> But if you really want the util_lock.c, I can send it as a separate > patch on top of this one (it is only a small refactor). But still... I > am not sure if this is good idea.
I think it should be a patch that comes before the current one and just provides the facility, then the current patch will build on top of it.
Ok. I did not add the parameter to select the type of lock (not all types would fit to this function). But we can expand the util_lock.c later with additional functions if more lock types are needed.
>> Also you do not really need to unlock the file if you are going to >> close() it. >> Posix semantics require the OS to drop all locks on a file if you close >> it, so you can safely drop the logic around unlocking in tools_util.c >> (however add a comment before the close like: /* closing the file will >> drop the lock */ ). > > I thought closing the file explicitly is more readable. But yes, I'll > remove it, adding the comment as you suggested is better.
Yup, thanks, this part looks good, I think we only need to split the patch and move the locking function into util, then we are golden.
Simo.
New patches attached.
Michal, the structure now is fine, only one minor nitpick is that it would be probably better to remove references to mmap cache from sss_br_lock_file debugging messages as it is not mmap cache specific anymore.
Sure, this was leftover from previous versions.
Anyway I was trying to test your patch on my system but on current master, with your 3 patches on top I cannot get sss_cache to do anything as far as I can see.
It always returns: "No cache object matched the specified search"
Looking at the code it seem to me that this always happen (skipped is always true if you pass only one switch ?) unless all maps are full and you pass -UGNSA, but I haven't used this tool much so I may just be missing something. What's the right way to test this patches ?
This is strange. It works fine for me. I test it for example this way:
- start sssd
- fill the cache with getent passwd user1
- turn off sssd
- getent passwd user1 again (result is returned from memory cache)
- sss_cache -u user1 (or sss_cache -U) should remove the cache
- getent passwd user1 should give no results
I also tested the sss_cache while sssd was running and it gives me expected results (sended SIGHUP to monitor and next getent did not use the mem cache).
I tested it on current master, but maybe there really is a bug. If it is still not working for you, could you send how you test it?
I tired the same above commands, however I didn't make sure to repopulate the cache after the first try, maybe I was indeed operating on an empty one, I'll retry with your 2 new patches (btw it would be nice if you could send them in separate emails in future, so patchwork picks both them up :).
Ok I have found out what I was doing wrong. I was trying with a user in a trusted domain. These users are apparently not yet stored in the memory ccache (need to open a bug on that I guess).
Once I started trying with an simple ipa users it worked fine.
ACK!
Simo.
Just re-sending these patches. They are the same as previous just rebased so that they apply on top of the modified 'sss_cache: Multiple domains not handled properly' patch.
Thanks Michal
On Mon, Nov 05, 2012 at 01:26:04PM +0100, Michal Židek wrote:
On 11/01/2012 02:45 PM, Simo Sorce wrote:
On Thu, 2012-11-01 at 09:14 -0400, Simo Sorce wrote:
On Thu, 2012-11-01 at 11:19 +0100, Michal Židek wrote:
On 11/01/2012 06:54 AM, Simo Sorce wrote:
On Wed, 2012-10-31 at 12:43 +0100, Michal Židek wrote:
On 10/30/2012 02:25 PM, Simo Sorce wrote: >On Tue, 2012-10-30 at 10:39 +0100, Michal Židek wrote: >>On 10/29/2012 03:53 PM, Simo Sorce wrote: >>> >>>Codewise looks ok, but I still see duplication of the code used to lock >>>the file. >>> >>>I was wondering, would it make sense to split this patch in 2 and put >>>the lock_mc_file functions as a more generic function in the common >>>utils as a separate patch first ? >>> >>>I would see it added in util/util_lock.c so it is available to both >>>tools and responders (avoids the duplication you still have). >>> >>>The prototype should be something like: >>> >>>static errno_t sss_br_lock_file(int fd, size_t start, size_t len, int >>>retries, int wait); >>> >>>Where wait is expressed in milliseconds. >>> >> >>I was thinking about this too, but there is one thing I do not like >>about it. Maybe I am wrong, but IMO functions related to memory cache >>should be as private to sssd_nss as possible. That is why the function >>for locking the files is static (sssd_nss part). Function for memory >>cache is invalidation is an exception to this rule, because it is used >>by sss_cache, but it should also keep the code that manipulates memory >>cache private (and it uses it only on one place, so static function was >>not needed that much). > >This is why I made it a generic function not strictly related to the >cache in the prototype. >I prefer to keep a single function to be used by all components, because >copy&paste is a source of pain in the long term. > >>I do not know about any other place, where we would like to lock the mc >>files except of sssd_nss and the sss_memcache_invalidate used by >>sss_cache and it should stay like this. I consider the usage of mc files >>in sss_cache as necessary pollution, but providing non-static API to >>spread this pollution does not sound good to me. > >The function prototype I suggested is not mc specific. Look at it, it >can be used with any file to lock any part, we can also extend it to >take a lock type if you want it even more generic. > >>But if you really want the util_lock.c, I can send it as a separate >>patch on top of this one (it is only a small refactor). But still... I >>am not sure if this is good idea. > >I think it should be a patch that comes before the current one and just >provides the facility, then the current patch will build on top of it. >
Ok. I did not add the parameter to select the type of lock (not all types would fit to this function). But we can expand the util_lock.c later with additional functions if more lock types are needed.
>>>Also you do not really need to unlock the file if you are going to >>>close() it. >>>Posix semantics require the OS to drop all locks on a file if you close >>>it, so you can safely drop the logic around unlocking in tools_util.c >>>(however add a comment before the close like: /* closing the file will >>>drop the lock */ ). >> >>I thought closing the file explicitly is more readable. But yes, I'll >>remove it, adding the comment as you suggested is better. > >Yup, thanks, this part looks good, I think we only need to split the >patch and move the locking function into util, then we are golden. > >Simo. >
New patches attached.
Michal, the structure now is fine, only one minor nitpick is that it would be probably better to remove references to mmap cache from sss_br_lock_file debugging messages as it is not mmap cache specific anymore.
Sure, this was leftover from previous versions.
Anyway I was trying to test your patch on my system but on current master, with your 3 patches on top I cannot get sss_cache to do anything as far as I can see.
It always returns: "No cache object matched the specified search"
Looking at the code it seem to me that this always happen (skipped is always true if you pass only one switch ?) unless all maps are full and you pass -UGNSA, but I haven't used this tool much so I may just be missing something. What's the right way to test this patches ?
This is strange. It works fine for me. I test it for example this way:
- start sssd
- fill the cache with getent passwd user1
- turn off sssd
- getent passwd user1 again (result is returned from memory cache)
- sss_cache -u user1 (or sss_cache -U) should remove the cache
- getent passwd user1 should give no results
I also tested the sss_cache while sssd was running and it gives me expected results (sended SIGHUP to monitor and next getent did not use the mem cache).
I tested it on current master, but maybe there really is a bug. If it is still not working for you, could you send how you test it?
I tired the same above commands, however I didn't make sure to repopulate the cache after the first try, maybe I was indeed operating on an empty one, I'll retry with your 2 new patches (btw it would be nice if you could send them in separate emails in future, so patchwork picks both them up :).
Ok I have found out what I was doing wrong. I was trying with a user in a trusted domain. These users are apparently not yet stored in the memory ccache (need to open a bug on that I guess).
Once I started trying with an simple ipa users it worked fine.
ACK!
Simo.
Just re-sending these patches. They are the same as previous just rebased so that they apply on top of the modified 'sss_cache: Multiple domains not handled properly' patch.
Thanks Michal
Pushed to master and sssd-1-9
Hi,
+errno_t sss_br_lock_file(int fd, size_t start, size_t len,
int retries, useconds_t wait)+{
- int ret;
- struct flock lock;
- int retries_left;
- lock.l_type = F_WRLCK;
- lock.l_whence = SEEK_SET;
- lock.l_start = start;
- lock.l_len = len;
- lock.l_pid = 0;
- for (retries_left = retries; retries_left > 0; retries_left--) {
ret = fcntl(fd, F_SETLK, &lock);if (ret == -1) {ret = errno;if (ret == EACCES || ret == EAGAIN || ret == EINTR) {DEBUG(SSSDBG_TRACE_FUNC,("Failed to lock file. Retries left: %d\n",retries_left - 1));if ((ret == EACCES || ret == EAGAIN) && (retries_left <= 1)) {/* File is locked by someone else. Return EACCESS* if this is the last try. */return EACCES;}if (retries_left - 1 > 0) {ret = usleep(wait);if (ret == -1) {DEBUG(SSSDBG_MINOR_FAILURE,("usleep() failed -> ignoring\n"));}}} else {/* Error occurred */DEBUG(SSSDBG_CRIT_FAILURE,("Unable to lock file.\n"));return ret;}} else if (ret == 0) {/* File successfuly locked */break;}- }
- if (retries_left == 0) {
DEBUG(SSSDBG_CRIT_FAILURE, ("Unable to lock file.\n"));return ret;
if retries is 0, ret is uninitialized. Can you fix this? All credits goes to gcc, I've just seen the compiler warning.
bye, Sumit
- }
- return EOK;
+}
On 11/06/2012 09:45 PM, Sumit Bose wrote:
Hi,
+errno_t sss_br_lock_file(int fd, size_t start, size_t len,
int retries, useconds_t wait)+{
- int ret;
- struct flock lock;
- int retries_left;
- lock.l_type = F_WRLCK;
- lock.l_whence = SEEK_SET;
- lock.l_start = start;
- lock.l_len = len;
- lock.l_pid = 0;
- for (retries_left = retries; retries_left > 0; retries_left--) {
ret = fcntl(fd, F_SETLK, &lock);if (ret == -1) {ret = errno;if (ret == EACCES || ret == EAGAIN || ret == EINTR) {DEBUG(SSSDBG_TRACE_FUNC,("Failed to lock file. Retries left: %d\n",retries_left - 1));if ((ret == EACCES || ret == EAGAIN) && (retries_left <= 1)) {/* File is locked by someone else. Return EACCESS* if this is the last try. */return EACCES;}if (retries_left - 1 > 0) {ret = usleep(wait);if (ret == -1) {DEBUG(SSSDBG_MINOR_FAILURE,("usleep() failed -> ignoring\n"));}}} else {/* Error occurred */DEBUG(SSSDBG_CRIT_FAILURE,("Unable to lock file.\n"));return ret;}} else if (ret == 0) {/* File successfuly locked */break;}- }
- if (retries_left == 0) {
DEBUG(SSSDBG_CRIT_FAILURE, ("Unable to lock file.\n"));return ret;if retries is 0, ret is uninitialized. Can you fix this? All credits goes to gcc, I've just seen the compiler warning.
bye, Sumit
- }
- return EOK;
+}
sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-devel
Good catch. Having a value of 0 or <= 0 makes no sense here. I'll send a patch that returns EINVAL in such situation in a new thread.
Thanks Michal
sssd-devel@lists.fedorahosted.org