URL: https://github.com/SSSD/sssd/pull/947 Title: #947: tests: fix race conditions in integration tests
pbrezina commented: """
I. `test_files_provider.py`
`poll_canary` has no meaning as we currently block when refreshing the cache. We can either remove it or report a bug against sssd that it behaves incorrectly. @jhrozek When files provider is updating cache in inotify callback (i.e. domain is inconsistent) should nss responder A) return not found and let client look in files database, or B) block until the cache is updated? Currently B is true but tests says A should be done.
Actually... Are you sure "B is true"? I just tested with a patch of `providers/files/files_ops.c:sf_passwd_cb()`: `+ while (1) { usleep(100000); }` right before `sf_enum_files()` and `id` works fine (according to your statement this should block).
I'm sure. See: * https://github.com/SSSD/sssd/blob/master/src/responder/common/responder_dp.c... * https://github.com/SSSD/sssd/blob/master/src/providers/files/files_id.c#L132 * https://github.com/SSSD/sssd/blob/master/src/providers/files/files_ops.c#L84...
I checked with `strace`:
openat(AT_FDCWD, "/lib64/libnss_sss.so.2", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/var/lib/sss/mc/passwd", O_RDONLY|O_CLOEXEC) = 3 connect(4, {sa_family=AF_UNIX, sun_path="/var/lib/sss/pipes/nss"}, 110) = 0 openat(AT_FDCWD, "/lib64/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = 5
Because you used `usleep` at wrong place. See, D-Bus communication is asynchronous so once you put a sleep somewhere, the process can not receive nor send any message. In this case, even though `dp_sbus_domain_inconsistent` was called, the message was not send because you immediately go to sleep. Therefore, the nss responder will answer with what it has in cache. You will see this in logs.
I'm not sure why it continues with nss_files, I am not this much familiar with nss internals.
To reproduce it correctly, use rather this diff:
```diff --- a/src/providers/files/files_ops.c +++ b/src/providers/files/files_ops.c @@ -836,9 +836,9 @@ static int sf_passwd_cb(const char *filename, uint32_t flags, void *pvt)
ret = EOK; done: - id_ctx->updating_passwd = false; + // id_ctx->updating_passwd = false; sf_cb_done(id_ctx); - files_account_info_finished(id_ctx, BE_REQ_USER, ret); + // files_account_info_finished(id_ctx, BE_REQ_USER, ret); return ret; } ```
``` [root /dev/shm/sssd]# id pbrezina@files uid=1000(pbrezina) gid=1000(pbrezina) groups=1000(pbrezina),10(wheel) [root /dev/shm/sssd]# touch /etc/passwd [root /dev/shm/sssd]# sss_cache -E [root /dev/shm/sssd]# id pbrezina@files ... after a long timeout id: ‘pbrezina@files’: no such user
[root /dev/shm/sssd]# id pbrezina uid=1000(pbrezina) gid=1000(pbrezina) groups=1000(pbrezina),10(wheel) [root /dev/shm/sssd]# touch /etc/passwd [root /dev/shm/sssd]# sss_cache -E [root /dev/shm/sssd]# id pbrezina ... after a long timeout uid=1000(pbrezina) gid=1000(pbrezina) groups=1000(pbrezina),10(wheel) ```
"""
See the full comment at https://github.com/SSSD/sssd/pull/947#issuecomment-562093538