Hi When I'm debugging or adding new objects and attributes to the directory, I need to be able to turn off the sssd cache. Otherwise, I do not see any of the changes I have made.
How can I force sssd to read from the directory and not from the cache?
Cheers, Steve
On Mon, Apr 15, 2013 at 09:10:20PM +0200, steve wrote:
Hi When I'm debugging or adding new objects and attributes to the directory, I need to be able to turn off the sssd cache. Otherwise, I do not see any of the changes I have made.
How can I force sssd to read from the directory and not from the cache?
Cheers, Steve
Using the sss_cache utility. Running sss_cache -UG would invalidate all users and groups for example. See man sss_cache for more examples.
On 04/15/2013 09:10 PM, steve wrote:
Hi When I'm debugging or adding new objects and attributes to the directory, I need to be able to turn off the sssd cache. Otherwise, I do not see any of the changes I have made.
How can I force sssd to read from the directory and not from the cache?
Cheers, Steve
Hi again OK, I found it. sss_cache
Unfortunately it gives an error even if a correct switch and domain are given:
sudo sss_cache -d default Usage: sss_cache [-?UGNSA] [-?|--help] [--usage] [-u|--user=STRING] [-U|--users] [-g|--group=STRING] [-G|--groups] [-n|--netgroup=STRING] [-N|--netgroups] [-s|--service=STRING] [-S|--services] [-a|--autofs-map=STRING] [-A|--autofs-maps] [-d|--domain=STRING] Please select at least one object to invalidate (Tue Apr 16 09:37:15:820975 2013) [sssd] [main] (0x0020): Error initializing context for the application
The other switches, e.g. sss_cache -u steve2 works OK.
sssd 1.9.4
Cheers, Steve
On Tue, 16 Apr 2013, steve wrote:
Hi again OK, I found it. sss_cache
Unfortunately it gives an error even if a correct switch and domain are given:
sudo sss_cache -d default Usage: sss_cache [-?UGNSA] [-?|--help] [--usage] [-u|--user=STRING] [-U|--users] [-g|--group=STRING] [-G|--groups] [-n|--netgroup=STRING] [-N|--netgroups] [-s|--service=STRING] [-S|--services] [-a|--autofs-map=STRING] [-A|--autofs-maps] [-d|--domain=STRING] Please select at least one object to invalidate (Tue Apr 16 09:37:15:820975 2013) [sssd] [main] (0x0020): Error initializing context for the application
The other switches, e.g. sss_cache -u steve2 works OK.
sssd 1.9.4
Surely that should be:
sss_cache -d default -UG
or just
sss_cache -UG
But to be honest, I'd favour the more brutal technique while debugging. sss_cache invalidates the cache, but if sssd can't contact the LDAP servers it'll still serve from cache I thought. I may be wrong on that point though.
I've always gone for the completely unambiguous:
service sssd stop rm -f /var/lib/sss/{db,mc}/* /var/log/sssd/* service sssd start
That way, I'm clear that it knew nothing, and that the logs I'm looking at are 100% from the current config.
jh
On 16/04/13 10:02, John Hodrien wrote:
On Tue, 16 Apr 2013, steve wrote:
Hi again OK, I found it. sss_cache
Unfortunately it gives an error even if a correct switch and domain are given:
sudo sss_cache -d default Usage: sss_cache [-?UGNSA] [-?|--help] [--usage] [-u|--user=STRING] [-U|--users] [-g|--group=STRING] [-G|--groups] [-n|--netgroup=STRING] [-N|--netgroups] [-s|--service=STRING] [-S|--services] [-a|--autofs-map=STRING] [-A|--autofs-maps] [-d|--domain=STRING] Please select at least one object to invalidate (Tue Apr 16 09:37:15:820975 2013) [sssd] [main] (0x0020): Error initializing context for the application
The other switches, e.g. sss_cache -u steve2 works OK.
sssd 1.9.4
Surely that should be:
sss_cache -d default -UG
or just
sss_cache -UG
But to be honest, I'd favour the more brutal technique while debugging. sss_cache invalidates the cache, but if sssd can't contact the LDAP servers it'll still serve from cache I thought. I may be wrong on that point though.
I've always gone for the completely unambiguous:
service sssd stop rm -f /var/lib/sss/{db,mc}/* /var/log/sssd/* service sssd start
That way, I'm clear that it knew nothing, and that the logs I'm looking at are 100% from the current config.
jh
Hi Thanks for the syntax. It works perfectly now. Good advice about the brutal technique too.
I'm actually trying to debug a bash script which runs getent passwd <user>. On Ubuntu it seems that getent is run in a different process as it returns nothing. The same script on openSUSE returns as expected. I know it's OT but any ideas how to get output from getent in an Ubuntu bash script? Cheers, Steve
On 04/16/2013 04:44 AM, steve wrote:
On 16/04/13 10:02, John Hodrien wrote:
On Tue, 16 Apr 2013, steve wrote:
Hi again OK, I found it. sss_cache
Unfortunately it gives an error even if a correct switch and domain are given:
sudo sss_cache -d default Usage: sss_cache [-?UGNSA] [-?|--help] [--usage] [-u|--user=STRING] [-U|--users] [-g|--group=STRING] [-G|--groups] [-n|--netgroup=STRING] [-N|--netgroups] [-s|--service=STRING] [-S|--services] [-a|--autofs-map=STRING] [-A|--autofs-maps] [-d|--domain=STRING] Please select at least one object to invalidate (Tue Apr 16 09:37:15:820975 2013) [sssd] [main] (0x0020): Error initializing context for the application
The other switches, e.g. sss_cache -u steve2 works OK.
sssd 1.9.4
Surely that should be:
sss_cache -d default -UG
or just
sss_cache -UG
But to be honest, I'd favour the more brutal technique while debugging. sss_cache invalidates the cache, but if sssd can't contact the LDAP servers it'll still serve from cache I thought. I may be wrong on that point though.
I've always gone for the completely unambiguous:
service sssd stop rm -f /var/lib/sss/{db,mc}/* /var/log/sssd/* service sssd start
That way, I'm clear that it knew nothing, and that the logs I'm looking at are 100% from the current config.
jh
Hi Thanks for the syntax. It works perfectly now. Good advice about the brutal technique too.
I'm actually trying to debug a bash script which runs getent passwd <user>. On Ubuntu it seems that getent is run in a different process as it returns nothing. The same script on openSUSE returns as expected. I know it's OT but any ideas how to get output from getent in an Ubuntu bash script?
is it 'getent passwd <user>' that returns nothing or 'getent passwd' that returns nothing?
Cheers, Steve
sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-devel
On 04/16/2013 04:35 PM, Dmitri Pal wrote:
On 04/16/2013 04:44 AM, steve wrote:
On 16/04/13 10:02, John Hodrien wrote:
On Tue, 16 Apr 2013, steve wrote:
Hi again OK, I found it. sss_cache
Unfortunately it gives an error even if a correct switch and domain are given:
sudo sss_cache -d default Usage: sss_cache [-?UGNSA] [-?|--help] [--usage] [-u|--user=STRING] [-U|--users] [-g|--group=STRING] [-G|--groups] [-n|--netgroup=STRING] [-N|--netgroups] [-s|--service=STRING] [-S|--services] [-a|--autofs-map=STRING] [-A|--autofs-maps] [-d|--domain=STRING] Please select at least one object to invalidate (Tue Apr 16 09:37:15:820975 2013) [sssd] [main] (0x0020): Error initializing context for the application
The other switches, e.g. sss_cache -u steve2 works OK.
sssd 1.9.4
Surely that should be:
sss_cache -d default -UG
or just
sss_cache -UG
But to be honest, I'd favour the more brutal technique while debugging. sss_cache invalidates the cache, but if sssd can't contact the LDAP servers it'll still serve from cache I thought. I may be wrong on that point though.
I've always gone for the completely unambiguous:
service sssd stop rm -f /var/lib/sss/{db,mc}/* /var/log/sssd/* service sssd start
That way, I'm clear that it knew nothing, and that the logs I'm looking at are 100% from the current config.
jh
Hi Thanks for the syntax. It works perfectly now. Good advice about the brutal technique too.
I'm actually trying to debug a bash script which runs getent passwd <user>. On Ubuntu it seems that getent is run in a different process as it returns nothing. The same script on openSUSE returns as expected. I know it's OT but any ideas how to get output from getent in an Ubuntu bash script?
is it 'getent passwd <user>' that returns nothing or 'getent passwd' that returns nothing?
Hi Inside the script neither command returns anything.
On Tue, Apr 16, 2013 at 10:44:08AM +0200, steve wrote:
Hi Thanks for the syntax. It works perfectly now. Good advice about the brutal technique too.
I'm actually trying to debug a bash script which runs getent passwd <user>. On Ubuntu it seems that getent is run in a different process as it returns nothing. The same script on openSUSE returns as expected. I know it's OT but any ideas how to get output from getent in an Ubuntu bash script? Cheers, Steve
This really shouldn't matter, does getent on Ubuntu works fine without bash script?
On 04/16/2013 05:17 PM, Jakub Hrozek wrote:
On Tue, Apr 16, 2013 at 10:44:08AM +0200, steve wrote:
Hi Thanks for the syntax. It works perfectly now. Good advice about the brutal technique too.
I'm actually trying to debug a bash script which runs getent passwd <user>. On Ubuntu it seems that getent is run in a different process as it returns nothing. The same script on openSUSE returns as expected. I know it's OT but any ideas how to get output from getent in an Ubuntu bash script? Cheers, Steve
This really shouldn't matter, does getent on Ubuntu works fine without bash script?
Yes, it's fine outside the script. It's just that the script sets permissions on some files to the new user. We have to use his numeric gidNumber because his username is not available to use as such (as getent shows)
On Tue, Apr 16, 2013 at 05:21:35PM +0200, steve wrote:
On 04/16/2013 05:17 PM, Jakub Hrozek wrote:
On Tue, Apr 16, 2013 at 10:44:08AM +0200, steve wrote:
Hi Thanks for the syntax. It works perfectly now. Good advice about the brutal technique too.
I'm actually trying to debug a bash script which runs getent passwd <user>. On Ubuntu it seems that getent is run in a different process as it returns nothing. The same script on openSUSE returns as expected. I know it's OT but any ideas how to get output from getent in an Ubuntu bash script? Cheers, Steve
This really shouldn't matter, does getent on Ubuntu works fine without bash script?
Yes, it's fine outside the script. It's just that the script sets permissions on some files to the new user. We have to use his numeric gidNumber because his username is not available to use as such (as getent shows)
The only file getent should care about is nsswitch.conf
Does the script run as that particular user?
Also, I think Ubuntu runs a different shell, maybe that would make a difference in a script.
The other things to try would be stracing the getent to see if the getent actually reaches the SSSD and checking the sssd_nss logs to reference the lookup.
On 04/16/2013 05:39 PM, Jakub Hrozek wrote:
On Tue, Apr 16, 2013 at 05:21:35PM +0200, steve wrote:
On 04/16/2013 05:17 PM, Jakub Hrozek wrote:
On Tue, Apr 16, 2013 at 10:44:08AM +0200, steve wrote:
Hi Thanks for the syntax. It works perfectly now. Good advice about the brutal technique too.
I'm actually trying to debug a bash script which runs getent passwd <user>. On Ubuntu it seems that getent is run in a different process as it returns nothing. The same script on openSUSE returns as expected. I know it's OT but any ideas how to get output from getent in an Ubuntu bash script? Cheers, Steve
This really shouldn't matter, does getent on Ubuntu works fine without bash script?
Yes, it's fine outside the script. It's just that the script sets permissions on some files to the new user. We have to use his numeric gidNumber because his username is not available to use as such (as getent shows)
The only file getent should care about is nsswitch.conf
Does the script run as that particular user?
Also, I think Ubuntu runs a different shell, maybe that would make a difference in a script.
The other things to try would be stracing the getent to see if the getent actually reaches the SSSD and checking the sssd_nss logs to reference the lookup.
Hi everyone It seems to take time for sssd to pick up new objects. I added a sleep 10 before doing anything with the new user and could then use his username or, put another way, getent passwd username worked after the sleep.
10 seconds is fine as we don't add users that often but it would be nice if nss could pick it up a bit quicker.
This is only my second day with sssd and I would like to thank all the devs for a great project.. Steve
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Tue 16 Apr 2013 12:34:53 PM EDT, steve wrote:
On 04/16/2013 05:39 PM, Jakub Hrozek wrote:
On Tue, Apr 16, 2013 at 05:21:35PM +0200, steve wrote:
On 04/16/2013 05:17 PM, Jakub Hrozek wrote:
On Tue, Apr 16, 2013 at 10:44:08AM +0200, steve wrote:
Hi Thanks for the syntax. It works perfectly now. Good advice about the brutal technique too.
I'm actually trying to debug a bash script which runs getent passwd <user>. On Ubuntu it seems that getent is run in a different process as it returns nothing. The same script on openSUSE returns as expected. I know it's OT but any ideas how to get output from getent in an Ubuntu bash script? Cheers, Steve
This really shouldn't matter, does getent on Ubuntu works fine without bash script?
Yes, it's fine outside the script. It's just that the script sets permissions on some files to the new user. We have to use his numeric gidNumber because his username is not available to use as such (as getent shows)
The only file getent should care about is nsswitch.conf
Does the script run as that particular user?
Also, I think Ubuntu runs a different shell, maybe that would make a difference in a script.
The other things to try would be stracing the getent to see if the getent actually reaches the SSSD and checking the sssd_nss logs to reference the lookup.
Hi everyone It seems to take time for sssd to pick up new objects. I added a sleep 10 before doing anything with the new user and could then use his username or, put another way, getent passwd username worked after the sleep.
10 seconds is fine as we don't add users that often but it would be nice if nss could pick it up a bit quicker.
This is only my second day with sssd and I would like to thank all the devs for a great project..
If you're too quick to request the user from before the LDAP server has filtered it out to all the replicas, you may be hitting SSSD's negative cache (which is 15s). The reason for this is to avoid repeated lookups against the LDAP server if you for example enter the wrong username into a script that makes multiple requests.
How are you creating the user, and how did you determine that 10 seconds was an appropriate sleep time? Depending on the server you use, the ldapmodify might return immediately while actually taking a few hundred milliseconds to finish flushing the transaction on the server-side. So if you are creating the user and processing it in the same script, you may be creating a race condition for yourself.
On 04/16/2013 08:53 PM, Stephen Gallagher wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Tue 16 Apr 2013 12:34:53 PM EDT, steve wrote:
On 04/16/2013 05:39 PM, Jakub Hrozek wrote:
On Tue, Apr 16, 2013 at 05:21:35PM +0200, steve wrote:
On 04/16/2013 05:17 PM, Jakub Hrozek wrote:
On Tue, Apr 16, 2013 at 10:44:08AM +0200, steve wrote:
Hi Thanks for the syntax. It works perfectly now. Good advice about the brutal technique too.
I'm actually trying to debug a bash script which runs getent passwd <user>. On Ubuntu it seems that getent is run in a different process as it returns nothing. The same script on openSUSE returns as expected. I know it's OT but any ideas how to get output from getent in an Ubuntu bash script? Cheers, Steve
This really shouldn't matter, does getent on Ubuntu works fine without bash script?
Yes, it's fine outside the script. It's just that the script sets permissions on some files to the new user. We have to use his numeric gidNumber because his username is not available to use as such (as getent shows)
The only file getent should care about is nsswitch.conf
Does the script run as that particular user?
Also, I think Ubuntu runs a different shell, maybe that would make a difference in a script.
The other things to try would be stracing the getent to see if the getent actually reaches the SSSD and checking the sssd_nss logs to reference the lookup.
Hi everyone It seems to take time for sssd to pick up new objects. I added a sleep 10 before doing anything with the new user and could then use his username or, put another way, getent passwd username worked after the sleep.
10 seconds is fine as we don't add users that often but it would be nice if nss could pick it up a bit quicker.
This is only my second day with sssd and I would like to thank all the devs for a great project..
If you're too quick to request the user from before the LDAP server has filtered it out to all the replicas, you may be hitting SSSD's negative cache (which is 15s). The reason for this is to avoid repeated lookups against the LDAP server if you for example enter the wrong username into a script that makes multiple requests.
How are you creating the user, and how did you determine that 10 seconds was an appropriate sleep time?
Hi The actual code at that point in the script is:
echo "dn: cn=$1,cn=Users,$basedn changetype: modify add: objectClass objectClass: posixAccount - add: uidNumber uidNumber: $uid - add: gidNumber gidNumber: $gid - add:unixHomeDirectory unixHomeDirectory: $unixhome - add: loginShell loginShell: /bin/bash" > /tmp/$1 ldbmodify --url=$db $auth /tmp/$1
mkdir $unixhome sleep 10 ##This is the line where the wait is necessary chown -R "$1":"Domain Users" $unixhome
We add the rfc2307 attributes to the user $1. It takes 10 seconds for the user to become visible. We started at 2 seconds and tested increments of 2 until we saw the user emerge from the ldap at 10 seconds. With nslcd we do not have the wait. I think this must be something to do with the sssd cache.
Depending on the server you use, the ldapmodify might return immediately while actually taking a few hundred milliseconds to finish flushing the transaction on the server-side. So if you are creating the user and processing it in the same script, you may be creating a race condition for yourself.
I'm new to all this stuff so I'm not sure what a race condition is but yes we are creating a user and processing him immediately afterwards. Maybe we can't do that wit sssd.
The other problem we have is when we change e.g. the unixHomeDirectory for a user on the server. How do we tell the clients that something has changed? We can't go around all clients, login as root and issue a sssd_cache -u <user>! Any ideas?
Cheers and thanks for your patience. Steve
On 04/17/2013 02:13 AM, steve wrote:
On 04/16/2013 08:53 PM, Stephen Gallagher wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Tue 16 Apr 2013 12:34:53 PM EDT, steve wrote:
On 04/16/2013 05:39 PM, Jakub Hrozek wrote:
On Tue, Apr 16, 2013 at 05:21:35PM +0200, steve wrote:
On 04/16/2013 05:17 PM, Jakub Hrozek wrote:
On Tue, Apr 16, 2013 at 10:44:08AM +0200, steve wrote: > Hi Thanks for the syntax. It works perfectly now. Good > advice about the brutal technique too. > > I'm actually trying to debug a bash script which runs > getent passwd <user>. On Ubuntu it seems that getent is run > in a different process as it returns nothing. The same > script on openSUSE returns as expected. I know it's OT but > any ideas how to get output from getent in an Ubuntu bash > script? Cheers, Steve This really shouldn't matter, does getent on Ubuntu works fine without bash script?
Yes, it's fine outside the script. It's just that the script sets permissions on some files to the new user. We have to use his numeric gidNumber because his username is not available to use as such (as getent shows)
The only file getent should care about is nsswitch.conf
Does the script run as that particular user?
Also, I think Ubuntu runs a different shell, maybe that would make a difference in a script.
The other things to try would be stracing the getent to see if the getent actually reaches the SSSD and checking the sssd_nss logs to reference the lookup.
Hi everyone It seems to take time for sssd to pick up new objects. I added a sleep 10 before doing anything with the new user and could then use his username or, put another way, getent passwd username worked after the sleep.
10 seconds is fine as we don't add users that often but it would be nice if nss could pick it up a bit quicker.
This is only my second day with sssd and I would like to thank all the devs for a great project..
If you're too quick to request the user from before the LDAP server has filtered it out to all the replicas, you may be hitting SSSD's negative cache (which is 15s). The reason for this is to avoid repeated lookups against the LDAP server if you for example enter the wrong username into a script that makes multiple requests.
How are you creating the user, and how did you determine that 10 seconds was an appropriate sleep time?
Hi The actual code at that point in the script is:
echo "dn: cn=$1,cn=Users,$basedn changetype: modify add: objectClass objectClass: posixAccount
add: uidNumber uidNumber: $uid
add: gidNumber gidNumber: $gid
add:unixHomeDirectory unixHomeDirectory: $unixhome
add: loginShell loginShell: /bin/bash" > /tmp/$1 ldbmodify --url=$db $auth /tmp/$1
mkdir $unixhome sleep 10 ##This is the line where the wait is necessary chown -R "$1":"Domain Users" $unixhome
We add the rfc2307 attributes to the user $1. It takes 10 seconds for the user to become visible. We started at 2 seconds and tested increments of 2 until we saw the user emerge from the ldap at 10 seconds. With nslcd we do not have the wait. I think this must be something to do with the sssd cache.
Jakub, Stephen, does this hit SSSD negative cache? Is it fast cache? If this is a special system is there a way to reduce negative cache timeout? Is it:
entry_negative_timeout (integer) Specifies for how many seconds nss_sss should cache negative cache hits (that is, queries for invalid database entries, like nonexistent ones) before asking the back end again.
Default: 15
Depending on the server you use, the ldapmodify might return immediately while actually taking a few hundred milliseconds to finish flushing the transaction on the server-side. So if you are creating the user and processing it in the same script, you may be creating a race condition for yourself.
I'm new to all this stuff so I'm not sure what a race condition is but yes we are creating a user and processing him immediately afterwards. Maybe we can't do that wit sssd.
Yes this is a race.
The other problem we have is when we change e.g. the unixHomeDirectory for a user on the server. How do we tell the clients that something has changed? We can't go around all clients, login as root and issue a sssd_cache -u <user>! Any ideas?
It will be corrected next time user logs in I assume. So is this a problem? Is there a reason why the change should happen earlier than that?
Cheers and thanks for your patience. Steve
sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-devel
The other problem we have is when we change e.g. the unixHomeDirectory for a user on the server. How do we tell the clients that something has changed? We can't go around all clients, login as root and issue a sssd_cache -u <user>! Any ideas?
It will be corrected next time user logs in I assume. So is this a problem? Is there a reason why the change should happen earlier than that?
Hi Unfortunately that only works for the first login after the change e.g. Change uidNumber for steve2 steve2 logs in and id shows his new uid log out change the number back to what it was steve2 logs in again but the uidNumber has not changed
Also, for groups: sss_cache -g group does not flush when primaryGroupID has been changed in AD /var/lib/sss/db/* must be removed for this to happen
I have workarounds for all this but they involve going to each client, logging in as root and either issuing the sss_cache commands or removing the db and starting again.
Cheers, Steve
On Thu, 18 Apr 2013, steve wrote:
Hi Unfortunately that only works for the first login after the change e.g. Change uidNumber for steve2 steve2 logs in and id shows his new uid log out change the number back to what it was steve2 logs in again but the uidNumber has not changed
Also, for groups: sss_cache -g group does not flush when primaryGroupID has been changed in AD /var/lib/sss/db/* must be removed for this to happen
Eh? Would you expect to need to clear user not group information to force primaryGroupID changes to get noticed?
I have workarounds for all this but they involve going to each client, logging in as root and either issuing the sss_cache commands or removing the db and starting again.
Can I just query one thing? Why on earth are you changing user attributes for users so frequently?
Forget the effect sssd has, you're completely hanging out to dry any running processes of these users everytime you do this.
jh
On 04/18/2013 09:50 AM, John Hodrien wrote:
On Thu, 18 Apr 2013, steve wrote:
Hi Unfortunately that only works for the first login after the change e.g. Change uidNumber for steve2 steve2 logs in and id shows his new uid log out change the number back to what it was steve2 logs in again but the uidNumber has not changed
Also, for groups: sss_cache -g group does not flush when primaryGroupID has been changed in AD /var/lib/sss/db/* must be removed for this to happen
Eh? Would you expect to need to clear user not group information to force primaryGroupID changes to get noticed?
Having the user login has no effect. getent still shows him as memberOf (he appears alongside his now primary group and not, as should happen, alongside his secondary group).
I have workarounds for all this but they involve going to each client, logging in as root and either issuing the sss_cache commands or removing the db and starting again.
Can I just query one thing? Why on earth are you changing user attributes for users so frequently?
Yes. Thanks. We have to justify from winbind, nslcd or sssd for a situation where 600 users can login to any one of around 80 machines in a Samba4 domain. Adding/removing a user to a group is quite common. This is not recognised on the clients unless root intervenes: Impossible! Less common, but common enough in our environment is moving a user's home directory.
We've eliminated winbind and are left with nslcd which is time consuming to implement (but which passes all the tests), and sssd with it's point and click configuration. We'd really like to go with sssd but we have to prove in a test lab that what we do will be covered. We simply have to maintain the domain centrally. We cannot visit 80 clients everytime a change is made.
Forget the effect sssd has, you're completely hanging out to dry any running processes of these users everytime you do this.
As I say, nslcd copes with this. I'm trying to get to the stage where we can configure sssd to do it too. sssd is like nslcd running with nscd: sssd = nslcd + nscd? Cheers, Steve
jh __
On Thu, 18 Apr 2013, steve wrote:
Having the user login has no effect. getent still shows him as memberOf (he appears alongside his now primary group and not, as should happen, alongside his secondary group).
Perhaps I was misunderstanding. I thought you were changing a user's primary group, and weren't seeing that updated. I'd expect you to have to wait to the cache to clear, or do:
sss_cache -u thatuser
Maybe I was misunderstanding what you're trying to do.
Can I just query one thing? Why on earth are you changing user attributes for users so frequently?
Yes. Thanks. We have to justify from winbind, nslcd or sssd for a situation where 600 users can login to any one of around 80 machines in a Samba4 domain. Adding/removing a user to a group is quite common. This is not recognised on the clients unless root intervenes: Impossible! Less common, but common enough in our environment is moving a user's home directory.
It's not recognised on the clients until the cache expires, but I don't see how that can not be the case. This'd also be the case with windows, where the user's PAC will be used to verify group membership, which often means forcing a user to log off and back on again to update group membership.
We've eliminated winbind and are left with nslcd which is time consuming to implement (but which passes all the tests), and sssd with it's point and click configuration. We'd really like to go with sssd but we have to prove in a test lab that what we do will be covered. We simply have to maintain the domain centrally. We cannot visit 80 clients everytime a change is made.
Group membership changes propogate in our environment just fine within a reasonable period of time. What should we be talking by default, 5 minutes?
Forget the effect sssd has, you're completely hanging out to dry any running processes of these users everytime you do this.
As I say, nslcd copes with this. I'm trying to get to the stage where we can configure sssd to do it too. sssd is like nslcd running with nscd: sssd = nslcd + nscd?
If you're just talking about changing group membership, then yes. But I thought you'd also talked of changing uids of existing users. Equally why would you be changing primary group membership of users on a frequent basis?
Either you have a cache, or you don't. If you just disabled the cache (as I believe has been suggested) does it behave as you think you want?
jh
On 04/18/2013 11:30 AM, John Hodrien wrote:
On Thu, 18 Apr 2013, steve wrote:
Having the user login has no effect. getent still shows him as memberOf (he appears alongside his now primary group and not, as should happen, alongside his secondary group).
Perhaps I was misunderstanding. I thought you were changing a user's primary group, and weren't seeing that updated. I'd expect you to have to wait to the cache to clear, or do:
sss_cache -u thatuser
Maybe I was misunderstanding what you're trying to do.
Can I just query one thing? Why on earth are you changing user attributes for users so frequently?
Yes. Thanks. We have to justify from winbind, nslcd or sssd for a situation where 600 users can login to any one of around 80 machines in a Samba4 domain. Adding/removing a user to a group is quite common. This is not recognised on the clients unless root intervenes: Impossible! Less common, but common enough in our environment is moving a user's home directory.
It's not recognised on the clients until the cache expires, but I don't see how that can not be the case. This'd also be the case with windows, where the user's PAC will be used to verify group membership, which often means forcing a user to log off and back on again to update group membership.
We've eliminated winbind and are left with nslcd which is time consuming to implement (but which passes all the tests), and sssd with it's point and click configuration. We'd really like to go with sssd but we have to prove in a test lab that what we do will be covered. We simply have to maintain the domain centrally. We cannot visit 80 clients everytime a change is made.
Group membership changes propogate in our environment just fine within a reasonable period of time. What should we be talking by default, 5 minutes?
Hi OK. I've just removed a user from a group and logged in as that user. After 30 minutes id, getent and tests on what he can access still show him to be a member. That's too long.
Could you do me a big favour and have a look at our client conf?
[sssd] services = nss, pam config_file_version = 2 domains = default
[nss]
[pam]
[domain/default] ldap_schema = rfc2307bis access_provider = simple enumerate = FALSE cache_credentials = true id_provider = ldap auth_provider = krb5 chpass_provider = krb5 krb5_realm = DOLORES.SITE krb5_server = doloresdc.dolores.site krb5_kpasswd = doloresdc.dolores.site
ldap_uri = ldap://doloresdc.dolores.site ldap_search_base = dc=dolores,dc=site #ldap_tls_cacertdir = /usr/local/samba/private/tls #ldap_id_use_start_tls = true ldap_user_object_class = user ldap_user_name = samAccountName ldap_user_uid_number = uidNumber ldap_user_gid_number = gidNumber ldap_user_home_directory = unixHomeDirectory ldap_user_shell = loginShell ldap_group_object_class = group ldap_group_search_base = dc=dolores,dc=site ldap_group_name = cn ldap_group_member = member ldap_user_search_filter =(&(objectCategory=User)(uidNumber=*))
ldap_sasl_mech = gssapi ldap_sasl_authid = ALGORFA$ ldap_krb5_keytab = /etc/krb5.keytab ldap_krb5_init_creds = true
Cheers
On Thu, Apr 18, 2013 at 01:13:17PM +0200, steve wrote:
On 04/18/2013 11:30 AM, John Hodrien wrote:
On Thu, 18 Apr 2013, steve wrote:
Having the user login has no effect. getent still shows him as memberOf (he appears alongside his now primary group and not, as should happen, alongside his secondary group).
Perhaps I was misunderstanding. I thought you were changing a user's primary group, and weren't seeing that updated. I'd expect you to have to wait to the cache to clear, or do:
sss_cache -u thatuser
Maybe I was misunderstanding what you're trying to do.
Can I just query one thing? Why on earth are you changing user attributes for users so frequently?
Yes. Thanks. We have to justify from winbind, nslcd or sssd for a situation where 600 users can login to any one of around 80 machines in a Samba4 domain. Adding/removing a user to a group is quite common. This is not recognised on the clients unless root intervenes: Impossible! Less common, but common enough in our environment is moving a user's home directory.
It's not recognised on the clients until the cache expires, but I don't see how that can not be the case. This'd also be the case with windows, where the user's PAC will be used to verify group membership, which often means forcing a user to log off and back on again to update group membership.
We've eliminated winbind and are left with nslcd which is time consuming to implement (but which passes all the tests), and sssd with it's point and click configuration. We'd really like to go with sssd but we have to prove in a test lab that what we do will be covered. We simply have to maintain the domain centrally. We cannot visit 80 clients everytime a change is made.
Group membership changes propogate in our environment just fine within a reasonable period of time. What should we be talking by default, 5 minutes?
Hi OK. I've just removed a user from a group and logged in as that user. After 30 minutes id, getent and tests on what he can access still show him to be a member. That's too long.
From man sssd.conf:
entry_cache_timeout (integer) How many seconds should nss_sss consider entries valid before asking the backend again
Default: 5400
So the default cache lifetime is 5400 seconds, you can set a shorter one if you need the entries to be updated more frequently.
Group membership changes propogate in our environment just fine within a reasonable period of time. What should we be talking by default, 5 minutes?
Hi OK. I've just removed a user from a group and logged in as that user. After 30 minutes id, getent and tests on what he can access still show him to be a member. That's too long.
From man sssd.conf:
entry_cache_timeout (integer) How many seconds should nss_sss consider entries valid before asking the backend again
Default: 5400So the default cache lifetime is 5400 seconds, you can set a shorter one if you need the entries to be updated more frequently.
Hi. It has no effect . I set:
entry_cache_timeout = 10 and restarted sssd, waited for a minute or so but still getent, id and permissions of the user were still those of being a group member. This suggests that the cache is still being consulted. It sometimes works, but after a variable length of time. The current test (removing a user from a group) has been running for 20 minutes but still the user is a member of the group. Stuck!
On 04/18/2013 05:30 AM, John Hodrien wrote:
On Thu, 18 Apr 2013, steve wrote:
Having the user login has no effect. getent still shows him as memberOf (he appears alongside his now primary group and not, as should happen, alongside his secondary group).
Perhaps I was misunderstanding. I thought you were changing a user's primary group, and weren't seeing that updated. I'd expect you to have to wait to the cache to clear, or do:
sss_cache -u thatuser
Maybe I was misunderstanding what you're trying to do.
Can I just query one thing? Why on earth are you changing user attributes for users so frequently?
Yes. Thanks. We have to justify from winbind, nslcd or sssd for a situation where 600 users can login to any one of around 80 machines in a Samba4 domain. Adding/removing a user to a group is quite common. This is not recognised on the clients unless root intervenes: Impossible! Less common, but common enough in our environment is moving a user's home directory.
It's not recognised on the clients until the cache expires, but I don't see how that can not be the case. This'd also be the case with windows, where the user's PAC will be used to verify group membership, which often means forcing a user to log off and back on again to update group membership.
We've eliminated winbind and are left with nslcd which is time consuming to implement (but which passes all the tests), and sssd with it's point and click configuration. We'd really like to go with sssd but we have to prove in a test lab that what we do will be covered. We simply have to maintain the domain centrally. We cannot visit 80 clients everytime a change is made.
Group membership changes propogate in our environment just fine within a reasonable period of time. What should we be talking by default, 5 minutes?
Forget the effect sssd has, you're completely hanging out to dry any running processes of these users everytime you do this.
As I say, nslcd copes with this. I'm trying to get to the stage where we can configure sssd to do it too. sssd is like nslcd running with nscd: sssd = nslcd + nscd?
If you're just talking about changing group membership, then yes. But I thought you'd also talked of changing uids of existing users. Equally why would you be changing primary group membership of users on a frequent basis?
Either you have a cache, or you don't. If you just disabled the cache (as I believe has been suggested) does it behave as you think you want?
Yes, I agree with John.
There is a bit of confusion here. 1) The UID and GID of the user are not frequently changed. It is a bad practice to change them because all the files owned by user also need to be chowned. We do not have a good solution for addressing that. We have a ticket to design something in future bu so far if you change the UID/GID of the user you have to clean the caches on the systems the user has access to. This is the flip side of the caching. So think twice changing UID/GID 2) Changing home directory is fine and would propagate when the user logs next time. You can't view UID/GID and home directory as similar attributes. They have different meaning and implications. 3) Group membership is also updated when user logs in or there is a lookup for user information. Several cache parameters define how frequently that would happen. Everything is currently optimized for performance of the applications running on the client. If you want to get less latency you would have to give up some of the performance gains the caching presents.
jh _______________________________________________ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-devel
On 04/18/2013 10:29 PM, Dmitri Pal wrote:
On 04/18/2013 05:30 AM, John Hodrien wrote:
On Thu, 18 Apr 2013, steve wrote:
Having the user login has no effect. getent still shows him as memberOf (he appears alongside his now primary group and not, as should happen, alongside his secondary group).
Perhaps I was misunderstanding. I thought you were changing a user's primary group, and weren't seeing that updated. I'd expect you to have to wait to the cache to clear, or do:
sss_cache -u thatuser
Maybe I was misunderstanding what you're trying to do.
Can I just query one thing? Why on earth are you changing user attributes for users so frequently?
Yes. Thanks. We have to justify from winbind, nslcd or sssd for a situation where 600 users can login to any one of around 80 machines in a Samba4 domain. Adding/removing a user to a group is quite common. This is not recognised on the clients unless root intervenes: Impossible! Less common, but common enough in our environment is moving a user's home directory.
It's not recognised on the clients until the cache expires, but I don't see how that can not be the case. This'd also be the case with windows, where the user's PAC will be used to verify group membership, which often means forcing a user to log off and back on again to update group membership.
We've eliminated winbind and are left with nslcd which is time consuming to implement (but which passes all the tests), and sssd with it's point and click configuration. We'd really like to go with sssd but we have to prove in a test lab that what we do will be covered. We simply have to maintain the domain centrally. We cannot visit 80 clients everytime a change is made.
Group membership changes propogate in our environment just fine within a reasonable period of time. What should we be talking by default, 5 minutes?
Forget the effect sssd has, you're completely hanging out to dry any running processes of these users everytime you do this.
As I say, nslcd copes with this. I'm trying to get to the stage where we can configure sssd to do it too. sssd is like nslcd running with nscd: sssd = nslcd + nscd?
If you're just talking about changing group membership, then yes. But I thought you'd also talked of changing uids of existing users. Equally why would you be changing primary group membership of users on a frequent basis?
Either you have a cache, or you don't. If you just disabled the cache (as I believe has been suggested) does it behave as you think you want?
Yes, I agree with John.
There is a bit of confusion here.
- The UID and GID of the user are not frequently changed. It is a bad
practice to change them because all the files owned by user also need to be chowned. We do not have a good solution for addressing that. We have a ticket to design something in future bu so far if you change the UID/GID of the user you have to clean the caches on the systems the user has access to. This is the flip side of the caching. So think twice changing UID/GID 2) Changing home directory is fine and would propagate when the user logs next time. You can't view UID/GID and home directory as similar attributes. They have different meaning and implications. 3) Group membership is also updated when user logs in or there is a lookup for user information. Several cache parameters define how frequently that would happen. Everything is currently optimized for performance of the applications running on the client. If you want to get less latency you would have to give up some of the performance gains the caching presents.
Hi Thanks for the explanation. There seem to be two different caches for sssd. 1. the cache for the user authentication credentials 2. the cache for ldap attributes (like nscd)
Is that correct? What you are saying is that when a user logs in that the cache is not consulted: the user login ALWAYS reads directly from the directory. Yes? If so, we are not observing that behavior. Even after a fresh user login after adding or removing a group membership, it can take up to 1/2 hour for the change to propogate so that id, getent or user permissions take effect. Are you absolutely certain that the login reads from the directory every time? If this were to be the case, then our problem would be solved.
We have no intention of randomly changing uid/gid. These are purely and simply evaluation tests that we are obliged to perform so thanks for the warning.
Could we request that an option is present e.g. in sssd.conf for each login to query LDAP and _not_ the cache?
Cheers and thanks for your help. Steve
On 04/18/2013 05:07 PM, steve wrote:
On 04/18/2013 10:29 PM, Dmitri Pal wrote:
On 04/18/2013 05:30 AM, John Hodrien wrote:
On Thu, 18 Apr 2013, steve wrote:
Having the user login has no effect. getent still shows him as memberOf (he appears alongside his now primary group and not, as should happen, alongside his secondary group).
Perhaps I was misunderstanding. I thought you were changing a user's primary group, and weren't seeing that updated. I'd expect you to have to wait to the cache to clear, or do:
sss_cache -u thatuser
Maybe I was misunderstanding what you're trying to do.
Can I just query one thing? Why on earth are you changing user attributes for users so frequently?
Yes. Thanks. We have to justify from winbind, nslcd or sssd for a situation where 600 users can login to any one of around 80 machines in a Samba4 domain. Adding/removing a user to a group is quite common. This is not recognised on the clients unless root intervenes: Impossible! Less common, but common enough in our environment is moving a user's home directory.
It's not recognised on the clients until the cache expires, but I don't see how that can not be the case. This'd also be the case with windows, where the user's PAC will be used to verify group membership, which often means forcing a user to log off and back on again to update group membership.
We've eliminated winbind and are left with nslcd which is time consuming to implement (but which passes all the tests), and sssd with it's point and click configuration. We'd really like to go with sssd but we have to prove in a test lab that what we do will be covered. We simply have to maintain the domain centrally. We cannot visit 80 clients everytime a change is made.
Group membership changes propogate in our environment just fine within a reasonable period of time. What should we be talking by default, 5 minutes?
Forget the effect sssd has, you're completely hanging out to dry any running processes of these users everytime you do this.
As I say, nslcd copes with this. I'm trying to get to the stage where we can configure sssd to do it too. sssd is like nslcd running with nscd: sssd = nslcd + nscd?
If you're just talking about changing group membership, then yes. But I thought you'd also talked of changing uids of existing users. Equally why would you be changing primary group membership of users on a frequent basis?
Either you have a cache, or you don't. If you just disabled the cache (as I believe has been suggested) does it behave as you think you want?
Yes, I agree with John.
There is a bit of confusion here.
- The UID and GID of the user are not frequently changed. It is a bad
practice to change them because all the files owned by user also need to be chowned. We do not have a good solution for addressing that. We have a ticket to design something in future bu so far if you change the UID/GID of the user you have to clean the caches on the systems the user has access to. This is the flip side of the caching. So think twice changing UID/GID 2) Changing home directory is fine and would propagate when the user logs next time. You can't view UID/GID and home directory as similar attributes. They have different meaning and implications. 3) Group membership is also updated when user logs in or there is a lookup for user information. Several cache parameters define how frequently that would happen. Everything is currently optimized for performance of the applications running on the client. If you want to get less latency you would have to give up some of the performance gains the caching presents.
Hi Thanks for the explanation. There seem to be two different caches for sssd.
- the cache for the user authentication credentials
- the cache for ldap attributes (like nscd)
There are several different caches at different levels. Let us split the authentication and identity lookups. With authentication there are two types of caching: * your password hash can be stored in the sssd cache database so that you can log offline (optional) * your password in clear is stored in the kernel key ring when you log while not connected to the network so that when you get on the VPN a kerberos ticket can be automatically acquired (optional).
With identity lookups you have several caches: 1) fast cache that was added in 6.4 that is equivalent of the NSCD. It works for users and groups and its purpose is to provide caching for the cases when a process does multiple lookups per second for the same ID. 2) normal sssd cache. It has a complex algorithm of keeping it up to date. For more about it see man sssd.conf and parameter entry_cache_nowait_percentage
When user performs online authentication SSSD would try to fetch the latest information from the server. At least this is how earlier versions of SSSD worked. If you do not see this in the 1.9 that might be a bug. But before opening it please make sure that your client actually has the connectivity to the server at the moment you re-authenticate. We would want to see your sssd logs for the authentication attempt with high debug_level value to detect what is wrong and why you do not see the group membership coming to the client in a reasonable time.
I suggest the following test: log into a box via sssd do id change user membership on the server lock screen/unlock screen (login/logout as a variant) do id
The group change should be propagated because the user has authenticated and we refetch user data on the re-authentication. Otherwise if you do not re-authenticate the cache would be updated but at much slower pace as Jakub mentioned. If you do not see the changes in groups then send us the sanitized sssd log for the default domain. Do not forget to set debug_level to 8 or 9 before running the test.
HTH
Is that correct? What you are saying is that when a user logs in that the cache is not consulted: the user login ALWAYS reads directly from the directory. Yes? If so, we are not observing that behavior. Even after a fresh user login after adding or removing a group membership, it can take up to 1/2 hour for the change to propogate so that id, getent or user permissions take effect. Are you absolutely certain that the login reads from the directory every time? If this were to be the case, then our problem would be solved.
We have no intention of randomly changing uid/gid. These are purely and simply evaluation tests that we are obliged to perform so thanks for the warning.
Could we request that an option is present e.g. in sssd.conf for each login to query LDAP and _not_ the cache?
Cheers and thanks for your help. Steve _______________________________________________ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-devel
On 04/18/2013 11:45 PM, Dmitri Pal wrote:
On 04/18/2013 05:07 PM, steve wrote:
On 04/18/2013 10:29 PM, Dmitri Pal wrote:
On 04/18/2013 05:30 AM, John Hodrien wrote:
On Thu, 18 Apr 2013, steve wrote:
Having the user login has no effect. getent still shows him as memberOf (he appears alongside his now primary group and not, as should happen, alongside his secondary group).
Perhaps I was misunderstanding. I thought you were changing a user's primary group, and weren't seeing that updated. I'd expect you to have to wait to the cache to clear, or do:
sss_cache -u thatuser
Maybe I was misunderstanding what you're trying to do.
Can I just query one thing? Why on earth are you changing user attributes for users so frequently?
Yes. Thanks. We have to justify from winbind, nslcd or sssd for a situation where 600 users can login to any one of around 80 machines in a Samba4 domain. Adding/removing a user to a group is quite common. This is not recognised on the clients unless root intervenes: Impossible! Less common, but common enough in our environment is moving a user's home directory.
It's not recognised on the clients until the cache expires, but I don't see how that can not be the case. This'd also be the case with windows, where the user's PAC will be used to verify group membership, which often means forcing a user to log off and back on again to update group membership.
We've eliminated winbind and are left with nslcd which is time consuming to implement (but which passes all the tests), and sssd with it's point and click configuration. We'd really like to go with sssd but we have to prove in a test lab that what we do will be covered. We simply have to maintain the domain centrally. We cannot visit 80 clients everytime a change is made.
Group membership changes propogate in our environment just fine within a reasonable period of time. What should we be talking by default, 5 minutes?
Forget the effect sssd has, you're completely hanging out to dry any running processes of these users everytime you do this.
As I say, nslcd copes with this. I'm trying to get to the stage where we can configure sssd to do it too. sssd is like nslcd running with nscd: sssd = nslcd + nscd?
If you're just talking about changing group membership, then yes. But I thought you'd also talked of changing uids of existing users. Equally why would you be changing primary group membership of users on a frequent basis?
Either you have a cache, or you don't. If you just disabled the cache (as I believe has been suggested) does it behave as you think you want?
Yes, I agree with John.
There is a bit of confusion here.
- The UID and GID of the user are not frequently changed. It is a bad
practice to change them because all the files owned by user also need to be chowned. We do not have a good solution for addressing that. We have a ticket to design something in future bu so far if you change the UID/GID of the user you have to clean the caches on the systems the user has access to. This is the flip side of the caching. So think twice changing UID/GID 2) Changing home directory is fine and would propagate when the user logs next time. You can't view UID/GID and home directory as similar attributes. They have different meaning and implications. 3) Group membership is also updated when user logs in or there is a lookup for user information. Several cache parameters define how frequently that would happen. Everything is currently optimized for performance of the applications running on the client. If you want to get less latency you would have to give up some of the performance gains the caching presents.
Hi Thanks for the explanation. There seem to be two different caches for sssd.
- the cache for the user authentication credentials
- the cache for ldap attributes (like nscd)
There are several different caches at different levels. Let us split the authentication and identity lookups. With authentication there are two types of caching:
- your password hash can be stored in the sssd cache database so that
you can log offline (optional)
- your password in clear is stored in the kernel key ring when you log
while not connected to the network so that when you get on the VPN a kerberos ticket can be automatically acquired (optional).
With identity lookups you have several caches:
- fast cache that was added in 6.4 that is equivalent of the NSCD. It
works for users and groups and its purpose is to provide caching for the cases when a process does multiple lookups per second for the same ID. 2) normal sssd cache. It has a complex algorithm of keeping it up to date. For more about it see man sssd.conf and parameter entry_cache_nowait_percentage
When user performs online authentication SSSD would try to fetch the latest information from the server. At least this is how earlier versions of SSSD worked. If you do not see this in the 1.9 that might be a bug. But before opening it please make sure that your client actually has the connectivity to the server at the moment you re-authenticate. We would want to see your sssd logs for the authentication attempt with high debug_level value to detect what is wrong and why you do not see the group membership coming to the client in a reasonable time.
I suggest the following test: log into a box via sssd do id change user membership on the server lock screen/unlock screen (login/logout as a variant) do id
Hi Test performed using domain group staff (there is no local group called staff). User steve3 is not a member of staff.
method: On the client: 1. steve3 logs in 2. id 3. logs out On the DC 1. sudo samba-tool group addmembers staff steve3 2. confirm: sudo samba-tool group listmembers staff check: steve3 has a member attribute under the DN for staff Back on the client 1. steve3 logs in 2. id 3. getent group staff 4. logs out 5. sudo service sssd stop 6. tar made of the log files
result: On the second login, id shows that steve3 is not recognised as a staff group member.
conclusion: sssd has not read the user information from LDAP
The logs at debug_level = 9 are here: https://dl.dropboxusercontent.com/u/45150875/sssd.client.log.tar Please note that entry_cache_timeout has no effect on these findings
/etc/sssd/sssd.conf [sssd] debug_level = 9 services = nss, pam config_file_version = 2 domains = default
[nss] debug_level = 9
[pam] debug_level = 9
[domain/default] debug_level = 9 ldap_schema = rfc2307bis access_provider = simple enumerate = FALSE #entry_cache_timeout = 10 cache_credentials = true id_provider = ldap auth_provider = krb5 chpass_provider = krb5 krb5_realm = DOLORES.SITE krb5_server = doloresdc.dolores.site krb5_kpasswd = doloresdc.dolores.site
ldap_uri = ldap://doloresdc.dolores.site/ ldap_search_base = dc=dolores,dc=site #ldap_tls_cacertdir = /usr/local/samba/private/tls #ldap_id_use_start_tls = true ldap_user_object_class = user ldap_user_name = samAccountName ldap_user_uid_number = uidNumber ldap_user_gid_number = gidNumber ldap_user_home_directory = unixHomeDirectory ldap_user_shell = loginShell ldap_group_object_class = group ldap_group_search_base = dc=dolores,dc=site ldap_group_name = cn ldap_group_member = member
ldap_sasl_mech = gssapi ldap_sasl_authid = ALGORFA$ ldap_krb5_keytab = /etc/krb5.keytab ldap_krb5_init_creds = true
Thanks for your time, Steve
The group change should be propagated because the user has authenticated and we refetch user data on the re-authentication. Otherwise if you do not re-authenticate the cache would be updated but at much slower pace as Jakub mentioned. If you do not see the changes in groups then send us the sanitized sssd log for the default domain. Do not forget to set debug_level to 8 or 9 before running the test.
HTH
On Fri, Apr 19, 2013 at 07:39:52AM +0200, steve wrote:
Hi Test performed using domain group staff (there is no local group called staff). User steve3 is not a member of staff.
Hi steve,
thank you for the patience.
method: On the client:
- steve3 logs in
- id
- logs out On the DC
- sudo samba-tool group addmembers staff steve3
- confirm: sudo samba-tool group listmembers staff
check: steve3 has a member attribute under the DN for staff Back on the client
- steve3 logs in
- id
- getent group staff
- logs out
- sudo service sssd stop
- tar made of the log files
This is an absolutely valid test and one that our QE performs regularly when qulifying a new release.
result: On the second login, id shows that steve3 is not recognised as a staff group member.
conclusion: sssd has not read the user information from LDAP
And the reason seems to be an SSSD crash after referrals from the Samba DC have been chased, so the new group information *has* been requested, but the request never ran to completion. I don't think we did much (if any) testing against a Samba 4 DC, so it's entirely possible the server is behaving in some strange way the client doesn't expect.
Can you try getting a core file using the tips me and Timo Aaltonen gave earlier in the thread?
As a temporary workaround, you can try setting ldap_referrals = False in sssd.conf to stop sssd from chasing referrals.
The logs at debug_level = 9 are here: https://dl.dropboxusercontent.com/u/45150875/sssd.client.log.tar
Thanks, the logs have been very helpful.
Please note that entry_cache_timeout has no effect on these findings
/etc/sssd/sssd.conf [sssd] debug_level = 9 services = nss, pam config_file_version = 2 domains = default
[nss] debug_level = 9
[pam] debug_level = 9
[domain/default] debug_level = 9 ldap_schema = rfc2307bis access_provider = simple enumerate = FALSE #entry_cache_timeout = 10 cache_credentials = true id_provider = ldap auth_provider = krb5 chpass_provider = krb5 krb5_realm = DOLORES.SITE krb5_server = doloresdc.dolores.site krb5_kpasswd = doloresdc.dolores.site
ldap_uri = ldap://doloresdc.dolores.site/ ldap_search_base = dc=dolores,dc=site #ldap_tls_cacertdir = /usr/local/samba/private/tls #ldap_id_use_start_tls = true ldap_user_object_class = user ldap_user_name = samAccountName ldap_user_uid_number = uidNumber ldap_user_gid_number = gidNumber ldap_user_home_directory = unixHomeDirectory ldap_user_shell = loginShell ldap_group_object_class = group ldap_group_search_base = dc=dolores,dc=site ldap_group_name = cn ldap_group_member = member
ldap_sasl_mech = gssapi ldap_sasl_authid = ALGORFA$ ldap_krb5_keytab = /etc/krb5.keytab ldap_krb5_init_creds = true
Thanks for your time, Steve
On Fri, Apr 19, 2013 at 11:08:23AM +0200, Jakub Hrozek wrote:
Can you try getting a core file using the tips me and Timo Aaltonen gave earlier in the thread?
btw if getting the core file turns out to be problematic, maybe a backtrace would be enough. You would install debugging symbols first, I think on Ubuntu it would be with something like "apt-get install sssd-dbg".
Then attach a gdb to the sssd_be process, turn on logging (gdb would log to gdb.txt by default I think) and resume the execution of sssd_be:
# gdb program $(pidof sssd_be) # (gdb) set logging on # (gdb) continue
And run the test case. When the SSSD crashes, you could also save the core file: # (gdb) generate-core-file
Thank you for your help debugging this problem!
I suggest the following test: log into a box via sssd do id change user membership on the server lock screen/unlock screen (login/logout as a variant) do id
Hi Test performed using domain group staff (there is no local group called staff). User steve3 is not a member of staff.
method: On the client:
- steve3 logs in
- id
- logs out On the DC
- sudo samba-tool group addmembers staff steve3
- confirm: sudo samba-tool group listmembers staff
check: steve3 has a member attribute under the DN for staff Back on the client
- steve3 logs in
- id
- getent group staff
- logs out
- sudo service sssd stop
- tar made of the log files
result: On the second login, id shows that steve3 is not recognised as a staff group member.
conclusion: sssd has not read the user information from LDAP
The logs at debug_level = 9 are here: https://dl.dropboxusercontent.com/u/45150875/sssd.client.log.tar Please note that entry_cache_timeout has no effect on these findings
/etc/sssd/sssd.conf [sssd] debug_level = 9 services = nss, pam config_file_version = 2 domains = default
[nss] debug_level = 9
[pam] debug_level = 9
[domain/default] debug_level = 9 ldap_schema = rfc2307bis access_provider = simple enumerate = FALSE #entry_cache_timeout = 10 cache_credentials = true id_provider = ldap auth_provider = krb5 chpass_provider = krb5 krb5_realm = DOLORES.SITE krb5_server = doloresdc.dolores.site krb5_kpasswd = doloresdc.dolores.site
ldap_uri = ldap://doloresdc.dolores.site/ ldap_search_base = dc=dolores,dc=site #ldap_tls_cacertdir = /usr/local/samba/private/tls #ldap_id_use_start_tls = true ldap_user_object_class = user ldap_user_name = samAccountName ldap_user_uid_number = uidNumber ldap_user_gid_number = gidNumber ldap_user_home_directory = unixHomeDirectory ldap_user_shell = loginShell ldap_group_object_class = group ldap_group_search_base = dc=dolores,dc=site ldap_group_name = cn ldap_group_member = member
ldap_sasl_mech = gssapi ldap_sasl_authid = ALGORFA$ ldap_krb5_keytab = /etc/krb5.keytab ldap_krb5_init_creds = true
Thanks for your time, Steve
UPDATE It seems that sssd can't contact the LDAP server, in this case a stable Samba 4 installation. Samba 4 reports LDAP_PROTOCOL_ERROR when a member logs in. We can however ldapserach, ldbsearch, ldapmodify and ldbmodify just fine either using plain password or GSSAPI. We have also tried using the domain Administrator to authenticate rather than the machine account. Could this be a sssd bug? Cheers, Steve
On Fri, Apr 19, 2013 at 12:45:36PM +0200, steve wrote:
I suggest the following test: log into a box via sssd do id change user membership on the server lock screen/unlock screen (login/logout as a variant) do id
Hi Test performed using domain group staff (there is no local group called staff). User steve3 is not a member of staff.
method: On the client:
- steve3 logs in
- id
- logs out On the DC
- sudo samba-tool group addmembers staff steve3
- confirm: sudo samba-tool group listmembers staff
check: steve3 has a member attribute under the DN for staff Back on the client
- steve3 logs in
- id
- getent group staff
- logs out
- sudo service sssd stop
- tar made of the log files
result: On the second login, id shows that steve3 is not recognised as a staff group member.
conclusion: sssd has not read the user information from LDAP
The logs at debug_level = 9 are here: https://dl.dropboxusercontent.com/u/45150875/sssd.client.log.tar Please note that entry_cache_timeout has no effect on these findings
/etc/sssd/sssd.conf [sssd] debug_level = 9 services = nss, pam config_file_version = 2 domains = default
[nss] debug_level = 9
[pam] debug_level = 9
[domain/default] debug_level = 9 ldap_schema = rfc2307bis access_provider = simple enumerate = FALSE #entry_cache_timeout = 10 cache_credentials = true id_provider = ldap auth_provider = krb5 chpass_provider = krb5 krb5_realm = DOLORES.SITE krb5_server = doloresdc.dolores.site krb5_kpasswd = doloresdc.dolores.site
ldap_uri = ldap://doloresdc.dolores.site/ ldap_search_base = dc=dolores,dc=site #ldap_tls_cacertdir = /usr/local/samba/private/tls #ldap_id_use_start_tls = true ldap_user_object_class = user ldap_user_name = samAccountName ldap_user_uid_number = uidNumber ldap_user_gid_number = gidNumber ldap_user_home_directory = unixHomeDirectory ldap_user_shell = loginShell ldap_group_object_class = group ldap_group_search_base = dc=dolores,dc=site ldap_group_name = cn ldap_group_member = member
ldap_sasl_mech = gssapi ldap_sasl_authid = ALGORFA$ ldap_krb5_keytab = /etc/krb5.keytab ldap_krb5_init_creds = true
Thanks for your time, Steve
UPDATE It seems that sssd can't contact the LDAP server, in this case a stable Samba 4 installation. Samba 4 reports LDAP_PROTOCOL_ERROR when a member logs in. We can however ldapserach, ldbsearch, ldapmodify and ldbmodify just fine either using plain password or GSSAPI. We have also tried using the domain Administrator to authenticate rather than the machine account. Could this be a sssd bug?
If ldapsearch works, but sssd does not, then it might be a sssd bug. But can you check from the logs after which exact operation the LDAP_PROTOCOL_ERROR happens? Maybe we could then reproduce the bug locally or perhaps sssd is "just" using some kind of control it shouldn't.
On 04/19/2013 02:24 PM, Jakub Hrozek wrote:
On Fri, Apr 19, 2013 at 12:45:36PM +0200, steve wrote:
I suggest the following test: log into a box via sssd do id change user membership on the server lock screen/unlock screen (login/logout as a variant) do id
Hi Test performed using domain group staff (there is no local group called staff). User steve3 is not a member of staff.
method: On the client:
- steve3 logs in
- id
- logs out On the DC
- sudo samba-tool group addmembers staff steve3
- confirm: sudo samba-tool group listmembers staff
check: steve3 has a member attribute under the DN for staff Back on the client
- steve3 logs in
- id
- getent group staff
- logs out
- sudo service sssd stop
- tar made of the log files
result: On the second login, id shows that steve3 is not recognised as a staff group member.
conclusion: sssd has not read the user information from LDAP
The logs at debug_level = 9 are here: https://dl.dropboxusercontent.com/u/45150875/sssd.client.log.tar Please note that entry_cache_timeout has no effect on these findings
/etc/sssd/sssd.conf [sssd] debug_level = 9 services = nss, pam config_file_version = 2 domains = default
[nss] debug_level = 9
[pam] debug_level = 9
[domain/default] debug_level = 9 ldap_schema = rfc2307bis access_provider = simple enumerate = FALSE #entry_cache_timeout = 10 cache_credentials = true id_provider = ldap auth_provider = krb5 chpass_provider = krb5 krb5_realm = DOLORES.SITE krb5_server = doloresdc.dolores.site krb5_kpasswd = doloresdc.dolores.site
ldap_uri = ldap://doloresdc.dolores.site/ ldap_search_base = dc=dolores,dc=site #ldap_tls_cacertdir = /usr/local/samba/private/tls #ldap_id_use_start_tls = true ldap_user_object_class = user ldap_user_name = samAccountName ldap_user_uid_number = uidNumber ldap_user_gid_number = gidNumber ldap_user_home_directory = unixHomeDirectory ldap_user_shell = loginShell ldap_group_object_class = group ldap_group_search_base = dc=dolores,dc=site ldap_group_name = cn ldap_group_member = member
ldap_sasl_mech = gssapi ldap_sasl_authid = ALGORFA$ ldap_krb5_keytab = /etc/krb5.keytab ldap_krb5_init_creds = true
Thanks for your time, Steve
UPDATE It seems that sssd can't contact the LDAP server, in this case a stable Samba 4 installation. Samba 4 reports LDAP_PROTOCOL_ERROR when a member logs in. We can however ldapserach, ldbsearch, ldapmodify and ldbmodify just fine either using plain password or GSSAPI. We have also tried using the domain Administrator to authenticate rather than the machine account. Could this be a sssd bug?
If ldapsearch works, but sssd does not, then it might be a sssd bug. But can you check from the logs after which exact operation the LDAP_PROTOCOL_ERROR happens? Maybe we could then reproduce the bug locally or perhaps sssd is "just" using some kind of control it shouldn't.
Hi The LDAP_PROTOCOL_ERROR occurs once during user authentication and again upon logging out. It does not occur with getent or at ny other time during the session. I'm almost certain that this has something to do with sssd; users authenticating against nslcd or winbind do not produce this response from the Samba4 ldap. I'll post over on samba-technical to see if they can help. Cheers, Steve
On Fri, Apr 19, 2013 at 04:48:15PM +0200, steve wrote:
Hi The LDAP_PROTOCOL_ERROR occurs once during user authentication and again upon logging out. It does not occur with getent or at ny other time during the session. I'm almost certain that this has something to do with sssd; users authenticating against nslcd or winbind do not produce this response from the Samba4 ldap. I'll post over on samba-technical to see if they can help. Cheers, Steve
Yes, but can you post the relevent snippet from the logs? They should include the query that is failing.
On 04/19/2013 05:51 PM, Jakub Hrozek wrote:
On Fri, Apr 19, 2013 at 04:48:15PM +0200, steve wrote:
Hi The LDAP_PROTOCOL_ERROR occurs once during user authentication and again upon logging out. It does not occur with getent or at ny other time during the session. I'm almost certain that this has something to do with sssd; users authenticating against nslcd or winbind do not produce this response from the Samba4 ldap. I'll post over on samba-technical to see if they can help. Cheers, Steve
Yes, but can you post the relevent snippet from the logs? They should include the query that is failing.
Hi I put the level 9 logs here: https://dl.dropboxusercontent.com/u/45150875/sssd.client.log.tar
I'm not a dev but I'll try: Here is where it fails getting the groups for the user (from sssd_default.log) I think that this is what produces the LDAP_PROTOCOL_ERROR.
(0x4000): Process user's groups (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_initgr_rfc2307bis_next_base] (0x0400): Searching for parent groups for user [CN=steve3,CN=Users,DC=dolores,DC=site] with base [dc=dolores,dc=site] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(&(member=CN=steve3,CN=Users,DC=dolores,DC=site)(objectclass=group)(cn=*))][dc=dolores,dc=site]. (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [objectClass] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [cn] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [userPassword] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gidNumber] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [nsUniqueId] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [modifyTimestamp] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [uSNChanged] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = 23 (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: ldap_result found nothing! (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_ldap_connect_callback_add] (0x1000): New LDAP connection to [ldap://dolores.site/CN=Configuration,DC=dolores,DC=site] with fd [22]. (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_rebind_proc] (0x1000): Successfully bind to [ldap://dolores.site/CN=Configuration,DC=dolores,DC=site]. (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_REFERENCE] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_REFERENCE] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_REFERENCE] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server]
On Fri, Apr 19, 2013 at 06:33:41PM +0200, steve wrote:
On 04/19/2013 05:51 PM, Jakub Hrozek wrote:
On Fri, Apr 19, 2013 at 04:48:15PM +0200, steve wrote:
Hi The LDAP_PROTOCOL_ERROR occurs once during user authentication and again upon logging out. It does not occur with getent or at ny other time during the session. I'm almost certain that this has something to do with sssd; users authenticating against nslcd or winbind do not produce this response from the Samba4 ldap. I'll post over on samba-technical to see if they can help. Cheers, Steve
Yes, but can you post the relevent snippet from the logs? They should include the query that is failing.
Hi I put the level 9 logs here: https://dl.dropboxusercontent.com/u/45150875/sssd.client.log.tar
I'm not a dev but I'll try: Here is where it fails getting the groups for the user (from sssd_default.log) I think that this is what produces the LDAP_PROTOCOL_ERROR.
Yes, but that's the same error as before. Let me explain the log snippet in a more detail.
[sssd[be[default]]] [sdap_ldap_connect_callback_add] (0x1000): New LDAP connection to [ldap://dolores.site/CN=Configuration,DC=dolores,DC=site] with fd [22]. [sssd[be[default]]] [sdap_rebind_proc] (0x1000): Successfully bind to [ldap://dolores.site/CN=Configuration,DC=dolores,DC=site].
^^^^^^^^ An LDAP referral was followed here.
[sssd[be[default]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_REFERENCE] [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] [sssd[be[default]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_REFERENCE] [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] [sssd[be[default]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_REFERENCE] [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] [sssd[be[default]]] [sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server] [sssd[be[default]]] [sdap_handle_release] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328], destructor_lock[0], release_memory[0] [sssd[be[default]]] [remove_connection_callback] (0x4000): Successfully removed connection callback. [sssd[be[default]]] [server_setup] (0x0400): CONFDB: /var/lib/sss/db/config.ldb
^^^^^ server_setup is the first function that a new sssd_be instance runs after it crashed and was respawned.
[sssd[be[default]]] [recreate_ares_channel] (0x0100): Initializing new c-ares channel [sssd[be[default]]] [resolv_get_family_order] (0x1000): Lookup order: ipv4_first
So I'd like to ask you to try two things: 1) run the same case with ldap_referrals=False in the sssd.conf config file to stop SSSD from following referrals 2) if possible, gather the backtrace or the core file
Thank you!
On 04/19/2013 06:53 PM, Jakub Hrozek wrote:
On Fri, Apr 19, 2013 at 06:33:41PM +0200, steve wrote:
On 04/19/2013 05:51 PM, Jakub Hrozek wrote:
On Fri, Apr 19, 2013 at 04:48:15PM +0200, steve wrote:
Hi The LDAP_PROTOCOL_ERROR occurs once during user authentication and again upon logging out. It does not occur with getent or at ny other time during the session. I'm almost certain that this has something to do with sssd; users authenticating against nslcd or winbind do not produce this response from the Samba4 ldap. I'll post over on samba-technical to see if they can help. Cheers, Steve
Yes, but can you post the relevent snippet from the logs? They should include the query that is failing.
Hi I put the level 9 logs here: https://dl.dropboxusercontent.com/u/45150875/sssd.client.log.tar
I'm not a dev but I'll try: Here is where it fails getting the groups for the user (from sssd_default.log) I think that this is what produces the LDAP_PROTOCOL_ERROR.
Yes, but that's the same error as before. Let me explain the log snippet in a more detail.
[sssd[be[default]]] [sdap_ldap_connect_callback_add] (0x1000): New LDAP connection to [ldap://dolores.site/CN=Configuration,DC=dolores,DC=site] with fd [22]. [sssd[be[default]]] [sdap_rebind_proc] (0x1000): Successfully bind to [ldap://dolores.site/CN=Configuration,DC=dolores,DC=site].
^^^^^^^^An LDAP referral was followed here.
[sssd[be[default]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_REFERENCE] [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] [sssd[be[default]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_REFERENCE] [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] [sssd[be[default]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_REFERENCE] [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] [sssd[be[default]]] [sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server] [sssd[be[default]]] [sdap_handle_release] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328], destructor_lock[0], release_memory[0] [sssd[be[default]]] [remove_connection_callback] (0x4000): Successfully removed connection callback. [sssd[be[default]]] [server_setup] (0x0400): CONFDB: /var/lib/sss/db/config.ldb
^^^^^server_setup is the first function that a new sssd_be instance runs after it crashed and was respawned.
[sssd[be[default]]] [recreate_ares_channel] (0x0100): Initializing new c-ares channel [sssd[be[default]]] [resolv_get_family_order] (0x1000): Lookup order: ipv4_first
So I'd like to ask you to try two things:
- run the same case with ldap_referrals=False in the sssd.conf config
file to stop SSSD from following referrals 2) if possible, gather the backtrace or the core file
Thank you!
Hi. Brilliant. With ldap_referrals=False it works instantaneously. I can remove steve3 from staff and it is reflected immediately after logging in by id and getent group staff. Similarly for re-adding him to the group. Upon logging in, he appears immediately. Do we lose anything by using ldap_referrals=False? Cheers, Steve
On Fri, Apr 19, 2013 at 07:09:09PM +0200, steve wrote:
On 04/19/2013 06:53 PM, Jakub Hrozek wrote:
On Fri, Apr 19, 2013 at 06:33:41PM +0200, steve wrote:
On 04/19/2013 05:51 PM, Jakub Hrozek wrote:
On Fri, Apr 19, 2013 at 04:48:15PM +0200, steve wrote:
Hi The LDAP_PROTOCOL_ERROR occurs once during user authentication and again upon logging out. It does not occur with getent or at ny other time during the session. I'm almost certain that this has something to do with sssd; users authenticating against nslcd or winbind do not produce this response from the Samba4 ldap. I'll post over on samba-technical to see if they can help. Cheers, Steve
Yes, but can you post the relevent snippet from the logs? They should include the query that is failing.
Hi I put the level 9 logs here: https://dl.dropboxusercontent.com/u/45150875/sssd.client.log.tar
I'm not a dev but I'll try: Here is where it fails getting the groups for the user (from sssd_default.log) I think that this is what produces the LDAP_PROTOCOL_ERROR.
Yes, but that's the same error as before. Let me explain the log snippet in a more detail.
[sssd[be[default]]] [sdap_ldap_connect_callback_add] (0x1000): New LDAP connection to [ldap://dolores.site/CN=Configuration,DC=dolores,DC=site] with fd [22]. [sssd[be[default]]] [sdap_rebind_proc] (0x1000): Successfully bind to [ldap://dolores.site/CN=Configuration,DC=dolores,DC=site].
^^^^^^^^An LDAP referral was followed here.
[sssd[be[default]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_REFERENCE] [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] [sssd[be[default]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_REFERENCE] [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] [sssd[be[default]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_REFERENCE] [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] [sssd[be[default]]] [sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server] [sssd[be[default]]] [sdap_handle_release] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328], destructor_lock[0], release_memory[0] [sssd[be[default]]] [remove_connection_callback] (0x4000): Successfully removed connection callback. [sssd[be[default]]] [server_setup] (0x0400): CONFDB: /var/lib/sss/db/config.ldb
^^^^^server_setup is the first function that a new sssd_be instance runs after it crashed and was respawned.
[sssd[be[default]]] [recreate_ares_channel] (0x0100): Initializing new c-ares channel [sssd[be[default]]] [resolv_get_family_order] (0x1000): Lookup order: ipv4_first
So I'd like to ask you to try two things:
- run the same case with ldap_referrals=False in the sssd.conf config
file to stop SSSD from following referrals 2) if possible, gather the backtrace or the core file
Thank you!
Hi. Brilliant. With ldap_referrals=False it works instantaneously. I can remove steve3 from staff and it is reflected immediately after logging in by id and getent group staff. Similarly for re-adding him to the group. Upon logging in, he appears immediately. Do we lose anything by using ldap_referrals=False? Cheers, Steve
I'm glad it finally works for you!
About needing the referrals - I'm not entirely sure about S4, but if it behaves just like AD, then you don't lose any data unless you use partial replication.
I'm still quite concerned about the crash you saw. Any chance you could get us the backtrace? Even if sssd was not configured optimally, it should never ever crash. (Then again, the problem might have been in openldap as we use libldap's referral chasing). The backtrace would tell us for sure.
On 04/19/2013 07:26 PM, Jakub Hrozek wrote:
On Fri, Apr 19, 2013 at 07:09:09PM +0200, steve wrote:
On 04/19/2013 06:53 PM, Jakub Hrozek wrote:
On Fri, Apr 19, 2013 at 06:33:41PM +0200, steve wrote:
On 04/19/2013 05:51 PM, Jakub Hrozek wrote:
On Fri, Apr 19, 2013 at 04:48:15PM +0200, steve wrote:
Hi The LDAP_PROTOCOL_ERROR occurs once during user authentication and again upon logging out. It does not occur with getent or at ny other time during the session. I'm almost certain that this has something to do with sssd; users authenticating against nslcd or winbind do not produce this response from the Samba4 ldap. I'll post over on samba-technical to see if they can help. Cheers, Steve
Yes, but can you post the relevent snippet from the logs? They should include the query that is failing.
Hi I put the level 9 logs here: https://dl.dropboxusercontent.com/u/45150875/sssd.client.log.tar
I'm not a dev but I'll try: Here is where it fails getting the groups for the user (from sssd_default.log) I think that this is what produces the LDAP_PROTOCOL_ERROR.
Yes, but that's the same error as before. Let me explain the log snippet in a more detail.
[sssd[be[default]]] [sdap_ldap_connect_callback_add] (0x1000): New LDAP connection to [ldap://dolores.site/CN=Configuration,DC=dolores,DC=site] with fd [22]. [sssd[be[default]]] [sdap_rebind_proc] (0x1000): Successfully bind to [ldap://dolores.site/CN=Configuration,DC=dolores,DC=site].
^^^^^^^^An LDAP referral was followed here.
[sssd[be[default]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_REFERENCE] [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] [sssd[be[default]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_REFERENCE] [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] [sssd[be[default]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_REFERENCE] [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] [sssd[be[default]]] [sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server] [sssd[be[default]]] [sdap_handle_release] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328], destructor_lock[0], release_memory[0] [sssd[be[default]]] [remove_connection_callback] (0x4000): Successfully removed connection callback. [sssd[be[default]]] [server_setup] (0x0400): CONFDB: /var/lib/sss/db/config.ldb
^^^^^server_setup is the first function that a new sssd_be instance runs after it crashed and was respawned.
[sssd[be[default]]] [recreate_ares_channel] (0x0100): Initializing new c-ares channel [sssd[be[default]]] [resolv_get_family_order] (0x1000): Lookup order: ipv4_first
So I'd like to ask you to try two things:
- run the same case with ldap_referrals=False in the sssd.conf config
file to stop SSSD from following referrals 2) if possible, gather the backtrace or the core file
Thank you!
Hi. Brilliant. With ldap_referrals=False it works instantaneously. I can remove steve3 from staff and it is reflected immediately after logging in by id and getent group staff. Similarly for re-adding him to the group. Upon logging in, he appears immediately. Do we lose anything by using ldap_referrals=False? Cheers, Steve
I'm glad it finally works for you!
About needing the referrals - I'm not entirely sure about S4, but if it behaves just like AD, then you don't lose any data unless you use partial replication.
Hi S4 is advertised as a 1 to 1 replacement for AD and I know the dev's strive hard for that. Our production setup has 2 DC's which replicate in the normal way. I've never heard of partial replication. Is it safe to assume that as I don't know what partial replication is, then we don't use it? A far as I know, the DC's supply identical LDAP attributes no matter which is queried.
I'm still quite concerned about the crash you saw. Any chance you could get us the backtrace? Even if sssd was not configured optimally, it should never ever crash. (Then again, the problem might have been in openldap as we use libldap's referral chasing). The backtrace would tell us for sure.
Yes, of course except I don't know how to do it. I once used gdb for the Samba guys when we had a crash. Is gdb what you mean to produce te backtrace? Could you give me a one liner as to what the syntax is to jog my memory? I'll dig out my notes and have a go.
Thanks for all your help and for making sssd such a great piece of software.
On Fri, Apr 19, 2013 at 08:12:39PM +0200, steve wrote:
Hi. Brilliant. With ldap_referrals=False it works instantaneously. I can remove steve3 from staff and it is reflected immediately after logging in by id and getent group staff. Similarly for re-adding him to the group. Upon logging in, he appears immediately. Do we lose anything by using ldap_referrals=False? Cheers, Steve
I'm glad it finally works for you!
About needing the referrals - I'm not entirely sure about S4, but if it behaves just like AD, then you don't lose any data unless you use partial replication.
Hi S4 is advertised as a 1 to 1 replacement for AD and I know the dev's strive hard for that. Our production setup has 2 DC's which replicate in the normal way. I've never heard of partial replication. Is it safe to assume that as I don't know what partial replication is, then we don't use it? A far as I know, the DC's supply identical LDAP attributes no matter which is queried.
Then you're fine, yes.
I'm still quite concerned about the crash you saw. Any chance you could get us the backtrace? Even if sssd was not configured optimally, it should never ever crash. (Then again, the problem might have been in openldap as we use libldap's referral chasing). The backtrace would tell us for sure.
Yes, of course except I don't know how to do it. I once used gdb for the Samba guys when we had a crash. Is gdb what you mean to produce te backtrace? Could you give me a one liner as to what the syntax is to jog my memory? I'll dig out my notes and have a go.
You'd need to attach a gdb to the sssd_be process, turn on logging (gdb would log to a file called gdb.txt by default I think) and resume the execution of sssd_be:
# gdb program $(pidof sssd_be) # (gdb) set logging on # (gdb) continue
Then run the test case that was causing the crash. When the SSSD crashes, you could also save the core file: # (gdb) generate-core-file
Thank you for your help debugging this problem!
On 04/19/2013 06:33 PM, steve wrote:
On 04/19/2013 05:51 PM, Jakub Hrozek wrote:
On Fri, Apr 19, 2013 at 04:48:15PM +0200, steve wrote:
Hi The LDAP_PROTOCOL_ERROR occurs once during user authentication and again upon logging out. It does not occur with getent or at ny other time during the session. I'm almost certain that this has something to do with sssd; users authenticating against nslcd or winbind do not produce this response from the Samba4 ldap. I'll post over on samba-technical to see if they can help. Cheers, Steve
Yes, but can you post the relevent snippet from the logs? They should include the query that is failing.
Hi I put the level 9 logs here: https://dl.dropboxusercontent.com/u/45150875/sssd.client.log.tar
I'm not a dev but I'll try: Here is where it fails getting the groups for the user (from sssd_default.log) I think that this is what produces the LDAP_PROTOCOL_ERROR.
(0x4000): Process user's groups (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_initgr_rfc2307bis_next_base] (0x0400): Searching for parent groups for user [CN=steve3,CN=Users,DC=dolores,DC=site] with base [dc=dolores,dc=site] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with [(&(member=CN=steve3,CN=Users,DC=dolores,DC=site)(objectclass=group)(cn=*))][dc=dolores,dc=site].
(Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [objectClass] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [cn] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [userPassword] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [gidNumber] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [nsUniqueId] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [modifyTimestamp] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [uSNChanged] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = 23 (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: ldap_result found nothing! (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_ldap_connect_callback_add] (0x1000): New LDAP connection to [ldap://dolores.site/CN=Configuration,DC=dolores,DC=site] with fd [22]. (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_rebind_proc] (0x1000): Successfully bind to [ldap://dolores.site/CN=Configuration,DC=dolores,DC=site]. (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_REFERENCE] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_REFERENCE] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_process_message] (0x4000): Message type: [LDAP_RES_SEARCH_REFERENCE] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_process_result] (0x2000): Trace: sh[0x893d088], connected[1], ops[0x895ef58], ldap[0x8933328] (Fri Apr 19 07:09:10 2013) [sssd[be[default]]] [sdap_process_result] (0x0040): ldap_result error: [Can't contact LDAP server]
Hi. Not sure I've got the right ldapsearch syntax, but the filter from sssd gives an error:
sudo ldapsearch -h doloresdc.dolores.site -b 'dc=dolores,dc=site' '[(&(member=CN=steve3,CN=Users,DC=dolores,DC=site)(objectclass=group)(cn=*))][dc=dolores,dc=site]' -Y GSSAPI SASL/GSSAPI authentication started SASL username: Administrator@DOLORES.SITE SASL SSF: 56 SASL data security layer installed. # extended LDIF # # LDAPv3 # base <dc=dolores,dc=site> with scope subtree # filter: [(&(member=CN=steve3,CN=Users,DC=dolores,DC=site)(objectclass=group)(cn=*))][dc=dolores,dc=site] # requesting: ALL # ldap_search_ext: Bad search filter (-7)
On Fri, Apr 19, 2013 at 06:57:42PM +0200, steve wrote:
Hi. Not sure I've got the right ldapsearch syntax, but the filter from sssd gives an error:
sudo ldapsearch -h doloresdc.dolores.site -b 'dc=dolores,dc=site' '[(&(member=CN=steve3,CN=Users,DC=dolores,DC=site)(objectclass=group)(cn=*))][dc=dolores,dc=site]' -Y GSSAPI
Try this instead: sudo ldapsearch -h doloresdc.dolores.site -b 'dc=dolores,dc=site' '(&(member=CN=steve3,CN=Users,DC=dolores,DC=site)(objectclass=group)(cn=*))' -Y GSSAPI
(our debug messages include the base and the filter in one message)
On Tue, Apr 16, 2013 at 09:02:12AM +0100, John Hodrien wrote:
On Tue, 16 Apr 2013, steve wrote:
Hi again OK, I found it. sss_cache
Unfortunately it gives an error even if a correct switch and domain are given:
sudo sss_cache -d default Usage: sss_cache [-?UGNSA] [-?|--help] [--usage] [-u|--user=STRING] [-U|--users] [-g|--group=STRING] [-G|--groups] [-n|--netgroup=STRING] [-N|--netgroups] [-s|--service=STRING] [-S|--services] [-a|--autofs-map=STRING] [-A|--autofs-maps] [-d|--domain=STRING] Please select at least one object to invalidate (Tue Apr 16 09:37:15:820975 2013) [sssd] [main] (0x0020): Error initializing context for the application
The other switches, e.g. sss_cache -u steve2 works OK.
sssd 1.9.4
Surely that should be:
sss_cache -d default -UG
or just
sss_cache -UG
But to be honest, I'd favour the more brutal technique while debugging. sss_cache invalidates the cache, but if sssd can't contact the LDAP servers it'll still serve from cache I thought. I may be wrong on that point though.
I've always gone for the completely unambiguous:
service sssd stop rm -f /var/lib/sss/{db,mc}/* /var/log/sssd/* service sssd start
That way, I'm clear that it knew nothing, and that the logs I'm looking at are 100% from the current config.
jh
Yes, this works too, but there's one thing to keep in mind -- removing the cached also removes your cached credentials which might be rather unfortunate if you were trying to log in offline later.
Depends on your use-case I guess.
On 04/15/2013 09:10 PM, steve wrote:
Hi When I'm debugging or adding new objects and attributes to the directory, I need to be able to turn off the sssd cache. Otherwise, I do not see any of the changes I have made.
How can I force sssd to read from the directory and not from the cache?
Cheers, Steve
Hi, you can also set entry_cache_timeout = 0 in [domain] section of sssd.conf. This will make the cache entries expire immediately.
sssd-devel@lists.fedorahosted.org