On to, 02 marras 2017, Simo Sorce wrote:
On Thu, 2017-11-02 at 13:14 +0100, Sumit Bose wrote:
On Fri, Oct 27, 2017 at 08:43:28AM -0400, Simo Sorce wrote:
On Thu, 2017-10-26 at 22:14 +0200, Sumit Bose wrote:
On Thu, Oct 26, 2017 at 02:43:29PM -0400, Simo Sorce wrote:
On Thu, 2017-10-26 at 12:16 +0200, Jakub Hrozek wrote:
On Wed, Oct 25, 2017 at 05:39:21PM +0200, Sumit Bose wrote: > Hi, > > please find below the design document for the enhanced NSS > API > which > makes e.g. the client side timeouts which where recently > refactored > available to callers. > > A more visual friendly version can be found at: > https://pagure.io/fork/sbose/SSSD/docs/blob/07514ce52845d47 > fe6b > b327 > f782a865bfa75628a/f/design_pages/enhanced_nss_api.rst > > bye, > Sumit
LGTM!
Looking at this I have some questions, if you are going to create a new library and just need to set a timeout it seem it would be a much better interface to use a context handle you allocate and pass into each call, and then have getters setters to set timeouts or any other flags that should influence the whole behavior. This will allow you also to control how many concurrent connections you want to have against sssd, as each new context will create a new socket connection and all.
In the original libnss_sss.so we could not do that because the glibc interface does not offer any way to hold a context, but there is no reason to continue along that line in a *new* API. And not using contexts in threaded applications is generally a bad idea, as you end up *requiring* te use of mutexes when that is really not always a need (separated threads can open separate connections and not share any data that require mutexes).
This sounds like a good 2.0 feature, are you interested in creating a more detailed design page for this?
Sure.
Currently my goal was to reuse as much or the current trusted code we already have.
On the responder side I also do not see why new calls are being created. You clearly want a client-wide behavior, introduce ONE new call that sets flags for the rest of the connection, and just reuse the usual commands otherwise.
The current flags, like invalidating a cached entry are per- request, only the single object address by the current request should be invalidate not all object which are requested on the same connection.
I would probably add a command to explicitly invalidate individual caches, this would avoid having special paths on every other call, resulting in cleaner code, at the cost of one more roundtrip though, so I guess it is a matter of figuring out what is the right balance here.
I do not understand what is the point of nss_truste_users why a force reload is a privileged operation ?
Since it can force expensive operations on the backends which will hit servers I think not everybody should be allowed to do this.
You can already force this by requesting unexisting users/groups, I am not convinced this necessarily needs to be a privileged operation as there are already ways to cause work for SSSD. I would rather drop it. If we really want to deal with potential abuse we should introduce rate-limiting per uid, and basically slow down to a halt abusing clients by giving weights and possibly quotas, doling out privileges is cumbersome anyway and does not really prevent a malicious client to cause hard ATM.
IMHO nss_trusted_users gets a NACK as a concept and should be dropped.
Of course I can remove it, but since removing it later is easier than adding it later I'd like to try to explain again why I think it would be useful.
You are right that it is already possible to send requests to the servers via SSSD. But as you said this are "only" searches for unexisting user and groups which should be handled by the indexes on the server quite efficiently and causes no disk-I/O on the client. Additionally we try to avoid accidental misuse of this with the negative cache.
You can also call out existing users, especially in large domains if you have groups big enough you can cause fast cache thrashing very easily, this is already a big deal for the client. Client disk load is uninteresting because a process can simply write/fsync locally to cause disk I/O issues, and traffic towards the server is also uninteresting because a client can simply directly contact the server directly to cause load.
So what we are left with is uniquely the fact a process might keep the sssd_be process busier and cause other processes on the system to get slower responses. But this is easy to deal with by simply throttling bad behaving users (and the admin kicking them out).
The flags might trigger operations like looking up all groups a user is a member of or looking up a group with all members. While only the first might cause a more expensive operation on the server both might cause a lot of disk-I/O on the client.
Rate limitations would help to mitigate misuse as well. But a typical SSSD client would have no use for the flags so why allow it to use them?
They may have a need to insure a specific user/group is refreshed, tools would be able to use this instead of having to be be root and deleting the whole sssd cache just to refresh a user you know has been changed on the server. We wouldn't want people to abuse this of course, but it's not a bad idea.
That's why I think nss_trusted_users is a good way to avoid accidental misuse in a similar way as the negative cache. But if you prefer I'll drop it.
I do not know that you can easily change it later, and I would rather use file permissions than explicit checks, that would mean exposing a "privileged" socket that only users that are part of a group of "trusted" service can access, and this is clearly a lot more work, which I am not advocating.
I am really torn on the need for this, it will make using this feature really cumbersome as you have to explicitly modify a configuration file and list specific users (groups ?), and this is to balance against a vague chance of a user causing a local DoS/slowdown but not a lot more.
To me it sounds like a very big hammer for a very small fly.
Right now the only user for this is dirsrv on IPA masters. If you throttle dirsrv plugins, you are denying SSSD clients from IPA clients from getting actual results. Throttling, thus, would be a wrong measure here.