People,
I've spent some time looking at the code and trying to understand what are the needed changes in order to have this task done. I'll start by writing down how things are working nowadays, what we want to achieve, what are the parts that will need to be touched and what are the steps that I'm going to take. Please, keep in mind that I may have a wrong (or at least not so clear) understanding of all these points that I'm about to explain and in case I made some mistake feel free to jump in and correct me. Also, whatever we sum up from this email will be written down in our DesignFeatures page. Let's start ...
How things work nowadays: ----------------------------------------
Nowadays all the services are started and taken care by the monitor. It basically means that the monitor checks which are the services listed to be started, starts those and registers them in order to relay them signals coming from our tools. Here I'm not sure whether the monitor relays signals from something else but the tools, but doesn't seem to be case (and please, correct me in case I'm wrong).
What we want to achieve: -------------------------------------
While my personal desire is to start slowly killing the monitor, that's not going to be case right now. We don't want to do any change in the code that wouldn't also contemplate platforms where systemd is not available. That being said, let's move to the important part ...
Here what we want to achieve is to have all responders (at least for now, yes, just the responders) socket-activatable as some of them don't actually have to be running all the time (that's the case, for example, of the ssh, sudo and ifp responders). We also have to keep into consideration that _any_ change in behaviour should _not_ happen, which means that we still have to honor our compromise with sssd.conf and still make the responders manually activated there to start and be running from the moment sssd is running (as we do nowadays).
How we plan to achieve our goal: -----------------------------------------------
Some parts I have a pretty clear idea on what to do, some others not that much. The basic idea is to take as much advantage of systemd machinery as we can and "remove" as much duties as possible from the monitor. Let's go through this by parts ...
- Starting the service: the idea is to have a systemd unit for each of the responders. Whether these units will be automatically generated by us is a detail that doesn't worth attention right now. Let's take a look on what the unit will look like:
[Unit] Description=SSSD @responder@ service provider
[Install] Also=sssd-@responder@.socket
[Service] ExecStart==@libexecdir@/sssd/sssd_@responder@ --uid 0 --gid 0 --debug-to-files Requires=sssd.service PartOf=sssd.service
There's two options that deserve some explanation of their usage here: -- "Requires=sssd.service": This option guarantees that sssd will be up when any of the responders are up. Considering that the providers part won't be changed, those will still be initializated synchronously by the monitor and only then notify init that its start up finished, which also mean that the providers' sbus socket will be up. -- "PartOf=sssd.service": This options guarantees that when sssd.service is restarted/stoped all the responders' services will be restarted/stopped accordingly.
- Relaying signals: seems the best approach for this in order to replace the registration currently done by the monitor, is to create a named bus for each of the responders so the tools could talk directly to them. By "named bus for each responder" I understand that's a pipe, named sbus-@responder@ that will be used to send the dbus message through it. It would require to adapt the tools code to actually check whether the sbus-@responder@ pipe exists and then send the message, as we won't have a list of the running responders, which may increase the iterations to send a message but I do believe wouldn't hurt us too much due to the small number of responders we have. It will also help tools to be able to set different debug levels for each responder.
Seems, at least for me, that doing these steps we are covered with respect to what we have nowadays. Does someone think that we are missing something? What? Please, try to explain in the same way you would explain to a 5 years old kid :-)
Coding plan: ------------------
My current plan is to start implementing this whole thing by creating the named bus for each responder and have the communication between the responders and the tools working. Next step will be to actually have the responders started on demand by using socket-activation. And the very last step will be to make sure we can have the responders always running when they're listed in the sssd.conf.
I'm not comfortable on giving any estimation about when we will have it done, mainly because I'd like to hear the feedback about this from others in the team.
Looking forward for your reading back from you!
Best Regards, -- Fabiano Fidêncio
On Wed, Nov 16, 2016 at 10:03:36AM +0100, Fabiano Fidêncio wrote:
People,
I've spent some time looking at the code and trying to understand what are the needed changes in order to have this task done. I'll start by writing down how things are working nowadays, what we want to achieve, what are the parts that will need to be touched and what are the steps that I'm going to take. Please, keep in mind that I may have a wrong (or at least not so clear) understanding of all these points that I'm about to explain and in case I made some mistake feel free to jump in and correct me. Also, whatever we sum up from this email will be written down in our DesignFeatures page. Let's start ...
How things work nowadays:
Nowadays all the services are started and taken care by the monitor. It basically means that the monitor checks which are the services listed to be started, starts those and registers them in order to relay them signals coming from our tools. Here I'm not sure whether the monitor relays signals from something else but the tools, but doesn't seem to be case (and please, correct me in case I'm wrong).
Monitor also relays: - signals that the administrator sends manually (see man sssd(8) and search for USR1/USR2/HUP) - notifications that /etc/resolv.conf had changed and the domains should reset their offline status - notifications that networking on the machine had changed (received via the netlink socket) and the domains should reset their offline status
These mostly (except for clearing the memory cache of NSS responder and rotating the logs) concern the providers which we won't be changing now.
And actually I wouldn't see too much harm with moving the resolv.conf monitoring and netlink monitoring to back ends themselves..both 'just' mean an open fd and the number of sssd_be processes running is usually small.
What we want to achieve:
While my personal desire is to start slowly killing the monitor, that's not going to be case right now.
Well, we can incrementally downsize the monitor..
We don't want to do any change in the code that wouldn't also contemplate platforms where systemd is not available.
Yes, but perhaps we can move the non-systemd code to a separate module to avoid a lot of code in the monitor being #ifdef-ed out. (I don't know if this will actually be the case in the end)
That being said, let's move to the important part ...
Here what we want to achieve is to have all responders (at least for now, yes, just the responders) socket-activatable as some of them don't actually have to be running all the time (that's the case, for example, of the ssh, sudo and ifp responders). We also have to keep into consideration that _any_ change in behaviour should _not_ happen, which means that we still have to honor our compromise with sssd.conf and still make the responders manually activated there to start and be running from the moment sssd is running (as we do nowadays).
Right, this is a possibility. I think we can use the "services" line to configure how the responders are started. If any service was explicitly listed in "services", then it would be also explicitly started by the monitor just as it is now.
I think we also want to add an idle timeout to the services and the services that are started explicitly would just have an infinite idle timeout..
Other systemd services that correspond to a responder might be disabled by default and the admin can just activate them to allow them to be started on demand (that's already the case with sssd-secrets, right?)
But this could also be just a middle step..in future, since the responder Requires sssd.service, I think we could just let the monitor socket-activate the responder as well /after/ starting the back ends. Or is there another reason for starting some services explicitly except for maintaining the status quo and being a bit more on the safe side?
How we plan to achieve our goal:
Some parts I have a pretty clear idea on what to do, some others not that much. The basic idea is to take as much advantage of systemd machinery as we can and "remove" as much duties as possible from the monitor. Let's go through this by parts ...
- Starting the service: the idea is to have a systemd unit for each of
the responders. Whether these units will be automatically generated by us is a detail that doesn't worth attention right now. Let's take a look on what the unit will look like:
[Unit] Description=SSSD @responder@ service provider
[Install] Also=sssd-@responder@.socket
[Service] ExecStart==@libexecdir@/sssd/sssd_@responder@ --uid 0 --gid 0 --debug-to-files
I wonder if the service could start as root and then read the 'user' from confdb to learn if it's supposed to drop root.
Requires=sssd.service PartOf=sssd.service
There's two options that deserve some explanation of their usage here: -- "Requires=sssd.service": This option guarantees that sssd will be up when any of the responders are up. Considering that the providers part won't be changed, those will still be initializated synchronously by the monitor and only then notify init that its start up finished, which also mean that the providers' sbus socket will be up. -- "PartOf=sssd.service": This options guarantees that when sssd.service is restarted/stoped all the responders' services will be restarted/stopped accordingly.
- Relaying signals: seems the best approach for this in order to
replace the registration currently done by the monitor, is to create a named bus for each of the responders so the tools could talk directly to them. By "named bus for each responder" I understand that's a pipe, named sbus-@responder@ that will be used to send the dbus message through it. It would require to adapt the tools code to actually check whether the sbus-@responder@ pipe exists and then send the message, as we won't have a list of the running responders, which may increase the iterations to send a message but I do believe wouldn't hurt us too much due to the small number of responders we have. It will also help tools to be able to set different debug levels for each responder.
For relaying the messages from monitor to services, we should implement sbus signals. Then the service could just subscribe to signals after startup (explicit by monitor or implicit via socket-activation) and monitor wouldn't attempt to contact individual services, but just send a signal and the sbus signal relaying code would deliver them to the right destination.
The sbus signals would also be useful to make a back end send a signal towards any interested services (for example, the files provider might want to send a signal to NSS and IFP that memory cache or negative cache needs to be invalidated) so we want to implement them anyway..
The other approach might be for each service to just expose a service management API on the system bus (for root only) and let the monitor start the enabled services by accessing their system bus inteface and taking advantage of the d-bus systemd integration. Something like:
--------------------------------- [D-BUS SSSD NSS Service] Name=org.freedesktop.sssd.nss1 Exec=/bin/false User=root SystemdService=dbus-org.freedesktop.sssd.nss1.service ---------------------------------
The advantage here would be that we wouldn't have to extend sbus with signals support. The obvious disadvantage is that we would depend on the system bus and systemd and would have to maintain a separate branch of non-systemd code..
Seems, at least for me, that doing these steps we are covered with respect to what we have nowadays. Does someone think that we are missing something? What? Please, try to explain in the same way you would explain to a 5 years old kid :-)
Coding plan:
My current plan is to start implementing this whole thing by creating the named bus for each responder and have the communication between the responders and the tools working. Next step will be to actually have the responders started on demand by using socket-activation. And the very last step will be to make sure we can have the responders always running when they're listed in the sssd.conf.
I'm not comfortable on giving any estimation about when we will have it done, mainly because I'd like to hear the feedback about this from others in the team.
Looking forward for your reading back from you!
Best Regards,
Fabiano Fidêncio _______________________________________________ sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org To unsubscribe send an email to sssd-devel-leave@lists.fedorahosted.org
On Wed, Nov 16, 2016 at 11:20 AM, Jakub Hrozek jhrozek@redhat.com wrote:
On Wed, Nov 16, 2016 at 10:03:36AM +0100, Fabiano Fidêncio wrote:
People,
I've spent some time looking at the code and trying to understand what are the needed changes in order to have this task done. I'll start by writing down how things are working nowadays, what we want to achieve, what are the parts that will need to be touched and what are the steps that I'm going to take. Please, keep in mind that I may have a wrong (or at least not so clear) understanding of all these points that I'm about to explain and in case I made some mistake feel free to jump in and correct me. Also, whatever we sum up from this email will be written down in our DesignFeatures page. Let's start ...
How things work nowadays:
Nowadays all the services are started and taken care by the monitor. It basically means that the monitor checks which are the services listed to be started, starts those and registers them in order to relay them signals coming from our tools. Here I'm not sure whether the monitor relays signals from something else but the tools, but doesn't seem to be case (and please, correct me in case I'm wrong).
Monitor also relays: - signals that the administrator sends manually (see man sssd(8) and search for USR1/USR2/HUP)
This actually affects the responders, right?
- notifications that /etc/resolv.conf had changed and the domains should reset their offline status - notifications that networking on the machine had changed (received via the netlink socket) and the domains should reset their offline status
These don't as far as I understand.
These mostly (except for clearing the memory cache of NSS responder and rotating the logs) concern the providers which we won't be changing now.
And actually I wouldn't see too much harm with moving the resolv.conf monitoring and netlink monitoring to back ends themselves..both 'just' mean an open fd and the number of sssd_be processes running is usually small.
Right, I like the suggestion but I'd like to keep this out of the scope for this task right now.
What we want to achieve:
While my personal desire is to start slowly killing the monitor, that's not going to be case right now.
Well, we can incrementally downsize the monitor..
We don't want to do any change in the code that wouldn't also contemplate platforms where systemd is not available.
Yes, but perhaps we can move the non-systemd code to a separate module to avoid a lot of code in the monitor being #ifdef-ed out. (I don't know if this will actually be the case in the end)
That's a good idea. I'll go pretty much if-def'ing the code and in the end split this part into a separate module in case it makes sense.
That being said, let's move to the important part ...
Here what we want to achieve is to have all responders (at least for now, yes, just the responders) socket-activatable as some of them don't actually have to be running all the time (that's the case, for example, of the ssh, sudo and ifp responders). We also have to keep into consideration that _any_ change in behaviour should _not_ happen, which means that we still have to honor our compromise with sssd.conf and still make the responders manually activated there to start and be running from the moment sssd is running (as we do nowadays).
Right, this is a possibility. I think we can use the "services" line to configure how the responders are started. If any service was explicitly listed in "services", then it would be also explicitly started by the monitor just as it is now.
Right.
I think we also want to add an idle timeout to the services and the services that are started explicitly would just have an infinite idle timeout..
Why? If the services are started explicitly we can just use the same logic used nowadays. So, I have the feeling that no changes would be needed at all in this case.
Other systemd services that correspond to a responder might be disabled by default and the admin can just activate them to allow them to be started on demand (that's already the case with sssd-secrets, right?)
Yes, you're right.
But this could also be just a middle step..in future, since the responder Requires sssd.service, I think we could just let the monitor socket-activate the responder as well /after/ starting the back ends. Or is there another reason for starting some services explicitly except for maintaining the status quo and being a bit more on the safe side?
The reason for me would be, only, to be a bit more on the safe side.
How we plan to achieve our goal:
Some parts I have a pretty clear idea on what to do, some others not that much. The basic idea is to take as much advantage of systemd machinery as we can and "remove" as much duties as possible from the monitor. Let's go through this by parts ...
- Starting the service: the idea is to have a systemd unit for each of
the responders. Whether these units will be automatically generated by us is a detail that doesn't worth attention right now. Let's take a look on what the unit will look like:
[Unit] Description=SSSD @responder@ service provider
[Install] Also=sssd-@responder@.socket
[Service] ExecStart==@libexecdir@/sssd/sssd_@responder@ --uid 0 --gid 0 --debug-to-files
I wonder if the service could start as root and then read the 'user' from confdb to learn if it's supposed to drop root
I guess it's doable and I'll keep it in mind while actually implementing the changes. .
Requires=sssd.service PartOf=sssd.service
There's two options that deserve some explanation of their usage here: -- "Requires=sssd.service": This option guarantees that sssd will be up when any of the responders are up. Considering that the providers part won't be changed, those will still be initializated synchronously by the monitor and only then notify init that its start up finished, which also mean that the providers' sbus socket will be up. -- "PartOf=sssd.service": This options guarantees that when sssd.service is restarted/stoped all the responders' services will be restarted/stopped accordingly.
- Relaying signals: seems the best approach for this in order to
replace the registration currently done by the monitor, is to create a named bus for each of the responders so the tools could talk directly to them. By "named bus for each responder" I understand that's a pipe, named sbus-@responder@ that will be used to send the dbus message through it. It would require to adapt the tools code to actually check whether the sbus-@responder@ pipe exists and then send the message, as we won't have a list of the running responders, which may increase the iterations to send a message but I do believe wouldn't hurt us too much due to the small number of responders we have. It will also help tools to be able to set different debug levels for each responder.
For relaying the messages from monitor to services, we should implement sbus signals. Then the service could just subscribe to signals after startup (explicit by monitor or implicit via socket-activation) and monitor wouldn't attempt to contact individual services, but just send a signal and the sbus signal relaying code would deliver them to the right destination.
The sbus signals would also be useful to make a back end send a signal towards any interested services (for example, the files provider might want to send a signal to NSS and IFP that memory cache or negative cache needs to be invalidated) so we want to implement them anyway..
The other approach might be for each service to just expose a service management API on the system bus (for root only) and let the monitor start the enabled services by accessing their system bus inteface and taking advantage of the d-bus systemd integration. Something like:
[D-BUS SSSD NSS Service] Name=org.freedesktop.sssd.nss1 Exec=/bin/false User=root SystemdService=dbus-org.freedesktop.sssd.nss1.service
The advantage here would be that we wouldn't have to extend sbus with signals support. The obvious disadvantage is that we would depend on the system bus and systemd and would have to maintain a separate branch of non-systemd code..
Hmm. It really makes me think that as the first step we should try a simpler approach. Just send, from the responders, a sbus message to the monitor and let the monitor register them in case they're socket activated. It would require just a new dbus method and a small chunk of code in the common code for the responders.
What do you think about this? My personal preference would be to go simpler right now and improve it later on (considering we want to have the socket activated services for the responders done as soon as possible).
Seems, at least for me, that doing these steps we are covered with respect to what we have nowadays. Does someone think that we are missing something? What? Please, try to explain in the same way you would explain to a 5 years old kid :-)
Coding plan:
My current plan is to start implementing this whole thing by creating the named bus for each responder and have the communication between the responders and the tools working. Next step will be to actually have the responders started on demand by using socket-activation. And the very last step will be to make sure we can have the responders always running when they're listed in the sssd.conf.
I'm not comfortable on giving any estimation about when we will have it done, mainly because I'd like to hear the feedback about this from others in the team.
Looking forward for your reading back from you!
Best Regards,
Fabiano Fidêncio _______________________________________________ sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org To unsubscribe send an email to sssd-devel-leave@lists.fedorahosted.org
sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org To unsubscribe send an email to sssd-devel-leave@lists.fedorahosted.org
Best Regards, -- Fabiano Fidêncio
On Wed, Nov 16, 2016 at 11:46:03AM +0100, Fabiano Fidêncio wrote:
On Wed, Nov 16, 2016 at 11:20 AM, Jakub Hrozek jhrozek@redhat.com wrote:
On Wed, Nov 16, 2016 at 10:03:36AM +0100, Fabiano Fidêncio wrote:
People,
I've spent some time looking at the code and trying to understand what are the needed changes in order to have this task done. I'll start by writing down how things are working nowadays, what we want to achieve, what are the parts that will need to be touched and what are the steps that I'm going to take. Please, keep in mind that I may have a wrong (or at least not so clear) understanding of all these points that I'm about to explain and in case I made some mistake feel free to jump in and correct me. Also, whatever we sum up from this email will be written down in our DesignFeatures page. Let's start ...
How things work nowadays:
Nowadays all the services are started and taken care by the monitor. It basically means that the monitor checks which are the services listed to be started, starts those and registers them in order to relay them signals coming from our tools. Here I'm not sure whether the monitor relays signals from something else but the tools, but doesn't seem to be case (and please, correct me in case I'm wrong).
Monitor also relays: - signals that the administrator sends manually (see man sssd(8) and search for USR1/USR2/HUP)
This actually affects the responders, right?
- notifications that /etc/resolv.conf had changed and the domains should reset their offline status - notifications that networking on the machine had changed (received via the netlink socket) and the domains should reset their offline status
These don't as far as I understand.
These mostly (except for clearing the memory cache of NSS responder and rotating the logs) concern the providers which we won't be changing now.
And actually I wouldn't see too much harm with moving the resolv.conf monitoring and netlink monitoring to back ends themselves..both 'just' mean an open fd and the number of sssd_be processes running is usually small.
Right, I like the suggestion but I'd like to keep this out of the scope for this task right now.
What we want to achieve:
While my personal desire is to start slowly killing the monitor, that's not going to be case right now.
Well, we can incrementally downsize the monitor..
We don't want to do any change in the code that wouldn't also contemplate platforms where systemd is not available.
Yes, but perhaps we can move the non-systemd code to a separate module to avoid a lot of code in the monitor being #ifdef-ed out. (I don't know if this will actually be the case in the end)
That's a good idea. I'll go pretty much if-def'ing the code and in the end split this part into a separate module in case it makes sense.
That being said, let's move to the important part ...
Here what we want to achieve is to have all responders (at least for now, yes, just the responders) socket-activatable as some of them don't actually have to be running all the time (that's the case, for example, of the ssh, sudo and ifp responders). We also have to keep into consideration that _any_ change in behaviour should _not_ happen, which means that we still have to honor our compromise with sssd.conf and still make the responders manually activated there to start and be running from the moment sssd is running (as we do nowadays).
Right, this is a possibility. I think we can use the "services" line to configure how the responders are started. If any service was explicitly listed in "services", then it would be also explicitly started by the monitor just as it is now.
Right.
I think we also want to add an idle timeout to the services and the services that are started explicitly would just have an infinite idle timeout..
Why? If the services are started explicitly we can just use the same logic used nowadays.
Yes.
So, I have the feeling that no changes would be needed at all in this case.
I thought we wanted to add an idle timeout to services that are only seldom needed like secrets or sudo to avoid running them all the time..
This could be a tevent timer that is cancelled and restarted when a new client connects and its period could be set from the command line or from a config parameter.
Other systemd services that correspond to a responder might be disabled by default and the admin can just activate them to allow them to be started on demand (that's already the case with sssd-secrets, right?)
Yes, you're right.
But this could also be just a middle step..in future, since the responder Requires sssd.service, I think we could just let the monitor socket-activate the responder as well /after/ starting the back ends. Or is there another reason for starting some services explicitly except for maintaining the status quo and being a bit more on the safe side?
The reason for me would be, only, to be a bit more on the safe side.
OK, we can start with that and gradually move.
How we plan to achieve our goal:
Some parts I have a pretty clear idea on what to do, some others not that much. The basic idea is to take as much advantage of systemd machinery as we can and "remove" as much duties as possible from the monitor. Let's go through this by parts ...
- Starting the service: the idea is to have a systemd unit for each of
the responders. Whether these units will be automatically generated by us is a detail that doesn't worth attention right now. Let's take a look on what the unit will look like:
[Unit] Description=SSSD @responder@ service provider
[Install] Also=sssd-@responder@.socket
[Service] ExecStart==@libexecdir@/sssd/sssd_@responder@ --uid 0 --gid 0 --debug-to-files
I wonder if the service could start as root and then read the 'user' from confdb to learn if it's supposed to drop root
I guess it's doable and I'll keep it in mind while actually implementing the changes. .
Requires=sssd.service PartOf=sssd.service
There's two options that deserve some explanation of their usage here: -- "Requires=sssd.service": This option guarantees that sssd will be up when any of the responders are up. Considering that the providers part won't be changed, those will still be initializated synchronously by the monitor and only then notify init that its start up finished, which also mean that the providers' sbus socket will be up. -- "PartOf=sssd.service": This options guarantees that when sssd.service is restarted/stoped all the responders' services will be restarted/stopped accordingly.
- Relaying signals: seems the best approach for this in order to
replace the registration currently done by the monitor, is to create a named bus for each of the responders so the tools could talk directly to them. By "named bus for each responder" I understand that's a pipe, named sbus-@responder@ that will be used to send the dbus message through it. It would require to adapt the tools code to actually check whether the sbus-@responder@ pipe exists and then send the message, as we won't have a list of the running responders, which may increase the iterations to send a message but I do believe wouldn't hurt us too much due to the small number of responders we have. It will also help tools to be able to set different debug levels for each responder.
For relaying the messages from monitor to services, we should implement sbus signals. Then the service could just subscribe to signals after startup (explicit by monitor or implicit via socket-activation) and monitor wouldn't attempt to contact individual services, but just send a signal and the sbus signal relaying code would deliver them to the right destination.
The sbus signals would also be useful to make a back end send a signal towards any interested services (for example, the files provider might want to send a signal to NSS and IFP that memory cache or negative cache needs to be invalidated) so we want to implement them anyway..
The other approach might be for each service to just expose a service management API on the system bus (for root only) and let the monitor start the enabled services by accessing their system bus inteface and taking advantage of the d-bus systemd integration. Something like:
[D-BUS SSSD NSS Service] Name=org.freedesktop.sssd.nss1 Exec=/bin/false User=root SystemdService=dbus-org.freedesktop.sssd.nss1.service
The advantage here would be that we wouldn't have to extend sbus with signals support. The obvious disadvantage is that we would depend on the system bus and systemd and would have to maintain a separate branch of non-systemd code..
Hmm. It really makes me think that as the first step we should try a simpler approach. Just send, from the responders, a sbus message to the monitor and let the monitor register them in case they're socket activated. It would require just a new dbus method and a small chunk of code in the common code for the responders.
What do you think about this? My personal preference would be to go simpler right now and improve it later on (considering we want to have the socket activated services for the responders done as soon as possible).
Yes, I guess we can start this way, also for backwards compatibility's sake.
sssd-devel@lists.fedorahosted.org