On Fri, 2018-05-18 at 16:11 +0200, Sumit Bose wrote:
On Fri, May 18, 2018 at 02:33:32PM +0200, Pavel Březina wrote:
> Hi folks,
> I sent a mail about new sbus implementation (I'll refer to it as sbus2) [1].
Sorry Pavel,
but I need to ask, why a new bus instead of somthing like varlink ?
> Now, I'm integrating it into SSSD. The work is quite
difficult since it
> touches all parts of SSSD and the changes are usually interconnected but I'm
> slowly moving towards the goal [2].
>
> At this moment, I'm trying to take "miminum changes" paths so the code
can
> be built and function with sbus2, however to take full advantage of it, it
> will take further improvements (that will not be very difficult).
>
> There is one big change that I would like to take though, that needs to be
> discussed. It is about how we currently handle sbus connections.
>
> In current state, monitor and each backend creates a private sbus server.
> The current implementation of a private sbus server is not a message bus, it
> only serves as an address to create point to point nameless connection. Thus
> each client must maintain several connections:
> - each responder is connected to monitor and to all backends
> - each backend is connected to monitor
> - we have monitor + number of backends private servers
> - each private server maintains about 10 active connections
>
> This has several disadvantages - there are many connections, we cannot
> broadcast signals, if a process wants to talk to other process it needs to
> connect to its server and maintain the connection. Since responders do not
> currently provider a server, they cannot talk between each other.
This design has a key advantage, a single process going down does not
affect all other processes communication. How do you recover if the
"switch-board" goes down during message processing with sbus ?
> sbus2 implements proper private message bus. So it can work in
the same way
> as session or system bus. It is a server that maintains the connections,
> keep tracks of their names and then routes messages from one connection to
> another.
>
> My idea is to have only one sbus server managed by monitor.
This conflict wth the idea of getting rid of the monitor process, do
not know if this is currently still pursued but it was brought up over
and over many times that we might want to use systemd as the "monitor"
and let socket activation deal with the rest.
> Other processes
> will connect to this server with a named connection (e.g. sssd.nss,
> sssd.backend.dom1, sssd.backend.dom2). We can then send message to this
> message bus (only one connection) and set destination to name (e.g. sssd.nss
> to invalidate memcache). We can also send signals to this bus and it will
> broadcast it to all connections that listens to this signals. So, it is
> proper way how to do it. It will simplify things and allow us to send
> signals and have better IPC in general.
>
> I know we want to eventually get rid of the monitor, the process would stay
> as an sbus server. It would become a single point of failure, but the
> process can be restarted automatically by systemd in case of crash.
>
> Also here is a bonus question - do any of you remember why we use private
> server at all?
In the very original design there was a "switch-board" process which
received a request from one component and forwarded it to the right
target. I guess at this time we didn't know a lot about DBus to
implement this properly. In the end we thought it was a useless overhead
and removed it. I think we didn't thought about signals to all components
or the backend sending requests to the frontends.
> Why don't we connect to system message bus?
Mainly because we do not trust it to handle plain text passwords and
other credentials with the needed care.
That and because at some point there was a potential chicken-egg issue
at startup, and also because we didn't want to handle additional error
recovery if the system message bus was restarted.
Fundamentally the system message bus is useful only for services
offering a "public" service, otherwise it is just an overhead, and has
security implications.
> I do not see any benefit in having a private server.
There is no way to break into sssd via a bug in the system message bus.
This is one good reason, aside for the other above.
Fundamentally we needed a private structured messaging system we could
easily integrate with tevent. The only usable option back then was
dbus, and given we already had ideas about offering some plugic
interface over the message bus we went that way so we could later reuse
the integration.
Today we'd probably go with something a lot more lightweight like
varlink.
If I understood you correctly we not only have 'a' private
server but 4
for a typically minimal setup (monitor, pam, nss, backend).
Given your arguments above I think using a private message bus would
have benefits. Currently two questions came to my mind. First, what
happens to ongoing requests if the monitor dies and is restarted. E.g.
If the backend is processing a user lookup request and the monitor is
restarted can the backend just send the reply to the freshly stared
instance and the nss responder will finally get it? Or is there some
state lost which would force the nss responder to resend the request?
How would the responder even know the other side died, is there a way
for clients to know that services died and all requests in flight need
to be resent ?
The second is about the overhead. Do you have any numbers how much
longer e.g. the nss responder has to wait e.g. for a backend if offline
reply? I would expect that we loose more time at other places,
nevertheless it would be good to have some basic understanding about the
overhead.
Latency is what e should be worried ab out, one other reason to go with
direct connections is that you did not have to wait for 3 processes to
be awake and be scheduled (client/monitor/server) but only 2
(client/server). On busy machines the latency can be (relatively) quite
high if an additional process need to be scheduled just to pass a long
a message.
Simo.
Thank you for your hard work on sbus2.
bye,
Sumit
>
> [1]
https://github.com/pbrezina/sbus
> [2]
https://github.com/pbrezina/sssd/tree/sbus
> _______________________________________________
> sssd-devel mailing list -- sssd-devel(a)lists.fedorahosted.org
> To unsubscribe send an email to sssd-devel-leave(a)lists.fedorahosted.org
> Fedora Code of Conduct:
https://getfedora.org/code-of-conduct.html
> List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
https://lists.fedoraproject.org/archives/list/sssd-devel@lists.fedorahost...
_______________________________________________
sssd-devel mailing list -- sssd-devel(a)lists.fedorahosted.org
To unsubscribe send an email to sssd-devel-leave(a)lists.fedorahosted.org
Fedora Code of Conduct:
https://getfedora.org/code-of-conduct.html
List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:
https://lists.fedoraproject.org/archives/list/sssd-devel@lists.fedorahost...
--
Simo Sorce
Sr. Principal Software Engineer
Red Hat, Inc