[HEADS-UP] systemd for F14 - the next steps

Fri Jul 23 02:12:53 UTC 2010

On Thu, 22.07.10 15:12, Simo Sorce (ssorce at redhat.com) wrote:

> > If a service A uses functionality provided by a service B which in
> > turn uses functionality provided by A then things willbreak regardless
> > whether systemd is used or not.
> 
> This is not true.
> SSSD is an example of that.
> 
> The nss_sss and pam_sss clients know to immediately give up if the
> sockets are not there because that means that sssd is not up yet.
> 
> Once sssd is up they will start working. This way if a service comes up
> 1st in the system and do some stupid enumeration getpwent() it will not
> block with timeouts or what not. After all if it is a system service
> and does an enumeration so early it is certainly not interested in
> non-system users.

Note that we install normal daemon sockets after sysinit ran and the
mount points are established. That basically means that during early
bootup the socket wouldn't be around aynway. Only when other normal
daemons start up the sssd socket would be accessible.

> If I were to use socket activation instead that service would bring
> sssd up unnecessarily early, before the network is up. This in turn
> will cause sssd to go in offline mode and have a 30 sec delay before it
> tries again to go online, failing any authentication for users that
> are not already cached (with credentials cached).
> 
> Now this will probably mitigated by the fact that we monitor network
> interfaces in newer versions and will try to get online earlier if an
> interface suddenly appears, but I hope you get the point.

Well, to be frank I think sssd would be broken if it enters a mode where
it systematically timeouts requests like this. The right fix is I
believe what you already mentioned: to watch network configuration via
netlink.

Networks today are dynamic. They come and go all the time. Connectivity
is not reliable. It cannot be assumed that a network is guaranteed to be
available on boot at all. You must follow network changes in cases like
yours and react to that.

Or in other words: think of a web browser which would instead of telling
the user in a clean message box "Network not available, blah blub" 
simply timeout after 30s hoping that a network might still show up. That
would be awful. 

So yes, cyclic dependencies can happen with and without socket
activation, maybe some become more visible with than without it. However
others might actually become less visible. But the fact is still that if
you have blocking calls in two ways between two services, then you will
enter a deadlock sooner or later. Maybe it won't happen during "normal"
operation, but it still be triggerable if a malicious user comes into
play. The right fix is to make the calls either non-blocking or simply
get rid of the cyclic dependency. And that fact doesn't change whether
socket actviation is used or not.

> > Cyclic dependencies cause deadlocks. Introducing systemd has little
> > effect on that. It won't make the situation worse, and it won't make
> > itmuch better either.
> 
> I beg to differ, you are changing how dependencies are handled, and can
> *cause* cyclic dependencies because of how you make socket activation
> work.

Well, in your sssd example above the cyclic dependency exists with or
without systemd. You try to work around this fact in saying "well,
I simply say that nobody could ever need my services before a certain
point P in time after boot, and I'll just refuse all services before
that." However, that is a really broken requirement to make. Why?
Because it basically means you imply an ordering dependency on your
socket creation, that is neither expressable with LSB init headers, nor
with systemd. The dependency you need is: "make sure to start
services A, B, C *BEFORE* me, and everything else *AFTER* me". Where A,
B, C is probably something like syslog, and "everything else" might be
stuff like ssh, policykit, accounts-daemon, the gettys or gdm which
might need your services because they offer services related to
non-system users. Neither systemd nor LSB can offer you this kind of
order dependencies. There is no such thing as expressing "before
everything but A, B, C". You currently nonetheless require this, and
that is simply broken. Because already with LSB your init script doesn't
actually do what you seem to believe it does.

And systemd or not systemd, and socket based or not socket based
actviation does not change anything about this. 

Also, if you look at sssd and a simple hypothetical syslog daemon which
looks up the user id of everybody connecting to it. If sssd is used this
will deadlock: sssd logs to syslog, and syslog uses NSS to resovle your
user id and hence sssd, and sssd won't reply because it is still
waiting for the syslog write to go through. And there you have it: your
system is deadlocked. You currently get away here, because you say:
"well, i assume that all current syslog implementations only look up
names via NSS during initialization for dropping privs and not anymore
during runtime. And hence if we refuse NSS lookups during initialization
we can get out of the problem." But that is a dirty trick, and sooner or
later syslog implementations might show up that actually use NSS during
normal operation too (for example, I am still waiting for a syslog
daemon which uses SCM_CREDENTIALS to receive trustable source
information for log messages, and this might very well look up the uids
in NSS then). Also, there might already be daemons which fall into the
trap you set up for them: for example i am pretty sure you find syslog
daemons which do NSS lookups during configuration reload or when
rotating log files. If that happens, and at
the very same time you try to log something then you have the deadlock:
you try to log and the logger waits for the NSS request to finish. And
all that during runtime, hours after sssd and syslog started up. And it
doesn't help that you refuse service to syslog during its installation.

Now, you might come up with the idea that the large socket buffer would
ensure you never block when logging. But that's not really of any
help. On fedora any user can log to syslog. So you have a rogue user
which connects a few /usr/bin/logger instances to /dev/urandom and he
can easily make sure that the socket buffer runs full all the time, and
your log write will hence have to wait until the syslog implementation
reads from the socket again. But it won't beacuse it waits for you.

So anyway. What I am saying is that you cannot evade the dep cycle loops
by making assumptions when other daemons validly or invalidly do NSS
lookups. You need to get rid of the dep cycles. Otherwise, you will
sooner or later run into deadlocks. Maybe they won't be triggered during
normal operation. But well, as soon as an evil user comes to play, then
you lost. (He actually can make the entire machine stand still this way:
if syslog deadlocks in an NSS call that sssd cannot process then no
syslog messages will be processed and step by step all daemons on the
system will come to a standstill as soon as they write something to
syslog.)

To get rid of dep cycles we have to declare which daemon may use which
other daemon. For example, for the case of mysql and syslog, we can say
that mysql is client and syslog is server and then be done with it. If
we look on sssd things are a bit different though: I think we must allow
syslog to execute NSS queries. And that basically means that sssd as
backend of NSS cannot be allowed to use syslog. (unless you write your
own client implementation for /dev/log which works asynchronously and
buffers messages locally if they cannot be written).

> > Or in even other words: this is a theoretical problem, not a practical
> > one, and orthogonal to the problem set systemd tries to solve.
> 
> I wish it were true.
> It is your attitude to ignore this kind of problems and just mark
> them as impossible or unimportant that really scares me. I look back
> and I see some of the pulseaudio failures all over again .oO("It's your
> kernel driver that sucks, I don't care, pulseaudio works fine with
> mine, so I won't even attempt a workaround").

Oh come on. I am not ignoring this. I gave you long explanations why I
think that dep loops like this are broken, independently of systemd. We
have thought about this issue too: but the simple fact is that your
software is broken if it has a dep cycle. Fix your software properly,
don't blame systemd that it doesn't allow you to employ "fixes" that
actually just hide the problem a bit longer.

Oh, and thanks a ton for the PA comparison. If you claim I said
something like you suggested, then please find me a quote. And otherwise
please don't make implications on what I might be thinking or not,
because well, PA might not be perfect but  it is certainly more
the trigger not the cause of the problems. If you'd have followed up on
this you'd have known that. And otherwise you are just spreading
FUD. but anyway, I think this is the wrong place to say any more about
PA here. 

> Ignoring scenarios just because they are complex to deal with does not
> make them go away. Software will not have the luxury to ignore them.

Well, it's you who is ignoring that his software is prone to
deadlocks. I am just pointing you to that.

Lennart

-- 
Lennart Poettering - Red Hat, Inc.