Audit / Netlink slowness

Tyler Larson fedora-devel at tlarson.com
Thu Jun 16 14:44:18 UTC 2005


>>>Is this behavior to be expected?
>>
>>Not exactly. There will be a *some* delay as we've added a lot of new
>>functionality, but 800 ms total delay is excessive. I'll see if we can find the
>>culprit.
> 
> 
> Could you please also try an "strace -r" to make sure I'm
> not the only one seeing this?
> 

Here's one of the relevant parts of my strace -r (FC4, 1GHZ athalon).

What stands out to me is that is that the sendto() and recvfrom() operations 
aren't taking very long at all (remember, the relative time stamps are 
*between* system calls, not *elapsed* during system calls). And since 
recvfrom() is called with MSG_DONTWAIT, it's not going to block waiting for 
data. Instead, there's a 100ms sleep before each recvfrom() that really racks 
up the wait time. In total, I counted 9 places where su sleeps for 100ms 
before a recvfrom() -- so that's almost 1 sec killed just sleeping.

I haven't dug through the code yet (which package is it in, anyway?), but I 
can only imagine that the motivation for programming the thing like this (with 
nanosleep + recvfrom w/ MSG_DONTWAIT) is to limit the amount of time spent 
waiting for a reply. Obviously the reply takes *some* time to arrive, and 
calling recvfrom() without MSG_DONTWAIT wouldn't be a viable option--you might 
block forever. Hence, sleep long enough to give up a timeslice and hopefully 
let the reply be generated, then check to see if its there.

A more reasonable solution (assuming it's possible with NETLINK sockets--I 
don't know much about them) would be a select() with a maximum timeout of 
100ms. That gives you worst-case performance equal to what we see now, with 
the potential of much better.

Am I way off-base here, or does that sound like a good idea?

--------------

      0.000100 bind(3, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0
      0.000543 fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
      0.000289 readlink("/proc/self/exe", "/bin/su", 4095) = 7
      0.000495 sendto(3, "\20\0\0\0\350\3\5\0\5\0\0\0\0\0\0\0", 16, 0, 
{sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 16
      0.000371 nanosleep({0, 100000000}, NULL) = 0
      0.101499 recvfrom(3, 
"0\0\0\0\350\3\0\0\5\0\0\0Ey\0\0\0\0\0\0\1\0\0\0\1\0\0\0"..., 1216, 
MSG_PEEK|MSG_DONTWAIT, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 48
      0.000428 nanosleep({0, 100000000}, NULL) = 0
      0.101554 recvfrom(3, 
"0\0\0\0\350\3\0\0\5\0\0\0Ey\0\0\0\0\0\0\1\0\0\0\1\0\0\0"..., 1216, 
MSG_DONTWAIT, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 48
      0.000438 sendto(3, "l\0\0\0\355\3\5\0\6\0\0\0\0\0\0\0PAM session 
open"..., 108, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 108
      0.000600 nanosleep({0, 100000000}, NULL) = 0
      0.101949 recvfrom(3, 
"$\0\0\0\2\0\0\0\5\0\0\0Ey\0\0\0\0\0\0\20\0\0\0\350\3\5"..., 1216, 
MSG_PEEK|MSG_DONTWAIT, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 36
      0.000444 recvfrom(3, 
"$\0\0\0\2\0\0\0\5\0\0\0Ey\0\0\0\0\0\0\20\0\0\0\350\3\5"..., 1216, 
MSG_DONTWAIT, {sa_family=AF_NETLINK, pid=0, groups=00000000}, [12]) = 36
      0.000442 close(3)                  = 0




More information about the devel mailing list