xinetd delays in in.rshd responses (cluster problem, long)

Tue Nov 8 09:26:19 UTC 2005

Tim Prendergast wrote:
> I have somewhat of a complex issue, hoping someone here may have some
> insight.
> 
> I have a beowulf cluster of systems for a scientific application we run.
> This cluster consists of 32 diskless slaves running fc4 w/ a monolithic
> kernel, and a master node running FC4 with a custom kernel. The master
> issues rsh commands to the slaves to grab a chunk of a job and process it,
> then return it.
> 
> My issue is this -- we have an old RH9 cluster that is similar in design,
> and the rsh commands (measured using `time rsh node2 uname -a`)  takes
> around 0.050s to complete. On the FC4 system, we are running around
> 0.650-1.35s to complete the same command. I've traced the delay to 
> xinetd or
> in.rshd, but am at a loss going any further. I've run some straces and I 
> can
> see the delay occur. I've pasted the straces below for reference.
> 
> Does anyone have any idea why this delay is happening? These systems are 
> all
> wired up over gig-e (0.1-0.2ms pings round trip) and running dual 3.4ghz
> Xeons w/ 2mb cache and 1gb mem in each slave, 4gb mem in the master. There
> is a lot of processing power here, so I can't see a reason for the delay.
> The PAM, rhosts, hosts.equiv, etc are all identical among the nodes (and 
> the
> clusters).
> 
[snip]
========================
> 
> Here you can clearly see the delay happen:
> <cut and paste section of interest from above>
> 16:18:55.297196 writev(3, [{"root\0", 5}, {"root\0", 5}, {"uname -a\0", 
> 9}],
> 3) = 19 <0.000038>
> 16:18:55.297332 read(3, "\0", 1)        = 1 <0.632181>
> 16:18:55.929633 rt_sigprocmask(SIG_SETMASK, [], [URG], 8) = 0 <0.000039>
> 16:18:55.929763 setuid32(0)             = 0 <0.000039>
> <end cut and paste>
> 
> It looks like it takes .63s to write the data to the socket and get the
> response, which I find hard to fathom (especially since anything outside of
> xinetd's realm appears to be really fast over the network).
> 
> Thoughts?
> 
> -Tim
> 

Can you strace the server-side, and also capture the packets on both interfaces 
to see what times they are sent/received.

The delay comes when the client sends the login information with the username. 
It could be a delay in mapping a username to a userid. What authentication 
mechanism is running on both systems? Is nscd running to cache the user 
information on both systems, or does it require a network lookup from NIS, LDAP etc?

-- 
Nigel Wade, System Administrator, Space Plasma Physics Group,
             University of Leicester, Leicester, LE1 7RH, UK
E-mail :    nmw at ion.le.ac.uk
Phone :     +44 (0)116 2523548, Fax : +44 (0)116 2523555