On Wed, Aug 08, 2007 at 09:26:08AM +0200, Andrey Ivanov wrote:
Hi,
JB> Hello all, currently we have a FDS instance running on RHEL4 with a
JB> small number of entries (6,000), we also have a linux compute cluster of
JB> 100 nodes which uses LDAP for user account data (via libnss_ldap).
JB> nss_ldap on the cluster is configured to use SSL, and everything is fine
JB> most of the time. However, occasionally, when a large job is started on
JB> the cluster, the number of connections increases from 100/minute to
JB> 1600/minute (26/sec).
JB> This causes the server to become generally unresponsive, and FDS
JB> especially so (as judged by the time required to retrieve the DSE via
JB> TLS). Which is a right pain as it causes our samba PDC to timeout and
JB> everything goes wrong very quickly.
JB> I can reproducably, impact on FDS performance by running:
JB> $ getent passwd | cut -d: -f 1 | while read i; do id $i; done
To reduce substantially the number of LDAP (or NIS) requests we use
the nscd (Name Service Caching Daemon). The result is that the number
of LDAP requests is decreased easily by one order of magnitude... Give
it a try and tune the /etc/nscd.conf :)
I have considered nscd, but I've had bad experiances with it in the
past when we ran NIS - usually due to entries being changed and the new
entry then not being seen on the clients at the same time. We could
avoid that problem by setting the nscd time out low enough but it's
another piece of client config that I'd rather avoid.
Andrey Ivanov
tel +33-(0)1-69-33-99-24
fax +33-(0)1-69-33-99-55
Direction des Systemes d'Information
Ecole Polytechnique
91128 Palaiseau CEDEX
France
--
Jonathan Barber
High Performance Computing Analyst
Tel. +44 (0) 1382 386389