Hi, folks
We've been really pleased with our 389 servers, which have been successfully running as a multi-master pair in production for 7 weeks, following (elapsed) months of development.
Unfortunately, in the last few days their performance has radically degraded to the point where they are becoming unusable due to excessive memory consumption. At first, we suspected a recent update to the package - but are no longer convinced that's the problem.
I'd really appreciate any suggestions on how to troubleshoot this further.
SYMPTOMS 389 ignores its 4GB userRoot nsslapd-cachememsize and its overall memory usage expands to encompass all of the server's 8GB RAM - and a large proportion of its 10GB swap. The service eventually fails, and hangs the server.
DETAILS As per the previous guidance for RHEL 6 installations, we had been using the packages from the COPRS repository (389-ds-base-1.2.11.32-1 aka 1.2.11.32 B2014.247.2316).
(As several of our applications require CoS/dynamic attributes, earlier versions of the RHEL packages were unusable for us until this fix was ported: https://fedorahosted.org/389/ticket/47762#comment:8 - which was only recently applied to the RHEL packages).
Due to an unfortunate misconfiguration, the production servers received automated updates - and were upgraded to RHEL 6.6, the latest COPRS package (1.2.11.32-1), and a new kernel :-/
The directory contains 236,340 entries, and id2entry.db4 is 3.6GB.
STEPS TAKEN We've tried to balance isolating the problem with maintaining a critical service for our users:
* An additional 8GB RAM (and additional CPU) was added to the VM to mitigate the immediate problem - but this was soon swallowed up too.
* Attempting to reproduce this on our equivalent development servers has not been successful, even when subjecting them to load.
* Downgrading to the RHEL packages (1.2.11.15-48 aka 1.2.11.15 B2014.300.2010) as per Noriko's very helpful procedure below (though I stopped the services prior to the downgrade). No obvious difference.
* Rebooting into the previous kernel (2.6.32-431.29.2). This slowed the problem, but only to an extent.
Not attempted yet:
* Turning on debugging on the production servers, as they're in a fragile and sluggish state.
* Building the latest 1.3.x build on RHEL 6. Will be attempting this soon...
* Installing RHEL 7.0 on a fresh VM, as we don't have experience with 7 yet.
* Downgrading to the previous COPRS package, as no longer publicly available(?)
Please let me know if more information would help...
Kind regards, Steve
From: 389-users-bounces@lists.fedoraproject.orgmailto:389-users-bounces@lists.fedoraproject.org [mailto:389-users-bounces@lists.fedoraproject.org] On Behalf Of Noriko Hosoi Sent: 25 October 2014 00:05 To: 389-announce@lists.fedoraproject.orgmailto:389-announce@lists.fedoraproject.org; General discussion list for the 389 Directory server project. Subject: [389-users] Please take an action: 389 Directory Server 1.2.11.X Discontinued for EL6
389 Directory Server 1.2.11.X Discontinued for EL6
The 389 Directory Server team announces the binary release of 389-ds-base version 1.2.11 for EL6 will be stopped via temporary COPR repository. We encourage you to switch it to the official version included in the Red Hat Enterprise Linux 6 distribution or its equivalent OS.
How to switch to the official version
Remove a yum repo file which points to the temporary COPR repository (e.g., nhosoi-389-ds-base-epel6-epel-6.repo) from /etc/yum.repos.d.
If the current 389 Directory Server 1.2.11 has the greater build number than 15, for instance, 1.2.11.32, downgrade it once by "yum downgrade" as follows.
yum downgrade 389-ds-base 389-ds-base-libs
Then, upgrade to make sure you have the latest version.
yum upgrade 389-ds-base
After upgrade completes, run setup-ds-admin.pl -u to update your directory server/admin server/console information.
setup-ds-admin.pl -u
See Install_Guidehttp://www.port389.org/docs/389ds/legacy/install-guide.html for more information about the initial installation, setup, and upgrade
See Sourcehttp://www.port389.org/docs/389ds/development/source.html for information about source tarballs and SCM (git) access.
http://www.port389.org/docs/389ds/releases/end-1-2-11.html
___________________________________________________________ This email has been scanned by MessageLabs' Email Security System on behalf of the University of Brighton. For more information see http://www.brighton.ac.uk/is/spam/ ___________________________________________________________
On 11/17/2014 01:12 PM, Steve Holden wrote:
Hi, folks
We’ve been really pleased with our 389 servers, which have been successfully running as a multi-master pair in production for 7 weeks, following (elapsed) months of development.
Unfortunately, in the last few days their performance has radically degraded to the point where they are becoming unusable due to excessive memory consumption. At first, we suspected a recent update to the package – but are no longer convinced that’s the problem.
I’d really appreciate any suggestions on how to troubleshoot this further.
*SYMPTOMS*
389 ignores its 4GB userRoot /nsslapd-cachememsize/ and its overall memory usage expands to encompass all of the server’s 8GB RAM – and a large proportion of its 10GB swap. The service eventually fails, and hangs the server.
*DETAILS*
As per the previous guidance for RHEL 6 installations, we had been using the packages from the COPRS repository (/389-ds-base-1.2.11.32-1/ aka /1.2.11.32 B2014.247.2316/).
(As several of our applications require CoS/dynamic attributes, earlier versions of the RHEL packages were unusable for us until this fix was ported: https://fedorahosted.org/389/ticket/47762#comment:8 – which was only recently applied to the RHEL packages).
Due to an unfortunate misconfiguration, the production servers received automated updates – and were upgraded to RHEL 6.6, the latest COPRS package (1.2.11.32-1), and a new kernel :-/
The directory contains 236,340 entries, and /id2entry.db4/ is 3.6GB.
*STEPS TAKEN*
We’ve tried to balance isolating the problem with maintaining a critical service for our users:
· An additional 8GB RAM (and additional CPU) was added to the VM to mitigate the immediate problem – but this was soon swallowed up too.
· Attempting to reproduce this on our equivalent development servers has not been successful, even when subjecting them to load.
· Downgrading to the RHEL packages (/1.2.11.15-48 /aka /1.2.11.15 B2014.300.2010/) as per Noriko’s very helpful procedure below (though I stopped the services prior to the downgrade). No obvious difference.
· Rebooting into the previous kernel (2.6.32-431.29.2). This slowed the problem, but only to an extent.
Not attempted yet:
· Turning on debugging on the production servers, as they’re in a fragile and sluggish state.
· Building the latest 1.3.x build on RHEL 6. Will be attempting this soon…
· Installing RHEL 7.0 on a fresh VM, as we don’t have experience with 7 yet.
· Downgrading to the previous COPRS package, as no longer publicly available(?)
Please let me know if more information would help…
valgrind output would be extremely useful, but it will definitely impact performance while in use. If you need help with this we can take it offline.
So you still see an issue with 1.2.11.15-48 as you did with 1.2.11.32-1? But you didn't see the issue until the systems upgraded and you went to RHEL6.6/1.2.11.32-1, correct?
Thanks, Mark
Kind regards, Steve
*From:*389-users-bounces@lists.fedoraproject.org mailto:389-users-bounces@lists.fedoraproject.org [mailto:389-users-bounces@lists.fedoraproject.org] *On Behalf Of *Noriko Hosoi *Sent:* 25 October 2014 00:05 *To:* 389-announce@lists.fedoraproject.org mailto:389-announce@lists.fedoraproject.org; General discussion list for the 389 Directory server project. *Subject:* [389-users] Please take an action: 389 Directory Server 1.2.11.X Discontinued for EL6
389 Directory Server 1.2.11.X Discontinued for EL6
The 389 Directory Server team announces the binary release of 389-ds-base version 1.2.11 for EL6 will be stopped via temporary COPR repository. We encourage you to switch it to the official version included in the Red Hat Enterprise Linux 6 distribution or its equivalent OS.
How to switch to the official version
Remove a yum repo file which points to the temporary COPR repository (e.g., nhosoi-389-ds-base-epel6-epel-6.repo) from /etc/yum.repos.d.
If the current 389 Directory Server 1.2.11 has the greater build number than 15, for instance, 1.2.11.32, downgrade it once by “yum downgrade” as follows.
|yum downgrade 389-ds-base 389-ds-base-libs|
Then, upgrade to make sure you have the latest version.
|yum upgrade 389-ds-base|
After upgrade completes, run *setup-ds-admin.pl -u* to update your directory server/admin server/console information.
|setup-ds-admin.pl -u|
See Install_Guide http://www.port389.org/docs/389ds/legacy/install-guide.html for more information about the initial installation, setup, and upgrade
See Source http://www.port389.org/docs/389ds/development/source.html for information about source tarballs and SCM (git) access.
http://www.port389.org/docs/389ds/releases/end-1-2-11.html
This email has been scanned by MessageLabs' Email Security System on behalf of the University of Brighton. For more information see http://www.brighton.ac.uk/is/spam/ ___________________________________________________________
-- 389 users mailing list 389-users@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/389-users
389-users@lists.fedoraproject.org