Memory loss after long uptime

Konstantin Svist fry.kun at gmail.com
Fri Apr 27 21:10:44 UTC 2012


Hi all,

I have a strange recurring problem on some of my servers, maybe someone 
can help me figure out what might be causing it.

After running a MySQL machine with fairly high load for a while 
(~month), RAM usage in top stops making sense. Normally, RES column 
accounts for everything currently present in RAM (not swap) and 
corresponds pretty well with mem_used-buffers-cache. This is always the 
case after a fresh boot and seems to be the case on most other servers.
But that's not the case here:

59098540k used - 1633832k cached - 25824k buffers = 57438884 (54.7G) 
supposedly used by all processes
sum(RES column from top) ~= 35127296k (33.5G)

So where's the other 21G??

I'm pretty sure I'm not nitpicking here, this is >30% of total system 
RAM unaccounted for. I've tried stopping all non-OS specific processes 
(and restarting some services that seemed to eat more RAM that they 
should have (irqbalance)) -- and memory is not reclaimed.

Over the few years that I've seen this problem, I've already replaced 
all the hardware and upgraded Fedora from 8 to 14 (currently running 
2.6.35.14-95.fc14.x86_64) and MySQL and other code without any sign of 
improvement.

Interestingly, another machine with same hardware & OS runs MySQL in 
slave mode to replicate the DB -- that machine has uptime of 134 days 
and does not exhibit the same symptoms. In fact, here is its mem footprint:

63166496k used - 19786032k cached - 2881484k buffers = 40498980k (38.6G)
sum(RES column from top) ~= 45.5G (which makes sense since a few 
RAM-hungry processes share memory)

Please help!

-------------- next part --------------
top - 13:19:00 up 41 days,  2:29, 25 users,  load average: 3.44, 4.32, 3.97
Tasks: 349 total,   2 running, 347 sleeping,   0 stopped,   0 zombie
Cpu0  : 76.5%us,  0.0%sy,  0.0%ni, 23.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  0.0%us,  5.9%sy,  0.0%ni, 29.4%id, 64.7%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  5.9%us,  0.0%sy,  0.0%ni, 88.2%id,  5.9%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu4  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu5  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu6  :  5.6%us,  0.0%sy,  0.0%ni, 94.4%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu7  :  5.3%us,  5.3%sy, 15.8%ni, 73.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu8  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu9  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu10 :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu11 :  0.0%us,  0.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  66110332k total, 59098540k used,  7011792k free,    25824k buffers
Swap:  8388604k total,  4009604k used,  4379000k free,  1633832k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 3953 mysql     20   0 34.7g  30g 4000 S 80.0 48.5  57931:46 /usr/local/mysql/libexec/mysqld --basedir=/usr/local/mysql --datadir=/usr/local/mysql/var --user=mysql --log-error=/usr/loc
 3132 wrkr      20   0 2931m 2.4g 1476 R  5.7  3.8   1219:32 /usr/local/bin/python ...
10921 wrkr      20   0 9184m 483m 3080 S  0.0  0.7  43:03.02 /usr/local/bin/python ...
26682 wrkr      30  10  261m 118m 1048 S 11.4  0.2  11:47.18 mytop
 2548 wrkr      20   0  520m  94m 1388 S  0.0  0.1 116:57.50 /usr/local/bin/python ...
 4046 wrkr      20   0  181m  59m  228 S  0.0  0.1  17:30.62 SCREEN -a -A
 3009 redis     20   0  132m  45m  380 S  0.0  0.1 974:29.29 /usr/local/redis/bin/redis-server ...conf
 4228 nobody    20   0  302m  32m  132 S  0.0  0.1   0:14.33 nginx: worker process
 4232 nobody    20   0  302m  32m   44 S  0.0  0.1   0:10.04 nginx: worker process
 4230 nobody    20   0  302m  32m    0 S  0.0  0.1   0:14.73 nginx: worker process
 4226 nobody    20   0  302m  32m    0 S  0.0  0.1   0:14.49 nginx: worker process
 4231 nobody    20   0  302m  32m   56 S  0.0  0.1   0:13.01 nginx: worker process
 4233 nobody    20   0  302m  32m   60 S  0.0  0.1   0:14.36 nginx: worker process
 4235 nobody    20   0  302m  32m   32 S  0.0  0.1   0:14.35 nginx: worker process
 4229 nobody    20   0  298m  32m    0 S  0.0  0.1   0:15.37 nginx: worker process
 4227 nobody    20   0  302m  32m   12 S  0.0  0.1   0:15.49 nginx: worker process
 4234 nobody    20   0  302m  32m    0 S  0.0  0.1   0:15.46 nginx: worker process
10300 root      20   0  298m  32m    0 S  0.0  0.1   0:00.28 nginx: master process /usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf
 3185 wrkr      20   0  255m  17m  416 S  0.0  0.0  18:12.48 /usr/local/bin/python ...
 3067 wrkr      20   0  247m  12m    0 S  0.0  0.0  12:23.26 /usr/local/bin/python ...
 5186 gdm       20   0  443m 9644   28 S  0.0  0.0   1:26.96 /usr/libexec/gnome-settings-daemon --gconf-prefix=/apps/gdm/simple-greeter/settings-manager-plugins
 4031 root      20   0  133m 9564  684 S  0.0  0.0   2:32.52 /usr/bin/Xorg :0 -nr -verbose -auth /var/run/gdm/auth-for-gdm-205vfs/database -nolisten tcp vt1
 5280 gdm       20   0  547m 5728 2724 S  0.0  0.0   2:38.00 /usr/libexec/gdm-simple-greeter
30374 root      25   5  130m 5476 1716 S  0.0  0.0   0:00.21 /usr/local/mysql/bin/mysql --port=3306 ...
 3740 root      20   0 29888 3228    0 S  0.0  0.0   0:00.05 /bin/sh /usr/local/mysql/bin/mysqld_safe --datadir=/usr/local/mysql/var --pid-file=
 2019 root      20   0  200m 3028  672 S  0.0  0.0  20:43.13 /usr/sbin/snmpd -LS0-6d -Lf /dev/null -p /var/run/snmpd.pid
27868 root      20   0 22632 2816    0 S  0.0  0.0   0:00.03 tail -F /usr/local/mysql/var/log-error.err
23248 root      20   0 22632 2704    0 S  0.0  0.0   0:00.00 tail -F /usr/local/mysql/var/log-error.err
30435 root      25   5  127m 2560 1716 S  0.0  0.0   0:00.02 /usr/local/mysql/bin/mysql --port=3306 ...
 5116 gdm       20   0  139m 2320  244 S  0.0  0.0   0:14.86 /usr/libexec/gconfd-2
26225 root      20   0  129m 2164  980 S  0.0  0.0   0:00.18 zsh
 4357 root      20   0  130m 2140  680 S  0.0  0.0   0:01.55 /bin/zsh
 5266 gdm       20   0  390m 1912  392 S  0.0  0.0   0:21.57 metacity
15765 root      20   0  130m 1732  532 S  0.0  0.0   0:00.34 /bin/zsh
28228 root      20   0 94312 1592  524 S  0.0  0.0   0:01.02 sendmail: accepting connections
 4312 root      20   0  130m 1396   60 S  0.0  0.0   0:00.86 /bin/zsh
30437 root      20   0 15272 1332  828 R 11.4  0.0   0:00.10 top
30370 root      25   5  107m 1320 1124 S  0.0  0.0   0:00.00 /bin/bash /opt/...sh
30431 root      25   5  107m 1320 1124 S  0.0  0.0   0:00.00 /bin/bash /opt/...sh
11212 gdm       20   0  343m 1300  240 S  0.0  0.0   0:00.91 /usr/bin/gnome-screensaver --no-daemon
 4270 root      20   0  130m 1252    4 S  0.0  0.0   0:03.06 /bin/zsh
27698 smmsp     20   0 75960 1236  372 S  0.0  0.0   0:00.00 sendmail: Queue runner at 01:00:00 for /var/spool/clientmqueue
27407 root      20   0  114m 1172  864 S  0.0  0.0   0:14.99 /bin/zsh /opt/...sh
27408 root      20   0  114m 1144  864 S  0.0  0.0   0:07.28 /bin/zsh /opt/...sh


More information about the users mailing list