Weird high load on server

Tue Jul 19 14:38:45 UTC 2005

> On 7/19/05, nodata <fedora at nodata.co.uk> wrote:
>> > Hi Guys,
>> >
>> > Hope you experts can help me out here.
>> >
>> > Basically I have server running at a very high load 2.44, although
>> > nothing is noticably high when using top. There aren't any processes
>> > running on the box except the standard linux OS tools. This box is
>> > used for backup, and only becomes active during the night.
>> >
>> > Its a compaq dl380 with a raid 5 configuration.
>> >
>> > Can anyone suggest what I can do to find out why the load is high?
>> >
>> > Thanks for your help in advance.
>> >
>> > Dan
>> >
>>
>> I bet you have hanging nfs mounts.
>> If the box is constantly at a load of around 2.44, and isn't sluggish, I
>> wouldn't worry.
>>
>> Look at iostat, sar, etc. to find out why the load is like that.
>>
>
>
> Hi
>
> I've looked at these but can't see anything. The server doesn't mount
> or export any filesystems using nfs or any other protocol. If it helps
> here are the various outputs:
>
>  uptime
>  14:45:49  up 62 days, 43 min,  2 users,  load average: 1.46, 1.57, 1.59
>
> sar 5 10
> Linux 2.4.21-27.0.4.ELsmp (orion.gs.moneyextra.com)     19/07/05
>
> 14:46:02          CPU     %user     %nice   %system     %idle
> 14:46:07          all      0.00      0.00      0.00    100.00
> 14:46:12          all      0.00      0.00      0.10     99.90
> 14:46:17          all      0.00      0.00      0.10     99.90
> 14:46:22          all      0.00      0.00      0.00    100.00
> 14:46:27          all      0.00      0.00      0.00    100.00
> 14:46:32          all      0.00      0.00      0.10     99.90
> 14:46:37          all      0.00      0.00      0.00    100.00
> 14:46:42          all      0.10      0.00      0.31     99.59
> 14:46:47          all      0.00      0.00      0.00    100.00
> 14:46:52          all      0.00      0.00      0.00    100.00
> Average:          all      0.01      0.00      0.06     99.93
>
> vmstat -a
> procs                      memory      swap          io     system
> cpu
>  r  b   swpd   free  inact active   si   so    bi    bo   in    cs us sy
> wa id
>  0  0      0  15404 189668 202836    0    0     3     1    0     2  3  4
> 1  3
>
> free -m
>              total       used       free     shared    buffers     cached
> Mem:           498        483         15          0        128        301
> -/+ buffers/cache:         53        445
> Swap:         1027          0       1027
>
> iostat
> Linux 2.4.21-27.0.4.ELsmp (orion.gs.moneyextra.com)     19/07/05
>
> avg-cpu:  %user   %nice    %sys   %idle
>            3.11    0.00    3.72   93.17
>
> Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
> /dev/ida/c0d0    19.68       427.93       279.15 2147483647 1400883506
> /dev/ida/c0d0p1
>                   0.00         0.22         0.00    1087144       8986
> /dev/ida/c0d0p2
>                   0.65         3.72        10.24   18680778   51401528
> /dev/ida/c0d0p3
>                   0.00         0.00         0.00        248          0
> /dev/ida/c0d0p4
>                   0.00         0.00         0.00          0          0
> /dev/ida/c0d0p5
>                   0.74         3.90         6.88   19570498   34517568
> /dev/ida/c0d0p6
>                   0.00         0.00         0.00        168          0
> /dev/ida/c0d0p7
>                   0.00         0.00         0.00        168          0
> /dev/ida/c0d0p8
>                  18.29       427.93       262.03 2147483647 1314955424
>
> top
>  14:47:51  up 62 days, 45 min,  2 users,  load average: 1.73, 1.61, 1.59
> 61 processes: 60 sleeping, 1 running, 0 zombie, 0 stopped
> CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle
>            total    0.4%    0.0%    0.0%   0.0%     0.0%    0.0%   99.5%
>            cpu00    0.9%    0.0%    0.0%   0.0%     0.0%    0.0%   99.0%
>            cpu01    0.0%    0.0%    0.0%   0.0%     0.0%    0.0%  100.0%
> Mem:   510400k av,  495224k used,   15176k free,       0k shrd,  132000k
> buff
>                     203040k actv,  182824k in_d,    6852k in_c
> Swap: 1052592k av,       0k used, 1052592k free                  308668k
> cached
>
>   PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU COMMAND
> 13100 root      20   0  1092 1092   888 R     0.4  0.2   0:00   0 top
>     1 root      15   0   512  512   452 S     0.0  0.1   1:18   0 init
>     2 root      RT   0     0    0     0 SW    0.0  0.0   0:00   0
> migration/0
>     3 root      RT   0     0    0     0 SW    0.0  0.0   0:00   1
> migration/1
>     4 root      15   0     0    0     0 SW    0.0  0.0   0:00   1 keventd
>     5 root      34  19     0    0     0 SWN   0.0  0.0   0:00   0
> ksoftirqd/0
>     6 root      34  19     0    0     0 SWN   0.0  0.0   0:00   1
> ksoftirqd/1
>     9 root      15   0     0    0     0 SW    0.0  0.0   0:00   0 bdflush
>     7 root      15   0     0    0     0 SW    0.0  0.0  70:21   0 kswapd
>     8 root      15   0     0    0     0 SW    0.0  0.0  23:07   1 kscand
>    10 root      15   0     0    0     0 SW    0.0  0.0   3:30   0 kupdated
>    11 root      25   0     0    0     0 SW    0.0  0.0   0:00   0
> mdrecoveryd
>    18 root      15   0     0    0     0 SW    0.0  0.0   0:00   0 ahc_dv_0
>    19 root      25   0     0    0     0 SW    0.0  0.0   0:00   0
> scsi_eh_0
>    23 root      15   0     0    0     0 SW    0.0  0.0   2:30   1
> kjournald
>   192 root      15   0     0    0     0 SW    0.0  0.0   0:00   0
> kjournald
>   193 root      15   0     0    0     0 SW    0.0  0.0  13:57   1
> kjournald
>   194 root      15   0     0    0     0 SW    0.0  0.0   4:18   0
> kjournald
>   568 root      15   0   576  576   492 S     0.0  0.1   0:57   0 syslogd
>   572 root      15   0   472  472   408 S     0.0  0.0   0:00   1 klogd
>   582 root      15   0   452  452   388 S     0.0  0.0   5:33   1
> irqbalance
>   599 rpc       15   0   600  600   524 S     0.0  0.1   0:22   0 portmap
>   618 rpcuser   25   0   720  720   644 S     0.0  0.1   0:00   0
> rpc.statd
>   629 root      15   0   400  400   344 S     0.0  0.0   0:18   0 mdadm
>   712 root      15   0  3160 3160  2024 S     0.0  0.6   3:22   1 snmpd
>   713 root      25   0  3160 3160  2024 S     0.0  0.6   0:00   0 snmpd
>   722 root      15   0  1576 1576  1324 S     0.0  0.3   4:58   1 sshd
>
> Anyone have any ideas. Literally the box is sitting there not doing
> anything that has been scheduled.
>
> This happens occassionally then the load spontaneously goes down. Do
> you reckon it has something to do with the raid 5?
>
> Thanks
> Dan
>

ps auxw | grep " D "