I'm not getting nearly the read speed I expected from a newly defined software RAID 5 array across three disk partitions (on the 3 drives, of course!).
Would someone kindly point me straight?
After defining the RAID 5 I did `hdparm -t /dev/md0' and got the abysmal read speed of ~65MB/sec. The individual device speeds are ~55, ~71, and ~75 MB/sec.
Shouldn't this array be running (at the slowest) at about 55+71 = 126 MB/sec? I defined a RAID0 on the ~55 and ~71 partitions and got about 128 MB/sec.
Shouldn't adding a 3rd (faster!) drive into the array make the RAID 5 speed at least this fast?
Here are the details of my setup:
# fdisk -l /dev/sda
Disk /dev/sda: 160.0 GB, 160000000000 bytes 255 heads, 63 sectors/track, 19452 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System /dev/sda1 1 127 1020096 82 Linux swap / Solaris /dev/sda2 * 128 143 128520 83 Linux /dev/sda3 144 19452 155099542+ fd Linux raid autodetect
# fdisk -l /dev/sdb
Disk /dev/sdb: 160.0 GB, 160000000000 bytes 255 heads, 63 sectors/track, 19452 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System /dev/sdb1 * 1 127 1020096 82 Linux swap / Solaris /dev/sdb2 128 143 128520 83 Linux /dev/sdb3 144 19452 155099542+ fd Linux raid autodetect
# fdisk -l /dev/sdc
Disk /dev/sdc: 500.1 GB, 500107862016 bytes 255 heads, 63 sectors/track, 60801 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System /dev/sdc1 * 1 127 1020096 82 Linux swap / Solaris /dev/sdc2 128 19436 155099542+ fd Linux raid autodetect /dev/sdc3 19437 60801 332264362+ 8e Linux LVM
The RAID 5 consists of sda3, sdb3, and sdc2. These partitions have these individual read speeds:
# hdparm -t /dev/sda3 /dev/sdb3 /dev/sdc2
/dev/sda3: Timing buffered disk reads: 168 MB in 3.03 seconds = 55.39 MB/sec
/dev/sdb3: Timing buffered disk reads: 216 MB in 3.03 seconds = 71.35 MB/sec
/dev/sdc2: Timing buffered disk reads: 228 MB in 3.02 seconds = 75.49 MB/sec
After defining RAID 5 with:
mdadm --create --verbose /dev/md0 --level=5 --raid-devices=3 /dev/sda3 /dev/sdb3 /dev/sdc2
and waiting the 50 minutes for /proc/mdstat to show it was finished, I did `hdparm -t /dev/md0' and got ~65MB/sec.
Dean
This off course is very logical...
Raid5 writes to all 3 disks at about the same time plus it has to write the crc/verification data which also causes some overhead.
so the average speed = 55+71+75 / 3 = 67...
So your speed measurement is correct...
Check out the difference from your raid0 config...
Raid0 writes to all disks simultaneously (so if you write 100mb it is 3 x 33,3mb on each disk)
If you add more disks your array does not necessarily have to become faster because of the overhead needed to be calculated...
http://en.wikipedia.org/wiki/Standard_RAID_levels
..
On Tue, 18 Sep 2007 09:12:31 -0700 (PDT) "Dean S. Messing" deanm@sharplabs.com wrote:
I'm not getting nearly the read speed I expected from a newly defined software RAID 5 array across three disk partitions (on the 3 drives, of course!).
Would someone kindly point me straight?
After defining the RAID 5 I did `hdparm -t /dev/md0' and got the abysmal read speed of ~65MB/sec. The individual device speeds are ~55, ~71, and ~75 MB/sec.
Shouldn't this array be running (at the slowest) at about 55+71 = 126 MB/sec? I defined a RAID0 on the ~55 and ~71 partitions and got about 128 MB/sec.
Shouldn't adding a 3rd (faster!) drive into the array make the RAID 5 speed at least this fast?
Here are the details of my setup:
# fdisk -l /dev/sda
Disk /dev/sda: 160.0 GB, 160000000000 bytes 255 heads, 63 sectors/track, 19452 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System /dev/sda1 1 127 1020096 82 Linux swap / Solaris /dev/sda2 * 128 143 128520 83 Linux /dev/sda3 144 19452 155099542+ fd Linux raid autodetect
# fdisk -l /dev/sdb
Disk /dev/sdb: 160.0 GB, 160000000000 bytes 255 heads, 63 sectors/track, 19452 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System /dev/sdb1 * 1 127 1020096 82 Linux swap / Solaris /dev/sdb2 128 143 128520 83 Linux /dev/sdb3 144 19452 155099542+ fd Linux raid autodetect
# fdisk -l /dev/sdc
Disk /dev/sdc: 500.1 GB, 500107862016 bytes 255 heads, 63 sectors/track, 60801 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System /dev/sdc1 * 1 127 1020096 82 Linux swap / Solaris /dev/sdc2 128 19436 155099542+ fd Linux raid autodetect /dev/sdc3 19437 60801 332264362+ 8e Linux LVM
The RAID 5 consists of sda3, sdb3, and sdc2. These partitions have these individual read speeds:
# hdparm -t /dev/sda3 /dev/sdb3 /dev/sdc2
/dev/sda3: Timing buffered disk reads: 168 MB in 3.03 seconds = 55.39 MB/sec
/dev/sdb3: Timing buffered disk reads: 216 MB in 3.03 seconds = 71.35 MB/sec
/dev/sdc2: Timing buffered disk reads: 228 MB in 3.02 seconds = 75.49 MB/sec
After defining RAID 5 with:
mdadm --create --verbose /dev/md0 --level=5 --raid-devices=3 /dev/sda3 /dev/sdb3 /dev/sdc2
and waiting the 50 minutes for /proc/mdstat to show it was finished, I did `hdparm -t /dev/md0' and got ~65MB/sec.
Dean
-- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
On Tue, 2007-09-18 at 18:26 +0200, Test wrote:
This off course is very logical...
Raid5 writes to all 3 disks at about the same time plus it has to write the crc/verification data which also causes some overhead.
so the average speed = 55+71+75 / 3 = 67...
So your speed measurement is correct...
I wouldn't have expected such poor performance for *reads*, which is what the OP complained about specifically. Even the web link you provided below states:
"The read performance of RAID 5 is almost as good as RAID 0 for the same number of disks. Except for the parity blocks, the distribution of data over the drives follows the same pattern as RAID 0."
So RAID5 should, presumably, be able to split the reads over multiple disks and achieve much better than disk-average performance when reading.
Check out the difference from your raid0 config...
Raid0 writes to all disks simultaneously (so if you write 100mb it is 3 x 33,3mb on each disk)
If you add more disks your array does not necessarily have to become faster because of the overhead needed to be calculated...
Alan M. Evans writes: : On Tue, 2007-09-18 at 18:26 +0200, Test wrote: : > This off course is very logical... : > : > Raid5 writes to all 3 disks at about the same time plus it has to write : > the crc/verification data which also causes some overhead. : > : > so the average speed = 55+71+75 / 3 = 67... : > : > So your speed measurement is correct... : : I wouldn't have expected such poor performance for *reads*, which is : what the OP complained about specifically. Even the web link you : provided below states: : : "The read performance of RAID 5 is almost as good as RAID 0 for : the same number of disks. Except for the parity blocks, the : distribution of data over the drives follows the same pattern as : RAID 0." : : So RAID5 should, presumably, be able to split the reads over multiple : disks and achieve much better than disk-average performance when : reading.
That was roughly my thinking (being a RAID N00BIE)
But even for writes, my thinking is (was?) as follows:
If I write 100 MB of data to the RAID 5, then the 100 MB gets split (roughly) into
a 50 MB piece for the 55MB/s disk, a 50 MB piece for the 71MB/s disk, and a 50 MB piece for the 75MB/s disk.
Two of these pieces are (striped) data, one is parity. The slowest drive determines the time it takes to complete the write:
T = max( 50MB/(55MB/s) , 50MB/(71MB/s) , 50MB/(75MB/s) ).
So the entire 100 MB of data is written (neglecging parity calcs) in T = 0.909 seconds and the avg. data rate is 100MB/.909s = 110 MB/s.
Where's my mistake?
What are others who run software RAID 5 seeing compared to the individual partition speeds?
Dean
On Tuesday 18 September 2007, Dean S. Messing wrote:
What are others who run software RAID 5 seeing compared to the individual partition speeds?
I have one server here with two ATA drives on a single motherboard PATA channel (master/slave setup), and two drives on an add-on ATA/133 PATA (two masters, no slaves). Here's the simple hdparm read test results: +++++++++++++++ [root@itadmin ~]# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid5 hdg1[3] hde1[2] hdb1[1] hda1[0] 480238656 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
unused devices: <none> [root@itadmin ~]# for disk in hda hdb hde hdg md0
do hdparm -t /dev/${disk} done
/dev/hda: Timing buffered disk reads: 170 MB in 3.02 seconds = 56.26 MB/sec
/dev/hdb: Timing buffered disk reads: 172 MB in 3.03 seconds = 56.70 MB/sec
/dev/hde: Timing buffered disk reads: 170 MB in 3.01 seconds = 56.55 MB/sec
/dev/hdg: Timing buffered disk reads: 170 MB in 3.01 seconds = 56.51 MB/sec
/dev/md0: Timing buffered disk reads: 372 MB in 3.01 seconds = 123.77 MB/sec [root@itadmin ~]# +++++++++++++++++
It would run faster if I put another PATA HBA in and put hdb on it; anytime you do a master/slave ATA setup you limit the throughput. A good rule of thumb is to make sure each PATA drive is master and alone on its channel; adding a slave drive to any of the PATA HBA ports will not increase (and will likely decrease) array throughput.
You might think that's not the case due to the way the numbers look above. Well, I tried a little test with four hdparm -t's running concurrently (this is a dual Xeon box, and handles the test nicely). Note how the two drives set master and slave slow when accessed concurrently: +++++++++++++ [root@itadmin ~]# hdparm -t /dev/hda & hdparm -t /dev/hdb & hdparm -t /dev/hde & hdparm -t /dev/hdg [2] 17631 [3] 17632 [4] 17633
/dev/hda:
/dev/hde:
/dev/hdb:
/dev/hdg: Timing buffered disk reads: Timing buffered disk reads: Timing buffered disk reads: Timing buffered disk reads: 106 MB in 3.01 seconds = 35.20 MB/sec 106 MB in 3.02 seconds = 35.06 MB/sec 170 MB in 3.02 seconds = 56.22 MB/sec 170 MB in 3.03 seconds = 56.17 MB/sec [1] Done hdparm -t /dev/hde [2] Done hdparm -t /dev/hda [3]- Done hdparm -t /dev/hdb [4]+ Done hdparm -t /dev/hde [root@itadmin ~]# ++++++++++++++++++++
SATA on the other hand is different.
If your box has multiple PCI buses put each HBA (particularly if they are 32-bit PCI and the drives are ATA/133 or SATA) on a separate bus if possible. The box above has three PCI-X buses; the motherboard HBA (along with the motherboard integrated U320 SCSI) is on one and the ATA/133 HBA is on another (the GigE NIC is on the third).
Note that 32-bit PCI on a single bus will throttle your total throughput to about 133MB/s anyway (33MHz PCI clock times 4 bytes transferred per clock). If you have PCI-e slots, even x1, getting PCI-e HBA's will dramatically improve total throughput if your drives can handle it, as each PCI-e lane can do 250MB/s (2.5GHz PCI-e clock; 8B/10B encoded).
Bonnie++ gives me a different picture: +++++++++++++++++ Size: 3072MB Sequential Output: Per Chr: 31.1MB/s Block: 43.8MB/s Rewrite: 25.3MB/s
Sequential Input: Per Chr: 33.1MB/s Block: 87.7MB/s
Random Seeks: 203.7 per second ++++++++++++++++++ Which is not too awful. (not great compared to my FC SAN's results, but I can't publish those results due to EULA restrictions).
For comparison, my laptop (single 7.2K RPM 100GB SATA, Intel Core 2Duo 2GHz 2GB RAM, Fedora 7): Filesize: 4096M Seq Writes: 30MB/s Block Writes: 30MB/s Rewrites: 17MB/s Seq Reads: 42MB/s Block Reads: 44MB/s Random Seek: 118.4/s +++++++++++++++
Lamar Owen wrote: <snip> : /dev/hda: : Timing buffered disk reads: 170 MB in 3.02 seconds = 56.26 MB/sec : : /dev/hdb: : Timing buffered disk reads: 172 MB in 3.03 seconds = 56.70 MB/sec : : /dev/hde: : Timing buffered disk reads: 170 MB in 3.01 seconds = 56.55 MB/sec : : /dev/hdg: : Timing buffered disk reads: 170 MB in 3.01 seconds = 56.51 MB/sec : : /dev/md0: : Timing buffered disk reads: 372 MB in 3.01 seconds = 123.77 MB/sec
Thanks very much Lamar.
I understand most of what you wrote in your very detailed email. Your numbers, above, are something like what I would expect.
Why do you suppose I get such a low number for my /dev/md0:
sda: ~56 MB/sec sdb: ~71 MB/sec sdc: ~75 MB/sec md0: ~65 MB/sec
True, I only have three SATA devices (the box won't take a 4th). I don't think this is a case of bus saturation. The 3 Sata cables all go to the motherboard
(Dell Precision 490 workstation, not sure what Mobo it is, SATA chipset is Intel 631/632 SATA AHCI)
but when I run the two fast drives in RAID0 mode `hdparm -t /dev/md0' gives ~140MB/sec.
(I take it you have not adjusted the "read-ahead" parms of the disks or md0? I found that it dramatically increases hdparm numbers, but I suspect there is a downside---something about "no free lunches".)
Dean