On Thu, Mar 5, 2015 at 9:48 AM, Alex Regan mysqlstudent@gmail.com wrote:
I currently have a 3TB backup system using five 1TB disks in RAID5. Restore times in case of disk failure are already exceedingly long,
Oh yeah, several things you need to check if you're using mdadm created RAID.
md/stripe_cache_size in sysfs defaults to 256, it needs to be higher, probably 1024 but do some reading to get more specific advice. Low values cause slow performance in general, but in particular rebuilds.
These too may be too low by default, in particular max. /proc/sys/dev/raid/speed_limit_max /proc/sys/dev/raid/speed_limit_min
A pernicious problem that's totally non-obvious and comes up on linux-raid@ list all the time. Mismatching SCT ERC and SCSI command timer. The former needs to be shorter than the latter. Both are per drive, not per array.
smartctl -l scterc <dev> cat /sys/block/sdX/device/timeout
And don't forget period scrubs: echo check > /sys/block/mdX/md/sync_action cat /sys/block/mdX/mismatch_cnt