Hi,
I'm looking to build a 1PB (usable) volume on a Fedora or Redhat platform. The volume (has to be a single volume) will be shared out by NFS. My considerations in the order of importance are:
Data loss Cost Speed
I am currently considering some less expensive storage arrays using 4TB SAS disks in 5+1 RAID 5 configurations (each array would be responsible for 10 such volumes). This will give me one level of protection. Then I'm going to propose a DR site with a full second set of storage. That will give me a 2nd level of protection.
Now I would like a 3rd level of protection and I thought about using GlusterFS. If I read it right, it would duplicate the data in a way so that it doesn't reside in a single place.Â
So, has anyone used Gluster? Does anyone have any other suggestions that might work for a situation like this? All options are currently on the table so it's play time! :)
Thanks!
On 01/09/2013 05:09 PM, aragonx@dcsnow.com wrote:
Hi,
I'm looking to build a 1PB (usable) volume on a Fedora or Redhat platform. The volume (has to be a single volume) will be shared out by NFS. My considerations in the order of importance are:
at this size i imagine that you need some speed with multiple concurrent accesses .. NFS is not up to the task .. i would advise to use lustre that will also give you the aggregation of space that you want
Data loss Cost Speed
I am currently considering some less expensive storage arrays using 4TB SAS disks in 5+1 RAID 5 configurations (each array would be responsible
if you value your data i would advise strongly against raid 5 (it will become lost as soon you have 1 disk down and 1 transient read error) IMHO you need RAID6 with volumes of 15 disks (in a 60 hdd 4U jbod you can have 4 RAID6 volumes)
HTH, Adrian
Hi there, some thoughts on that:
On 09.01.2013 17:09, aragonx@dcsnow.com wrote:
Data loss
this is a matter of backup strategy, you mean availability here?
Cost Speed
I am currently considering some less expensive storage arrays using 4TB SAS disks in 5+1 RAID 5 configurations (each array would be responsible for 10 such volumes).
afaik the biggest SAS disks are 900GB, you mean NL-SAS? Even with the 4TB disks (which I would not recommend since they are ridiculously overpriced compared to 3GB) in RAID + Hotspare we're talking ~300 Disks here. How do you plan to house, connect, power and monitor them?
Now I would like a 3rd level of protection and I thought about using GlusterFS. If I read it right, it would duplicate the data in a way so that it doesn't reside in a single place.
so you want a redundant storage system? Over what link would you plan to sync the two storages? This sound all very ambitious to me. Maybe you should consider buying an entry-level or even mid-range storage system like IBM DS3512 or similar...
Thanks!
regards Jens
Hi there, some thoughts on that:
On
09.01.2013 17:09, aragonx@dcsnow.com wrote:
Data
loss
this is a matter of backup strategy, you mean availability
here?
This is a long term archive and this data will only reside here.
afaik the biggest SAS disks are 900GB, you
mean NL-SAS? Even with the
4TB disks (which I would not
recommend since they are ridiculously
overpriced compared to
3GB) in RAID + Hotspare we're talking ~300 Disks
here. How do
you plan to house, connect, power and monitor them?
I calculated we would need 3 racks for the arrays, servers and switches. We have the space in our datacenter so that won't be an issue.
so you want a redundant storage system? Over what
link would you plan
to sync the two storages?
The application would probably write to both clusters at the same time. The application has two sites also and I would locate this storage close to it.
This sound all very ambitious to me. Maybe you
should consider buying
an entry-level or even mid-range storage
system like IBM DS3512 or
similar...
We have been looking at a Dell PowerValut group of arrays or the Compellent (BSD isn't so bad). I'll take a closer look at the IBM too. But since this data will exist no where else, I would like some additional redundancy above the security the array and DR site provide.
So far, complete solutions to handle all of this go well above my budget.
Thank you!
On 01/09/2013 06:04 PM, aragonx@dcsnow.com wrote:
I calculated we would need 3 racks for the arrays, servers and switches. We have the space in our datacenter so that won't be an issue.
Why 3 racks? in a 42U racks you can have 2 X : 1 X 2U server with 2 X RAID HCA and 1 X 10 Gbit NIC (2 ports) + 4 X 4U 60 HDD JBOD
that means 36U .. and with 4 RAID6 on each jbod that with mean : 8 * (60 -8) = 416 unformatted disks ... for 3 TB disk i think you can reach 1 PiB of formatted space ..
HTH, Adrian