Hi everyone,
We've been discussing RAID support for quite some time as a team, and I wanted to extend a few questions to our existing user base to hopefully get a better idea of what our users would prefer/expect from RAID support.

Just to do some level-setting, Stratis's goals have always been a little bit different from LVM's and we aim to be an opinionated, easy to use option with reasonable defaults for catching a large percentage of users who have typical storage use cases. We do not aim to cover every feature or configuration and would recommend something like LVM with a filesystem for more complicated, advanced cases we do not support.

That being said, we have been discussing a few different options for RAID support, and wanted to hear from our user base about what cases they would ideally like to see supported.

The short list of options that we've generally considered are:
* RAID 1 only
* RAID 10 only
* RAID 1, RAID 10, and RAID 6 without supporting takeover between them

RAID 1 only
Pros:
* From a simplicity perspective, this is one of the simpler options. It's mirroring across devices, so it would be very easy to understand from a usage perspective.
* Easy to change the redundancy factor on a RAID array later (move from two copies of data to three, etc.)

Cons:
* Number of disks would have to be a multiple of the redundancy factor (redundancy factor of 2 means a multiple of 2 disks).
* Large storage overhead (minimum of 1/2 of the disk space is used for replication)

RAID 10 only
Pros:
* This allows a lot of flexibility, particularly in terms of adding devices. Adding a single device to an array is relatively easy because of the mirroring/striping combination of RAID 10 so there is no requirement that the number of disks is a multiple of the redundancy factor.
* Making a RAID array larger is very simple. Adding a single device adds half of the device's capacity to the RAID array's capacity.

Cons:
* Redundancy factor can not easily be changed after creating a RAID array so it's possible that a new pool would need to be created and data copied over to change the redundancy factor.
* Large storage overhead (minimum of 1/2 of the disk space is used for replication)

RAID 1, RAID 10, and RAID 6 without takeover support

(For those who are unfamiliar, RAID 6 is usually best suited to archival storage as it has the least storage overhead of the options listed above but write performance is much slower than with the other two. It also allows failure of two disks in the array with no data loss.)

Pros:
* This would allow users to select the RAID level that best suits their use case

Cons:
* This could be practically hard to implement if different RAID levels require different amounts of metadata
* May be confusing to users unfamiliar with RAID if each option requires a different number of disks or minimum number of disks.

With all of the tradeoffs outlined, which of these would be preferable? Which ones would you likely use Stratis for? Any feedback you're willing to share in the context of current usage would be appreciated.

Thanks!

--
John
he/him
Principal Software Engineer, Stratis team