I need to buy a new RAID for our company server. The one we have is both too slow and too full. I started thinking about this during this year’s NAB as I was talking to a variety of storage vendors about their latest products.
NOTE: A RAID (Redundant Array of Inexpensive Drives) is a collection of hard disks that are stored in a single box, connect via a single cable and, when attached to the computer, act as though they were one very large, very fast hard drive. RAIDs are used when you need more speed or storage capacity than a single hard drive can provide.
Our server is a fairly new Mac Mini with a Thunderbolt port, so I’m looking for a Thunderbolt RAID 5. (Here’s an article that describes what the different RAID levels mean.)
Our office network includes about twelve computers, three of which are wireless. Three wired computers do audio and video editing, while the rest handle standard office and web work.
The network is wired for gigabit Ethernet, as fiber or 10-gig Ethernet is way outside my budget. What this means is that the maximum data transfer rate between the server and wired computers is limited by the gigabit Ethernet Protocol, which is about 110 MB/second after overhead. Wireless devices will be much slower, depending upon which wireless protocol they support.
The genesis of this article came when I started to think about what gear to buy.
NOTE: Your network switch will often determine how efficient your network is. Low-cost switches often can’t support transferring data at full speed to all connected computers at the same time. For this reason, I upgraded to a Cisco SMB switch. It costs more, but avoids bottlenecks.
FACTORS TO CONSIDER
In thinking about this, I realized that there are five main factors to consider when buying a RAID:
The importance of each of these factors changes depending upon your needs. So, let me explain what they are and when you need to consider them for your own system.
STORAGE CAPACITY
The storage capacity of a RAID (or any hard disk) is measured in either Gigabytes or Terabytes. A Gigabyte is 1,024 Megabytes, while a Terabyte is 1,024 Gigabytes.
NOTE: In an effort to make storage numbers more accessible, many technology marketing departments describe a gigabyte as 1,000 MB, or a terabyte as 1,000 GB. However, hard disks don’t read marketing literature. After formatting, RAIDs will always store less data than printed on the packaging.
We are all familiar with picking a hard disk based on storage capacity. If we are storing small files, a smaller capacity is fine. If we are storing large files, more space will be necessary.
As you are deciding what size RAID to buy, keep in mind the adage: “It is impossible to buy a hard disk that is either too big or too fast.” Always buy a bit more than you need.
CONNECTION PROTOCOL
In the past, we had two choices on how to connect a hard drive to our computer:
Those protocols were good for the time, but either was very fast. In fact, the protocol was slower than the internal speed of a hard disk. (FireWire 800, for example, transfers data at about 85 MB/sec.)
Now, we have several additional new choices:
For Mac users, the last two options: mini-SAS and eSATA require PCIe cards or conversion boxes. While the protocols themselves are excellent, if you are starting fresh use Thunderbolt because it is easier to connect and as fast, or faster, than mini-SAS or eSATA.
I’ve been told that USB is optimized for smaller files – think office files – while Thunderbolt is optimized for larger files – think media files. When it comes to speed, these new protocols are EXTREMELY fast:
However, as you’ll see in the next section, theoretical speed and actual speed are radically different.
DATA TRANSFER RATE
The data transfer rate is the speed that data travels between the computer and the RAID (or hard disk). For RAIDs and other storage, we measure this in MB/second; higher numbers indicate faster performance.
What you need to understand about standard hard drives – also called “spinning media” – is that a single hard drive can only transfer data at about 120 MB/second. (SSD, or Flash drives, are much faster and we’ll talk about them in a minute.)
This means that if you need speeds faster than 120 MB/second, you need to group multiple hard drives to work together. This is what a RAID does – it groups a bunch of standard hard disks together so they can transfer data faster.
Just to help you think about this, here are some data transfer speeds of different codecs (actual data transfer rates will vary with image size and frame rate):
For multicam editing, multiply the speed of the codec you are using by the number of cameras you are editing.
Data transfer rate is the most important spec we need to consider for direct-attached RAIDS and drives. However, if we are attaching a RAID to a server, the data transfer rate is determined by the network protocol, in my case gigabit Ethernet, not the speed of the RAID. A fast data transfer rate is important, but not critical.
A SIDE NOTE ON SSD DRIVES
SSD stands for “Solid State Drive.” It takes a bunch of RAM and makes it look like a hard disk to the computer. SSDs provide all the speed of RAM with the permanent memory of spinning media.
The good news is that SSDs are very, VERY fast. The bad news is that they are very expensive and don’t store as much as spinning media.
Depending upon which controller the SSD uses and the type of NAND-based flash memory, SSD drives can attain speeds of more then 1.0 GB per second when playing back a single files sequentially. This is fast enough to fully fill a Thunderbolt 1 connection. However, SSD speeds slow down dramatically when performing random reads and writes, which is what a server requires.
In general, SSDs are an excellent choice for boot drives and, if you can afford them, for media drives that are direct attached. However, for small businesses, a standard hard drive is a better option for server storage because most of that speed is lost when transferring data over the network.
Here’s an excellent article that compares SSD drives with standard hard drives.
IOPS
While the data transfer rate is critical for direct-attached RAIDs, when we are attaching a RAID to a server, a different measurement becomes more important: IOPS (pronounced: “eye opps”). This is the number of Input / Output oPerations per Second the RAID can perform.
With a server, multiple users are accessing different files on the same RAID at the same time. IOPs measure how quickly the RAID can respond to all these different requests.
Since the overall data transfer rate is determined by the network – which is FAR slower than the native data transfer rate of the RAID – we need to concentrate on which RAID can process the greatest number of requests in the least amount of time within the budget that we have to work with.
NOTE: As you might expect, high-performance storage with high IOPS to meet the needs of hundreds of users costs in the tens-of-thousands of dollar range, and fills entire equipment racks with drives. While providing vast performance and storage, they are beyond the budget of most smaller shops, like mine. As with all things, we need to balance performance against budget.
Calculating IOPS involves some tricky math and varies depending upon the RAID level you are using. (Do a Google search for “Calculate IOPS” and you’ll see what I mean.) However, when you are buying a RAID for a server, check it’s IOPS rating. The higher the IOPS rating, the better the RAID will perform when multiple users are accessing the RAID at the same time.
RAID CONTROLLER
There are two types of RAID controllers: hardware-based and software-based.
When performance is important, look for a hardware RAID controller. When flexibility is more important, a software RAID controller may be a better choice.
SUMMARY
The media storage industry is in the process of transitioning from older protocols to Thunderbolt 1 and 2. I saw this in all the announcements that were made at NAB in April of this year. Supporting Thunderbolt, or USB 3, means that storage can be attached to any Mac without needing a PCIe card or converter box.
And, for users with a deep investment in PCIe cards, expansion options were offered from a wide variety of vendors, including ATTO, Sonnet, mLogic and Akitio.
However, most of these new products won’t be shipping for a while. So this gives me time to do my research and figure out which drives makes the most sense for what I need.
Ultimately, I will be buying two RAIDs: one for the server and one for high-performance direct-attached editing. Given what I’ve learned in researching this article, they won’t be the same product, because they don’t do the same job.
I’ll let you know what I decide. In the meantime, I’m always interested in your thoughts.
UPDATE – April 21, 2014
After I published this article, I realized that I forgot to include a link to other articles I’ve written on storage. There is a wealth of information here that can save you a ton of headaches: Storage Basics – Collected Articles
21 Responses to Specs to Consider When Buying a RAID [u]
Hi Larry,
Have you done a recent review on Drobo drives. I would like to hear your detailed assessment of them.
P.S. I don’t work for them, I have worked for clients that use them.
Phil
Philip:
I’ve reviewed most of their recent units. Use the search box at the top of my website and search for “Drobo.”
Larry
First note – we dropped “Inexpensive” for Independent long ago, so Redundant Array of Independent Disks. Inexpensive only applied during the time when there were two types of drives – drives that were aimed at small systems (inexpensive) and drives that were aimed at midrange to mainframe systems (very expensive).
Regarding performance, things can look very different from the trenches. Everything that you say about performance in the article is true, but not necessarily correct when sitting in the trenches.
For example, you mention the theoretical top speeds for the interfaces:
USB 3 has a theoretical speed of 640 MB/second
Thunderbolt 1 has a theoretical speed of 1.1 GB/second
Thunderbolt 2 has a theoretical speed of 2.2 GB/second
However, in reality users can reliably expect to see performance of:
USB 3 has a real speed of ~100 MB/second for a SINGLE drive, ~245 MB/sec for an array
Thunderbolt 1 has a real speed of ~700MB/second when using an array
Thunderbolt 2 has a theoretical speed of ~1.6 GB/second when using an array
When referring to IOPS, unless a configuration is being defined to support 100s of users on a very fast connection layer (think Fibre Channel or Infiniband), The level of IOPS created by even 10 or 15 editors accessing shared data is minuscule by comparison to American Express’ data center where IOPS are the real limiting factor of their operations.
Also, it’s important that users pay close attention to the case of the “B” in MB/Mb and the like. Big “B” means Bytes while little “b” means bits. For safe, easy comparisons, if the number is in little “b” bits, divide by 10 (even though there are 8 bits per byte) for a more real-world Bytes-related number.
As for hardware versus software-based RAID control, the difference is truly only noticed in very large scale implementations. For most arrays of 8 to 16 disks, either will do a good job for a normal user. The problems that users will run into are more often related to filesystem and interconnect type. Don’t plan on sharing a software based RAID array with more than 2 users. OTOH, a more expensive hardware RAID solution will provide for more robust management and later expansion.
None of these differences will hamper most editing operations, but it’s important that users know what to expect while their storage is in operation.
Tim:
Thanks for writing. I have seen SO many versions of what RAID stands for (Independent/Inexpensive and Drives/Disks/Devices) that I knew whatever I wrote would raise flags somewhere. I’m happy to use Independent in the future.
You are also VERY correct when you say there is a big difference between theoretical and practical speeds. I could not agree more! The big thing many people don’t understand is that regardless of how you connect a single drive, it will never fill the full bandwidth of USB 3 or any version of Thunderbolt. In general, on a new Mac Pro, I’m seeing faster speeds from Thunderbolt 1 than you are reporting.
I appreciate your comments on IOPS, this is something I’m still getting my head around.
However, I disagree with you on speeds from software vs hardware RAID controllers. My experience has been that even with smaller systems of 4 – 8 drives, hardware RAID controllers are more than twice as fast as software controllers.
By the way, I forgot to list other articles I’ve written about storage that go into more details. I’ll add an update with a link to that article.
Larry
Understood. The RAID Advisory Board adopted “Redundant Array of Independent Disks” as the official term back in 1994. My suspicion is they are the most correct source :).
“However, I disagree with you on speeds from software vs hardware RAID controllers. My experience has been that even with smaller systems of 4 – 8 drives, hardware RAID controllers are more than twice as fast as software controllers.”
We’ve now attacked this in our lab from the perspective of 4 different vendors’ SW-based RAID controllers and the only one that falls short is the Drobo and we relate that more to the 4800/5400RPM drives than their RAID algorithms. Promise is also getting dinged on performance of their RAID when in reality it’s their use of MUCH slower hard drives that causes the performance drop versus other solutions. What we have uncovered is that, assuming all OTHER elements are equal (7200RPM 64MB Cache drives, dedicated SAS I/O channel, more than 3 spindles), we don’t see the fall over in performance using a SW-based RAID algorithm until after 14 – 16 drives.
Tim:
1994, hunh? Hmmm… I should catch up on my reading.
Thanks for the update on software vs hardware RAID controllers. My experience was principally with Drobo, which tends to be desperately slow.
This is all good stuff – thanks!
Larry
Larry,
Good intro to a complex subject. I own RAIDs from WD, Promise and Drobo and produce videos. A couple of modest amendments:
– A gigabyte – GB – is officially 1,000 MB. SI, IEC and IEEE have all specified that mega, giga, tera etc. are base 10 metrics. If you want base 2 you need to specify it with prefixes like kibi. Macs started using the correct metric in Snow Leopard. The difference is important because as you go to larger capacities the percentage difference between base 10 and base 2 grows.
– SSDs use flash memory – like on a USB thumb drive – not DRAM, which is why SSD capacity costs so much less than DRAM. For media users, flash drives are excellent for large files if you can afford them. For servers a flash drive makes a fine boot drive, but because it is costly a RAID array makes more sense for large files.
One CRITICAL point: RAID arrays and SSDs can and do FAIL. They MUST be backed up just like any other single storage device or you could lose everything. Too many people don’t understand that and suffer enormous heartburn because of it.
Robin
“One CRITICAL point: RAID arrays and SSDs can and do FAIL. They MUST be backed up just like any other single storage device or you could lose everything. Too many people don’t understand that and suffer enormous heartburn because of it.”
Amen, brother Robin! Also, the false sense of security of a RAID array doesn’t cover the “oops” factor of “Oops, I didn’t meant to delete that file.” or “Oops, I didn’t mean to overwrite that file.” Only a solid backup can help there.
Hi Larry,
I have a Thunderbolt 2 RAID from Promise on order. I plan on using RAID 5. Will this preclude me from having a back-up drive of my files due to the redundancy on RAID 5? I currently have 2 smaller G-RAIDs, and I copy my files using Goodsynch.
Thanks,
Frank
@Frannk – RAID-5 only costs you the capacity of one disk and the redundancy is related to a parity/ECC stripe operation. This has no impact on the what that you utilize the result volume. What are you trying to accomplish with the new RAID unit?
TIm, I am buying for speed and extra capacity, but I want to protect my files from a drive failure. I back up my FCPX projects by Duplicating Projects by Snapshot. Am I looking at RAID 5 incorrectly? Thanks
Frank, you can back up – make that MUST back up – from your RAID to another storage device – hard drive, network storage, or another RAID – if the latter has enough capacity.
The RAID firmware or software handles the redundancy under the covers: all your PC sees is a storage device like any other. Thus the fact that you have a RAID is irrelevant – except for the larger capacity – to backing up.
Hope this helps. BTW, I’ve been very pleased with my Thunderbolt 1 Promise array.
Robin
Frank:
I agree with everyone. Backups, which make a copy of your files, are essential.
What a RAID-5 does is guard against losing data when a drive inside the RAID fails. However, there is only one copy of your files on the RAID, which is why a backup is critical.
Larry
Tim, Robin, Larry, Thanks for your help A two-part follow-up, if I may: Can I make my 2 smaller G-Raids (8GB and 6GB) into one virtual RAID drive as my back-up or does it make sense to partition my soon to arrive 24gb Pegasus into two partitions for backup purposes? Thanks, Frank
I would use Disk Utility to Strip (RAID 0) the two G-RAID units into one 14GB array. You could then readily use that resulting volume as your backup destination.
Strip should have been stripe…
Thanks Tim
Not wanting to step on Larry’s toes here, but for more in-depth discussion of RAID and drive speeds, check out my blog on the Cow from last month:
http://blogs.creativecow.net/Tim-Jones/archive/2014/02
Tim:
Not a problem – good information is always welcome.
Larry
Hi, Mary here. I’m still a newbie at all this, but considering purchase of a RAID drive. Our needs have more to do with flexibility and packability, since we travel frequently between two states. Right now we have a fusion iMac and a Thunderbolt 2TB ( plug-in ) LaCie drive for our media files, which we back up periodically to a 3Tb hard drive (firewire 2) The latter 3T is, sadly, partitioned (half devoted to still images).
I’d like to devote the 3TB drive exclusively to photo media backup. Does it make sense to replace the 3T LaCie with a mirrored RAID on grounds that it would take place the of this HD and be constantly backing up. (though I suppose one could argue we still need the back-up–perhaps to the old 2TB LaCie). Or does the process of writing simultaneously to 2 drives really slow down FCP X?
If so, what would be the best configuration? And what’s this about controllers? What is ideal in my situation, where compactness and flexibility is needed. I have no idea what’s involved in setting up nor why required. Sorry!
[…] Specs to Consider When Buying a RAID […]