[ Updated July 9, 2024, with more accurate technical information on chunk size – see the end of this article. ]
Recently, David Young asked an interesting question:
“While setting up a RAID using three NVMe OWC Aura P12 SSDs, Disk Utility asks for the ‘chunk’ size. Answers on the Internet range from small (16K) to large (256K). Which should I pick.”
My recommendation was to use 64K for most video work, but told him I hadn’t run any tests to verify this. So, David decided to run some performance tests and send me the results. Here’s what we both learned.
WHAT IS CHUNK SIZE?
A RAID records data across multiple drives – either HDD or SDD – to increase storage capacity or improve speed.
Just as a pixel is the smallest discreet component of an image: a “chunk” is the smallest “block” in which to store data for each drive in the RAID.
For example, if you have a 10 KB text file and the chunk size is 256 KB, then that 10 KB of data is stored in a 256 KB block, with the rest of the block left empty. Conversely, with 16 KB chunks, there is much less wasted space when storing that 10 KB file.
UPDATE: This explanation is essentially correct for file systems, but not for RAIDs. Please see the technical note at the end for a more accurate answer.
The problem is that some applications, like databases prefer large chunks, while large files, like media, prefer smaller chunks. Smaller chunks save space, larger chunks are more efficient with i/ops – to a point. Both very small and very large chunks can be inefficient.
Yup, it’s confusing.
LARRY’S EXECUTIVE SUMMARY
While the chunk size makes a difference in system performance, the answer is more nuanced than “one size fits all.”
If you principally work with databases, large chunks are better because they allow faster i/o operations. If you create smaller office-type files, smaller chunk sizes are the better choice because you save space.
If you principally create larger files, somewhere in the middle is better. For most video work, a good option is to select 64K, or one step smaller.
NOTE: Chunk size is specified when the RAID is created. Changing the chunk size will delete all existing data on a RAID, so this should only be done when the RAID is new or empty.
DAVID’S GEAR
David writes: I typically edit music videos, using both 1080 and 4K frame sizes. I ran these tests using a 2020 iMac, with a 3.9 GHz 8-Core Intel i7 with 64 GB of RAM.
I was formatting an OWC Express 4M2 using three Aura P12 Pro M.2 NVMe SSDs. (Larry adds: These are typical high-speed SSDs which either run stand-alone or grouped into a RAID.)
DAVID’S PROCESS
The RAID was formatted using Apple Disk Utility and performance speed was measured using AJA System Test Lite (version 16.2.3). Each speed test was run once.
IMPORTANT NOTE: There is a significant variation from one run to the next of AJA System Test. I strongly recommend that all speed tests are run between 3 to 5 times and then average the results. The changes David reports could easily be due to this variability.
I used the ProRes codecs.
Larry adds: Selecting a specific codec slows down performance tests for both AJA and Blackmagic Design software. Why? Because all NLE’s convert video from whatever codec you are using into uncompressed 16-bit RGB files for editing. Part of these speed test utilities function is to measure the time it takes to convert between the codec and uncompressed RGB, because that’s what the NLE needs to do as well. If you want to get a more accurate measurement of storage performance, use 16-bit RGB as the codec type.)
LARRY INTERPRETS PERFORMANCE
This chart illustrates that smaller frame sizes benefit from smaller chunks. However, there’s only a 4% difference in write speed between 16 KB and 64 KB chunks; and exactly 1% difference in read speed.
In other words, smaller is better for this frame size, but not by a lot.
Here, the larger files of 4K video benefit from a slightly larger chunk size. While 64 KB now wins, there only a 6.2% difference in write speeds and 1.4% difference in reads when compared to 16 KB.
Here, larger is marginally better, but, again, not by a lot.
NOTE: In none of David’s tests did a 256 KB chunk size deliver the best performance for video files.
A CLOSER LOOK AT CODECS
This chart illustrates the time it takes to decode a codec – in this case ProRes – compared to fully uncompressed video according to AJA System Test Lite. Keep in mind that every codec requires time to decode, not just ProRes.
16-bit RGB is 25.6% faster writing and 17.1% faster reading 1080p files.
NOTE: However, ProRes files are multiple TIMES smaller than 16-bit uncompressed video. For virtually all editing, this performance “hit” is not significant.
When we move to UHD frame sizes, larger files transfer faster. 16-bit RGB has write speeds 20.3% faster, while read speeds are the same as 1080p at 17.1% faster.
SUMMARY
RAID chunk size determines the size of data that is written to an SSD or HDD RAID. As a general rule, when creating a RAID for media work, select a 64KB chunk size.
TECHNICAL UPDATE
Recently, a reader challenged me about the accuracy of my definition of chunk size. So, I contacted Tim Standing, VP of Software Engineering at OWC. Tim has decades of experience writing storage drivers for single hard disks, SSDs and RAIDs. I asked him if my definition was correct.
Your description of chunk size is correct for file systems. For APFS, the chunk size is always 4 KB so a 1 byte file takes 4 KB and a 4KB + 1 byte file takes 8 KB. For HFS+, it is a bit more complicated as the chunk size starts out at 4 KB and then grows by multiples of 2 as the volume gets bigger. The chunk size changes to 8 KB blocks for HFS+ volumes larger than 17.5 TB, and changes again to 16 KB blocks for volumes above 35 TB, etc.
With RAID volumes, the chunk size is called a “stripe unit size.” Another term, the “stripe size,” is equal to the stripe unit size times the number of disks which hold data in the RAID volume. For RAID 0 volumes, the number of data disks is the number of disks used for the RAID volume. For RAID 4 and 5, it is one less than the number of disks used and for RAID 6 volumes, it is 2 less than the number of disks. So a RAID 0 volume with 4 disks and a stripe unit size of 16 KB has a 64 KB stripe size whereas the same disks and stripe unit size has a stripe size of 48 KB for RAID 5.
RAID volumes operate at the speed of a single disk when accessing files which are approximately the same size as the stripe unit size. The speed increases as the file size approaches the stripe size because you are accessing more than one disk and all the disks can operate in parallel, giving you greater total speed. Once the file size exceeds the stripe size, the performance will plateau and reach a steady level, for any file size larger than the stripe size.
The above is true for all cases except writing to RAID 4 and 5 volumes. In this case, there is a penalty when writing the start and end of a file if they don’t perfectly line up with a stripe boundary. With files much larger than the stripe size, this penalty is much less as it only applies to the start and end, which now make up a much smaller percentage of the entire file.
THANKS!
I’m indebted to David Young for taking time to run these tests and share his data. Here’s a PDF of his test results. And to Tim Standing for explaining this in more accurate detail.