This article is part of the Technology Insight series, made possible with funding from Intel.
In our previous discussion of these competing protocols for solid state drives (SSDs), there was a big unanswered question: The NVMe specification first arrived eight years ago. Where’s all the adoption?
Sometimes it takes an industry insider to shed new light on tough subjects. We turned to Eric Pike, formerly of Fusion-io and now director of Western Digital’s enterprise devices group. The company looks to bolster its hard disk lineup (which increasingly focuses on high-capacity data center needs) with NVMe SSDs. Pike sees the technology as a perfect fit for the storage behemoth’s next generation. Our forty-minute discussion covered lots of landscape, but we’ve condensed it to focus on core issues for business.
VB: Where are we at now with NVMe adoption in the SMB and enterprise markets?
EP: It depends on whose numbers you look at, but for cloud providers, you have somewhere around 80% to 85% on a unit basis as well as a petabyte storage basis. But if you flip it around to SMBs and mainstream enterprises, you’re between 12% and 15%. And that includes direct-attached devices and appliances like EMC and NetApp.
VB: That seems…really low.
EP: If you’ve ever read the book Crossing the Chasm, that actually represents the front-end distribution you see before you cross the chasm. But why haven’t we crossed it? Fusion-io kind of jump-started putting storage on the PCIe bus, but we had a proprietary protocol, and everybody was waiting for a standard. Once the standard arrived, people thought, “This is the death of SATA and SAS. All of this performance with PCIe, now coupled with a standard interface — that’s it.” Fast forward, though, and here we are at 12% to 15% overall.
Inside Track Research Note, NVMe – The State of Play, Freeform Dynamics Ltd, 2019 (via WD blog)
VB: Who are the early adopters in that little wedge?
EP: The ones who can take advantage of it, right? They’re running high-frequency transactional systems; large-scale, high-performance databases; highly virtualized environments where the throughput of an NVMe device is very valuable. Even on Western Digital’s site and around the web, you’ll see copious examples of where NVMe applications make sense. That said, 85% of the application environment is still satisfied with “good enough.”
VB: Why isn’t NVME’s performance good enough to get us over the chasm?
EP: The very first comment [the author] makes in that book is that you have to take consideration of the whole product, which means it’s more than just a performance statement. And it’s not always about price, because there are pockets of the market where the price delta could be argued to be negligible. But more broadly, there’s a premium at the system level. Before you even put an NVMe device inside of a box, most of the ODM/OEMs — the guys building the boxes — have to add costs. They add higher-end solutions around NVMe. We’re trying to understand why that is. What are the drivers behind that? It’s probably too early to share anything, but we have some theories we’re looking through.
VB: So, system builders packer higher-end components around higher-end storage to get higher-end prices? There must be more to it than that.
EP: Oh, sure. When a customer buys a server, high availability, HA, is important. It’s one of the things that differentiates enterprise customers from client or even workstation customers. So, they tend to put drives in boxes using RAID configurations. However, the availability of hardware RAID for large-scale installations is still in its early stages on NVMe. You can get RAID controllers that support 12, 20, 24 SATA devices, but most state-of-the-art NVME RAID controllers support four devices. If a customer wants to put in a RAID with six or eight NVMe drives, they either have to create two RAID sets with two RAID controllers or use software RAID. Now, we know from the early days of NVMe and PCIe technology, you lower latency by direct-connecting to a CPU for software RAID, but those early instantiations were notoriously — I’ll use somebody else’s term — performance-challenged. You end up using a lot of CPU cycles. It’s getting better, though, and we are seeing improvements in that area.
VB: But not better enough?
EP: We still have remnants of barriers. The issue hasn’t yet been addressed to the point where a customer could have the equivalent of a SATA installation experience with an NVMe installation.
VB: So, what are the best roles today for NVMe with small/medium business and enterprise customers?
EP: Think of high-end traders and high-end analytics work. End users in those scenarios love NVMe. In terms of business value, the difference of a couple of hundred dollars in price just starts to sound like unnecessary noise.
VB: And tomorrow?
EP: The short answer is eventually everyone. Because NVMe has value with just about any workload in high-performance environments. And even without the raw performance benefits, you still have TCO benefits over the long run.
VB: Such as?
EP: You’ve probably seen reports about data center resource utilization numbers that are in the teens, right? Part of that is because they design their environment for peak workloads. There’s a lot of overprovisioning, and that also applies to storage performance. To hit a certain performance level, you need X sort of storage. But with hyperconverged infrastructure, you have a lot more control over scaling individual resources. You don’t have to overprovision everything all together. With hyperconvergence, NVMe is going to give you a lot more scaling efficiency.
VB: What specifically do you mean by scaling efficiency?
EP: It’s tied to access density, meaning how quickly I can access all the data on a drive. The higher the access density, the more efficient my use of that storage is. Think about the RAID rebuild time on a 16TB SATA drive. I don’t even know how long it is. Hours? Even days? NVMe over PCIe gen 4 is a fraction of the time, and gen 5 is going to halve that again in the near future. NVMe lets you scale storage capacity with scalable access density so you can access these large amounts of data.
VB: For businesses, is there more to the TCO discussion than scaling efficiency?
EP: At a large scale, power is definitely a TCO factor. Recall throughput. A SATA device does around 450 MB/s, right? A NVMe device capable of saturating the bus is going to give you six to seven times the performance. Now, that SATA device will run on 7W or 8W, while NVMe is about 25W. So, we’re talking about, say, 3.5X on the power delta but 6.5X on the performance delta. NVMe is more power efficient. There are also value NVMe devices down in the 11W to 14W range. You still get about four times the performance of a SATA device for less than two times the power. So again, if you’re talking about one drive in one system, you may not notice this, but if you have a fairly large-scale installation and TCO matters, these things start adding up.