A Storage Story: NVMe over Fabrics and Fibre Channel

The next quantum leap in storage technologies is driving the need for faster storage fabrics to transport this data surge. Such an acceleration in storage demand and solutions requires storage transports to provide better overall performance as well as the ability to guarantee Service Level Agreements (SLAs). Of course, the storage solution I’m speaking about Non-Volatile Memory Express (NVMe) and its latest expansion into NVMe over Fabrics.

First some housekeeping. References to NVMe-oF and FC-NVMe are prevalent and cause confusion; these terms refer to two distinctly different entities. NVMe-oF is a term defining a set of transport protocols that utilize remote direct memory access (RDMA) to move data to/from NVMe storage systems. It’s also a term referencing the NVMe.org specification that defines the message passing and control parameters between the NVMe protocol and some transport protocols (RDMA). FC-NVMe is a term that describes the transport of NVMe traffic utilizing the Fibre Channel (FC) transport protocol; it also refers to the Fibre Channel standard specification defining the parameters to bind the Fibre Channel Protocol (FCP) to the NVMe storage solution.

The T11 organization created the new FC-NVMe standards specification that was published in August of 2017. This document defines the required NVMe Fibre Channel transport bindings to the NVMe storage protocol. Current FC-NVMe-2 standards development from the T11 will focus on areas such as sequence error correction, queue arbitration, and command cleanup.

The most prevalent storage protocol of today and yesterday is the Small Computer System Interface (SCSI). SCSI started in a similar fashion to NVMe, over PCI buses in local systems. It’s efficient but has limitations, such as queuing and stack simplicity when compared to NVMe. The newer NVMe protocol solves these issues by providing up to 64K queues and a smaller protocol stack requiring less copying of data during transactions.

NVMe can’t be talked about without mentioning solid-state disks (SSDs). One of the larger legacy latency issues associated with storage is the mechanical nature of disk drives and tape drives. SSD technologies have drastically reduced that storage latency. This latency reduction coupled with the lower latency of the NVMe protocol create today’s best latency-optimized storage solution. But how do we share this ultra-fast storage with the world?

Let’s Network It

The old networking transport wars were won by the Ethernet forces, but they weren’t able to stamp out all other transport factions such as Fibre Channel. Fibre Channel survived because of its focus on storage and reliability. It was able to provide the necessary performance due to its simpler protocol stack with Zero-Copy feature and provides superior SLA ability thanks to its buffer-to-buffer flow control feature.

The speed aspects of Fibre Channel continue to play hopscotch with that of Ethernet, each jumping over the next to claim being the fastest protocol available. Fibre Channel is also well-suited as an NVMe Fabric transport for its ability to allow both SCSI and NVMe traffic to run on the same fabric simultaneously. Competing forces are taking different approaches to solve this new problem. NVMe-oF also refers to RDMA transport technologies such as RoCEV2, iWARP, and InfiniBand. Each of these transports provide their unique advantages and disadvantages. There’s also work on NVMe-TCP.