What is a computational storage drive? Much-needed help for CPUs
The inevitable slowing of Moore’s Law has pushed the computing industry to undergo a paradigm shift from the traditional CPU-only homogeneous computing to heterogeneous computing. With this change, CPUs are complemented by special-purpose, domain-specific computing fabrics.
The inevitable slowing of Moore’s Law has pushed the computing industry to undergo a paradigm shift from the traditional CPU-only homogeneous computing to heterogeneous computing. With this change, CPUs are complemented by special-purpose, domain-specific computing fabrics. As we’ve seen over time, this is well reflected by the tremendous growth of hybrid-CPU/GPU computing, significant investment on AI/ML processors, wide deployment of SmartNIC, and more recently, the emergence of computational storage drives.
Not surprisingly, as a new entrant into the computing landscape, the computational storage drive sounds quite unfamiliar to most people and many questions naturally arise. What is a computational storage drive? Where should a computational storage drive be used? What kind of computational function or capability should a computational storage drive provide?
Resurgence of a simple and decades-old idea
The essence of computational storage is to empower data storage devices with additional data processing or computing capabilities. Loosely speaking, any data storage device — built on any storage technology, such as flash memory and magnetic recording — that can carry out any data processing tasks beyond its core data storage duty can be called a computational storage drive.
The simple idea of empowering data storage devices with additional computing capability is certainly not new. It can be traced back to more than 20 years ago through the intelligent memory (IRAM) and intelligent disks (IDISKs) papers from Professor David Patterson’s group at UC Berkeley around 1997. Fundamentally, computational storage complements host CPUs to form a heterogeneous computing platform.
Computational storage even stems back to when early academic research showed that such a heterogeneous computing platform can significantly improve the performance or energy efficiency for a variety of applications like database, graph processing, and scientific computing. However, the industry chose not to adopt this idea for real world applications simply because previous storage professionals could not justify the investment on such a disruptive concept in the presence of the steady CPU advancement. As a result, this topic has become largely dormant over the past two decades.
Fortunately, this idea recently received a significant resurgence of interest from both academia and industry. It is driven by two grand industrial trends:
- There is a growing consensus that heterogeneous computing must play an increasingly important role as the CMOS technology scaling is slowing down.
- The significant progress of high-speed, solid-state data storage technologies pushes the system bottleneck from data storage to computing.
The concept of computational storage natively matches these two grand trends. Not surprisingly, we have seen a resurgent interest on this topic over the past few years, not only from academia but also, and arguably more importantly, from the industry. Momentum in this space was highlighted when the NVMe standard committee recently commissioned a working group to extend NVMe for supporting computational storage drives, and SNIA (Storage Networking Industry Association) formed a working group on defining the programming model for computational storage drives.