You’ve been reading plenty of “What’s coming in 2015” articles. Pfft, what fun are they? Robin Harris, a.k.a. StorageMojo, peers into his crystal ball to predict what storage will be like in 2025. And, he says, the next 10 years will be the most exciting and explosive in the history of data storage.
Today, storage has assumed the central place in IT infrastructures, just as computer pioneer Alan Turing noted in 1947:
“Speed is necessary if the machine is to work fast enough for the machine to be commercially valuable, but a large storage capacity is necessary if it is to be capable of anything more than rather trivial operations. The storage capacity is therefore the more fundamental requirement.” [Italics added]
I’ve been observing the storage market for over 35 years. In that time, many technologies, products, companies and markets emerged, matured, and – sometimes – disappeared.
Yet some trends have persisted for decades. Ever-decreasing cost per byte. Growing capacity demand. Storage growing as a percentage of data center spend. Storage as the critical management problem.
Storage is the most difficult infrastructure problem because data needs to persist. The ever-present enemy of storage is entropy, the universal process of organized systems – such as your data – becoming less organized over time.
Despite the challenges, though, storage is set for an explosion of new and better choices thanks to a combination of technical, business, and application factors. The next 10 years will be the most exciting and explosive in the history of data storage.
For decades we had RAM, disk, and tape limiting our storage palette. But now: Get ready for Technicolor storage in 3D.
The shadow IT industry – the secretive cloud-scale IaaS suppliers – is a key piece. But so are resistance RAM (RRAM), low-latency architectures, new applications, and the commoditization of most storage. Here’s an overview of some of the most critical changes affecting the next decade in storage.
The shadow IT industry is a couple of dozen Web scale service providers and the dozens of companies that serve them. It will continue to grow in power and influence. This numerically small but economically powerful group is creating products, such as Shingled Magnetic Recording (SMR) disks or terabyte optical discs for cloud vendors, that often will find broader applications in the enterprise.
The larger challenge is that enterprises will no longer have access to the latest and most cost-effective storage technology. Cloud providers will offer services with which few enterprises can compete.
However, this creates opportunities for those inside the cloud companies to leverage specialized technologies for enterprise and consumer consumption by creating new companies. Achieving this goal requires a tricky balancing act, since scaling down is almost as difficult as scaling up.
This is already happening. For instance, the founders of Nutanix, a hyperconverged systems company, included Google alumni who worked on its Web scale GFS storage system.
NAND flash has revolutionized storage in the last decade. But flash is hardly the ideal storage medium.
Flash writes are very slow – slower than disk. Endurance is limited; 1,000 writes is a common specification. Thanks to Moore’s Law, smaller feature sizes reduce endurance and speed. Power is inefficient, since writes take about 20 volts.
Engineering creativity has worked around most of these issues, but new Resistive RAM (RRAM) technologies are coming that eliminate flash’s problems.
There are several forms of RRAM, but they all store data by changing the resistance of a memory site, instead of placing electrons in a quantum trap, as flash does. RRAM promises better scaling, fast byte-addressable writes, much greater power efficiency, and thousands of times flash’s endurance.
RRAM’s properties should enable significant architectural leverage, even if it is more costly per bit than flash is today. For example, a fast and high endurance RRAM cache would simplify metadata management while reducing write latency.
Everyone loves the massive Input/Output Operations Per Second (IOPS) of flash storage. But those IOPS have uncovered another bottleneck: storage stack latency.
For decades, our working assumption has been that hard drives are latency’s limiting factor. We ignored the storage software stack’s contribution to latency.
Now the storage stack’s latency has come to the fore. We have to re-architect the rest of the storage stack to optimize it to cope with low latency.
Look at TPC-C benchmarks. You’ll see that even flash-based storage systems may have long tail latencies into the dozens of seconds. That simply shouldn’t be. Application and operating system software stacks shouldn’t have to deal with latencies like this when the underlying storage is capable of much higher responsiveness.
Reduced latency enables servers to do much more work with the same CPU cycles and cache capacity. More important: Long latencies create problems that are difficult to reliably engineer around.
Storage systems and network routers are the two last bastions of vertical integration. In the 1980s, the server world converted from vertically-integrated companies (such as DEC and Data General) to horizontal integration, with CPUs from Intel, operating systems from Microsoft, databases from Oracle, and applications from everywhere.
As a result of this transition, the server industry saw gross margins drop from 60%-70% to 30%. Servers became a box that a business ordered online and received a few days later.
That transition is long overdue for data storage. In fact, it is already happening. Look at what Google and Amazon charge for storage! But if enterprises hope to compete with cloud economics they will demand similar TCO economics as well.
It won’t be an easy transition, especially for legacy applications designed for specific underlying storage. But the sooner enterprises start using scale-out storage – as the big cloud providers already do – the better they will adapt.
Streaming data analysis is one example of new applications with unique storage requirements. But image analysis, video analysis, and other demanding storage-centric applications will enable new forms of data value. They’ll be made possible by infrastructure-busting performance and capacity demands.
Consider the relatively simple case of police body cam video storage. But to make it useful, a legal chain of custody has to be established, the data has to be searchable, and, of course, the storage has to be extremely cost-effective.
Think of the output of 50,000 high-definition (HD) police body cams. That’s a tremendous amount of data capacity, so it also needs to be readily searchable at high speed. High-performance, exabyte capacity, low-cost, tamper-proof storage – that is a 21st century challenge!
Electronic health records are another opportunity and concern. Once the currently range of EHR incompatibilities are ironed out (I’m thinking 5-10 years), researchers and healthcare professionals will be able to examine outcomes for millions of people in virtual drug trials.
We are just barely scratching the surface of what Big Data means for infrastructure architecture. There will be many surprises.
None of these trends is isolated. They all have spillover effects that, over the decade, will reshape the industry.
Few of today’s storage companies will emerge unscathed by the large drop in gross margins. Enterprises will have to rethink how they architect, justify, and provision infrastructure. The shadow IT industry will struggle with commercializing specialized cloud products.
But the good news is that by 2025 data storage will be much more robust, scalable, performant, and cost-effective than it is today. That’s a future I can live with.
More about storage technology:
Get a free trial of Druva’s single dashboard for backup, availability, and governance, or find out more information by checking out these useful resources: