Flash Is Dead: The Next Storage Revolution Is About CPUs and RAM

By August 18, 2014Storage

Alright, flash is not dead, it’s thriving. People love SSDs in their phones and laptops, because it’s so much faster than traditional hard drives. They are faster because they have lower latency, which is to say that they allow the computer to “wait less.”

SSDs operate in the millisecond and tenth of a millisecond time span, whereas typical mechanical hard drives range in the 6-10 millisecond range. That’s about 10x lower latency, which equates to twice as fast in the real world. You often do not find technologies that are ten times faster than the ones before them. But imagine something one hundred or a thousand times faster than even SSDs.

No problem for super computers costing millions of dollars. Just use CPUs and RAM, because they operate in the nanosecond realm, which is 1,000 times faster than a microsecond. We’re talking about going from 10x improvement in performance, to 100x or 1000x. Imagine an entire datacenter running in RAM with no disks. Stanford University has made the case, and they are calling it RamCloud.

“But imagine something one hundred or a thousand times faster than even SSDs.”

Cost aside, let’s try to put these types of potential speed increases into perspective and solve for cost later.

A Scenario To Consider

Let’s assume a typical 3.0GHz processor today in 2014 can perform some basic calculations and transfer data inside the chip itself in 10 nanoseconds, or 30 clock cycles. Perhaps the human equivalent in speed would be someone asking you to solve a simple math problem—what’s 2+4+3+4 in your head. You quickly add up that it’s 13 and it takes you 2 seconds from start to finish.

Now suppose that the CPU has to go back to DRAM for this data, because it doesn’t have the information handy to respond immediately. Going back to DRAM can take an additional 9-13 nanoseconds, even with today’s faster DDR3-1866 RAM. DRAM still runs at 200-300Mhz as a base clock speed, even if the bus speed itself is a lot higher. So going back to DRAM can double the time it takes for a CPU to execute on a task. It would take a human 4 seconds instead of 2. But twice as slow is nothing compared to how much slower storage is compared to RAM.

Continuing along the same lines as the math problem analogy, suppose that the math problem required all data to come from fast SSD-backed storage running at 500 microseconds (half a millisecond, or 20x quicker than a mechanic hard drive). Even with this fast storage, the simple math problem would require the computer’s CPU to spend 50,000 nanoseconds to complete the answer, when in fact it could do it roughly five thousand times faster than before. In human terms, the four second calculation of adding 2+4+3+4 would take you nearly three hours to complete.

“But twice as slow is nothing compared to how much slower storage is compared to RAM.”

man doing math on chalk board

In reality, you wouldn’t need storage to make such a simple calculation, because you could afford to keep your code small enough to cache in DRAM or in the CPU hardware registers itself. But the problem becomes much more pronounced when you have to go get real data from storage, which happens all the time with systems.

Perhaps a more complex math problem would best illustrate. Advanced math problems can require reading a paragraph describing the problem, looking at the textbook for a hint, going over notes from class, and finally scribbling it down on paper. This process can take minutes per problem.

If we used the same storage analogy in computing terms, a five-minute problem that could be solved completely in your head would instead take seventeen days to complete if we had to do it via the human equivalent of storage systems, which is to say, going back to the disk storage system and sending data back and forth. And that’s with SSDs. If we had mechanical drives, it would take nearly three months to do the problem. Imagine working on something all winter and completing it just as spring starts in. It had better be worthwhile, and I would say that looking for the cure to cancer, predicting tornadoes, or developing automated cars certainly are.

Today

So how does this play out in the real world today in 2014? Well, companies like Microsoft, EMC, Nimble, PureStorage, SAP, etc. are all taking advantage of using CPUs and RAM to accelerate their storage solutions. Today’s applications and users can wait milliseconds for data, because they were built to be used with mechanical hard drives, WAN connections, and mobile phones. So the storage companies are using CPUs and RAM to take in IO, organize data, compress it, dedupe it, secure it, place it in specific locations, replicate it, and snapshot it, because CPUs have so much time on their hands and can afford the nanoseconds to do so. They are using off the shelf Intel CPUs and DRAM to do this.

But the idea of waiting milliseconds today will seem absurd in the future. This lazy approach will someday soon change as CPUs and RAM continue to get faster than SSDs. In time, SSDs are going to be much too slow in computing terms, so we are going to see further advancements on the storage front for faster storage and memory technologies.

Things like PCM (Phase Change Memory), Spin-torque tunneling MRAM, Racetrack memory or DWM (Domain-Wall Memory) are technologies in development today. CPU frequencies are not increasing, but parallelism is, so the goal will be to place more RAM and storage closer to the CPU than before, and use more threads and cores to execute on data.

“In time, SSDs are going to be much too slow in computing terms, so we are going to see further advancements on the storage front for faster storage and memory technologies.”

Tommorow

If you have to wonder why CPUs and RAM are the keys to future storage performance, the reason is simply because CPU and RAM are hundreds of times faster than even the fastest storage systems out there today. Cost can be reduced with compression and deduplication.

And I’m betting that this speed discrepancy gap will continue for a while longer, at least over the next 3-5 years. Take a look at Intel and Micron’s Xeon Phi idea using TSV’s, which should make its way to commoditization in a few years. This will augment other advances in memory and storage technologies, driving the discussion from dealing with milliseconds of storage latency to microseconds and nanoseconds in the years to come.

Photo credits via Flickr, in order of appearance: Pier-Luc Bergeron; stuartpilbrow.