Analyzing Intel-Micron 3D XPoint: The Next Generation Non-Volatile Memory
by Kristian Vättö, Ian Cutress & Ryan Smith on July 31, 2015 11:00 AM ESTThe current mainstream memory technologies, namely DRAM (quick memory accessed by the processor) and NAND (solid-state storage), have been around for decades. While the cell designs have evolved over the years to allow scaling to 20nm and below, the fundamental physics behind DRAM and NAND operation haven't changed a bit and both technologies have their unique technological limitations. DRAM offers nanosecond-level latency and unlimited endurance, but this comes at the cost of large cell size, cell volatility, and power consumption. Since DRAM cells need to be constantly refreshed, the cells don't retain data in an off state, requiring quite a bit of power and making DRAM unsuitable for permanent storage. NAND, on the other hand, has much higher latency (especially write operations) and has a limited number of write cycles, but the cells are non-volatile and the structure is much more efficient, enabling low cost and suitability for storage.
Combining DRAM and NAND at the system-level architecture provides the best of both worlds, which is why modern computers use DRAM as a memory/cache and NAND for storage. However, there's still a latency and capacity gap between DRAM and NAND, so the question arises: what if you were to combine the best of DRAM and NAND at the silicon level? The mission of next generation memory technology across the industry has been to develop a new type of memory that provides low latency and high endurance while offering a small and scalable cell size.
We have seen numerous startups, such as Crossbar and Nantero, discuss and demonstrate their next generation memory technologies, but we have yet to see the established DRAM and NAND vendors come out with their solutions. Intel and Micron are here to change that with the announcement of their new 3D XPoint (Cross Point) non-volatile memory technology this week.
First and foremost, Intel and Micron are making it clear that they are not positioning 3D XPoint as a replacement technology for either NAND or DRAM, and in that scale it has been talked about more in its applications nearer NAND than DRAM. It's supposed to complement both and provide a technology that sits in between the two by filling the latency and cost gap exists between DRAM and NAND. Basically, 3D XPoint is a new tier in the computer architecture because it can be used as either slower, non-volitile memory or much faster storage.
DRAM | 3D XPoint | NAND | |
Endurance (P/E Cycles) | 10^15 | 10^7 | 10^3 |
Read Latency | Nanoseconds | 10s of Nanoseconds | ~100 Microseconds |
Intel and Micron are claiming that 3D XPoint provides up to a thousand times higher endurance than NAND. Assuming that the numbers are relative to modern (15-20nm) MLC NAND, the endurance should be in the order of a few million P/E cycles; though the marketing materials are claiming up to tens of millions of write cycles. If we assume 3 million write cycles (1000x of what modern MLC has), a 256GB 3D XPoint based drive would have a total write endurance of 768 petabytes. That's equivalent to 420TB per day for five years, or 4.9GB per second. For storage applications that currently rely on NAND, 3D XPoint will eliminate any potential endurance concerns, but it's not durable enough to challenge DRAM in that front since DRAM endurance is essentially infinite. Whether 3D XPoint provides enough endurance to replace DRAM ultimately depends on the application, but especially in certain enterprise workloads there's a need for DRAM.
3D XPoint latency should be in the order of 10s of nanoseconds, but the companies didn't specify whether this is read or write latency. Judging by the graphs provided by Intel, it seems to be read latency because NAND write latency would measured in milliseconds (typically 1-2ms for a full page write), whereas the graph puts NAND latency at tens of microseconds that is in line with NAND read latency. Write latency is likely higher than that, probably at least 100s of nanoseconds or even a few microseconds given Intel and Micron's claims of "up to 1000x faster than NAND", but what complicates things is that 3D XPoint is accessible at the bit-level whereas NAND is page-level, so comparing the latency of the two without extended context is quite difficult. In any case, 3D XPoint performance should be closer to DRAM than NAND, but since Intel and Micron aren't discussing any specific latencies yet it's too early to make any final conclusions.
Meanwhile unlike many next generation memory technologies out there at the moment, 3D XPoint is the furthest along and doesn't only exist on paper or in a lab. Intel and Micron are currently sampling the first generation die that is being produced at the companies' jointly owned fab in Lehi, Utah. The die is 128Gbit (16GB) in capacity, whereas the products that startup memory companies have in production are in the order of dozens of megabytes. The die is built on a 20nm node and consists of two layers, and in the future scaling will happen through both lithography shrinks and by increasing the number of layers.
The Utah fab has been producing 20nm NAND for now since Intel didn't invest on the 16nm shrink and all initial 3D NAND production will take place in Micron's Singapore fab, but it's unclear whether the full fab with its 20,000 wafers per month capacity will be dedicated to 3D XPoint from now on. My guess would be that 3D XPoint will gradually take over the full wafer capacity in Utah depending on how the market reacts to the new technology and how high demand Intel and Micron are seeing. 3D XPoint does require some new equipment for manufacturing since 3D XPoint deals with a whole new set of materials, but Intel and Micron said that the transition is quite similar to a new NAND node and allows some of the existing equipment to be used.
The companies aren't quoting any price per gigabyte yet, but since the whole function of 3D XPoint is to fill the gap between DRAM and NAND, it will also be priced accordingly. A quick look at NewEgg puts DRAM pricing at approximately $5-6 per gigabyte, whereas the high-end enterprise SSDs are in the range of $2-3. While client SSDs can be had for as low as $0.35, they aren't really a fair comparison because at least initially 3D XPoint will be aimed for enterprise applications. My educated guess is that the first 3D XPoint based products will be priced at about $4 per gigabyte, possibly even slightly lower depending on how DRAM and NAND pricess fall within a year.
80 Comments
View All Comments
FunBunny2 - Friday, July 31, 2015 - link
If you want to know what's being sold, go back and look up Unity Semiconductor's CMOx tech. Rambus bought them, then Rambus and Micron settled, including a patent sharing arrangement. The last Unity CEO said, just before Rambus bought them, that 2015 was production year. Could be.nwarawa - Friday, July 31, 2015 - link
I can't wait for this to be a normal conversation:A:"How much storage do you have?"
B:"256GB"
A:"RAM or on your drive?"
B:"Yes."
ajp_anton - Friday, July 31, 2015 - link
10^15 P/E cycles for DRAM? How does this work?, as typical DRAM does on the order of 10^16 cycles in a year. I'm assuming a P/E cycle is the same as a clock cycle because of the constant refreshing, is this wrong?Crazy1 - Saturday, August 1, 2015 - link
I had to look this up, but the DDR3 standard calls for at least 8 refresh commands every 7.8 usec. Rounding down to the nearest 50ns, means to one refresh every 950 ns. When calculated out, that equals roughly 3.32x10^13 cycles/year. That means DDR3 should survive up to 30 years with a 10^15 P/E cycles rating, while never turning off your computer or putting it in hibernate.In a refresh cycle, the information in a cell is read, then rewritten. There is no erase. I'm not sure the speed a typical P/E cycle occurs when erasing and writing new data is required. If it is significantly quicker than 950ns, there may be a decrease in lifespan from 30 years. However, unless you run intensive programs that delete and write new information to all memory cells every 32ns, you are not going to exceed the 10^15 P/E cycles in a year.
TallestJon96 - Friday, July 31, 2015 - link
Excellent work. Anandtech always has the best information and reviews, even if they are the last.This is pretty exciting stuff. If storage can become fast enough, then perhaps we will not need memory. Theoretically this would be a massive improvement to efficiency and performance. I would argue that the perfect computer would only have a processor and extremely fast storage. This is not enough to fill the gap, but storage is certainly catching up.
As a gamer, the idea of having my game loaded onto storage that is fast enough to not need to load into the memory is pretty appealing. Zero load time, no texture streaming issues, and potentially larger scale.
I have to wonder about bandwidth with this tech. Latency is clearly between ram and SSDs, but is closer to ram. But I haven't seen any solid bandwidth stats.
Freakie - Friday, July 31, 2015 - link
In the article they mention that gamers already can by-pass slow NAND and HDD speeds by just creating a RAMDisk. If you have 32GB of RAM, you could take 8GB of it for your system memory, turn the other 24GB into a RAM disk, and put all of your game files onto it and then your games will load their resources at the speed of your RAM.And DDR4 is coming down in price very quickly so it isn't such a crazy idea. The cheapest 32GB DDR4 kit I can find is $176 which means 64GB will cost you $350 for games that have 40GB of resources. While not incredibly cheap, it's also not totally unreasonable especially if you're already complaining about SSD's not loading game resources fast enough.
Friendly0Fire - Saturday, August 1, 2015 - link
Sadly, 24GB is a bit short for modern games and 8GB for the OS and the game is also a bit on the low side. Games are finally taking advantage of 64-bit executables (and thus far larger memory cap) and it's showing up as a dramatic increase in asset size, both on disk and in memory.64GB of RAM might get you there, but I think 32's on the short-ish side. 3D XPoint would side-step the issue by providing far more storage than contemporary games would likely need.
lordken - Sunday, August 2, 2015 - link
As said by Friendly0Fir 24GB is unfortunately nothing today, many games today have 20-50GB disk requirments (not sure if devs are plain lazy to optimize or they really need that much space for stuff)Plus dont forget that you need to first fetch data into ramdisk after boot, and wait it to flush it out before shutdown. So personally I would not bother with ramdisks, and probably load times doesnt solely depend on read time from storage only. On some games I didnt seen much difference between HDD and SSD load performance (which shows either bad game engine/coding or some other bottleneck, maybe my CPU).
And not to say leaving only 8GB for OS is really not that great.
JKflipflop98 - Monday, August 3, 2015 - link
Not to mention it's a giant pain in the butt to have to create the ram drive, copy all the files over, and then create all the links needed to actually run the game. By the time you're done futzing around with all that crap, you've cost yourself 10x the time you've saved in loading screens.lordken - Sunday, August 2, 2015 - link
"This is pretty exciting stuff. If storage can become fast enough, then perhaps we will not need memory. "imho this will "never" be true, RAM will always be faster, no matter how much you make storage faster you can still also improve RAM which in turn will always keep ahead of storage. Plus as shown in article it is much closer to CPU and thus better perf/latencies etc.
Maybe in case when Xpoint v3 reach performance level of DDR3/4 then diminishing returns could start to kick in , but still by that time we will probably have DDR5/6 or HBM3. So I think RAM will stick around, even if it could perhaps shift into CPU L4 like cache with HBM for example.