In RAMCloud, each individual server is effectively disk-less (disks are only used for back-up and not to service application reads and writes). All data is placed in DRAM main memory. Each server is configured to have high memory capacity and every processor has access via the network to the memory space of all servers in the RAMCloud. It is easy to see that such a system should offer high performance because high-latency disk access (a few milli-seconds) is replaced by low-latency DRAM+network access (micro-seconds).
An immediate architectural concern that comes to mind is cost. DRAM has a dollar/GB purchase price that is 50-100 X higher than that of disk. A server with 10 TB of disk space costs $2K, while a server with 64 GB DRAM and no disk costs $3K (2009 data from the FAWN paper). It's a little more tricky to compare the power consumption of DRAM and disk. An individual access to DRAM consumes much less energy than an individual access to disk, but DRAM has a higher static energy overhead (the cost of refresh). If the access rate is high enough, DRAM is more energy-efficient than disk. For the same server example as above, the server with the 10 TB high-capacity disk has a power rating of 250 W, whereas the server with the 64 GB high-capacity DRAM memory has a power rating of 280 W (2009 data from the FAWN paper). This is not quite an apples-to-apples comparison because the DRAM-bound server services many more requests at 280 W than the disk-bound server at 250 W. But it is clear that in terms of operating (energy) cost per GB, DRAM is again much more expensive than disk. Note that total cost of ownership (TCO) is the sum of capital expenditure (capex) and operational expenditure (opex). The above data points make it appear that RAMCloud incurs a huge penalty in terms of TCO.
However, at least to my initial surprise, the opposite is true for a certain large class of workloads. Assume that an application has a fixed high data bandwidth demand and this is the key determinant of overall performance. Each disk offers very low bandwidth because of the low rotational speed of the spindle, especially for random access. In order to meet the high bandwidth demands of the application, you would need several disks and several of the 250 W, 10 TB servers. If data was instead placed in DRAM (as in RAMCloud), that same high rate of data demand can be fulfilled with just a few 280 W, 64 GB servers. The difference in data bandwidth rates for DRAM and disk is over 600X. So even though each DRAM server in the example above is more expensive in terms of capex and opex, you'll need 600 times fewer servers with RAMCloud. This allows overall TCO for RAMCloud to be lower than that of a traditional disk-based platform.
I really like Figure 2 in the RAMCloud CACM paper (derived from the FAWN paper and reproduced below). It shows that in terms of TCO, for a given capacity requirement, DRAM is a compelling design point at high access rates. In short, if data bandwidth is the bottleneck, it is cheaper to use technology (DRAM) that has high bandwidth, even if it incurs a much higher energy and purchase price per byte.
|Source: RAMCloud CACM paper|
If architectures or arguments like RAMCloud become popular in the coming years, it opens up a slew of interesting problems for architects:
1. Already, the DRAM main memory system is a huge energy bottleneck. RAMCloud amplifies the contribution of the memory system to overall datacenter energy, making memory energy efficiency a top priority.
2. Queuing delays at the memory controller are a major concern. With RAMCloud, a single memory controller will service many requests from many servers, increasing the importance of the memory scheduling algorithm.
3. With each new DDR generation, fewer main memory DIMMs can be attached to a single high-speed electrical channel. To support high memory capacity per server, innovative channel designs are required.
4. If the Achilles' heel of disks is their low bandwidth, are there ways to design disk and server architectures that prioritize disk bandwidth/dollar over other metrics?