Datacenter efficiency is the new buzz word these days. Cloud computing (another buzz word!) essentially dictates the need for datacenter energy efficiency, and tree-hugger engineers are more than happy to embrace this shift in computer design.
There are a lot of sub-systems in a datacenter that can be re-designed/optimized for efficiency. These range from power distribution, all the way down to individual compute cores of a chip multi-processor. In this post, I am mostly going to talk about how general purpose microprocessor instruction set architecture (ISA) battles are currently raging at a datacenter near you.
To sum it up, Intel and ARM are getting ready for a battle. This is a kind of battle that Intel has fought in the past against IBM and Sun. The battle is between Reduced Instruction Set Computing (RISC) vs. complex Instruction Set Computing (CISC). To some extent, this battle is still going on. Intel’s x64 ISA is CISC, and it’s RISC competitors are Oracle's SPARC ISA and IBM's POWER ISA. Intel also has it’s own line of RISC processors called Itanium, and IBM has it’s mainframe z/Architecture which is CISC. We will ignore these last two because they are niche market segments and will focus on the commodity server business, which is mostly a shipment volume driven business.
One clarification to be made here is that Intel's x64 chips are "CISC-like" rather than being true CISC CPUs. Modern x64 chips have a front end that decodes the x86 Instruction Set Architecture (ISA) into RISC-like "micro-ops". These micro-ops are then processed by the actual CPU logic, unlike the true CISC operations executed by IBM's mainframe z CPUs, for example. To clarify this distinction, any reference to x64 CPUs as CISC processors in this article actually refers to x64 CPUs with this decoder front-end. The front-end decoding logic in x64 CPUs is reported to be a large consumer of resources and energy, and it's entirely possible that it's a significant contributor to lower energy efficiency of x64 CPUs compared to true RISC CPUs.
ARM is the new kid on the block in server racket, and it touts higher efficiency as it’s unique selling point. Historically, arguments that have been put forth for RISC claim that it’s a more energy efficient platform due to lower complexity of the cores. The reasoning boils down to the fact that some of the complexity of computation is moved from the hardware to the compiler. This energy efficiency is what ARM and other RISC CPU manufacturers claim to be their raison d'etre.
There is currently a lot of effort going into developing ARM based servers, especially since recent ARM designs support ECC and Physical Address Extension (PAE), allowing the cores to be “server grade”. This interest is unabated even though the cores are still a 32-bit architecture. Reports however claim that ARM is working on a 64-bit architecture. What is unclear at this point is how energy efficient ARM cores will be when they finally make it into servers.
The uncertainty about efficiency of ARM based servers stems from the fact that ARM does not manufacture it’s own cores. ARM is an IP company and it leases it’s core designs to third parties - the likes of TI, NVIDIA etc. This gives ARM both a strategic advantage, and a reason to be cautious of Intel’s strategy.
The advantage of licensing IP is that ARM is not invested heavily in the fab business, which is fast becoming an extremely difficult business to be in. AMD exited/spun-off it’s fab business as it was getting hard for them to compete with Intel. Intel is undoubtedly the big kahuna in the fab industry with the most advanced technology. This allows them to improve x64 performance by leveraging manufacturing process superiority, even when x64 inherently might be energy hungry. An example of this phenomenon is the Atom processor. Intel‘s manufacturing technology superiority is therefore a reason for concern for ARM.
The datacenter energy efficiency puzzle is a complex world. From the CPU point of view, there are five primary players - AMD, IBM, Intel, NVIDIA, and Oracle. AMD is taking it’s traditional head-on approach with Intel, and NVIDIA is attacking the problem from sideways by advancing GPU-based computing. They are reportedly working on ARM cores where they are building high-performance ARM-based CPUs to work in conjunction with their GPUs. This is done evidently to bring higher compute efficiency using asymmetric compute capabilities.
IBM and Oracle strategies are two fold. Besides their RISC offerings, they are also getting ready to peddle their “appliances” - which are vertically integrated systems that try to squeeze out every bit of performance from the underlying hardware. Ultimately, these appliances might rule the roost when it comes to energy efficiency, in which case one can expect HP, Cisco, and possibly Oracle and IBM to come out with x64 and/or ARM based appliances.
It’s unlikely that just one of these players will be able to dominate the entire market. Hopefully, in the end the entire computing eco-system will be more energy efficient because of innovation driven by these competing forces. This should make all the tree-hugging engineers and tree-hugger users of the cloud happy. So watch out for the ISA of that new server in your datacenter, unless of course you are blinded by your cloud!
Some interesting links:
Read the following article this morning, quite interesting perspective from NVIDIA.
ReplyDeletehttp://arstechnica.com/business/news/2011/02/nvidia-30-and-the-riscification-of-x86.ars
This is really getting interesting now, with Intel trying another trick using Atom processors >>
ReplyDeletehttp://www.theregister.co.uk/2011/03/01/seamicro_intel_proxy/
And today, Intel took this game to a whole new level!
ReplyDeletehttp://www.theregister.co.uk/2011/03/16/intel_xeon_e3_chips_micro_servers/
Hi Kshitij!
ReplyDeleteMy name is Gregory.
First of all, I'd like to thank you for your very interesting blog. Secondly, I've read from the blog: "One clarification to be made here is that Intel's x64 chips are "CISC-like" rather than being true CISC CPUs. Modern x64 chips have a front end that decodes the x86 Instruction Set Architecture (ISA) into RISC-like "micro-ops". These micro-ops are then processed by the actual CPU logic, unlike the true CISC operations executed by IBM's mainframe z CPUs, for example. To clarify this distinction, any reference to x64 CPUs as CISC processors in this article actually refers to x64 CPUs with this decoder front-end. The front-end decoding logic in x64 CPUs is reported to be a large consumer of resources and energy, and it's entirely possible that it's a significant contributor to lower energy efficiency of x64 CPUs compared to true RISC CPUs." So my question is do you any actual numbers for the power consumption of Intel x64 CPU by its modules? If you have, please share it with me via email (grimoroz@mail.ru). Thank you in advance!
Hi Gregory,
ReplyDeleteThere's not a lot of public documentation from Intel that breaks
down the power consumption of the frontend decode unit and the entire
decode unit. Part of my comment was based on non-Intel estimates I've
read in the past.
One reference to look at core level power breakdown is the following.
It gives the breakdown for a Pentium Pro core.
http://www.princeton.edu/~mrm/tutorial/Sigmetrics2001_tutorial.pdf