2010 Poster Sessions : Energy Efficient Computing

Student Name : Curt Harting
Advisor : William Dally
Research Areas: Computer Systems
Computing systems today are power limited and undergoing an unprecedented shift toward parallelism. This trend is occurring both in embedded and super-computing systems. This poster focuses on the research currently taking place in Professor William Dally's Energy Efficient Computing group, both in the embedded ELM project and the supercomputing based ESC project.

In order to allow efficient embedded programmable solutions, the ELM project addresses the one of the largest deficiencies of traditional
processors: energy consumed by data movement in instruction and data caches. More than half of the energy dissipated by a traditional RISC processor is expended in the cache subsystem. The ELM data supply consists of one large traditional register file shared between two VLIW pipelines and a smaller register files located near the functional units. This arrangement allows different levels of the working set to be captured in small efficient memories. To reduce instruction issue energy, we use VLIW techniques and instruction register files. The use of VLIW-style instruction issue works particularly well on the dense linear algebra common to compute intensive embedded applications.
Instruction registers allow tight loops common to embedded applications to be captured in a small efficient memory.

The Efficient Supercomputing (ESC) project builts off the knowledge gained from the ELM project, and focuses on supercomputing chips and systems. Often the energy spent on the actual floating point operation is an order of magnitude less than the total energy per instruction.
This energy overhead is spent by data and instruction supply through complex cache hierarchies and coherence protocols. In supercomputing, a tradeoff exists between energy efficient multiprocessors and those that are easily programmable. Our goal is to reduce this unnecessary overhead in the memory subsystem by exposing existing architectural features and adding new features, while maintaining roughly the same programming model that is used in scientific computing today. The ESC architecture maintains a global, shared address space with hardware coherence but exposes more of the memory hierarchy than is currently typical. We support block, stride, and gather operations within the memory hierarchy. Also, we add active messaging support for the relocation of instructions instead of data. This allows for efficient atomic operations and thread creation. Exposing the concept of processor locality enables threads that share the same data to be colocated, minimizing communication costs. Through these features, we hope to significantly reduce the energy per flop, while maintaining programmability.

Curt Harting is a Electrical Engineering PhD student in Professor Bill Dally's group. His research is primarily focused on the design and implementation of energy efficient architectures. Harting received his MS in electrical engineering from Stanford Unversity and BSE from Duke University.