The Parallel Systems and Computer Architecture Lab

PASCAL

Power Efficient Cache Design:

Introduction

For quite a while now, the power wall problem has been a major challenge in the design of computers. We have recently seen a change of design paradigm – the advent of parallel processing in order to keep the overall power dissipation manageable. Indeed, a parallel design such as on-chip many-core processors allows for a more convenient power control such as voltage scaling or power gating onto individual cores with a lower performance penalty. However, as the manufacturing technology continues shrinking the basic devices, other considerations than dynamic power consumption come into play.

Power dissipation consists of three parts: the dynamic power, the short-circuit power, and the static power dissipation. Dynamic power consumption had been a major challenge because of its larger share of the overall power budget. However, the growing importance of low power design and current sub-micron technology makes static power dissipation a more significant problem. As we can see in Figure 1, the leakage power has been exponentially increasing with time and it now dominates the overall power consumption. Actually, the leakage power also consists of various factors, such as the sub-threshold leakage, the substrate leakage, and the leakage through gate oxide. Among all the major factors which contribute to leakage power, gate oxide leakage has been significantly reduced with the new technology of high-K material. Therefore, our attention will be on controlling the sub-threshold leakage (note that the sub-threshold leakage is known to be proportional to , where VTh decreases as the technology scales down).

Many power saving schemes has been suggested to handle leakage power issue, but we notice that many issues are left unsolved. For example, very small attention on the peripheral circuits has been made compared to memory cells on the cache. Also, the issue raised by applying the power control schemes onto shared cache of many-core has not explored enough.

Goals and Approach

Compared to previous research projects, our approach takes a new direction:

Consider the overall cache design to save power. For example, many research projects have focused only on memory cell leakage power rather than including peripheral circuits.
Instead of adding power control mechanisms to the existing cache, we are considering new cache layouts for power efficiency to achieve more scalable design.

Current Results

We focused on peripheral circuits like word line driver. Indeed, they are quite large since they must drive many memory cells. In turn, this causes significant leakage power consumption.

In [1], the mechanism and policies to control the leakage power of peripheral circuits of L2 (last level cache) was discussed. Our power control policy and mechanism hide latency efficiently. In the given simulation setup in [1], the wake-up penalty is 10 Cycles, and the L2 cache latency is 20 Cycles. This comes at only a 2% performance loss, while achieving a 30% static power saving. The transistor budget is known to be extremely small, less than 2% of additional transistor budget.

Current Status

We are applying our efforts on shared cache of many-core architecture with our experience with [1].

Publications

[1] Adaptive Techniques for Leakage Power Management in L2 Cache Peripheral Circuits
H. Homayoun, A. Veidenbaum, and J-L. Gaudiot
Proceedings of the XXVI IEEE International Conference on Computer Design (ICCD 2008), Lake Tahoe, California, October 12-15, 2008

More is coming…