Lab 9: Power Estimation
Introduction
In embedded systems research there are several metrics we focus on: size, performance, and power. In systems such as mobile phones, PDAs, laptops, etc..., where we have a limited power supply. Thus, power consumption is of great concern. To see if our research efforts have any effect we must be able to measure the difference in power consumption between the original design and the modified one.
There are several different ways to obtain power consumption for a given design. There are a variety of tools such as Synopsys, Wattch, SimplePower, Spice, etc.. These tools range from obtaining power data based on instruction traces to obtaining power from the layout. As you can imagine there exist tradeoffs in power estimation tools as well. An instruction trace is easier and quicker to obtain then the full blown layout design. However power estimation using the layout is more accurate then instruction-level estimation. This is because at the layout level, you know exactly where the transistors are located, how many wires exist, and how long the wires are. However, on the other hand it takes more time and requires expensive tools to be able to synthesize your design to the layout level. It will also take much, much, much longer to simulate a design at the layout level.
PowerStone Benchmark Suite
How do you compare the modification you made with the modifications made by another research group? If each research group comes up with it's own set of experiments it is not possible to compare results using differing experiments. Instead there needs to exist a common set of tests available to be able to compare results between various research groups and to be able to test how a given modification will perform in a particular class of applications. PowerStone is a group of programs (also called benchmarks) which represent various embedded applications. By using this suite, various research groups can compare their results amongst each other fairly and more accurately.
| Benchmark | Description |
| bcnt | Bit Manipulation |
| binary | Binary Insertion |
| crc | Cyclic Redundancy Check |
| matmul | Matrix Multiplication |
| summin | Handwriting Recognition |
Obtaining Power Consumption of a i8051 VHDL Core
The 8051 is an 8-bit microprocessor originally designed in the 1980's by Intel that has gained great popularity since its introduction. Its standard form includes several standard on-chip peripherals, including timers, counters, and UART's, plus 4kbytes of on-chip program memory and 128 bytes (note: bytes, not Kbytes) of data memory, making single-chip implementations possible. Its hundreds of derivatives, manufactured by several different companies (like Philips) include even more on-chip peripherals, such as analog-digital converters, pulse-width modulators, I2C bus interfaces, etc. Costing only a few dollars per IC, the 8051 is estimated to be used in a large percentage (maybe 1/2?) all embedded system products.
The 8051 memory architecture includes 128 bytes of data memory that are accessible directly by its instructions. A 32-byte segment of this 128 byte memory block is bit addressable by a subset of the 8051 instructions, namely the bit-instructions. External memory of up to 64 Kbytes is accessable by a special "movx" instruction. Up to 4 Kbytes of program instructions can be stored in the internal memory of the 8051, or the 8051 can be configured to use up to 64 Kbytes of external program memory The majority of the 8051's instructions are executed within 12 clock cycles.
Suppose we wanted to see if we can modify this 8051 core
to consume less power. This first thing we would want to
do is to measure the power the original core consumes executing a
given program. If we were using Synopsys to obtain gate
level power estimation we would perform the following steps:
Behavioral Synthesis and Simulation


We have gone through high-level simplified steps required to obtain power of a design using Synopsys (For more detailed steps on synthesis and simulation you can refer to the dalton page). For the i8051 code, it takes only a couple of minutes to synthesize and simulate a behavioral level design. However, to synthesize the i8051 source code to gate level can take up to an hour. To simulate a gate level design can take anywhere from an hour to a week (sometimes more) depending on the code we are trying to simulate. The power analysis can take equally as long.
What happens when we want to run 10 benchmarks using anywhere from one to a dozen different i8051 configurations? We would end up having to wait weeks even months trying to obtain power data! Is there any way that would could obtain power data at a fraction of the time? Remember, the lower we go down (from behavior to layout) the longer it takes to simulate the design. Therefore, if we could estimate the power consumption at a higher level we could obtain power data at a fraction of the time, instead of hours we need only seconds. The problem with estimating power at a higher level is that we loose accuracy.
Power Estimation Tool
In this lab you will write a high-level power estimation tool.
Since we have limited resources and time, the estimation tool
will determine how much power an ALU would consume as opposed
to a microprocessor core such as the i8051. You will be given:
You job is to implement a high-level methodolgy for power
consumption while trying to preserve accuracy. Then using your
power estimation tool measure the power the ALU consumes
for each of these benchmarks.
(Note: You should analyze i8051_lib.vhd before you
analyze i8051_alu.vhd).
In previous labs, we have given you an introduction on how to use SimpleScalar, an 8051 simulator, and Synopsys. You can use these tools or any other you can find to assist you in developing you power estimation tool. Since these simulation tools are processor dependent you are allowed to choose if the ALU will be a component of an 8051 or MIPs processor.
Lab Questions
In your lab report, be sure to discuss the following: