| Instructor: | Frank Vahid (vahid@cs.ucr.edu). Office hours TR 2-3, Bourns A207 |
| Class meeting time | TR 3:10-4:30, GEOL 1429 (course call# 16330) |
| Prerequisites | None, though digital design knowledge helpful |
| Textbooks | None -- all readings will be online papers. |
| Grade | Based on presentations, participation and a few possible homeworks. |
Title: International Roadmap for Semiconductors
1999 (ITRS)
Summary:
Back to listings
Title: VSIA's Architecture Document
Summary: The VSIA architecture document described requirements
for Virtual Component(VC) designers to aid in design reuse. The
main focus of this document is to provide a set of standards for
different levels of VC's and their requirements. They classify
virtual component as being either soft VC's (RTL descriptions),
firm VC's (ranges form partially mapped RTL descriptions to fully
placed netlists), and hard VC's (fully mapped and placed
netlists). Along with each of these VC's is a set of documentation
and required models design with the intention of reducing the
effort involved with integrating a VC into a design. These
requirements include documentation on interface, timing,
electrical characteristics, test procedures, and models.
Back to listings
Title: VSIA's System Level Model Taxonomy Document
Summary: This document was created by the VSI to provide a
common acceptable nomenclature and classification for models such
that design information can be transfer between designers and quickly
understood. The document discusses many classifications and levels of
modeling. They first define what they describe as System Models,
Architectural Models, Hardware Models, and Software Models. Within
each model they further define types of models within each category,
namely, behavioral, functional, structural, interface, performance,
and dataflow graph models. Within each of these classifications, they
describe the precision of the model. That is they define what aspects
of the design need to be implemented in a particular model and what
attributes might be included in the model.
Back to listings
Title: Future of Computing Architectures
Summary: This document basically evaluates the past,
present, and predicted computer architectures. They
acknowledge that in the past the emphasis was in desktop systems.
However, the future trends seem to be heading for personal mobile
computing. This type of system has different requirements, such as
size, portablility, power consumption/effciency, real time
response, design scalibility, etc... They believe that present day
architectures have a heavy bias to looking at the past for
architecture designs. However, with the shift from desktop to
embedded systems this is not sensible. Different architectures were
examined in this paper and IA-64 and Raw were thought to be among
the better architecture because there systems were not based on the
past but looked to future demands for design ideas.
Back to listings
Title: Philips Silicon Platforms
Summary: For well defined application "silicon platforms"
(SOC) must be defined which combine effcient implemantitions with
programmability. The large architecture space leads to new
possiblities for new architectures. However, with all this space
and possibilities we eventually have the problem with time to
market and the design gap. Some of the solutions were reuse,
libraries, and breaking up the architecture in differing levels of
granularity. A multiwindow TV appliacation was chosen to explore
system level architecutres. In the process they also came up with
an architecture template and heuristics to aid the in the design of
the system.
Back to listings
Title: Heterogenous Reconfigurable Processors
Summary: With the paradigm shift in reconfigurable systems,
many implementations and techniques are utilized in the design.
This paper discusses some of these, the tradeoffs, and issues associated
with them. Custom designs yeild the best solutions however issued
such as time to market, flexibility, and adaptivity do not fare so
well. What about programmable architectures or configurable
architectures? These in the past were found to be too confining.
They bring up some of the potential expansions in configurable
architectures to remedy this, and even classify future
system-on-a-chips into three categories: Homogeneous arrays of
general-purpose processing elements, application specific
combination or processesing elements, and heterogenous combinations
of processing elements.
Back to listings
Title: EXPRESSION Architecture Description Language(UCI)
Summary: An Architecture Description Language (ADL) is presented
that allows for the capturing an architectures behavior as well
as the structure. The behavior representation of the architecture
described operation of the processors, and operand types while the
structure representation of the architecture describes the pipeline
structure and data-path construction. In addition EXPRESSION allows
for describing the memory hierarchy of the architecture being
described. From an EXPRESSION description, reservation tables per
operation are automatically obtained. Also, using the EXPRESSION
grammar, the described architecture is automatically verified for
correctness. The EXPRESSION description is also used to
generate tool-kits such as simulators and compilers for the architecture,
automatically.
Back to listings
Title: Adapting Cache Line Size to Application
Behaviour(UCI)
Summary: This paper describes the implementation of a cache system
that adapts its cache-line size dynamically. The paper gives
details on the hardware for such an adaptive cache and describes
the policies required for such a system. The main idea is that
during the execution of an application, the cache line size that
caused a hit is either merged with its neighbor or broken into
smaller cache lines. There are a 48 possible ways of configuring
the proposed cache architecture. Each configuration has slightly
different policy or algorithm for the increase/decrease of the
cache line size.
Back to listings
Title: Philip's Retargetable Simulator
Summary: A methodology (Y-Chart) is described that allows for
the design of a programmable systems in the domain of
high-performance video signal processing. This approach allows
for describing various architectures and sets of applications.
This description is then used to derive an architecture instance
(where all the configurable parameters are defined.) The
architecture instance is then processed to create an executable
architecture instance. This executable is then instrumented
to enable it to collect performance metrics. Then, the application is
mapped and executed via the simulator to obtain performance metrics.
The steps above are then repeated to refine the system. Object
oriented principles are used to capture the architecture instance. A
multithreaded library is used for the concurrent execution model. The
execution model is capable of performing 10K coarse-grain instructions
per second.
Back to listings
Title: Philips approach an MPEG example
Summary: This paper presents a case study of architecture
for MPEG2 decoding. The objective is to validate the System
Level Performance Analysis and Design space Exploration (SPADE)
methodology. Using this methodology, one can, concurrently,
break an application into a Kahn process model and specify a
parameterizable architecture. The Kahn process model is then
mapped onto this architecture. Afterwards, the system and the
architecture are simulated and based on performance analysis the
design space is explored. The Kahn process representation of the
application is executed using a multithreaded object oriented
environment where each process is executed as a thread and
communication is performed using blocking-read and write
primitives. The application execution model is augmented with code
to output performance metrics and execution traces. Then, after
manual mapping onto the architecture, the performance of the overall
system is evaluated via trace-driven simulation. The SPADE
methodology is an example of the Y-Chart methodology.
Back to listings
Title: Programmable interconnect
Summary: This paper discusses a number of interconnect
architectures for SOCs. They divide these architectures into
either "global" and "local" interconnects. Global interconnects
are those that have a fixed communication cost regardless of
the distance between the communicating components. Local
interconnects are networks that provide cheap interconnect among
local components while minimizing the cost of communication
between distant components. For global interconnects, the
CrossBar, Multi-Stage (Omega) and Multi-Bus is described. For
local interconnect, the mesh, generalized mesh and hierarchical
such as fat-tree networks are described. The paper concludes that
the generalized hierarchical mesh interconnect is best
suited for reconfigurable SOCs. This network breaks the components
into clusters and provides a local generalized mesh within each
cluster. Likewise, a generalized mesh interconnect is used for
across-cluster communication.
Back to listings
Title: Java Driven Codesign and Prototyping of Networked
Embedded Systems
Summary: This paper mentions the challenge in codesign.
This paper presents a method and tools called JaCOP which will
support co-synthesis and prototyping of networked embedded systems.
These tools will aid in the design of hardware/software
development, profiling, amnd managing the interaction of software
and hardware components. The proposed design flow starts with the
initial Java specificatin, and profiling data. The data is then
analyzed and animated to help the designer in partioning.
Functions are implemented and synthesized using high-level and
logical synthesis tools. Previously designed hardware components
kept in a library can then be reused. These and software
components are put into a pool. As the design process continues
portions of the chip can be reconfigued with other sections are
being worked on. This way we have actual hardware and software
codesign. This papers elaborates the detail entailed in each of
these steps.
Back to listings
Title: Description and Simulation of Hardware/Software Systems
with Java
Summary: This paper suggests the idea of using JavaBeans
to model systems. This way complexities that arise in designing of
integrated systems can be avoided by using higher and higher levels
of abstraction. The system is composed of structure and behavior
on the sytem level, alogrithmic level, and register transfer
level. They suggest that the problem can be solved by an object
model that is based on differing interpretations of objects. Some
of the interpretations are objects as components, hardware objects,
and objects as connections. This paper shows the use of tools to,
classes, and libraries in accordance with Java and an expansion of
Java to design these systems. They even come up with a design flow
and examples.
Back to listings
Title: Representation of Function Variants for Embedded System
Optimization and Synthesis
Summary: This paper proposes a novel approach for the
coherent representation and selection of function variants in the
different phases of the design process. They utilize a real
example from the video processing domain to illustrate the
example. An System Property Interval model is used and extended to
show the concepts of function variants. They define clusters and
interfaces using the function variants and explain the seclection
processes.
Back to listings
Title: Fast Prototyping: a system design flow applied to a
complex System-On-Chip multiprocessor design
Summary: This paper introduces CoWare N2C which is supposed
to reduce time to market, enable concurrent hardware and software
development, enable early verificatin, and productive reuse of
intellectual property. The describe the challenges of system on a
chip design and illustrates how the use of this product would
provide solutions. The tools used in the actual development of
their product and the strategies used were also discussed. Some of
the perks of this product was the ability to refine behvariorl C
descriptions down to a clocked-C (similar to VHDL RTL), cross
checking capabilities, functional validation, fully emulated
prototypes, and co-simulation engine that allows hybrid
prototyping. They give descriptions on the prototyping of the
megacell and the overall system. The time it took to design their
system was achieved in four to five months which is significantly
shorter than would be expected.
Back to listings
Title: Code Compression for Embedded Systems
Summary: Code compression can provide substantial savings
in terms of size for memory. Two algorithms were proposed in the
paper to compress code in a space efficient and simple to decompose way.
a) SAMC (Semi Adaptive Markov Compression) uses a binary arithmetic
coder driven by a Markov model. The idea is to divide instructions
into 4 streams of eight bits each (for a 32 bit word) and build
Markov models for each one of them. b) SADC(Semi Adaptive
Dictionary Compression) uses a semi adaptive dictionary method to
compress opcodes, opcode register combinations and opcode immediate
combinations. The basic idea is that opcodes in an instruction
sequence tend to exhibit some dependence between them. These
dependencies are exploited by generating new augmented opcodes
which combine several opcodes together.
The compression performance results showed significant improvements
over previous attempts.SAMC (Semi Adaptive Markov Compression) is
targetted for RISC instruction sets with fixedsized instructions
and can work for any architecture.Compression ratios are comparable
to UNIX compress.SADC(Semi Adaptive Dictionary Compression) works
on a specific program and instruction set.As a result it can
achieve significantly better compression. As it is a dictionary
method it allows for fast hardware implementations.
Back to listings
Title: Customized Instruction-Sets for Embedded Processors
Summary: Instruction Set Architectures (ISA) are the most
visible instructions of the processor. It is the contract between
the hardware and the software. The major motivation for breaking
the ISA is that doing so can lead to performance or
performance/price gains.
There are 5 barriers that come in the way of breaking the ISA:
1. Existing binaries barrier:
2. Toolchain devolopment and mentainance costs
3. Lost savings/ higher chip cost due to lower volumes:
4. Hardware devolpoment costs: Each variant processor needs a new chip
design.
5. The product devolopment cycle for embedded products:
The paper discusses why each of the above is a barrier and some
solutions .The paper then outlines the factors that the author
believes will cause ISA's to become performance driven families of
what are now incompatable ISA's.
Back to listings
Title: Instruction Fetch Energy Reduction Using Loop
Caches For Embedded Applications with Small Tight Loops
Summary: In this paper they propose using a small
instruction buffer called a loop cache to reduce the instruction
fetch energy when executing tight program loops. Instructions are
specified to the execution core either from the loop cache or from
the main cache. The proposed technique is based on a special class
of branch instructions called the short backward branch
instruction(sbb). When a sbb is detected in an instruction stream
and found to be taken, the loop cache controller assumes that we
are starting to execute the second iteration of the program loop
and tries to fill up the loop cache. From the third iteration
onwards the controller directs all the instruction requests to
the loop cache and shuts off the main cache completly.
The loop cache controller knows precisely when the next instruction
request will hit in the loop cache well ahead of time. The loop
cache has no address tag store. It can be implemented as a direct
mapped array.There is no valid bit associated with each loop
cache entry. It does not require program loops to be aligned to
any perticular address boundry . But most importantly there is
no cycle count penalty nor cycle time degredation associated with
this technique.
Back to listings
Title: Selective Instruction Compression for Memory
Energy Reduction in Embedded System
Summary: The authors propose a technique for reducing energy
required by firmware code to execute on embedded systems. The main
idea here is that firmware running on a given embedded processor
uses only a small subset of the instructions supported by the
processor.By replacing such instructions with binary patterns of
limited with (i.e. log2 N), memory bandwidth usage can be reduced,
thus decreasing the total energy. Each time an instruction is
fetched from the memory it is first decompressed by means of an
instruction decompression table and then passed to the processor's
decoding logic. The authors propose to compress only a subset of
fixed cardinality(256 elements) of the instructions used in the program.
The paper discusses four architectural schemes to do the
compression. Based on experimental results they state that dynamic
memory utilization is substancially improved for all the
compression schemes. The main advantage of their scheme is that it
does not require any modification of the processor since it always
executes full-size instructions.
Back to listings
Title: Address Bus Encoding Techniques for System-Level Power
Optimization
Summary: Analyzed the use to different bus encoding schemes with
regard to the average number of bus line transitions per clock
cycle. They first discussed the use of Bus Invert and T0 encodings,
and then describe the use mixed encoding. The combination of Bus
Invert and T0 encoding resulted in the lowest average bus line
transitions per clock cycle. Furthermore, they analyzed the impact
of these encodings in regards to power consumption. It is evident
from their research, that there is a point at which using the
mixed encoding schemes will result in the reduced power
consumption. That is, depending on the capacitance of the chip,
the logic needed to implement the circuitry will results in greater
power consumption. But as the capacitance increases, this overhead
will become smaller and at some point using the mixed encoding will
result in lower power consumption.
Back to listings
Title: Motorola's IP Interface Standard
Summary: Motorola's IP interface provides a standard similar to
the VSIA on-chip bus standard. However, Motorola's interface in
divided into a number or bus lines that corresponds to different
portions of the bus interface, such as the main system bus, the
peripheral bus, interrupts, DMA control, etc. In particular the
blue line describes the peripheral virtual component interface,
which is referred to as a gasket. This interface is a two signal
handshaked protocol that provides both a point-to-point connection
between a bus gasket and VC or a one-to-many connection between a
single bus gasket and many VC's.
Back to listings
Title: VSIA's On-Chip Bus
Summary: The on-chip bus described in the VSIA document can be
classified into two different interfaces, and the full Virtual
Component Interface (VCI) and the Peripheral Virtual Component
Interface (PVCI). The PVCI defines the interface between a
peripheral core and a bus wrapper. The bus wrapper is used to
interface the VC with the on-chip bus. This allows for designers
to provide multiple bus wrappers to different standard buses. It
also allows VC integrators to design bus wrappers to proprietary
busses. The PVCI is a simple two signal handshaked protocol that
provides a point-to-point connection between a bus wrapper and VC.
Back to listings
Title: Profile-Driven Program Synthesis for Evaluation of
System Power Dissipation
Summary: This paper provides algorithms for synthesizing
an execution tract from an original program's execution trace
such that the power consumption of the two traces is identical.
The simulation time (evaluation of power) is thus reduced. In
their approach, they use integer linear programming to solve a
best fit basic-block template to that of the original program. Then they
select a set of matching instructions for each block, assign operands and
allocate memory based on mathematical models. They report a simulation
reduction time of 10 to 10000 times while maintaining power accuracy of
less than 5%.
Back to listings
Title: A Power Estimation Framework for Designing Low Power
Portable Video Applications
Summary: This paper presents a hierarchical and mixed-level
simulation environment that is targeted towards data processing
applications. In particular, they use an MPEG encoder example to
illustrate their methodology. In their environment, they model
each component at both functional level as well as
low structural level using C and VHDL. A designer can use the high-level
model to check functionality and capture intermediate data that is later
applied to the low-level structural models for accurate power and
performance metrics.
Back to listings
Title: Cycle-Accurate Simulation of Energy Consumption in
Embedded Systems
Summary: This paper gives a methodology for performing a
cycle-accurate simulation (for power and function) of embedded
systems. Their work focuses on a set of board-level components.
They assume that only interface and manufacture's datasheet are
available to the designer for these discrete components.
Their work models the processor and components in a high-level language and
simulate while capturing power consumption by evaluating simple mathematical
models for cache, CPU, memory and so on.
Back to listings
Title: Memory Exploration for Low Power Embedded
Systems
Summary: This paper presents a memory exploration strategy
based on three performance metrics. They study cache size, line
size, set associativity and tiling. They outline an exhaustive
exploration algorithm that tries to find a set of values for the
parameters just mentioned that minimizes power consumption while
meeting timing constraints.
Back to listings
Title: New Chips Move Networking onto Silicon
Summary: Networking technolgy today is towards faster,
less expensive and more functional networks. The growing trend is
towards providing many networking functions via a single
internetworking chip rather than previous approaches that use
either multiple ASIC's or software on general purpose RISC
processors. Internetworking Chips are Integrated specialized
chipsets optimized to perform high-level networking functions.
They take advantage of improvements in processor technology
that permits lower chip prices, as well as more sophisticated
decision making in hardware. Therefore they are customizable like
software but are also fast like ASICs.
The advantages of internetworking chips are that they help meet
the demand for faster, less expensive and more functional networks
by offering better performance than software that runs on a gneral
purpose processor. They also offer lower prices, faster time to
market and more flexibility than ASIC's. They provide a standerized
approach that could permit more interoperability. On the other
hand they have the disadvantage that Generic functions hardwired
into commoditized chipsets may not be sufficient to add some of the
specialized features that vendors want to incorporate in their
products. Also Networking venders may not want to put aside big
investments in in-house-ASIC design teams in favor of buying
off-the-shelf internetworking chips.
Back to listings
Title: Prototyping Networked Embedded Systems
Summary: The low cost consumer oriented fast time-to-market
mentality that dominates embedded system design today forces design
teams to use hardware-software codesign to cope with growing
design complexities. New codesign methodologies and tools must
support a key characteristic of next generation embedded systems:
the capabitlity to communicate over networks and adapt to
different operating environments.
They devoloped a co-synthesis method that makes the most efficient
assignment of tasks to either software or hardware. They first
begin with an initial Java specification of the desired
functionality. Software profiling by the Java virtual machine then
identifies bottlenecks and computation intensive tasks. The
designer then seeks canditates for hardware implementations based
on profiling results and a reuse library of available hardware
components.The designer uses a high level synthesis tool to
tranform Java methods into VHDL. At the same time the tool
generates an appropriate interface description for each hardware block.
Back to listings
Title: Designing the Next Step in Internet
Applications
Summary: Web enabled information Appliances are showing up
in many places but the challenege is supplying web pages with
dynamic content. In order to support dynamic content the authors
suggest the following steps
1. Generate web content using an HTML editor of choice. For dynamic
content pages, proprietry tags are inserted where the dynamic
portions are to be inserted.
2. Use a conversion toolto convert the HTML pages,images and
applets into embedded application source code.
3. Implement the generated shell routines that are specific
to the overall application running in the embedded device.
4. Compile and link the resulting source code
The embedded server could support both Static and Dynamic web
content as well as support from processing. By utilizing a
conversion tool designers spend their time implementing the web
portions specific to their deive. This results in a very large
time savings in their application devolopment time
Back to listings
Title: Information Appliances: From Web Phones to Smart
Refridgerators
Summary: IA(Information Appliances) hides its operating
system and computational abilities behind a task oriented
interface. Using LAN,s, the Internet and wireless technologies,
these embedded systems will provide connectivity to nearly every
kind of electronic device manufactured in the coming years. Some
categories of IA are thin clients, set-top web browser, web phones,
smart cellular phones and pagers. The embedded system within all
these devices performs the low level protocol processing and link
control invisibly enabling smart machines to talk to each other.
IA technologies are making it much easier not only to build
intelligence into a product but also to allow it to communicate
with other devices across a variet of infrastructures. The biggest
impact of IA's will be in the invisible realm- meter reading,
data capture and industry automation which benefits from a
steady flow of information.
Back to listings