UCR CS220: Synthesis of Digital Systems

Course Information   Lecture Topics  Homework   Individual Projects

CS220 covers the synthesis and simulation of digital systems. Topics include synthesis at the system, behavioral, register-transfer, and logic levels; application-specific processors; simulation; and emerging system-on-a-chip design methodologies.

Course information

Instructor

Harry Hsieh, (harry@cs.ucr.edu), Engineering Unit II, Room 339

Office hours: Tue Wed 11:00am-Noon, or by appointment

Class meeting

Engineering Unit II, Room 315, TR 9:40AM-11AM

Textbooks

Recommanded:
Giovanni De Micheli, Synthesis and Optimization of Digital Circuits. McGraw Hill, 1994, ISBN:0-07-016333-2
Hassoun and Sasao editors, Logic Synthesis and Verification, Kluwer Academic Publishers, 2002 . ISBN:0-7923-7606-4

Additional reading will be distributed throughout the quarter.

Prerequisite

CS/EE120B(Digital systems), CS141, CS161, and consent of instructor

Call # and units

12300, 4 units.

Grade

Examinations 60%, Research Project / Research Review 20%, Homework 15%, Attendance/Discussion 5%

Projects are to be done individually.  Project topic must be chosen by the third week of the class (Tue, 10/10).  Research project involves an individual research work carried out throughout the quarter.  Weekly meeting will be held to make sure sufficient progress were made.  Two presentations and a final project report are required.  A "progress" presentation will be given by the student to introduce to the class the research work, and a final project presentation will serve to conclude the project.  The idea is that the result of the project, possibly with one more quarter of independent study, will have enough technical content for a conference or workshop publication.

Letter grades are assigned according to the usual 85/70/60/50 rule:  85% and above correspond to an A, 70% and above to a B, 60% and above to a C, 50% and above to a D, and less than 50% to an F.  +/- grades will be given.  Curving may be done on individual items only if it helps the class.  You are NOT competing against one another -- you can all earn A's  (and that happens often), so work together and help each other to succeed.

 

Lecture Topics  (schedule are tentative and subject to change)

Date

Topic

Corresponding Reading

Lecture notes

Th 9/28

Introduction to microelectronics and synthesis
Background: graphs, optimization

De Micheli, chapter 1

pdf_6

pdf_2


Tu 10/3

Hardware Modeling

De Micheli, chapter 1, 3

pdf_6

pdf_2

Th 10/5

Architectural Synthesis

De Micheli, chapter 3, 4

pdf_6

pdf_2


Tu 10/10

Scheduling Algorithm

Project selection

De Micheli, chapter 5

pdf_6

pdf_2

Th 10/12

Scheduling Algorithm, Resource Sharing, and Binding

De Micheli, chapter 5, 6

pdf_6

pdf_2


Tu 10/17

Two-Level Combinational Logic Optimization
Homework #1 Due

De Micheli, chapter 7

pdf_6

pdf_2

Th 10/19

Two-Level Combinational Logic Optimization

De Micheli, chapter 7

pdf_6

pdf_2


Tu 10/24

Review
Homework #2 Due

 

 

Th 10/26

Review

 

 


Tu 10/31

Midterm Examination

 

formulae and algorithm

Th 11/2

Two-Level Combinational Logic Optimization

De Micheli, chapter 7

pdf_6

pdf_2


Tu 11/7

Student Presentations

 


Th 11/9

Two-Level Combinational Logic Optimization

De Micheli, chapter 7

pdf_6

pdf_2


Tu 11/14

Multi-Level Combinational Logic Optimization

De Micheli, chapter 8

pdf_6

pdf_2

Th 11/16

Multi-Level Combinational Logic Optimization Binding

De Micheli, chapter 8

pdf_6

pdf_2


Tu 11/21

Sequential Logic Optimization, Cell-Library Homework #3 Due

De Micheli, 9, 10

pdf_6

pdf_2

Th 11/23

Happy Thanksgiving!!!

 

 


Tu 11/28

System Level Design

Hardware Accelerator

Clock Partitioning

Parameter Tuning

Design Space Exploration

Accelerators_6

Accelerators_2

Tuning_6

Tuning_2

Th 11/30

Formal Verification

 

 


Tu 12/5

Student Presentation #2

Homework #4 Due

 

 

Th 12/7

Midterm Examination #2

 

formulae and algorithm

 

Homeworks

Individual Projects

1)     SystemC behavioral simulation with performance estimation

2)     Case study of Picture-in-Picture MPEG-2 Decoder running in MPSoC

3)     Communication implementation for Kahn Process Network applications in MPSoC

4)     Combinational RTL-gate correlation engine

5)     Formal and semi-formal technique for RTL-gate correlation

Detail description:

1)     SystemC behavioral simulation with performance estimation

System simulations using cycle-accurate or cycle-approximate processor models are too slow to simulate large SoC designs, such as MPEG-2 Decoder, in a reasonable speed. It limits the developers the ability to explore different architectures and different source code transformations, especially for applications running on multiprocessor system-on-chip designs. Behavioral simulation runs much faster than simulation with processor models in SystemC, but no performance could be measured for the application running on the target platform. We could annotate some information on the behavioral code to estimate the performance of the system. Such annotation may include runtime of every basic block using different processors, number of different kinds of memory accesses, and delay information of various kinds of interconnections. We show the result by comparing the runtime and performance estimation to the runtime and performance from system simulation using cycle-accurate or cycle-approximate processor models.

a.      Build a SystemC behavioral model for a KPN application (compiled code)

b.     Simulate the KPN application in a SystemC with cycle-accurate processor models and interconnection models

c.      Compare the performance and runtime of behavioral model (compiled code) and cycle-accurate model

d.     Figure out a list of annotations and calculations for performance estimation in behavioral model (compiled code)

e.      Annotate design with appropriate numbers and implement in behavioral model (compiled code) / SystemC kernel

f.       Estimate performance in the annotated behavioral model. Compare the performance and runtime to simulation with cycle-accurate processor models and interconnection models

g.      Try different KPN applications and different inputs

 

2)     Case study of Picture-in-Picture MPEG-2 Decoder running in MPSoC.

Multiprocessor systems are very common today to deliver as a multimedia solution in embedded platform. The architectures are usually application specific to provide the performance, power and area constraints necessary for the application. In this case study, we would like to develop appropriate multiprocessor system architectures for a Picture-in-Picture MPEG-2 Decoder application, with 2, 4, and 8 processors, using Virtex FPGA platform.

h.     Understand how to put soft processors in FPGA and program them

i.        Understand what other components can be programmed to FPGA

j.       Understand the KPN application and create an appropriate architecture for 2, 4, and 8 processors

k.     Program the KPN application to run on the architectures

l.        Debug the KPN application running on the architectures

m.   Compare the performance / power / area of the KPN application running on the architectures

 

3)     Communication implementation for Kahn Process Network applications in MPSoC.

 A number of programming methodologies have been proposed to develop applications in MPSoC. Two main methodologies are threading model (i.e. POSIX threads) and message passing model (i.e. MPI, YAPI). Programming methodology and architecture are often considered as inseparable. Applications written in threading model are naturally mapped to shared memory with coherency supported; and applications written in message passing model are mapped to dedicated hardware FIFO queues or shared memory with software/hardware FIFO protocol. However, as both represent data sharing between processes, there may not be a real different between the two methodologies. For applications written in message passing model, such as Kahn Process Network, we could convert data communicated using FIFO queues into shared memory. Since all the data communication is explicit in Kahn Process Network, analysis can be done on how the data is shared between processes. In this study, we try to develop several different ways to implement FIFO queues in shared memory. The implementations may different in efficiency, hardware, coherency requirement, and usage constraints. Data flow analysis may be necessary to understand how the data read and write from FIFO queues is generated and used. Transformation may be needed in the source code to follow the implementations. System simulation can be used to show the results.

n.     Simulate the KPN application in a SystemC with cycle-accurate processor models and interconnection models

o.     Understand TTL and how to implement FIFO queues in shared memory with simple software FIFO protocol

p.     Perform source code analysis on the KPN application to understand how the data communicated using FIFO queues is used in the processes

q.     Come up with several different ways to implement FIFO queues in shared memory

r.       Manually/ automatically change the source code to implement the FIFO queues in different ways

s.      Run system level simulation and compare the performance of the implementations

 

4)     Combinational RTL-gate correlation engine

At the RTL level, combinational circuits are usually modeled as programming statements such as arithmetic operations and conditional statements. After synthesized to the gate level, these objects are modeled as combinational gates and nets. Due to the optimization present in state-of-the-art logic synthesis tool, a combinational object at the gate level may not have an exact correspondence at the RTL level. In this project, we assume sequential objects such as registers and latches are correlated exactly. Therefore, the work remains to correlated combinational blocks between sequential boundaries. The goal is to correlate an RTL statement to a minimum set of gates and a gate to a minimum set of RTL statement, given a typical industrial design. Different RTL language constructs are handled differently in the synthesis, so should they be studied case by case in correlation.

a.      Study and understand the logic synthesis process and its common transformations (e.g. Synopsys Design Compiler).

b.     Study and understand the current gate-to-RTL correlation technique and its limitations.

c.      Study RTL level HDL language constructs and their synthesis algorithms for combinational objects, such as arithmetic operations, if-else statements, case statements, assignment, and etc. For each category of objects, propose a correlation mechanism to the gate level objects. Devise the metric for correlation and keep in mind that the mapping will not be exclusive (multiple RTL statements correlate to the same gate).  How does sharing and common sub-expression extraction affect the correlation?

d.     Correlate individual gate to RTL statements that may “have something to do with” the gate. Devise metric for correlation.

t.       (bonus) Can we scarify optimality in design for 100% RTL to gate correlation?  What’s the cost?  Justify your answer.

Expected Product:

          A correlation engine for RTL->gate on a statement on a statement by statement basis, and a correlation engine for gate->RTL on a gate by gate basis.

 

5)     Formal and semi-formal technique for RTL-gate correlation

The current design correlation technique relies on user-specified renaming rules and heuristic name changes to perform the correlation rather than formal methods due to performance issues. But it also introduces possible inaccuracy by possibly correlating two objects that are not exact correspondence, which will make the later-on verification or debugging process error-prone. This project is to propose an effective but still efficient technique to verify the correctness of the design correlation with formal or semi-formal methods.  The idea is to start with correlation, and apply formal and semi-formal method locally to keep the complexity under check.

a.      Study and understand the logic synthesis process and its common transformations (e.g. Synopsys Design Compiler).

b.     Study the current gate-to-RTL correlation technique. Understand its limitation that may introduce inaccuracy in the correlation results.

c.      Study the efficiency and effectiveness of current equivalence checking engines (e.g. Synopsys Formality)

d.     Study possible techniques or their combinations that can be used to verify the correctness of the correlation including both formal and semi-formal techniques, which include but are not limited to local equivalence checking and random simulation.  Implement a simple tool/script and test it out on a large industrial design.

e.      (bonus) Analysis the theoretical value of correlation, and correlation follow by equivalence.  How much of the correlation do we have to be declare “correct” and how much more equivalence checking we will need in order to declare equivalence in entirety?

Expected Product:

Propose a practical solution for the verification of design correlation between different levels of abstraction. Implement a simple tool/script that can “call” correlation and equivalence engine appropriately.