UCR EE/CS 120B Digital Systems Winter 2000 Professor Frank Vahid (Updated 3/7/00 10:30 a.m.: i++ changed to i--) Homework 3 Due: 3/9, BEFORE lecture. You can turn it in before lecture begins at 11:10, or put it under the door of A207 before 11:00. Remember to turn in a neat homework. Don't forget the required "This is my own original work" statement at the top. SHOW YOUR WORK for every problem. You may want to use online drawing tools for 1 and 2, since 2 is an extension of 1 and thus could just be copied and then extended. 1. Design a simple microprocessor. The datapath has an ALU and a register file (RF), identical to those described in lecture on 2/29. In particular, the RF has two read ports A and B, one write port, and 8 registers, and the following control signals: wa, we, raa, rea, rab, reb. The ALU has 8 operations and thus 3 input select lines alu2, alu1, alu0, with the operations for encodings 0 through 7 being 0: no-operation, 1: add, 2: sub, 3: and, 4: or, 5: not A, 6: A+1, 7: shift left A. The ALU has two status outputs, aluc (operation generated a carry) and aluz (result is zero). The RF and ALU are 8 bits wide. The write input to the RF can come from the ALU, or from IR(7..0). A 2x1 mux selects with control rfs (1: IR, 0: ALU). The ALU output can be externally output by enabling a tri-state buffer using a signal oe. We want to design the controller for the following instructions. * Mov Rx, #constant -- Move a constant into register Rx (opcode 00001) * Out Rx -- Externally output register Rx (opcode 00010) * Add Rx, Ry -- Rx = Rx + Ry (opcode 00011) * Inv Rx -- Rx = Rx' (opcode 00100) * Sub Rx, Ry -- Rx = Rx - Ry (opcode 00101) * Jz Rx, #offset -- PC = PC + offset if Rx==0 (Jump) (opcode 00110) Note: offset is in two's complement, so jump could be backwards (negative). (a) Draw the architecture of the microprocessor. Include the datapath architecture (with ALU, mux, RF, and tri-state buffer components), but don't draw the internals of those components. Also include the controller architecture (with program ROM, IR, PC, control logic and state-register). Note: this part is nearly straight from 2/29's lecture, except that our PC can be loaded. Draw and label all signals VERY CLEARLY. (b) Draw a Moore-type FSMD for controller. The FSMD may include arithmetic operations in the states. (c) Draw a pure Moore-type FSM for the controller. The pure FSM may only include bit operations, such as setting control signals and reading status signals. (d) Convert this FSM to a state table in the form that we used earlier in the quarter (state bits on the left followed by control logic inputs, then next state bits on the right followed by control logic outputs). This table would be huge if you tried to complete it. So instead, just complete the first 3 rows. Make sure that the state encoding of 000 corresponds to the fetch state. (e) Derive minimized logic for only the we control signal and the Q0n next-state signal. Derive the initial equation directly from your FSM (which tells you when those signals should be set to 1). Use algebraic techiques to minimized the logic (the problem is too big for K-maps). (f) Draw this logic inside the combinational logic block of your microprocessor architecture. You just built a simple microprocessor! Congratulations! Look it over carefully, note how it works. (g) Write a small MACHINE program for your microprocessor to execute the following. First write an ASSEMBLY program, then translate to 0's and 1's (machine program): for (i=20; i!=0; i--) R0 = R0 + 2 2. A typical microprocessor also has a memory for storing data. (a) Extend your architecture from 1(a) to include a 256x8 RAM for connecting to the datapath. Let's extend your instruction set with the following two instructions: Mov Rx, d -- Rx = DataMemory(d) (opcode 00111) Mov d, Rx -- DataMemory(d) = Rx (opcode 01000) (b) Redraw your initial FSMD to include these instructions. (c) Redraw your pure FSM to include these instructions. (d) Rederive your minimized logic for only the we control signal and the Q0n next-state signal. (e) Write a small machine program for your microprocessor to execute the following: for (i=20; i!=0; i--) A = A + B where A,B are locations 32,33 in DataMemory, respectively. 3. Binary multiplication can be achieved by using shifts and adds. (a) Write an assembly program (not a machine program) to implement binary multiplication of two registers on the machine created above, using the assembly instructions listed above. (b) Let's make multiplication faster. Let's create a new instruction "Mul Rx, Ry" that multiplies two numbers. When Mul is executed, the controller will carry out the shift and add algorithm. Thus, the Mul instruction may take many cycles more than a typical instruction. Re-draw your FSMD to include the states needed to carry out the Mul instruction. (c) Let's make multiplication even faster. ASSUME you have built a multipler functional unit, that requires 4 cycles. Redraw your datapath to include this functional unit, and then redraw your FSMD to use the functional unit. 4. Assume we design an architecture with a three-stage pipeline: fetch, decode and execute. Each stage requires only one cycle. (a) Graphically show how 10 sequential instructions, numbered I0 through I9, flow through the pipeline. (b) Assume that pipelining the architecture reduced the clock period by the ideal amount of 1/3 (since there are 3 stages). What is the speedup (# of non-pipelined cycles / # of pipelined cycles) for 5 instructions? 50? 500?