CS 201

Project Description

Handed Out: Jan. 15, 2008.
You are to implement the Ball-Larus's efficient path profiling algorithm. Given the IR for a piece of code, you should be able to instrument the IR in such a way that executing the instrumented IR code collects the frequencies of Ball-Larus paths in the original IR code. Given a function, Ball-Larus paths are a set of acyclic control flow paths that start at the function ENTRY node or a LOOP ENTRY node and terminate at the function EXIT node or a loop BACKEDGE. For more precise definition of BL path, please refer to the following paper. You are not required to implement the optimizations described in the paper. However, your program should be able to identify all the BL paths and the instrumentation should be able to collect the execution frequencies of the BL paths. You will carry out this project in two steps:
  1. In the first step you will read the intermediate code provided in an input file and then construct the corresponding Control Flow Graph. In addition, you will carry out the analysis needed to identify the loops in the control flow graph. (Due date: Feb. 5, 2008)
  2. In the second step you will instrument the intermediate code and collect path profiles by running the instrumented code. (Due date: March 11, 2008)
More details about the languages, tools, and output requirements can be found in the following:

Reporting problems

Please send email to vijay@cs.ucr.edu. Thanks.

Mini C language

This language supports a subset of the features of C language. This is an example of Mini C program.
#define read(x)         scanf("%d", &(x))
#define print(x)        printf("%d\n", (x))

int main()
{
    int a[2], n, m, k, j, i, s0, s1, s2, s3, s4;

    read(a[0]);
    n = a[0];
    m = a[0];
    read(a[1]);
    k = a[1];
    j = 1;
    while ( n > 0 ) {
        s3 = k + a[j];
        j = j - 1;

        s1 = a[j] + m;
        if ( n % 2 == 0 ) {
            m = -m;
            s0 = a[j] + m;
        }
        else {
            s0 = a[j] + m;
            m = -m;
        }
        s2 = a[j] + m;
        n = n - 1;

        j = j + 1;
        s4 = k + a[j];
    }
    print(s0);
    print(s1);
    print(s2);
    print(s3);
    print(s4);
}

The Intermediate Representation (IR) Language

Below is a list of some constructs of the intermediate language we will use in our projects this semester. This intermediate language is not designed to be a complete one (for example, function definitions and calls are omitted for now; more constructs will be added later if necessary); it is intended to be simple to parse, analyze, optimize, and generate. With this in mind, you should not be surprised that one line can only contain one such statement, including label declaration statements.

The following is the IR code for the above example:

        .[]     _a, 2
        .       _n
        .       _m
        .       _k
        .       _j
        .       _i
        .       _s0
        .       _s1
        .       _s2
        .       _s3
        .       _s4
        .       t1
        .       t2
        .       t4
        .       t5
        .       t6
        .       t7
        .       t8
        .       t9
        .       t10
        .       t11
        .       t12
        .       t13
        .       t14
        .       t15
        .       t16
        .       t17
        .       t18
        .       t19
        .       t20
        .       t21
        .       t22
        .       p0
: START
        .[]<    _a, 0
        =[]     t1, _a, 0
        =       _n, t1
        =[]     t2, _a, 0
        =       _m, t2
        .[]<    _a, 1
        =[]     t4, _a, 1
        =       _k, t4
        =       _j, 1
: L2
        <=      p0, _n, 0
        ?:=     L3, p0
        =[]     t5, _a, _j
        +       t6, _k, t5
        =       _s3, t6
        -       t7, _j, 1
        =       _j, t7
        =[]     t8, _a, _j
        +       t9, t8, _m
        =       _s1, t9
        %       t10, _n, 2
        !=      p0, t10, 0
        ?:=     L0, p0
        -       t11, 0, _m
        =       _m, t11
        =[]     t12, _a, _j
        +       t13, t12, _m
        =       _s0, t13
        :=      L1
: L0
        =[]     t14, _a, _j
        +       t15, t14, _m
        =       _s0, t15
        -       t16, 0, _m
        =       _m, t16
: L1
        =[]     t17, _a, _j
        +       t18, t17, _m
        =       _s2, t18
        -       t19, _n, 1
        =       _n, t19
        +       t20, _j, 1
        =       _j, t20
        =[]     t21, _a, _j
        +       t22, _k, t21
        =       _s4, t22
        :=      L2
: L3
        .>      _s0
        .>      _s1
        .>      _s2
        .>      _s3
        .>      _s4

Specifications of IR language

Syntax Semantics
Variable declaration statements:
. name declares a name for a scalar variable
.[] name, n declares a name for a vector/array variable consisting of n (must be a positive whole number) elements, with name[0] being the first element
Copy statements:
= dst, src dst = src, src can be an immediate
array access statements:
=[] dst, src, index dst = src[index], index can be an immediate
[]= dst, index, src dst[index] = src, index can be an immediate
Input/Output statements:
.< dst read a value into dst from the standard input
.[]< dst, index read a value into dst[index] from the standard input
.> src write the value of src into the standard output
.[]> src, index write the value of src[index] into the standard output
fixed-point arithmetic op statements(one or both source operands can be immediates):
+ dst, src1, src2 dst = src1 + src2
- dst, src1, src2 dst = src1 - src2
* dst, src1, src2 dst = src1 * src2
/ dst, src1, src2 dst = src1 / src2
% dst, src1, src2 dst = src1 % src2
comparison statements(one or both source operands can be immediates):
< dst, src1, src2 dst = src1 < src2
<= dst, src1, src2 dst = src1 <= src2
!= dst, src1, src2 dst = src1 != src2
== dst, src1, src2 dst = src1 == src2
>= dst, src1, src2 dst = src1 >= src2
> dst, src1, src2 dst = src1 > src2
logical-op statements(one or both source operands can be immediates):
|| dst, src1, src2 logical OR, dst = src1 || src2
&& dst, src1, src2 logical AND, dst = src1 && src2
! dst, src logical NOT, dst = ! src
label declaration statements:
: label declares a label
branch statements:
:= label goto label
?:= label, predicate if predicate is true(1) goto label

Getting the Tools

The binaries are linux-x86 executables.

Download the front end (c2ir) which turns Mini C code into IR code.

Download the interpreter (irRun) which executes IR code. The interpreter executes code written in the mini-intermediate-represeantation language. It assumes that

Statistics are written in a file named "milRun.stat" in the current directory.

Download test cases in pp_test.tar.gz and use them to test/debug your program.

Output Requirements