cs152 Project Phase 3: Code Generation

Due Date: Wednesday, March 11, 2009 at 11:59p.m.

Grade Weight: 15% of total course grade

The project must be completed individually; you are not allowed to collaborate with your classmates. Any collaboration or code sharing (including code submitted in past quarters) will be considered a violation of academic integrity and dealt with as set forth in the academic integrity policy.

Overview

In the previous phases of the class project, you used the flex and bison tools to create a lexical analyzer and a parser for the "MINI-L" programming language. In this phase of the class project, you will take a syntactically-correct MINI-L program (a program that can be parsed without any syntax errors), verify that it has no semantic errors, and then generate its corresponding intermediate code. The generated code can then be executed (using a tool we will provide) to run the compiled MINI-L program.

[The MINI-L source-code language is described in detail here (this is the same as for the prior project phases).]

You should perform one-pass code generation and directly output the generated code. There is no need to build/traverse a syntax tree. However, you will need to maintain a symbol table during code generation.

The intermediate code you will generate is called "MIL" code. We will provide you with an interpreter called mil_run that can be used to execute the MIL code.

[The MIL intermediate code representation is described in detail here.]
[Download the mil_run MIL interpreter here.]

The output of your code generator will be a file containing the generated MIL code. If any semantic errors are encountered by the code generator, then appropriate error messages should be emitted and no other output should be produced.

[The required output format for your code generator is described here.]

The `mil_run` MIL interpreter

We are providing an interpreter for MIL intermediate code (mil_run), which can be used to execute the MIL code generated by your code generator. The mil_run interpreter requires an input file to be specified that contains the MIL code that should be executed. For example, if you have MIL code contained in a file called mil_code.mil, then you can execute the MIL code with the following command:

mil_run mil_code.mil

If the MIL code itself requires some input data, this input data can be written to a file and then redirected to the executing MIL code. For example, if the input values are written to a text file called input.txt, then it can be passed to the executing MIL program as follows:

mil_run mil_code.mil < input.txt

The mil_run interpreter will generate a file called milRun.stat that contains some statistics about the MIL code that was just executed. You may ignore this file.

mil_run makes the following assumptions.

Each line in the MIL file contains at most one MIL instruction
Each line is at most 254 characters long
All variables are defined before they are used

You must ensure that your generated MIL code meets the above three requirements.

mil_run is a Linux executable and can be run on bass.

Detailed Requirements

The following tasks will need to be performed to complete this phase of the project.

You will need to modify your bison specification file from the previous phase of the class project so that it no longer outputs the list of productions taken during parsing.
Implement the code generator. This will most likely require some enhancements to your bison specification file. You may also want to create additional implementation files. The requirements for your implementation are as follows.
1. You do not need to do anything special to handle lexical or syntax errors in this phase of the class project. If any lexical or syntax errors are encountered, your compiler should emit appropriate error message(s) and terminate the same way as was done in previous phases.
2. You need to check for semantic errors in the inputted MINI-L program. During code generation, if any semantic errors are encountered, then appropriate error messages should be emitted and no other output should be produced (i.e., no code should be generated).
3. If no semantic errors are encountered, then the appropriate MIL intermediate code should be generated and written to a file called XXXX.mil, where XXXX is the name of the MINI-L program as specified in the program declaration statement within the MINI-L source code.
4. When generating the intermediate code, be careful that you do not accidentally create a temporary variable with the same name as one of the variables specified in the original MINI-L program.
Compile everything together into a single executable. The particular commands needed to compile your code generator will depend on the implementation files you create.
Use the mil_run MIL interpreter to test your implementation. For each program written in MINI-L source code, compile it down to MIL code using your implementation. Then invoke the MIL code using mil_run to verify that the compiled program behaves as expected.

A Note about Runtime Errors

There are some errors that cannot always be captured at compile-time and may only happen at run-time. These errors include those such as array index out-of-bounds errors, and division by zero. Your implementation need not handle these errors. You may assume that when we grade your programs, we will not use any MINI-L programs that would lead to run-time errors. Note also that the mil_run MIL interpreter we are providing may have unexpected behavior if you try to run it on a program that can lead to run-time problems (such as an out-of-bounds array access). Thus, when you are testing your implementation, try to make sure your MINI-L programs will not cause any run-time errors.

Example Usage

Suppose your code generator is in the executable named my_compiler. Then for the MINI-L program primes.min (which is syntactically and semantically correct), your code generator should be invoked as follows:

cat primes.min | my_compiler

The file primes.mil should then be created and should contain the generated MIL code (it is okay if your generated code looks slightly different, but it should have the same behavior when executed). Next, you can test your generated code using the mil_run MIL interpreter to ensure that it behaves as expected. Suppose we want to run the compiled primes program with input "100":


echo 100 > input.txt 

mil_run primes.mil < input.txt

When the compiled primes program is executed with the above input "100", then all the prime numbers up to 100 are printed to the screen:

Submission instructions

Submit via Moodle. Turn in your source code and the Makefile (or a build script) in a .tar.gz or .zip file. Make sure you can uncompress and compile your submission on bass, otherwise your score for this project will be 0.

Extra credit for implementing your project in OCaml

Similar to projects 1 and 2, if you implement this project in Objective Caml (an ML variant that supports functional, object-oriented, and imperative programming) you will receive 10% extra credit on this project's score.

We have installed OCaml on bass. It comes with its own parser generator, ocamlyacc (similar to bison), and an impressive array of libraries.

If you are having technical difficulties with OCaml, contact the instructor. TA's do not have time and resources to help students with the OCaml implementation.

OCaml resources

The OCaml manual
Jason Hickey's book

Implementing the project in a functional language other than OCaml

This might be possible, but check with the instructor first.