cs152 Project Phase 1: Lexical Analyzer Generation Using flex

Due Date: Wednesday, Jan 28, 2009 at 11:59p.m.

Grade Weight: 10% of total course grade

The project must be completed individually; you are not allowed to collaborate with your classmates. Any collaboration or code sharing (including code submitted in past quarters) will be considered a violation of academic integrity and dealt with as set forth in the academic integrity policy.

Overview

For this first part of the class project, you will use the flex tool to generate a lexical analyzer for a high-level source code language called "MINI-L". The lexical analyzer should take as input a MINI-L program, parse it, and output the sequence of lexical tokens associated with the program.

[The MINI-L language is described in detail here.]
[The required output format for your lexical analyzer is described here.]


Flex

Flex is a tool for generating lexical analyzers. Lexical analyzers scan text (a sequence of characters) and look for lexical patterns in the text. Flex requires an input file specifying a description for a lexical analyzer to generate. From this description, flex will automatically create a C program for you (called lex.yy.c) that will perform the lexical analysis.

In our department, flex is installed and can be used on machine "bass".

[A brief introduction to flex can be found here.]
[The detailed manual for flex can be found here.]


Detailed Requirements

The following tasks will need to be performed to complete this phase of the project.
  1. Write the specification for a flex lexical analyzer for the MINI-L language. For this phase of the project, your lexical analyzer need only output the list of tokens identified from an inputted MINI-L program.
    Example: write the flex specification in a file named mini_l.lex.
  2. Run flex to generate the lexical analyzer for MINI-L using your specification.
    Example: execute the command flex mini_l.lex. This will create a file called lex.yy.c in the current directory.
  3. Compile your MINI-L lexical analyzer. This will require the -lfl flag for gcc.
    Example: compile your lexical analyzer into the executable lexer with the following command: gcc -o lexer lex.yy.c -lfl. The program lexer should now be able to convert an inputted MINI-L program into the corresponding list of tokens.

Example Usage

Suppose your lexical analyzer is in the executable named lexer. Then for the MINI-L program primes.min, your lexical analyzer should be invoked as follows:

cat primes.min | lexer

The list of tokens outputted by your lexical analyzer should then appear as they do here. The tokens can be printed to the screen (standard out).

Another example: for program mytest.min, the outputted tokens should look like this.


Submission instructions

Submit via Moodle. Turn in your source code and the Makefile (or a build script) in a .tar.gz or .zip file. Make sure you can uncompress and compile your submission on bass, otherwise your score for this project will be 0.


Extra credit for implementing your project in OCaml

If you implement this project in Objective Caml (an ML variant that supports functional, object-oriented, and imperative programming) you will receive 10% extra credit on this project's score.

Functional languages are an excellent choice for writing a compiler: they feature strong static typing, pattern matching, algebraic (variant) data types, parametric polymorphism, type inference, automatic garbage collection. These traits permit rapid program development, and lead to a concise and elegant implementation.

We have installed OCaml on bass. It comes with its own lexer generator, ocamllex (similar to flex), and an impressive array of libraries.

OCaml resources

Implementing the project in a functional language other than OCaml

This might be possible, but check with the instructor first.