FLEX Tutorial
Lan Gao
[Home]    [What is a Scanner?]    [How to use FLEX?]    [Practice]    [Resources]

What is a Scanner?
The scanner performs lexical analysis of a certain program (in our case, the Simple program). It reads the source program as a sequence of characters and recognizes "larger" textual units called tokens. For example, if the source programs contains the characters

VAR ics142: INTEGER; // variable declaration

the scanner would produce the tokens

VAR  ID(ics142)  COLON  ID(INTEGER)  SEMICOLON

to be processed in later phases of the compiler. Note that the scanner discards white space and comments between the tokens, i.e. they are "filtered" and not passed on to later phases. Examples of nontokens are tabs, line feeds, carriage returns, etc.


How to use FLEX?
FLEX (Fast LEXical analyzer generator) is a tool for generating scanners. In stead of writing a scanner from scratch, you only need to identify the vocabulary of a certain language (e.g. Simple), write a specification of patterns using regular expressions (e.g. DIGIT [0-9]), and FLEX will construct a scanner for you. FLEX is generally used in the manner depicted here:

First, FLEX reads a specification of a scanner either from an input file *.lex, or from standard input, and it generates as output a C source file lex.yy.c. Then, lex.yy.c is compiled and linked with the "-lfl" library to produce an executable a.out. Finally, a.out analyzes its input stream and transforms it into a sequence of tokens.

*.lex is in the form of pairs of regular expressions and C code. (sample1.lex, sample2.lex)
lex.yy.c defines a routine yylex() that uses the specification to recognize tokens.
a.out is actually the scanner!


Practice
  • Get familiar with FLEX
    1. Try sample*.lex
    2. Command Sequence:
          flex sample*.lex
          gcc lex.yy.c -lfl
          ./a.out

  • Understand the input file
    1. Format:
          definitions
          %%
          rules
          %%
          user code
    2. The definitions section: "name definition"
      The rules section: "pattern action"
      The user code section: "yylex() routine"
    3. Try to answer questions listed in the sample files

  • Write a scanner for 32-bit hexadecimal numbers. Here is the answer...

Resources
  1. FLEX homepage
  2. FLEX manual
  3. man flex