Bison Tutorial
Lan Gao
[Home]    [What is a parser?]    [How to use Bison?]    [Practice]    [Resources]

What is a Parser?
Parsing is the process of matching grammar symbols to elements in the input data, according to the rules of the grammar. The parser obtains a sequence of tokens from the lexical analyzer, and recognizes it's structure in the form of a parse tree. The parse tree expresses the hierarchical structure of the input data, and is a mapping of grammar symbols to data elements. Tree nodes represent symbols of the grammar (non-terminals or terminals), and tree edges represent derivation steps.

There are two basic parsing approaches: top-down and bottom-up. Intuitively, a top-down parser begins with the start symbol. By looking at the input string, it traces a leftmost derivation of the string. By the time it is done, a parse tree is generated top-down. While a bottom-up parser generates the parse tree bottom-up. Given the string to be parsed and the set of productions, it traces a rightmost derivation in reverse by starting with the input string and working backwards to the start symbol.


How to use Bison?
Bison is a general-purpose parser generator that converts a grammar description (Bison Grammar Files) for an LALR(1) context-free grammar into a C program to parse that grammar. The Bison parser is a bottom-up parser. It tries, by shifts and reductions, to reduce the entire input down to a single grouping whose symbol is the grammar's start-symbol.

Steps to use Bison:

Write a lexical analyzer to process input and pass tokens to the parser (calc.lex).
Write the grammar specification for bison (calc.y), including grammar rules, yyparse() and yyerror().
Run Bison on the grammar to produce the parser. (Makefile)
Compile the code output by Bison, as well as any other source files.
Link the object files to produce the finished product.


Practice
  • Get familiar with Bison: Write a desk calculator which performs '+' and '*' on unsigned integers
    1. Create a Directory: "mkdir calc"
    2. Save the five files (calc.lex, calc.y, Makefile, main.cc, and heading.h) to directory "calc"
    3. Command Sequence: "make"; "./calc"
    4. Use input programs (or stdin) which contain expressions with integer constants and operators + and *, then press Ctrl-D to see the result

  • Understand the input file
    1. Format:
          %{
          C Declarations
          %}
          Bison Declarations
          %%
          Grammar Rules
          %%
          Additional C Code
    2. Useful Bison definitions:
          %token, %union, %type, %left, %right, %nonassoc, ...
      Format of the grammar rules section:
          result: components ...
              ;

      Important data structure and functions:
          yylval, YYSTYPE, yyerror(), yyparse()

Resources
  1. Bison homepage
  2. Bison manual
  3. man bison