Conversion of DFAs or NFAs to Regular Expressions

**Theorem:**
Any regular language can be described by a regular expression.
In other words, if *L = L(A)* for some DFA *A* then there
exists a regular expression *R* such that
* L(R) = L*.

To prove the theorem, given *A*, we will
construct *R* equivalent to *A*. The proof
is by state elimination. We start with our original
automaton (actually, it will be slightly modified),
and gradually remove states. The intermediate steps
will not really be FAs nor regular expressions. They
will be hybrid creatures that have states and transitions,
but the transitions will be labelled with regular
expressions. For example, we may have a transition
from *p* to *q* on an expression, say,
*E = 00(0+11) ^{*}0*. The meaning of this
is that while in

We now describe the construction. First, we will
convert *A* to an equivalent automaton
that has no transitions into the initial state,
and no transitions out of the final states.
We will also convert multiple transitions into
a single transition. Then we will execute state
elimination. When we're done, we will be left
with two states, one initial and the other
final, with just one transition between them.
The expression labeling this transition is our
desired expression *R*.

** Initialization.** We modify *A* as follows.

- Add a new state
*q'*and an l-transition from_{0}*q'*to_{0}*q*. The initial state of the new automaton is_{0}*q'*._{0} - Add a new state
*q*and an l-transition from each (old) final state_{f}*q*to*q*. The new automaton has just one final state_{f}*q*._{f} - For any pair of states
*p*and*q*, if*a,b,c,...*are on the transitions from*p*to*q*, replace all these transitions by one transition labelled with the regular expression*a+b+c...*.

** State Elimination.**
We now iterate the following process until there are
only two states left. Pick any state *q* other
than *q' _{0}* and

After we do the above for all pairs *p* and *s*,
we remove *q*. (We actually only need to worry about
pairs *p* and *s* that have transitions
*p*-> *q* and *q*-> *s*.)

Note that it is possible that *p = s*

When we are done, we will be left with only two
states, *q' _{0}* and

Note that in the above construction we never used
the fact that *A* is deterministic. So the
construction works for NFAs as well.

**Example.**
Let us apply the construction to the following NFA:

Note that this automaton accepts the set of strings
whose 2nd or 3rd last letter is *0*. After initialization, we get:

After eliminating q_{3}, we get:

After eliminating q_{1} and q_{2}, we get:

And finally, after eliminating q_{0}, we get:

So the resulting expression is
*(0+1) ^{*}0(0+1)(0+1+ l)*,
and we are done.