- Using DNA to Solve SAT, by Richard Lipton, January 1995. (Or here.)
- An introductory version of "On the Computational Power of DNA". See lecture 1 summary below.
- On the Computational Power of DNA, by Boneh, Dunworth, Lipton and Sgall, October 1995.
- "We show how DNA based computers can be used to solve the satisfiability problem for boolean circuits. Furthermore, we show how DNA computers can solve optimization problems directly without first solving several decision problems..." (see lecture 2 summary below)
- Breaking DES Using a Molecular Computer
- Boneh, Dunworth, Lipton. (See lecture 2 summary below.) Some details about the biology.

- Bionet
**newsgroup**on information theory. - Description of talks at the Princeton workshop '95
- Interesting, brief summaries of the results presented at DNA Based Computers: A Mini DIMACS Workshop, April 1995
- An annotated on-line bibliography on Molecular Computation maintained by Don Beaver at PSU.
- On Constructing a Molecular Computer. Adleman.
- Suggestions on how to expand earlier results. Non-DNA molecular computing, "combinatorial chemistry", rational drug design. Some discussion of practical considerations (by the man who actually did a biological computation!)

In more detail we discussed Using DNA to Solve**
Satisfiability**, by Lipton. The idea is to generate DNA for all
possible assignments to the variables (by mixing strands that glue together in
the right ways similar to Adleman's method) and then to select only those that
satisfy the formula. To select "X or Y" for instance, filter out the
assignments that have X true, filter out those that have Y true, and then take
the union of the two assignments (pour the two test tubes together). To do "X
AND Y", first filter X, then in the result filter Y. This works for Boolean
**expressions**.

**second meeting:** On the Computational Power of
DNA, Boneh et al. This paper is a little more technical and develops some
more powerful ideas. The basic new operation is **extension**: given a tube
of DNA, each strand can be extended by attaching some given strand at one end.
For instance, to construct DNA strands representing all 2^N N-bit strings,
start with a "null" sequence in a tube T and repeat the following: duplicate
the current tube to get tubes T and T'. Extend the strings in T by DNA
representing 0 and those in T' by DNA representing 1. Pour T' back into T and
continue. After n steps, each string will occur exactly once.

Extension can also be used to solve **circuit satisfiability**. Given a
circuit, consider the corresponding directed acyclic graph where each interior
node represents a logic gate and each leaf represents a variable or its
negation. The "root" node of the circuit is the value computed. Now use a
different approach than mixing and separating, as follows. Start with all
possible 2^N inputs in a test tube. Let, say, "x1,x2,x3,...,xN" denote a
typical input string. Consider the gates sequentially in order of increasing
distance from the leaves. After consideration of the Ith gate (for each I),
each string will be extended to "x1,x2,...,xN,g1,g2,...,gI" where gI is the
value of the Ith gate when the circuit is given input "x1,..,xN". This is
accomplished as follows. If the Ith gate has inputs xJ and xK, for instance,
separate the contents of the current tube into 4 tubes T00, T01, T10, T11,
depending on the values of xJ and xK. Extend the strings in the tube T00 with
f(0,0), where f is the function computed by the Ith gate. Similarly extend the
other tubes. Now pour all tubes back into tube T. When all gates have been
computed, extract the strings with gK representing 1.

Boneh et al. observe that this idea has other uses. **Optimization**
problems such as finding a satisfying assignment with a maximum number of true
variables can be solved by applying the technique with longer subsequences
representing "1"s than "0"'s, and then using electrophoresis to select the
longest strands at the end of the process. Similar approaches work for
MAX-CLIQUE, etc. An exercise: can you figure out how to, given a formula in
conjuctive normal form (an "and" of a bunch of clauses, each clause being an
"or" of several variables and negations of variables), find the maximum number
clauses that can be satisfied simultaneously by one assignment?

**Random sampling** of satisfying assignments is straightforward as well,
because the process produces exactly one copy of each satisfying assignment; by
standard results, this means that estimating the *number* of satisfying
assignments is also possible. To **complexity theorists**, this is quite
interesting, because the class **#P** ("sharp-P") which represents hard counting
problems (such as counting satisfying assignments), is notoriously intractable.
(If one can count satisfying assignments, then any problem in the polynomial
hierarchy can be solved with a polynomial number of such queries --- 1991 6
STRUCTCMP, Toda & Ogiwara, Counting Classes are At Least as Hard as the
Polynomial-Time Hierarchy.)

A more direct approach solves problems K levels up in the **polynomial
hierarchy** with O(K) extra steps. For instance, suppose a problem is in the
second level of the polynomial hierarchy, specifically, suppose the language
encoding the problem can be expressed as "L= { X : for all Y, there exists Z
such that P(X,Y,X) }" where P is a predicate computable in time polynomial in
the size of X. For instance, "L=graphs such that, for all two-colorings of the
graph (i.e., all ways of partitioning the vertices into two sets), one of the
two colors contains a clique of a given size." Even for complete graphs, this
problem (in this case called computing the "Ramsey Number") is notoriously
hard.

Boneh et al. use the ability to "complement" (in the set-theoretic sense) the strands of DNA in the tube to solve such problems. For the L above, the method works as follows (assuming we have a circuit to compute P). Using the previous methods, generate a tube with those (X,Y,Z) inputs that satisfy P. Now chop off the "Z" portion of each string, leaving in the tube (X,Y) pairs. Now generate a tube that has exactly those (X,Y) pairs not in the current tube by starting with all possible (X,Y) pairs and filtering out those matching the DNA in the first tube. This leaves in the tube pairs (X,Y) such that there is no Z satisfying P(X,Y,Z). Now chop off the "Y" part of each pair. This leaves the strings "X" such that there exists a "Y" such that for all Z, P(X,Y,Z) is false. Now complement this set of X's. The result is exactly those X's in L.

Finally, it appears that the method gives an approach to cracking DES. The circuit that computes DES, given the 56-bit key K and a 64-bit plain text P, produces a cypher-text C. If a (P,C) pair is known, the key can be found as follows. Compute the circuit simultaneously for all 2^56 keys and the P for which C is known. Use the previous technique that appends the circuit gate values to each input. At the end, each string will encode the inputs, the intermediate gates, and the outputs. Select the string that encodes the correct output and recover the key from it. The paper estimates that this can be done in 3-4 months. Conventional techniques are faster (requiring 2^43 or so operations, maybe one week on a fast computer) but require roughly 2^43 plain-text/cypher pairs, so more or less require that the circuit being cracked is available.

Since only the first few **Ramsey numbers** are known, I propose the following
**challenge**: use these techniques to compute some new Ramsey numbers.
Last modified: Sat Nov 2 05:35:52 EST 2002