DNA COMPUTING

Dax Burkhart
CS85, Prof. Neal Young
Spring '96, Dartmouth College


-----------------------------
   FINAL PROJECT
-----------------------------

My initial project proposal was to find information pertaining more to
the biology side of DNA Computing.  In my search to do this I found
that the web was scattered with pages about DNA Computing and there
was no real central place to search for it.  Also, I spoke with
Biology Professor Bob Gross  (Dartmouth College) about the problems
that could go wrong with some of the procedures used in some papers
(such as Merge, Separate, Copy, Detect, Amplify, and Extract), and he
seemed to think that any procedure done on DNA was not perfect.  You
have to depend on the probability that it is correct when using the
precedures.  So, in light of these two things, I decided to put
together a web page that included many links to papers and other ideas
on DNA Computing and Molecular Biology.  Although it does not include
everyone or everything having to do with DNA Computing, it is an
excellent compilation of sources.

DNA COMPUTING

This page is designed to organize the topics related to DNA Computing into one page, and bring in some aspects of Molecular Biology. It includes links to papers, jump stations, home pages, articles, and more for the layperson and researcher alike.

People - (papers/pages)

ADLEMAN	AMENYO	AMOS	BAUM	BEAVER	BONEH	GLASER	KAPLAN
KARI	LIPTON	REIF	ROOß	ROTHEMUNG	SEEMAN	SMITH	WINFREE

Organizations

Miscellaneous

PEOPLE

Len Adelman
On Constructing a Molecular Computer Len Adelman (20 pages).

Abstract. It has recently been suggested that under some circumstances computers based on molecular interactions may be a viable alternative to computers based on electronics. Here, some practical aspects of constructing a molecular computer are considered.
John Amenyo
BIOCOMPUTERS: Architecture, Engineering and Technology John Amenyo.

Introduction. In what manner can computers built using nucleotide-based biomolecules, in the style of L. Adleman [1] be compared to extant electronic mainframes, microcomputers and other digital information processing systems [2]?

In this note, I would like to revisit this question and give a more "engineering-oriented" answer. Namely, I try to envision what the "hardware" of a bio-computer (Adleman-style) would look like when it becomes available. Of course, there are related questions such as the following.

Report on DNA-based Computers Workshop John Amenyo.

What is a computer? John Amenyo.
Martyn Amos (home page)
Error-resistant Implementation of DNA Computations Martyn Amos, Alan Gibbons and David Hodgson (10 pages).
Abstract. This paper introduces a new model of computation that employs the tools of molecular biology whose implementation is far more error-resistant than extant proposals. We describe an abstraction of the model which lends itself to natural algorithmic description, particularly for problems in the complexity class NP. In addition we describe a number of linear-time algorithms within our model, particularly for NP-complete problems. We describe an in vitro realisation of the model and conclude with a discussion of future work.
Eric Baum (home page)
DNA Sequences Useful for Computation Eric Baum (6 pages).
Abstract. Recent proposals for DNA based computing encode Boolean vector component values with sequences of DNA. It has previously been assumed that sufficient length random subsequences could be used to encode component values. However use of such subsequences will inadvertently result in long complementary subsequences. Complementary subsequences of sufficient length would stick to each other and cause mistakes or delays in computation. We suggest some constraints on DNA subsequences to be used in encodings, and describe maximal sets of subsequences satisfying these constraints.

Running dynamic programming algorithms on a DNA computer Eric B. Baum and Dan Boneh (7 pages).
Abstract. We show that DNA computers are useful for running algorithms based on dynamic programming. This class of algorithm takes advantage of the large memory capacity of a DNA computer. We present algorithms for solving certain instances of the knapsack problem, using a dynamic programming approach. Unlike other algorithms for DNA computers, which are brute force, dybamic programming is the same algorithm one would use to solve (smaller) problems on a conventional computer.

Computing with DNA Eric Baum (24 pages).
"These are transparencies from a talk given at the Organo Electronic Materials Conference. Unfortunately, the drawings are not included."
Don Beaver (home page)
Molecular Computation Bibliography

Molecular Computing Don Beaver (13 pages).

Abstract. We design a molecular Turig Machine and determine the complexity of the problems solvable by molecular computers. In [A94], a combinatorial molecular experiment to solve the NP-complete problem of Hamiltonian Path was proposed and implemented. Using our design, we show that such molecular computers can in fact compute PSPACE, under the generous assumptions implicit in [A94] fails to satisfy, we show that molecular computers are limited to solving problems in P.

Computing With DNA Don Beaver (10 pages).

Abstract. We consider molecular models for computing and derive a DNA-based mechanism for solving intractable problems through massive parallelism. In principle, such methods might reduce the effect needed to solve otherwise difficult tasks, such as factoring large numbers, a computationally-intensive task whose intractability forms the basis for much of modern cryptography.
Dan Boneh (home page)
Breaking DES Using a Molecular Computer Dan Boneh, Richard Lipton, and Chris Dunworth (20 pages).

Abstract. Recently Adleman has shown that a small traveling salesman problem can be solved by molecular operations. In this paper we show how the same principles can be applied to breaking the Data Encryption Standard (DES). Our method is based on an encoding technique presented by Lipton. We describe in detail a library of operations which are useful when working with a molecular computer. We estimate that given one arbitrary (plain-text, cipher-text) pair, one can recover the DES key in about 4 months of work. Furthermore, if one is given cipher-text, but the plain text is only known to be one of several candidates then it is still possible to recover the key in about 4 months of work. Finally, under chosen cipher-text attack it is possible to recover the DES key in one day using some preprocessing.

On the Computational Power of DNA Dan Boneh, Richard Lipton, Chris Dunworth, and Jiri Sgall (15 pages).

Abstract. We extend the results of Lipton to show how DNA based computers can be used to solve the satisfiability problem for circuits. We show that our methods enable random sampling of satisfying assignemnts. Furthermore, we show how one can solve optimization problems directly without first solving several decision problems. Finally we suggest a procedure for evaluating functions in the polynomial hierarchy.

Making DNA Computers Error Resistant Dan Boneh and Richard Lipton (9 pages).

Abstract. Recently Lipton showed that the formula satisfaction problem can be solved using a DNA based computer. The algorithm ignored the effects of errors that occur during biological experiments. In this paper we show that Lipton's algorithm can be made resistant to errors. In addition, we present a new circuit satisfaction algorithm which can be made error resistant using the same techniques.

A Divide and Conquer Approach to Sequencing Dan Boneh and Richard Lipton (6 pages).
Abstract. We suggest an approach to sequencing based on a "divide and conquer method Thisapproach eliminates the need for solving a hard NP-complete problem which arises when using traditional sequencing techniques. At the present time this algorithm has not been tried out in the lab. However, computer simulations using parts of the human genome provide some encouraging data.

Batching DNA Computations Dan Boneh and Richard Lipton (2 pages).

Abstract. In this short note we point out that in some cases batching computations on a molecular computer is very cheap. This is especially useful for breaking cryptosystems and can be useful for other applications as well.
Elton Glaser (home page)
DNA Models and Algorithms for NP-complete Problems Eric Bach, Anne Condon, Elton Glaser, and Celena Tanguay (22 pages).
Peter Kaplan (home page)
Topics in Biomolecular Computation
Note: This page is still under construction but when it is finished I think it will be worthwhile to look here.
Lila Kari (home page)
DNA Computing: the Arrival of Biological Mathematics Lila Kari (18 pages).

DNA Computing Based on Splicing: The Existence of Universal Computers Lila Kari, Rudolf Freund and Ghcorghc Paun (31 pages).

Abstract. Splicing systems are generative mechanisms based on the splicing operation introduced by Tom Head as a model of DNA recombination. We prove that the generative power of finite extended splicing systems equals that of Turing machines, provided we consider multisets or provided a control mechanism is added. We also show that there exist universal splicing systems with the properties above, i.e. there exists a universal splicing system with fixed components which can simulate the behaviour of any given splicing system, when an encoding of the particular splicing system is added to its set of axioms. In this way the possibility of designing programmable DNA computers based on the splicing operation is proven.

DNA Computing Based on Splicing: Universality Results Lila Kari, Rudolf Freund, Ghcorghc Paun, and Erzsebet Csuhaj-Varju.

Abstract. The paper extends some of the most recently obtained results on the computational universality of extended H systems (with regular sets of rules respectively with finite sets of rules used with simple additional mechanisms) and shows the possibility to obtain universal systems based on these extended H systems, i.e. the theoretical possibility to design programmable universal DNA computers based on the splicing operation. The additional mechanisms considered here are: multisets (counting the numbers of copies of each available string), checking the presence/absence of certain symbols in the spliced strings, and organizing the work of the system in a distributed way (like in a parallel communicating grammar system). In the case of multisets we also consider the way of simulating a Turing machine (computing a partial recursive function) by an equivalent H system (computing the same function), in the other cases we consider the interpretation of algorithms as language generating devices, hence the aim is to reach the power of Chomsky type-0 grammars, the standard model for representing algorithms being equivalent with Turing machines taken as language generators.

Test Tube Distributed Systems Based on Splicing Lila Kari, Ghcorghc Paun, and Erzsebet Csuhaj-Varju (21 pages).

Abstract. We define a symbol processing mechanism with the components (test tubes) working as splicing schemes in the sense of T. Head and communicating by redistributing the contents of tubes (in a similar way to the separate operation of Lipton-Adelman). (These systems are similar to the distributed generative mechanisms called Parallel Communicating Grammar Systems.) Systems with finite initial contents of tubes and finite sets of splicing rules associated to each component are computationally complete, they characterize the family of recursively enumberable languages. The existence of universal test tube distributed systems is obtained on this basis, hence the theoretical proof of the possibility to design universal programmable computers with the structure of such a system.
Richard Lipton (home page)
Using DNA to Solve SAT Richard Lipton (8 pages).
Abstract. We show how to use DNA experiments to solve the famous "SAT" problem of Computer Science. This is a special case of a more general method of [3] to solve NP-complete problems. The advantage of these results is the huge parallelism inherent in DNA based computing. It has the potential to yield vast speedups over conventional electronic based computers for such search problems.
John Reif (home page)
Parallel Molecular Computation: Models and Simulations and Figures John Reif (35 pages).

(shortened abstract)
This paper describes techniques for massively parallel computation at the molecular scale, which we refer to as molecular parallelism. While this may at first appear to be purely science fiction, already Adelman [A 94] has employed molecular parallelism in the solution of the Hamiltonian path problem of n nodes using O(n^2) lab steps on recombinant DNA of length O(n log n) base pairs. He successfully tested his techniques in a small lab experiment on DNA for a 7 node Hamiltonian path problem. Lipton [L 94] showed that finding the satisfying inputs to a Boolean formula of size n can be done in O(n) lab steps using DNA of length O(n log n) base pairs. This recent work by Adleman and Lipton in molecular parallelism considered only the solutionof NP search problems, and provided no way of quickly executing lengthy computations by purely molecular means, the number of lab steps depended linearly on the size of the simulated formula.

This paper describes techniques for quickly executing lengthy computations by the use of molecular parallelism. We demonstrate that molecular computations can be done using short DNA strands by more or less conventional biotechnology engineering techniques within a small number of lab steps.
Diana Rooß (home page)
On the Power of DNA-Computers Diana Rooß and Klaus Wagner (20 pages).

Abstract. In [Adl94] Adleman used biological manipulations with DNA strings to solve some instances of the Directed Hamiltonian Path Problem. Lipton [Lip94] showed how to extend this idea to solve any NP problem. We prove that exactly the problems in P^NP = delta(sub 2, super P) can be solved in polynomial time using Lipton's model. Various modifications of Lipton's model are investigated, and it is proved that their computational power in polynomial time can be characterized by one of the complexity classes P, delta(sub 2, super P), delts(sub 3, super P) or even PSPACE. Restricting Lipton's model to DNA strings of logarithmic length one can compute exactly the problems in L.
Paul Rothemung (home page)
A DNA and Restriction Enzyme Implementation of Turing Machines Paul Wilhelm Karl Rothemund.

Abstract. Bacteria employ restriction enzymes to cut or restrict DNA at or near specific words in a unique way. Many restriction enzymes cut the two strands of double-stranded DNA at different positions leaving overhangs of single-stranded DNA. Two pieces of DNA may be rejoined or ligated if their terminal overhangs are complementary. Using these operations fragments of DNA, or oligonucleotides, may be inserted and deleted from a circular piece of plasmid DNA. We propose an encoding for the transition table of a Turing machine in DNA oligonucleotides and a corresponding series of restrictions and ligations of those oligonucleotides that, when performed on circular DNA encoding an instantaneous description of a Turing machine, simulate the operation of the Turing machine encoded in those oligonucleotides. DNA based Turing machines have been proposed by Charles Bennett but they invoke imaginary enzymes to perform the state-symbol transitions. Our approach differs in that every operation can be performed using commercially available restriction enzymes and ligases.
Nadrian Seeman (home page)
Ned Seeman's Laboratory
"Our laboratory is investigating unusual DNA molecules in model systems that use synthetic molecules. Our interest in these molecules was originally stimulated by a desire to characterize Holliday junctions. These are four-arm branched DNA molecules that are found to be structural intermediates in genetic recombination. The focus of the work on these unusual molecules is to characterize their structure, dynamics and thermodynamics, and to establish the relationship between these properties and their biological function. In the last few years, the symmetry and crossover topology of the 4-arm junction has been characterized and analyzed. The study of recombination intermediates has been extended by constructing and analyzing molecules with double crossovers, and by exploring broader classes of multi-stranded molecules, called antijunctions and mesojunctions."
Warren Smith (home page)
List of publications by Smith, including some (non-hyperlinked) DNA papers.
An opinionated, but reasonably short, summary of the Mini DIMACS Workshop on DNA based computers, held at Princeton University on April 4 1995. Warren Smith (4 pages).
Erik Winfree (home page)
Molecular Computation Page

Complexity of Restricted and Unrestricted Models of Molecular Computation Erik Winfree (8 pages).

Abstract. In [Li1] and [Ad2] a formal model for molecular computing was proposed, which makes focused use of affinity purification. The use of PCR was suggested to expand the range of feasible computations, resulting in a second model. In this note, we give a precise characterization of these two models in terms of recognized computational complexity classes, namely branching programs (BP) and nondeterministic branching programs (NBP) respectively. This allows us to give upper and lower bounds on the complexity of desired computations. Examples are given of computable and uncomputable problems, given limited time.

On the Computational Power of DNA Annealing and Ligation Erik Winfree (10 pages).

Abstract. In [Winfree] it was shown that the DNA primitives of Separate, Merge, and Amplify were not sufficiently powerful to invert functions defined by circuits in linear time. Dan Boneh et al [Boneh] show that the addition of a ligation primitive, Append, provides the missing power. The question becomes, "How powerful is ligation? Are Separate, Merge, and Amplify necessary at all?" This paper proposes to informally explore the power of annealing and ligation for DNA computation. We conclude, in fact, that annealing and ligation alone are theoretically capable of universal computation.

MOLECULAR BIOLOGY

Nanotechnology
Nanotechnology

Engines of Creation: The Coming Era of Nanotechnology by K. Eric Drexler.

Molecular engineering: An approach to the development of general capabilities for molecular manipulation K. Eric Drexler.
Abstract. Development of the ability to design protein molecules will open a path to the fabrication of devices to complex atomic specifications, thus sidestepping obstacles facing conventional microtechnology . This path will involve construction of molecular machinery able to position reactive groups to atomic precision. It could lead to great advances in computational devices and in the ability to manipulate biological materials. The existence of this path has implications for the present.

Small is Beautiful: a collection of nanotechnology links

Prometheus Returns
Imagine a technology so powerful that it will allow such feats as desktop manufacturing, cellular repair, artificial intelligence, inexpensive space travel, clean and abundant energy, and environmental restoration; a technology so portable that everyone can reap its benefits; a technology so fundamental that it will radically change our economic and political systems; a technology so imminent that most of us will see its impact within our lifetimes. Such is the promise of Nanotechnology.

Paul Hengren (home page)
Lots of biology info and jump points.

PCR
PCR Reference Information What it is, what it's not.
BioGuide - PCR

Electrophoresis
What is Electrophoresis ?

The World of Electrophoresis

2-D Gel Electrophoresis And The Quantitation Of Gene Expression

Two-dimensional gel electrophoresis

Biology - Miscellaneous
Molecular Biology, Biochemistry Databases

Molecular Information Theory and the Theory of Molecular Machines

Principles of Biotechnology

Steps Toward Molecular Manufacturing

Restriction Enzymes - Background Paper
Introduction. Watson and Crick's description, in 1953, of the double helical structure of theDNA molecule (See Classic Collection- The Structure of DNA) opened the door to a new era in biological understanding and research. Scientists, now knowing the molecular structure of the hereditary molecule, could begin both to elucidate and to manipulate its function. These new studies were, however, dependent on the discovery and use of the many enzymes that are able to modify or join existing DNA molecules, or to aid in the synthesis of new DNA molecules.

REBASE Home Page
Dr. Richard J. Roberts' Restriction Enzyme Database

EL.B.A (Electronics and Biotechnolgy Advanced) Project: Towards Biocomputation from the Commission of the European Community

ORGANIZATIONS

Center for Discrete Mathematics and Theoretical Computer Science
DNA Sequence Classification Using Compression-Based Induction David Loewenstern, Haym Hirsh, Peter Yianilos, and Michiel Noordewier (14 pages).

Abstract. Inductive learning methods, such as neural networks and decision trees, have become a popular approach to developing DNA sequence identification tools. Such methods attempt to form models of a collection of training data that can be used to predict future data accurately. The common approach to using such methods on DNA sequence identification problems forms models that depend on the absolute locations of nucleotides and assume independence of consecutive nucleotide locations. This paper describes a new class of learning methods, called compression-based induction (CBI), that is geared towards sequence learning problems such as those that arise when learning DNA sequences. The central idea is to use text compression techniques on DNA sequences as the means for generalizing from sample sequences. The resulting methods form models that are based on the more important relative locations of nucleotides and on the dependence of consecutive locations. They also provide a suitable framework into which biological domain knowledge can be injected into the learning process. We present initial explorations of a range of CBI methods that demonstrate the potential of our methods for DNA sequence identification tasks.

Computing Research Association

The Journal of Computational Biology

The National Center for Biotechnology Information

Joslin Diabetes Center DNA Core Facility

NEC Research Institute Home Page

DNA Based Computers: A Mini DIMACS Workshop

2nd Annual DIMACS Workshop on DNA Based Computers

MISCELLANEOUS

THE HISTORY OF COMPUTING

DNA COMPUTERS Yali Friedman.
Abstract. On November 11 1994, Leonard Adleman published a paper in Science[1] describing the "Molecular Computation of Solutions of Combinatorial Problems". This was the first ever implementation of a DNA-based computer. Since then, many advancements have been proposed to refine the protocol for programming a DNA computer to reduce the complexity of the operations and eliminate errors. [2,3] Despite their respective complexities, biological and mathematical operations have some similarities:

The very complex structure of a living being is the result of applying simple operations to initial information encoded in a DNA sequence.
The result f(w) of applying a computable function to an argument w can be obtained by applying a combination of basic simple functions to w.

For the same reasons that DNA was presumably selected for living organisms as a genetic material, its stability and predictability in reactions, DNA strings can also be used to encode information for mathematical systems.

(NP-complete) problems
Explanations of some NP-complete problems.

Molecular Computation of Solutions to Combinatorial Problems from Science.
Introduction. The tools of molecular biology were used to solve an instance of the directed Hamiltonian path problem. A small graph was encoded in molecules of DNA, and the "operations" of the computation were performed with standard protocols and enzymes. This experiment demonstrates the feasibility of carrying out computations at the molecular level.

A VAT OF DNA MAY BECOME THE FAST COMPUTER OF THE FUTURE from NY Times.

DNA Computing from Science.

Computer In A Bottle: Your laptop may soon lose its cool hard silicon chips, in exchange for DNA. The stuff of life itself.

Gene Genie by Thomas A. Bass.
It's a hundred times faster than the best serial supercomputer. It's a billion times more energy efficient. It's a trillion times denser than the best storage media. It's a teaspoonful of DNA that's a computer! And Leonard Adleman invented it.

Produced by Dax Burkhart '96, Dartmouth College,
for COSC 85/185: Novel Frameworks in Theoretical Computer Science,
Professor Neal Young.
26 May 1996

This final project varies greatly from original project proposition because after searching the web, I found that the links to DNA Computing papers and pages were scattered throughout the web. So I decided to bring them together. I admit that the above does not contain ALL sources for DNA Computing, but it is an excellent start.

CONTENTS

PEOPLE

MOLECULAR BIOLOGY

ORGANIZATIONS

MISCELLANEOUS