R

Algorithms and Computational Biology Lab
Computer Science & Engineering Department
University of California
Riverside, CA 92531

Research interests Combinatorial algorithms, Bioinformatics, Epigenetics, Metagenomics.
Advisor Dr. Stefano Lonardi

Computing the Microbiome: Faster, more Accurate and more Efficient Methods for the Analysis of Metagenomes

Abstract of the Dissertation

   Metagenomics is revolutionizing microbial ecology and has unlocked unprecedented opportunities in many domains of Life science. For instance, metagenomics has allowed the discovery of new forms of life in unexplored habitats in the marine environment. In medicine, metagenomics can help for accurate and faster diagnosis than standard laboratory procedures. In the context of pathogen surveillance in public health or biosurveillance, it was successfully applied with limited resources to monitor outbreaks in epidemic areas.

   As sequencing technologies have considerably improved in speed and cost over the past decade, the number of reference sequences in public databases is exponentially growing and thus faster while accurate and efficient computational methods are needed for analyzing these large data. The research presented in this dissertation focuses on (i) how to build faster, more accurate and more efficient sequence classification methods to determine the microbial composition of metagenomic samples (i.e., the CLARK series), and (ii) how to infer/recover the microbial composition in missing or contaminated data in the context of a city-scale biosurveillance for example.

   Our classification system is composed of several tools, namely CLARK, CLARK-l and CLARK-S, which are already used by several reserch teams worldwide as state-of-the-art methods. While CLARK is able to perform with high accuracy sequence classification with unprecedented speed, CLARK-S is a variant of CLARK and can achieve with high speed a higher accuracy than CLARK. We also show that the new sequence analysis methods are versatile and applicable to several contexts of sequence classification, for example, for BAC clones in the context of the barley genome.

Keywords: Microbiome, metagenomics, genomics, comparative microbiomics, classification, prediction, inference, sequence analysis, light-weight algorithm, k-mers, discriminative spaced k-mers, target-specific k-mers.

Ph.D. Final Defense date

Date: Wednesday, January 11th 2017 at 11:15am
Room: WCH, 415
Document: Dissertation