Weihua Pan

Ph.D. Candidate
Advisor: Dr. Stefano Lonardi
Department of Computer Science and Engineering
Bourns College of Engineering
University of California, Riverside

University of California, Riverside
Winston Chung Hall, Room 362 (Algorithms and Computational Biology Lab)
Riverside, CA 92521

wpan005 AT ucr DOT edu                       

I'm currently a Ph.D student in Computer Science and M.S. student in Statistics at University of California, Riverside. I received my B.E. in Computer Science and Technology from the Najing Normal University (NNU), Nanjing, P.R.China in July 2011 and received my M.E. in Computer Software and Theory from the University of Science and Technology of China (USTC), Hefei, P.R.China in June 2014. Here is my CV.

  • Ph.D. Computer Science (2014.9 to present)
    University of California, Riverside

  • M.S. Statistics (2016.9 to present)
    University of California, Riverside

  • M.E. Computer Software and Theory (2011.9 to 2014.6)
    University of Science and Technology of China

  • B.E. Computer Science and Technology (2007.9 to 2011.7)
    Nanjing Normal University
Research Interests
  • Computational Biology

    Contiguity Improvement for Genome Assembly

    Nucleosome Movement and its relationship with Gene Expression and Function

    Large-scale Haplotype Inference

    Metagenomics Sequence Analysis and Functional Analysis

  • Combinatorial Optimization Approximate Algorithm
  • Statistics Modeling / Machine Learning / Deep Learning / Data Mining
  • Nucleosome Movement and its relationship with Gene Expression and Function (in progress)

    We already propose a method ThIEF for tracking nucleosomes across multiple time points. Now I’m trying to characterize nucleosome movement using tracks generated by ThIEF, and study its relationship with gene expression and gene function. If strong covariance can be found, I plan to apply deep learning method to predict gene expression level and gene function by nucleosome movement information.
  • Chimeric Contigs Correction for de novo Genome Assemblies via Low-quality Optical Maps (almost finished)

    I propose a novel chimeric contigs correction method called Novo&Chimeric that can correct chimeric contigs and chimeric optical maps by each other. The problem is modeled into weighted vertex cover problem for deciding whether to correct contigs or optical maps when they conflict. Experiments show that our tool can outperform manual work done by experienced researcher of genome assembly problem using much less time.
  • De novo Genome Assemblies Reconciliation via Optical Maps (finished)

    I propose a novel assembly reconciliation method called Novo&Stitch that can take advantage of optical maps to accurately carry out assembly reconciliation. Combinatorial optimization models and technologies such as graph model, dynamic programming, weighted vertex cover model on hypergraph, greedy strategy, linear programming are used for solving some subproblems like data reduction, error correction and post-processing. Extensive experimental results demonstrate that our tool can significantly improve the contiguity of de novo genome assemblies without introducing misassembles or reducing completeness.
  • Large-scale Haplotype Inference for Population Data (finished)

    I improve a haplotype inference algorithm WinHAP to version 2 in two aspects:
    (1) a divide-and-conquer strategy is utilized to solve the challenge of huge computer memory required by the existing algorithms, The basic idea is to screen the long chromosomes for haplotypes within the consecutive 1,000-SNP windows. Thus, the memory need of the algorithm is only related with one segment and no longer increases as length of sequences;
    (2) the OpenMP parallel computing mode is implemented to utilize all the computing power in a multi-core computer cluster.
    Improved WinHAP algorithm can phase 500 genotypes with 1,000,000 SNPs using just 12.8MB in memory and 2.5 hours on a personal computer.
  • Metagenome NGS Reads Binning (finished)

    I propose a supervised metagenome reads binning method MetaObtainer by combining both similarity-based and composite-based binning methods. It can deal with very short NGS reads because similarity-based method is used for pre-grouping reads before characterization, and it’s much faster than most similarity-based methods because it’s alignment-free.
  • Metagenomics Functional Analysis Pipeline (finished)

    I cooperate with Dr. Kang Ning 's group in Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences and improve a tool called Parallel-MATA to version 2. Parallel-MATA2.0 enhances the taxonomical analysis based on multiple databases, improved parallel computation's efficiency, and enabled interactive visualization of results. Furthermore, it includes functional analysis for metagenomic samples, which is based on dual computational engines: one based on SEED database and another based on GO (Gene Ontology) hierarchical structure.
  • Novo&Stitch: Accurate Reconciliation of Multiple de novo Genome Assemblies via Optical Maps.
    Weihua Pan, Steve I. Wanamaker, Audrey M.V. Ah-Fong, Howard Judelson, and Stefano Lonardi.
    Submitted to RECOMB 2018.

  • ThIEF:Finding Genome-wide Trajectories of Epigenetics Marks.
    Anton Polishko, Md. Abid Hasan, Weihua Pan, Evelien M. Bunnik, Karine Le Roch and Stefano Lonardi.
    WABI 2017, 19:1-19:16.

  • MetaObtainer: a tool for obtaining specified species from metagenomic reads of next-generation sequencing
    Weihua Pan, Bo Chen and Yun Xu.
    Interdisciplinary Sciences: Computational Life Sciences, 7(4), pp.405-413.

  • WinHAP2: an extremely fast haplotype phasing program for long genotype sequences
    Weihua Pan, Yanan Zhao, Yun Xu and Fengfeng Zhou.
    BMC Bioinformatics, 2014, 15:164.

  • Parallel-META 2.0: Enhanced Metagenomic Data Analysis with Functional Annotation, High Performance Computing and Advanced Visualization
    Xiaoquan Su $, Weihua Pan $, Baoxing Song, Jian Xu and Kang Ning.
    PLOS One, 2014, 9(3): e89323. (co-first author)

  • Softwares
    • Novo&Chimeric
    • Novo&Chimeric is an optical map based chimeric contigs correction tool

    • Novo&Stitch
    • Novo&Stitch is an optical map based de novo genome assemblies reconciliation tool

    • WinHAP
    • WinHAP is an extremely fast tool for large-scale haplotyping: inferring haplotype sequences from genotype sequences.

    • Parallel-MATA
    • Parallel-META is a GPGPU and Multi-Core CPU based software which can parallelly analyze massive metagenomic data structures, report the classification, construction and distribution on phylogenetic & taxonomic and functional level.

    • MetaObtainer
    • MetaObtainer is a tool for obtaining the specified species from next-generation sequencing short reads.
    Paper Review

    • 17th Workshop on Algorithms in Bioinformatics (WABI 2017)
    • 7th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM-BCB 2016)
    • 6th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM-BCB 2015)
    • 15th Workshop of Algorithms in Bioinformatics (WABI 2015)
    • 2015 RECOMB Workshops on Massively Paraellel Sequencing (RECOMB-Seq 2015)
    • 19th Annual International Conference on Research in Computational Molecular Biology (RECOMB 2015)
    Teaching Assistant

    • Software Construction (CS100), University of Californina Riverside, Spring quarter, 2016.
    • Software Construction (CS100), University of Californina Riverside, Winter quarter, 2016.
    • Design and Analysis of Algorithm (CS218), University of Californina Riverside, Fall quarter, 2015.

    • Design and Analysis of Algorithm, University of Science and Technology of China, 2012.

    Honors and Awards

        As a phd student:
    • Dean's Distinguished Fellowship. 2014-2016.
        As a master student:
    • "Global Digital" Scholarship. October, 2013.
    • "Guanghua" Scholarship. November, 2012.
        As an undergraduate student:
    • National Encouragement scholarship. November, 2008.
    • National Mathematical Contest in Modeling, second prize. December, 2009.
    • Merit Student of Jiangsu Province. May, 2010.
    • Excellent Bachelor's Thesis of Jiangsu Province, third prize. April, 2012
    • Mathematical Contest in Modeling (MCM), successful participants. In 2010.
    • National Contest of Software Design and Development, third prize of Jiangsu zone. July, 2010.
    • Outstanding Graduates of Nanjing Normal University. June, 2011.
    • Scholarship of Nanjing Normal University, first level (6 times), second level (1 time). In 4 years.
    • Pacemaker to Merit Student of Nanjing Normal University. November, 2010.
    • Merit Student of Nanjing Normal University (2 times).
    • "Xiaotong Fei" Scholarship. November, 2010.
    • "Sifang culture" Scholarship. April, 2010.
    • Mathematical Contest in Modeling of Nanjing Normal University, first prize (in 2010), third prize (in 2009). June, 2009 and June, 2010.
    • Information Security Contest of Nanjing Normal University, third prize. March, 2010.
    • Programming Contest of Nanjing Normal University, third prize. April, 2009.
    • Excellent Student Cadre of Nanjing Normal University. December, 2009.

    ACM Digital Library      IEEE Xplore     

    National Center for Biotechnology Information      HapMap Project      International Society for Computational Biology

    ISMB Conference      RECOMB Conference      BIOINFORMATCS Journal

    Weihua Pan
    Last modified: Weds October 25, 2017