Weihua Pan

Ph.D. Student
Supervisor: Dr. Stefano Lonardi
Department of Computer Science and Engineering
Bourns College of Engineering
University of California, Riverside

University of California, Riverside
Winston Chung Hall, Room 362 (Algorithms and Computational Biology Lab)
Riverside, CA 92521

wpan005 AT ucr DOT edu                       

I'm currently a Ph.D student in Computer Science at University of California, Riverside. My supervisor is professor Dr. Stefano Lonardi. I received my B.S. in Computer Science and Technology from the Najing Normal University (NNU), Nanjing, P.R.China in July 2011 and received my M.S. in Computer Software and Theory from the University of Science and Technology of China (USTC), Hefei, P.R.China in June 2014. Here is my CV.

  • Ph.D. Computer Science (2014.9 to present)
    University of California, Riverside

  • M.S. Computer Software and Theory (2011.9 to 2014.6)
    University of Science and Technology of China

  • B.S. Computer Science and Technology (2007.9 to 2011.7)
    Nanjing Normal University
Research Interests
  • Computational Molecular Biology

    Computational Epigenomics

    Next-generation Sequencing Data Analysis

    Metagenomics Sequence Analysis

  • Large-scale Biological Data Mining
  • Approximation Algorithm and Heuristic Algorithm
  • Dynamics of Nucleosomes (proceeding)

    I'm tring to create an efficient method to analyze the dynamics of specific. The method will track necleosomes at different time points, and eventually decide which nucleosomes are stable and which are unstable.
  • Large-scale haplotype phasing algorithm (finished)

    I have improved WinHAP algorithm, which is proposed in 2012, in two aspects:
    (1) a divide-and-conquer strategy is utilized to solve the challenge of huge computer memory required by the existing algorithms, The basic idea is to screen the long chromosomes for haplotypes within the consecutive 1,000-SNP windows. Thus, the memory need of the algorithm is only related with one segment and no longer increases as length of sequences;
    (2) the OpenMP parallel computing mode is implemented to utilize all the computing power in a multi-core computer cluster.
    Improved WinHAP algorithm can phase 500 genotypes with 1,000,000 SNPs using just 12.8MB in memory and 2.5 hours on a personal computer.
  • Metagenomics short reads binning (finished)

    My recent work includes unsupervised and supervised methods for metagenomics short reads binning and methods for obtaining virus reads from metagenomic sample.

    I have proposed a binning tool called MetaObtainer. MetaObtainer synthesizes some of newest technologies for processing short reads, so it can have better performation than other tools. It can (1) deal with next-generation sequencing reads which are shorter than 100bp with very high accuracy (both of Precision and Recall are more than 90%); (2) find unknown species using the reference genomes of species which are similar with it; (3) perform well when reads of specified species are very few in the dataset; (4) handle genomes of similar abundance levels as well as different abundance levels (1:10).
  • Metagenomics function analysis (finished)

    I have cooperated with Dr. Kang Ning 's group in Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences and improved a tool called Parallel-MATA to version 2. Parallel-MATA2.0 enhanced the taxonomical analysis based on multiple databases, improved parallel computation's efficiency, and enabled interactive visualization of results. Furthermore, it included functional analysis for metagenomic samples, which is based on dual computational engines: one based on SEED database and another based on GO (Gene Ontology) hierarchical structure.
Teaching Assistant

    In USTC:

Honors and Awards

    As a phd student:
  • Dean's Distinguished Fellowship. 2014-2016.
    As a master student:
  • "Global Digital" Scholarship. October, 2013.
  • "Guanghua" Scholarship. November, 2012.
    As an undergraduate student:
  • National Encouragement scholarship. November, 2008.
  • National Mathematical Contest in Modeling, second prize. December, 2009.
  • Merit Student of Jiangsu Province. May, 2010.
  • Excellent Bachelor's Thesis of Jiangsu Province, third prize. April, 2012
  • Mathematical Contest in Modeling (MCM), successful participants. In 2010.
  • National Contest of Software Design and Development, third prize of Jiangsu zone. July, 2010.
  • Outstanding Graduates of Nanjing Normal University. June, 2011.
  • Scholarship of Nanjing Normal University, first level (6 times), second level (1 time). In 4 years.
  • Pacemaker to Merit Student of Nanjing Normal University. November, 2010.
  • Merit Student of Nanjing Normal University (2 times).
  • "Xiaotong Fei" Scholarship. November, 2010.
  • "Sifang culture" Scholarship. April, 2010.
  • Mathematical Contest in Modeling of Nanjing Normal University, first prize (in 2010), third prize (in 2009). June, 2009 and June, 2010.
  • Information Security Contest of Nanjing Normal University, third prize. March, 2010.
  • Programming Contest of Nanjing Normal University, third prize. April, 2009.
  • Excellent Student Cadre of Nanjing Normal University. December, 2009.

Selected Publications
  • MetaObtainer: a tool for obtaining specified species from metagenomic reads of next-generation sequencing
    Weihua Pan, Bo Chen and Yun Xu.
    Interdisciplinary Sciences: Computational Life Sciences, accepted.

  • WinHAP2: an extremely fast haplotype phasing program for long genotype sequences
    Weihua Pan, Yanan Zhao, Yun Xu and Fengfeng Zhou.
    BMC Bioinformatics, 2014, 15:164.

  • Parallel-META 2.0: Enhanced Metagenomic Data Analysis with Functional Annotation, High Performance Computing and Advanced Visualization
    Xiaoquan Su $, Weihua Pan $, Baoxing Song, Jian Xu and Kang Ning.
    PLOS One, 2014, 9(3): e89323. (co-first author)

  • Softwares (I do or I participate in doing)
    • WinHAP
    • WinHAP is an extremely fast tool for large-scale haplotyping: inferring haplotype sequences from genotype sequences.

    • Parallel-MATA
    • Parallel-META is a GPGPU and Multi-Core CPU based software which can parallelly analyze massive metagenomic data structures, report the classification, construction and distribution on phylogenetic & taxonomic and functional level.

    • MetaObtainer
    • MetaObtainer is a tool for obtaining the specified species from next-generation sequencing short reads.
    ACM Digital Library      IEEE Xplore     

    National Center for Biotechnology Information      HapMap Project      International Society for Computational Biology

    ISMB Conference      RECOMB Conference      BIOINFORMATCS Journal

    Weihua Pan
    Last modified: Weds January 28, 2015