TALKS
  1. KickStarter: Fast and Accurate Computations on Streaming Graphs via Trimmed Approximations
    at SoCalPLS (Southern California Programming Languages and Systems), University of California, Irvine, November 2016.
  2. Load the Edges You Need: A Generic I/O Optimization for Disk-based Graph Processing
    at ATC'16, Denver, Colorado, June 2016.
  3. Efficient Processing of Large Graphs via Input Reduction
    at HPDC'16, Kyoto, Japan, June 2016.
  4. ASPIRE: Exploiting Asynchronous Parallelism in Iterative Algorithms using a Relaxed Consistency based DSM
    at OOPSLA'14, Portland, Oregon, October 2014.
  5. CuSha: Vertex-Centric Graph Processing on GPUs
    at HPDC'14, Vancouver, Canada, June 2014.
  6. A Relaxed Consistency based DSM for Asynchronous Parallelism
    at SoCalPLS (Southern California Programming Languages and Systems), Harvey Mudd College, Claremont, May 2014.
    Abstract   Presentation  
    A Relaxed Consistency based DSM
    for Asynchronous Parallelism
    Many vertex-centric graph algorithms can be expressed via asynchronous parallelism by relaxing certain read-after-write data dependences and thus allowing threads to compute vertex values using stale (i.e., not the most recent) values of their neighboring vertices. We observe that on distributed shared memory systems, by converting synchronous algorithms into their asynchronous counterparts, algorithms can be made tolerant to high inter-node communication latency. However, high inter-node communication latency can lead to excessive use of stale values causing an increase in the number of iterations required by the algorithm to converge. In this talk we present a relaxed memory consistency model and consistency protocol that simultaneously tolerate communication latency and minimize the use of stale values. We demonstrate that for a range of asynchronous graph algorithms, on an average, our approach outperforms algorithms based upon: prior relaxed memory models that allow stale values by at least 2.27x; and Bulk Synchronous Parallel (BSP) model by 4.2x. We also show that our approach performs well in comparison to GraphLab, a popular distributed graph processing framework.

STUDENTS MENTORED
  1. Andy Thio (Undergraduate), High Performance Parallel Sokoban Solver, June 2017
  2. Shawn Lee (Undergraduate), Efficient Parallel Sudoku Solver via Thread Management & Data Sharing Methods, June 2017
  3. Bin Wu (Masters), Distributed Out-of-Core Graph Processing, August 2015
  4. Zhongqi Wang (Undergraduate), TIGRAPH - A Tiny Graph Processing System, June 2015
  5. Sihan He (Undergraduate), TIGRAPH - A Tiny Graph Processing System, June 2015
  6. Bryan Duane Rowe II (Masters), An Evaluation of Graph Processing Frameworks, February 2015