I am an Assistant Professor in the Computer Science and Engineering (CSE) Department at the University of California, Riverside (UCR), since 2020.

Prior to that, I was a postdoc associate at MIT CSAIL in 2019, working with Julian Shun. Before that, I finished my PhD at Carnegie Mellon University in 2018, advised by Guy Blelloch. Before that, I received my Bachelor’s degree in Computer Science from Tsinghua University in 2012. You can find my [CV] here.

My current research interest is design efficient (usually parallel) algorithms for large-scale data with good performance in practice. You can find more information about our research group here.

We always welcome highly self-motivated students to join our group, but it can be competitive as we only have limited slots per year. We do not necessarily check the general pool. If you want to join our research group, please fill out this form and drop me an email. If you are currently at UCR and you are interested in working with me or want me to write you a letter, please read the recruiting page.

I will teach CS 218 in Winter 2024. More info can be found [here]. The homepage for CS 119L in Winter 2024 is [here].

This is the homepage of my lovely wife.

View my publication list:

Google Scholar DBLP By Year

A full list of my publications can be found here.

Research Interests and Selected Topics

My research aims at designing efficient algorithms and software for the most fundamental problems (e.g., graphs, geometry) in CS. Basically speaking, the two main approaches are (1) by understanding the emerging hardware and algorithm design principles on top of them, and (2) to design new algorithms that are faster and/or have better theoretical guarantees than the best previous ones. In many cases, the two methods overlap, and the algorithms designed for certain hardware actually work well for a wide variety of scenarios, due to the new angle to consider these problems.

Parallel Algorithms:

Shared-memory parallel algorithms have been studied for over 50 years. Still, quite surprisingly, it seems that there are a lot of room for improvements both theoretically and practically, even for the most fundamental problems. The new algorithms I have designed are classed into the following categories.

The new parallel graph algorithms I have designed include: a variety of stepping algorithms (rho-stepping, radius-stepping, delta*-stepping) for SSSP, BGSS algorithm for SCC, BC labeling for biconnectivity, and many other distance-based or connectivity-based algorithms.

The new parallel geometric algorithms I have designed include: the first theoretically-efficient parallel algorithms for incremental convex hull and Delaunay triangulation, several algorithms to construct spatial-partitioning data structures, and several spatial clustering algorithms that are efficient both theoretically and practically.

I have been working on parallel tree structures, both in theory and in practice. Some interesting works include: parallel priority queues, compressed parallel search trees, and parallel data structures for range and nearest-neighbor search in low dimension.

We show parallel algorithms under the randomized incremental construction (RIC) framework, and many incremental algorithms in this framework are highly parallel (i.e., with shallow dependence depth).

This group of work aims at better utilizing the new non-volatile main memories (NVRAMs) for computing, which provides significantly larger capacity than normal DRAM. However, writing to NVRAMs is significantly more costly than reading, so how to reduce the number of writes in algorithms is interesting both theoretically and practically.

Space-efficient algorithms consider computing using limited-size fast memories (i.e., DRAMs). Nowadays, DRAM size is sufficient for many applications that were only available on distributed settings, super-computers, or on disks previously, which greatly improves the performance. However, DRAM size is still limited and relatively small (e.g., compared to NVRAMs). Hence, how to use such a DRAM for efficient computing remains to be an interesting topic to explore.

Processing-in-Memory (PIM) is an architectural idea that puts processors inside the memory chips to reduce data movement between CPU and main memory. Algorithmically, this new architecture can be considered as a hybrid of shared-memory computing (the CPU side) and distributed computing (the PIM side). While the setting is very different from what we have previously, it remains an open problem on how to cooperate between the CPU and the PIM modules for a balanced overall computation.

Other Topics

I have also been working on some other interesting research topics, such as computer graphics and image processing. You can go to the full list of my papers for more details.

Ph.D. Thesis:

Write-efficient algorithms, 2018