Tin Vu

Database and Data mining Lab · Department of Computer Science and Engineering · University of California, Riverside, CA 92521 · tin.vu@email.ucr.edu

I am a PhD candidate and software engineer who is passionate about turning challenges into opportunities with inspired data. My research focuses on applying machine learning techniques to improve big data management systems, especially in spatial databases. Here is my CV.

Update: I have graduated with a PhD degree and joined Microsoft as an Applied Scientist.

Experience

Research Assistant

University of California, Riverside

Spatial Partitioning using Deep Learning: utilize the power of deep learning to build a model that can predict the best partitioning technique for a given spatial dataset. Source Code.

Query Optimization using Deep Learning: explore the capabilities of deep learning in the context of query optimization.

Indexing Techniques for Big Spatial Data: build a big data management system which fully supports spatial data processing. Source Code.

09/2016 - Present

Teaching Assistant

University of California, Riverside

CS 014 - Introduction to Data Structures and Algorithms (Fall 2017, Summer 2018): CS 014 introduces the students to the fundamental data structures and algorithmic analysis techniques such as lists, stacks, queues, search trees, sorting algorithms, hash tables, and graphs.

CS 141 - Intermediate Data Structures and Algorithms (Winter 2018, Spring 2018): CS 141 provides the basic background for a computer scientist in the area of data structures and algorithms. During this course, students will learn problem solving skills, how to compare them, and how to apply them in real problems.

CS 218 - Design and Analysis of Algorithms (Fall 2018): Study of efficient data structures and algorithms for solving problems from a variety of areas such as sorting, searching, selection, linear algebra, graph theory, and computational geometry. Worst-case and average-case analysis using recurrence relations, generating functions, upper and lower bounds, and other methods.

CS 167 - Introduction to Big Data (Spring 2020): CS 167 covers the data management and systems aspects of big data platforms such as Hadoop, Spark, and AsterixDB. In this course, you will learn how the data is stored in a distributed file system and how the queries run in parallel.

09/2017 - Present

Software Engineer Intern

Microsoft Corporation

Map & GeoSpatial Group, Microsoft AI & Research: explore how to leverage machine learning, deep learning as well as geospatial technology to improve the quality of Bing Maps geocoding system.

09-12/2019 and 06-09/2020

Software Development Engineer Intern

Environmental Systems Research Institute (ESRI)

ArcGIS GeoDatabase Group: applied parallel processing techniques to improve the performance and scalability of Utility Network operations; won the 2nd Place and Best Presentation Award at ESRI Intern Hackathon.

06/2019 - 09/2019

Data Engineer

VNG Corporation

R&D Division: developed a database system for the largest online game service in Vietnam with 5 million customers.

04/2014 - 04/2016

Research Assistant

Mimas Research Group

Emotion recognition system: developed a machine learning system to identify human emotion based on EEG signal.

04/2010 - 07/2012

Education

University of California, Riverside

PhD Candidate in Computer Science

Advisor: Dr. Ahmed Eldawy

I also have collaboration projects with Dr. Vassilis Tsotras (UC Riverside), Dr. Vagelis Hristidis (UC Riverside) and Dr. Michael J. Carey (UC Irvine).

09/2016 - Present

Hanoi University of Science and Technology

Bachelor of Engineering in Information Technology

09/2008 - 06/2013

Publications

Please visit my Google Scholar profile for the most updated publications

2022

  • Tin Vu, Ahmed Eldawy, Vagelis Hristidis and Vassilis J. Tsotras. "Incremental Partitioning for Efficient Spatial Data Analytics". Proceedings of the VLDB Endowment (PVLDB), Volume 15, Issue 3, 2022. DOI>10.14778/3494124.3494150 -- PDF

  • Tin Vu, Alberto Belussi, Sara Migliorini, and Ahmed Eldawy. "Towards a Learned Cost Model for Distributed Spatial Join: Data, Code & Models". In Proceedings of the 31st ACM International Conference on Information and Knowledge Management (CIKM), 2022. DOI>10.1145/3511808.3557712 -- PDF

2021

  • Tin Vu, Alberto Belussi, Sara Migliorini, and Ahmed Eldawy. "A Learned Query Optimizer for Spatial Join", In ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL 2021, 10 pages , 2021. DOI>10.1145/3474717.3484217 -- PDF

  • Ahmed Eldawy, Vagelis Hristidis, Saheli Ghosh, Majid Saeedan, Akil Sevim, A.B. Siddique, Samriddhi Singla, Ganesh Sivaram, Tin Vu, and Yaming Zhang. "Beast: Scalable Exploratory Analytics on Spatio-temporal Data", In International Conference on Information and Knowledge Management (CIKM), 12 pages, 2021. DOI>10.1145/3459637.3481897 -- PDF

2020

  • Tin Vu, Solluna Liu, Renzhong Wang, and Kumarswamy Valegerepura. "Noise Prediction for Geocoding Queries using Word Geospatial Embedding and Bidirectional LSTM", In Proceedings of the 28th International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL 2020, November, 2020. DOI>10.1145/3397536.3422201 -- PDF

  • Puloma Katiyar, Tin Vu, Sara Migliorini, Alberto Belussi, and Ahmed Eldawy. "SpiderWeb: A Spatial Data Generator on the Web", In 28th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL 2020, November, 2020. DOI>10.1145/3397536.3422351 -- PDF

  • Tin Vu, Alberto Belussi, Sara Migliorini, and Ahmed Eldawy. "Using Deep Learning for Big Spatial Data Partitioning", In ACM Transactions on Spatial Algorithms and Systems (TSAS), 2020. DOI>10.1145/3402126 -- PDF

  • Tin Vu and Ahmed Eldawy, "DeepSampling: Selectivity Estimation with Predicted Error and Response Time", DeepSpatial2020, 1st ACM SIGKDD Workshop on Deep Learning for Spatiotemporal Data, Applications, and Systems. DOI>10.1145/0000000.0000000 -- PDF

  • Tin Vu and Ahmed Eldawy. "R*-Grove: Balanced Spatial Partitioning for Large-Scale Datasets", In Frontiers in Big Data, August, 2020. DOI>10.3389/fdata.2020.00028 -- PDF

2019

  • Saheli Ghosh, Tin Vu, Mehrad Amin Eskandari and Ahmed Eldawy, "UCR-STAR: the UCR spatio-temporal active repository", SIGSPATIAL Special 11, no. 2 (2019): 34-40. DOI>10.1145/3377000.3377005 -- PDF

  • Tin Vu, Sara Migliorini, Ahmed Eldawy, and Alberto Bulussi. "Spatial Data Generators", In 1st ACM SIGSPATIAL International Workshop on Spatial Gems (SpatialGems 2019), 2019. Best Paper Award. DOI>10.1145/0000000.0000000 -- PDF

  • Tin Vu, "Deep Query Optimization", In Proceedings of the 2019 International Conference on Management of Data (pp. 1856-1858). DOI>10.1145/3299869.3300104 -- PDF

2018

  • Tin Vu and Ahmed Eldawy. "R-Grove: growing a family of R-trees in the big-data forest", In Proceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, (SIGSPATIAL 2018), November, Seattle, WA, pages 532-535, 2018. DOI>10.1145/3274895.3274984 -- PDF

2015

  • Thanh Nguyen Trung, Tin Vu, Minh Nguyen, "BFC: High performance distributed big file cloud storage based on key value store", 16th IEEE/ ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/ Distributed Computing, Takamatsu, Japan, 06/2015. DOI>10.1109/SNPD.2015.7176209 -- PDF

Skills

Programming Languages
  • Java
  • C++
  • Python
Platforms & tools
  • Apache Hadoop
  • Apache Spark
  • TensorFlow
  • Scikit-learn, matplotlib

Interests

I love running. I run ~3 miles every day.

I also like books, especially historical books. This is my reading list.

Other links: Ha Tran, the most incredible female vocalist at Vietnam.

Awards & Certifications

  • 2nd Place and Best Presentation Award, Esri Weekend of Innovation Intern Hackathon 2019
  • Dean's Distinguished Fellowship, College of Engineering, University of California, Riverside(2016-2017)
  • NSF Student Travel Awards (SIGSPATIAL 2017, SIGSPATIAL 2018, SIGSPATIAL 2019)
  • 1st Prize of Microsoft Start-up Students Contest (10 teams in the Final Round), April 2013.
  • 1st Prize of Samsung SmartTV Development Contest, October 2013
  • 1st Prize of WOWZAPP 2012, World wide hackathon for Windows 8, Microsoft (ranked 1st position in 82 teams of the contest), November 2012
  • Certificate of Merit for having graduated with Very Good degree, Hanoi University of Science and Technology, July 2013
  • Government ­sponsored Scholarship, granted annually based on academic performance; Ministry of Education and Training, 2008-2013
  • PVFCCo scholarship: Encourage Scholarship for outstanding students; PetroVietnam Fertilizers & Chemicals corporation (PVFCCo), Vietnam Oil and Gas group, 2011-2012
  • Acer scholarship: Scholarship for outstanding students from Acer Inc, 2010
  • Itochu scholarship: Scholarship for outstanding student from Itochu Corporation, Japan, 2009