Hardware/Software Accelerated Data Mining for Real-Time Monitoring of Streaming Pediatric ICU Data

Welcome to the UCR Pediatric Intensive Care Unit Site

This page is designed to report progress on our NSF funded project: IIS - 1161997 II: Medium: Hardware/Software Accelerated Data Mining for Real-Time Monitoring of Streaming Pediatric ICU Data. We gratefully acknowledge the financial support of NSF, and the moral support of our program director.

Our team consists of Eamonn Keogh, Vassilis Tsotras, Walid Najjar and Randall Wetzel, and some very talented grad and undergrad researchers (listed below)

Abstract:

On any given day in America, there are at least one thousand children fighting for their lives in Pediatric Intensive Care Units (PICUs). In the PICU the patient's condition is carefully monitored. Most of this data is shown in a five-minute “sliding window” display, so a doctor summoned to a patient’s bedside always has her most recent history to consider. However what happens to the data that “falls off” this sliding window? In most PICUs, a tiny fraction of it is coarsely aggregated and recorded, but surprisingly, most of this data is simply discarded. Even if most or all the data is recorded, its sheer volume simply overwhelms researchers and analysts; very few tools exist to help them make sense of and learn from this data.

We believe that this currently discarded data is a potential goldmine of actionable knowledge that could improve outcomes (decreased mortality/morbidity, reduce pain, etc.), and reduce costs (implicit in reduced length of stay). However, the very nature of this data—multivariate, heterogeneous, high dimensional, temporal, noisy, biased, and high frequency—poses significant challenges for traditional analytical techniques from statistics and data mining.

In this proposal we plan to investigate two tightly coupled ideas:

Mining archives of annotated PICU data to find regularities and patterns that can be used to aid in diagnostics and prediction of outcomes.
Monitoring ICU telemetry in real time to detect whether the patterns and rules discovered in the offline step have occurred and can be used to guide interventions (actions by the doctor).

Our project brings together experts in data mining (Keogh, Tsotras), high performance computing (Najjar) and medicine (Wetzel) to investigate holistic solutions to the above problems.

Publications:

CID: An Efficient Complexity-Invariant Distance for Time Series. Gustavo E. A. P. A. Batista, Eamonn J. Keogh, Oben Moses Tataw, Vinicius M. A. de Souza. Data Mining and Knowledge Discovery 2013

Alessandro Camerra, Jin Shieh, Themis Palpanas, Thanawin Rakthanmanon, Eamonn Keogh (2013) Beyond one billion time series: indexing and mining very large time series collections with iSAX2+. Knowledge and Information Systems. February 2013

Thanawin Rakthanmanon, Bilson Campana, Abdullah Mueen, Gustavo Batista, Brandon Westover, Qiang Zhu, Jesin Zakaria, Eamonn Keogh (2012). Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping. Transactions on Knowledge Discovery from Data. 2013

Thanawin Rakthanmanon; Eamonn Keogh; Stefano Lonardi; Scott Evans. MDL-Based Time Series Clustering. Knowledge and Information Systems 2012.

Thanawin Rakthanmanon and Eamonn Keogh. Fast-Shapelets: A Scalable Algorithm for Discovering Time Series Shapelets. SDM 2013

Bing Hu, Yanping Chen and Eamonn Keogh. Time Series Classification under More Realistic Assumptions. SDM 2013

Mahbub Hasan, Abdullah Mueen, Vassilis Tsotras, Eamonn Keogh (2012). Diversifying Query Results on Semi-Structured Data. Proceedings of CIKM 2012.

6. Thanawin Rakthanmanon, Bilson Campana, Abdullah Mueen, Gustavo Batista, Brandon Westover, Qiang Zhu, Jesin Zakaria, Eamonn Keogh (2012). Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping SIGKDD 2012. Best paper winner

R. Halstead, J. Villarreal, W. Najjar. Compiling irregular applications for reconfigurable systems, to appear in International Journal of High Performance Computing and Networking (IJHPCN)

R. Moussalli, W. Najjar, X. Luo and A. Khan. A High Throughput No-Stall Golomb-Rice Hardware Decoder, to appear in The 21st IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM) April 28-30, 2013, Seattle, Washington

Tutorials:

1) None yet, expect the first tutorial in 2014.

Data:

We plan to release all our data, in general the data will be made available within 6-months of being used in a publication. However if you have a pressing need, just ask.

Funded Students: To date, the following students have been partly or fully funded by this grant.

Bing Hu (Ph.d ongoing)
Jesin Zakaria (Ph.d ongoing)
Xiaoyin Ma, (Ph.d ongoing)
Roger Moussalli, PhD student CSE, graduated 3/2013, now at IBM Watson.
Reaz Uddin
Thanawin (Art) Rakthanmanon (Ph.d 2012 Kasetsart University)
Bilson Campana (Ph.d 2012 Google Research)