Welcome to the MK Motif Discovery Page

This page is build in support of our SDM 2009 paper,
**A. Mueen, E.
Keogh, Q. Zhu, S. Cash &**** B.
Westover **
(2009). **
Exact Discovery of Time Series Motifs** [pdf].
You can download all the datasets we used in the paper from
here. If you want the code,
read on.

Time series motifs are pairs of individual time series, or subsequences of a longer time series, which are very similar to each other. Since the formalism of time series motifs in 2002, dozens of researchers have used them in domains as diverse as medicine, entertainment, biology, telemedicine, telepresence and severe weather prediction. Below we show a concrete example.

Consider the time series of insect telemetry below, do you see any reoccurring patterns?

As it happens, there is a very
interesting repeated pattern as shown to the right (zoomed-in). By
referencing the accompanying video at the relevant locations we are able
to determine that this repeated pattern is not a coincidence, it
corresponds to a behavior that occurs immediately after phloem (plant
sap) ingestion has taken place. This example gives an intuition as to what a time series motif is (see paper for formal details). See this file for more information on the insect problem. |

Naively, using a brute force method, it would take 544,500,000 Euclidean distance calculations to find this motif. Using the MK algorithm, we can reduce this number by several orders of magnitude.

Time series motifs where introduced in our ICDM 2002 paper,
and since then there have been dozens of follow up papers. However, MK algorithm
is the first non-trivial algorithm to discover *exact* motifs in large
datasets.

Do you want the user-friendly code to find motifs? Download and read this users manual first (powerpoint/pdf), and then download the code.

**Acknowledgements**:

- Thanks to all the donors of datasets. Funded NSF 0803410 and NSF 0808770.