Welcome to the SAX (Symbolic Aggregate approXimation) Homepage!

SAX is the first symbolic representation for time series that allows for dimensionality reduction and indexing with a lower-bounding distance measure. In classic data mining tasks such as clustering, classification, index, etc., SAX is as good as well-known representations such as Discrete Wavelet Transform (DWT) and Discrete Fourier Transform (DFT), while requiring less storage space. In addition, the representation allows researchers to avail of the wealth of data structures and algorithms in bioinformatics or text mining, and also provides solutions to many challenges associated with current data mining tasks. One example is motif discovery, a problem which we defined for time series data. There is great potential for extending and applying the discrete representation on a wide class of data mining tasks.

SAX was invented by Eamonn Keogh and Jessica Lin in 2002 (Jessica came up with the name!). Jessica Lin is now an Assistant Professor at GMU. Her SAX page is here.

Edward Tufte was kind enough to mention that SAX allows a sparkline like visualization of data. The relevant paper is this one [pdf].

Li Wei has generalized the SAX code  to handle the N/n not equal an integer case, and to allow alphabet sizes up to 20. Download this zip file for the code and details. 

If you want a copy of my SAX time series/Shape tutorial, download this.

Here is a video of Dr. Keogh giving a talk at Google about using SAX for various problems, including shape mining.

NEWS: Much of the utility of SAX has now been subsumed by iSAX , which is a generalization of SAX that allows indexing and mining of massive datasets. Visit the iSAX page.

  • the performance SAX enables is amazing, and I think a real breakthrough. As an example, we can find similarity searches using edit distance over 10,000 time series in 50 milliseconds. Ray Cromwell, Timepedia.org
  • SAX represents the state-of-the-art in time series data streams analysis due to its generality   Gaber and Gama, Tutorial PKDD07
  • In order to characterize the expression waveforms we follow the basic SAX  formalism for time-series analysis presented by Keogh and Lin... Androulakis et al.
  • To circumvent the limitations of our previous work, we now rely on a similarity measure that is based on a recent technique called symbolic aggregate approximation (SAX). Almotairi, Saleh et. al 2007.
  • (SAX based VizTree is).. a way to do such analysis more systematically Edward Tufte.
  • The method is based on the notion of so called time signature of the clusters, introduced in Lin & Keogh and obtained using a recent time series analysis method called the Symbolic Aggregate approximation (SAX). Pouget, M. Dacier, J. Zimmerman, A. Clark, and G. Mohay. Journal of Information Assurance and Security 1 (2006) 21-32
  • "Our goal is to identify those transcripts that share significant components of their expression patterns. In order to do so, we explore the SAX idea of Lin and Keogh... Yang et al..
  • we take SAX Motif developed by Keogh in order to support a medical expert in discovering interesting knowledge. Kitaguchi, S.
  • In order to symbolize a street data, we utilize the SAX approach. Jalili and Alipour. 
  • ...we examine another interesting query, the Time Relaxed Spatiotemporal Trajectory Join (TRSTJ)... we address the TRSTJ problem using SAX... Bakalov, Hadjieleftheriou and Tsotras. 
  • We have decided to use SAX (to detect sophisticated attack tools ).. SAX is a recent and popular method with interesting proven properties. F. Pouget, G. Urvoy-Keller, and M. Dacier 
  • ..we are currently using (Lin and Keoghs SAX) approach to creating discrete data from continuous data. Amy McGovern et al.
  • (to find repeated patterns in protein unfolding data) ...we adopted a two step approach called SAX Ferreira et al.
  • ..we use SAX and Keogh's Tarzan algorithm to do anomaly detection in network traffic. Kyoji Umemura et. al.
  • SAX has already prove efficient in a large variety of domains Fabian Pouget, Telecom Paris.
  • SAX representation of abstracted data makes analysis (of anterior-posterior center of pressure) more easy and accurate.   Bhatkar et al.
  • Our Symbolic Transformation (based on SAX method) can be use to discover novel gene relations by mining similar subsequences in time-series microarray data. Vincent Shin-Mu Tseng
  • ..we use SAX bitmap matrices to compute an anomaly score for acoustic signals, enabling the extraction of bird vocalizations and other acoustic events Kasten, McKinley and Gage. 2007
  • Using the time-series data as an input, it takes too much computation amount to extract motifs from the human motion information. Therefore, we use Symbolic Aggregate approXimation (SAX) .Araki , Arita and Taniguchi 2006
  • SAX has the advantage of dimensionality and noise reductions. It also allows real valued data to remain the original characteristics with only an infinitesimal time and space overhead...we therefore use it for to determine behavior of system... Lavangnananda and Wongwattanakarn. SMCai07.
  • SAX demonstrates some promising properties for the field of anomaly detection in a marine engine. Morgan, Liu, Turnbull, and Brown 2007.
  • ..motivated by recent advances in the symbolic representation of streaming data (SAX), effectively reduces the dimensionality of.. Annu. Rev. Biomed. Eng. 2007.
  • (we use SAX to create a) ..secure multiparty protocol for the privacy preserving pattern discovery problem. Costa da Silva and  Klusch 2007.
  • By using SAX with the sensor network data, we are able to detect such complex patterns with good accuracy .. SAX is a very mature and robust solution for mining time-series data. Zoumboulakis and Roussos 2007
  • we apply the (SAX based) motif discovery approach the analysis of responses obtained by tactile stimulation of different body areas. Fabri et al. IJCNN07
  • We extend the  SAX approach.. to support Query-by-Singing/Humming. Duda, Nurnberger and Stober (2007).
  • We use an algorithm based on SAX (Symbolic Aggregate approXimation) to discover human skills..   Makio, Tanaka, and Uehara2007
  • symbolic aggregate approximation (SAX) outperform other dimensionality reduction techniques like singular value decomposition or discrete fourier transform (SVD, DFT) for time series data.. Assent, Krieger, Afschari and Seidl EDBT 2008
  • Portions of our work have been inspired by Symbolic Aggregate Approximation (SAX).. Cohen, Bjornsson, Temple, Banker, and Roysam. PAMI 2008
  • we apply a technique that has demonstrated success with the interpretation of univariate data, named SAX to visualize patterns that may differentiate between medical conditions such as renal and respiratory failure.  Ordonez et al. 2008 AMIA
  • The continuous attributes are transformed into ordered categories using the transformation technique presented in SAX. Ralph Krieger 2008

Papers by Keogh and collaborators that use SAX. (in random order)

In [1] we show how to use SAX to find time series discords which are unusual time series. In [2] we consider a special case of SAX, which has an alphabet size of 2, and a word size equal to the raw data, and show that we can use this bit-level representation for a variety of data mining tasks. In [3] we show how to use SAX to create time series bitmaps, which allow visualization of time series data directly within a standard GUI such as MS Windows. In [4] we further show how to use time series bitmaps to do anomaly detection. In [5] we show that SAX can support parameter-lite data mining of time series, including classification and clustering. In [7] we show that SAX can replace standard representations of time series (i.e DWT, DFT) for all classic data mining problems including classification, clustering and indexing. We first used SAX to find time series motifs (exactly, and somewhat fast) in [9], and later showed a blinding fast probabilistic algorithm in [8]. In [10] we tentatively showed how to use SAX to meaningfully cluster time series streams. In [12] we show an application of SAX to a shape mining problem, and in [11] we generalize the time series bitmap concept to more general datasets. In [13] we show how to use SAX to find approximately duplicated shapes (shape motifs) in large databases. Paper [14] is a journal paper reviewing SAX first two years. Paper [15] shows how to find motifs under uniform scaling. Paper [16] introduces iSAX. Paper [17] shows how to do SAX on resource limited sensors.

  1. E. Keogh, J. Lin and A. Fu (2005). HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence. In Proc. of the 5th IEEE International Conference on Data Mining (ICDM 2005), pp. 226 - 233., Houston, Texas, Nov 27-30, 2005.  [pdf ].  More info on discords and HOT SAX is here Also KAIS journal paper.

  2. Ratanamahatana, C., Keogh, E., Bagnall, T.  and Lonardi, S. (2005). A Novel Bit Level Time Series Representation with Implications for Similarity Search and Clustering. PAKDD 05. [pdf Also DMKD journal paper.

  3. Kumar, N.,  Lolla  N.,  Keogh, E.,  Lonardi, S. , Ratanamahatana, C. A. and Wei, L. (2005). Time-series Bitmaps: A Practical Visualization Tool for working with Large Time Series Databases . In proceedings of SIAM International Conference on Data Mining (SDM '05), Newport Beach, CA, April 21-23. pp. 531-535 [pdf

  4. Li Wei, Nitin Kumar, Venkata Nishanth Lolla, Eamonn Keogh, Stefano Lonardi, Chotirat Ann Ratanamahatana (2005). Assumption-Free Anomaly Detection in Time Series. In Proc. of the 17th International Scientific and Statistical Database Management Conference (SSDBM 2005), Santa Barbara, CA, U.S.A., June 27-29, 2005.

  5. Keogh, E., Lonardi, S. and Ratanamahatana, C. (2004). Towards Parameter-Free Data Mining. In proceedings of the tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Seattle, WA, Aug 22-25, 2004. [pdf, slides ] Also DMKD journal paper.

  6. Lin, J., Keogh, E., Lonardi, S., Lankford, J. P. & Nystrom, D. M. (2004). Visually Mining and Monitoring Massive Time Series. In proceedings of the tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Seattle, WA, Aug 22-25, 2004. [pdf ,slides] Also Information Visualization journal paper.

  7. Lin, J., Keogh, E., Lonardi, S. & Chiu, B. (2003) A Symbolic Representation of Time Series, with Implications for Streaming Algorithms. In proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. San Diego, CA. June 13. [pdf, slides]

  8. Chiu, B. Keogh, E., & Lonardi, S. (2003). Probabilistic Discovery of Time Series Motifs. In the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. August 24 - 27, 2003. Washington, DC, USA. pp 493-498. [Expanded Version pdf]

  9. Patel, P., Keogh, E., Lin, J., & Lonardi, S. (2002). Mining Motifs in Massive Time Series Databases. In proceedings of the 2002 IEEE International Conference on Data Mining. Maebashi City, Japan. Dec 9-12.

  10. Keogh, J. Lin, and W. Truppel. (2003). Clustering of Time Series Subsequences is Meaningless: Implications for Past and Future Research. In proceedings of the 3rd IEEE International Conference on Data Mining . Melbourne, FL. Nov 19-22. pp 115-122. [ pdf] Also KAIS journal paper.

  11. Eamonn Keogh, Li Wei, Xiaopeng Xi, Stefano Lonardi, Jin Shieh, Scott Sirowy  (2006). Intelligent Icons: Integrating Lite-Weight Data Mining and Visualization into GUI Operating Systems. ICDM  2006. [pdf]

  12. Li Wei, Eamonn Keogh and Xiaopeng Xi (2006) SAXually Explict Images: Finding Unusual Shapes. ICDM 2006. [pdf]. Now a Data Mining and Knowledge Discovery  Journal paper.

  13. Xiaopeng Xi, Eamonn Keogh, Li Wei, Agenor Mafra-Neto (2007). Finding Motifs in a Database of Shapes. SIAM International Conference on Data Mining.

  14. Jessica Lin, Eamonn Keogh Li Wei and Stefano Lonardi (2007) Experiencing SAX: a Novel Symbolic Representation of Time Series. DMKD Journal.

  15. Dragomir Yankov, Eamonn Keogh, Jose Medina, Bill Chiu, and Victor Zordan (2007). Detecting Motifs Under Uniform Scaling. SIGKDD 2007. [pdf] Supporting webpage with video and datasets.

  16. Jin Shieh and Eamonn Keogh (2008). iSAX: Indexing and Mining Terabyte Sized Time Series. SIGKDD 2008.

  17. Shashwati Kasetty, Candice Stafford, Gregory P. Walker, Xiaoyue Wang, Eamonn Keogh (2008). Real-Time Classification of Streaming Sensor Data. 20th IEEE Int'l Conference on Tools with Artificial Intelligence. [pdf]

     

     

     

     

Selected papers by others that use SAX. 

In [A] the authors "New approaches for representing, analyzing and visualizing complex kinetic mechanisms", they note "The procedure is based on the methodology recently proposed by (Lin and Keogh) for the analysis of multi-dimensional time series". Papers [B.C,D,E] use SAX and random projection (see [8] above) to discover motifs in telemedicine time series. In paper [F] the authors convert plamprint to time series, then to SAX, then they do biometric recognition. Paper [G] says "we take Motif developed by Keogh in order to support a medical expert in discovering interesting knowledge". Paper [H] uses SAX  and random projection (see [8] above) to mine motion capture data. Paper [I] uses SAX to find repeated patterns in motion capture data. Paper [J] uses SAX to find rules in time series. Paper [K] uses SAX to find motifs of unspecified length. Paper [L] uses SAX to find repeated patterns in robot sensors. Androulakis et. al.  [M] uses SAX for electing Maximally Informative Genes to. Enable Temporal Expression Profiling. In paper [N] the authors us SAX to do Spatiotemporal Trajectory Joins. In [O] the authors use SAX motifs to "analyze respiration wave during cello performance" Paper [P] uses SAX to "detect multi-headed stealthy attack tools".  Paper [Q] is using SAX to " Understand the formation of tornadoes"! Paper [R] uses SAX and time series motifs to for the Selection of Informative Genes in Time-Course Gene Expression Data. Paper [S] makes the minor extensions to [8] above, to allow it to handle the multidimensional case. Ph.d Thesis [T] uses SAX for a variety of tasks in network traffic analysis. Paper [U] uses SAX to do Anomaly Detection in Network Traffic. Paper [V] uses SAX to do prediction of severe weather phenomena such as tornados, thunderstorms, hail, and floods. Paper [W] uses a modification of SAX to discover novel gene relations by mining similar subsequences in time-series microarray data. Paper [X] uses SAX for classification of environmental sounds. [Y] uses SAX for financial data mining. Paper [Z] uses SAX for motif discovery. Paper [AA] uses SAX to find motifs in motion capture data. paper [AB] uses SAX based motifs to mine system call sequences. Paper [AB] uses SAX to classify control chart patterns. Papers [AD] and [AE] extend SAX for segmentation of time series into natural episodes. Paper [AF] uses SAX to find anomalies in SAX in a marine engine. Paper [AG] uses SAX for  the selection of informative genes. paper [AH] uses SAX to detect complex events in wireless sensor networks. Paper [AI] uses SAX to mine MRIs. Paper [AJ] uses SAX to mine motion capture data. Paper [AK] uses SAX for privacy-preserving discovery of frequent patterns in time series. Paper [AL] uses SAX to find association rules in time series. Paper [AM] uses SAX and Vistree to find patterns in CPU traces. Paper [AN] uses SAX for similarity search. Paper [AO] uses SAX for assessing the wellbeing of unsupervised, vulnerable individuals. Paper [AP] uses SAX for characterizing the mechanism of action of anti-inflammatory drugs. Paper [AQ] uses SAX to visualize patterns that may differentiate between medical conditions such as renal and respiratory failure. Paper [AR] uses SAX to Understand malicious internet traffic by mining honeypot traces. Paper [AS] uses SAX to mine ECG data. Paper [AT] uses SAX to tokenize for gestures.

  1. Androulakis, I. P. (2005). New Approaches for Representing, Analyzing and Visualizing Complex Kinetic Mechanisms. . In proceedings of the 15th European Symposium on Computer Aided Process Engineering. Barcelona, Spain. May 29-June 1.

  2. Silvent, A., Dojat, M. & Garbay, C. (2004). Multi-level Temporal Abstraction for Medical Scenario Construction. International Journal of Adaptive Control and Signal Processing.

  3. Silvent, A. S., Carbay, C., Carry, P. Y. & Dojat, M. (2003). Data, Information and Knowledge for Medical Scenario Construction. In proceedings of the Intelligent Data Analysis In Medicine and Pharmacology Workshop (IDAMAP 2003). October. Protaras, Cyprus.

  4. F. Duchene, C. Garbay, V. Rialle, "Mining heterogeneous multivariate time-series for learning meaningful patterns: Application to home health telecare," Research Report 1070-I, Institut d Informatique et Mathematiques Appliquees de Grenoble (IMAG), Grenoble, France, 2004. 

  5. F. Duchene and C. Garbay, Apprentissage de motifs temporels, multidimensionnels et heterogenes -- Application a la telesurveillance medicale, Conference francophone sur lapprentissage automatique (CAP), Nice, France, 31 mai - 3 juin 2005. Presses Universitaires de Grenoble.

  6. Chen, J. S., Moon, Y. S. & Yeung, H. W. (2005). Palmprint Authentication Using Time Series. In proceedings of the 5th International Conference on Audio- and Video-Based Biometric Person Authentication. Hilton Rye Town, NY. July 20-22.

  7. Kitaguchi, S. (2004). Extracting Feature based on Motif from a Chronic Hepatitis Dataset. In proceedings of the 18th Annual Conference of the Japanese Society for Artificial Intelligence (JSAI). Kanazawa, Japan. June 2-4.

  8. Tanaka, Y. & Uehara, K. (2004). Motif Discovery Algorithm from Motion Data. In proceedings of the 18th Annual Conference of the Japanese Society for Artificial Intelligence (JSAI). Kanazawa, Japan. June 2-4.

  9. Celly, B. & Zordan, V. B. (2004). Animated People Textures. In proceedings of the 17th International Conference on Computer Animation and Social Agents (CASA 2004). July 7-9. Geneva, Switzerland.

  10. Ohsaki, M., Sato, Y., Yokoi, H. & Yamaguchi, T. (2003). A Rule Discovery Support System for Sequential Medical Data In the Case Study of a Chronic Hepatitis Dataset. ECML 2003

  11. Tanaka, Y. & Uehara, K. (2003). Discover Motifs in Multi Dimensional Time-Series Using the Principal Component Analysis and the MDL Principle. In proceedings of the 3rd International Conference on Machine Learning and Data Mining in Pattern Recognition. pp.252-265.

  12. Koji Murakami Yoshikazu Yano Shinji Doki Shigeru Okuma (2004). Behavior extraction from a series of observed robot motion . RoboMec2004

  13. Androulakis, I.P., J. Wu, J. Vitolo and C. Roth, Selecting maximally informative genes to enable temporal expression profiling analysis, Proceedings of Foundations of Systems Biology in Engineering, Santa Barbara, CA, (2005)

  14. P. Bakalov, M. Hadjieleftheriou, V. J. Tsotras, (2005). Time Relaxed Spatiotemporal Trajectory Proc. of the ACM International Symposium on Advances in Geographic Information Systems(ACM-GIS),Bremen, Germany, November 2005.

  15. Keita Kinjo Tomonobu Ozaki Keigo Sawai Koichi Furukawa (2005) Knowledge acquisition from time series data by association rule and network analysis. The 19th Annual Conference of the Japanese Society for Artificial Intelligence, 2005

  16. F. Pouget, G. Urvoy-Keller, and M. Dacier Time Signatures to detect multi-headed stealthy attack tools In 18th Annual FIRST Conference Baltimore, Maryland, USA June 2006.

  17. Amy McGovern, Univ. of Oklahoma, Norman, OK; and A. Kruger, D. Rosendahl, and K. Droegemeier. (2007) Understanding the formation of tornadoes through data mining. Fifth Conference on Artificial Intelligence Applications to Environmental Science.

  18. Eric Yang and Ioannis Androulakis (2006) Selection of Informative Genes in Time-Course Gene Expression Data. AIChE 2006.

  19. David Minnen, Thad Starner, Irfan Essa, and Charles Isbell. Improving Activity Discovery with Automatic Neighborhood Estimation. In Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI), 2007.

  20. Fabian Pouget (2005). Distributed System of Honeypot Sensors. Telecom Paris

  21. Masayuki Okabe, Taeko Miwa, Kyoji Umemura (2006) Anomaly Detection in Network Traffic based on String Analysis. IC2006.

  22. McGovern, Amy, and Rosendahl, Derek H., and Kruger, Adrianna, and Beaton, Meredith G., and Brown, Rodger A., and Droegemeier, Kelvin K. (2007) Understanding the formation of tornadoes through data mining.  Fifth Conference on Artificial Intelligence and its Applications to Environmental Sciences at the American Meteorological Society annual conference.

  23. Vincent Shin-Mu Tseng, L. C. Chen and J. J. Liu (2006) Discovering Novel Gene Relations by Mining Similar Subsequences in Time in Time-Series Microarray Data. in Proc. Intl Workshop on Science of Artificial, Taiwan, 2005

  24. Automated Ensemble Extraction and Analysis of Acoustic Data Streams. Technical Report MSU-CSE-06-40. December 2006. Eric P. Kasten and Philip K. McKinley Stuart H. Gage

  25. Application Research of a New Symbolic Approximation Method-SAX in Time Series Mining (2006) LIU Yi,BAO De-pei,YANG Ze-hong. COMPUTER ENGINEERING AND APPLICATIONS 2006 Vol.42 No.27

  26. Motif Detection Inspired by Immune Memory (2007) William Wilson, Phil Birkin, and Uwe Aickelin

  27. Motion motif extraction from high-dimensional motion information. Araki , Arita and Taniguchi 2006

  28. Wilson Will, Feyereisl Jan and Aickelin Uwe (2007): Detecting Motifs in System Call Sequences, Proceedings of the 8th International Workshop on Information Security Applications (WISA 2007), Lecture Notes in Computer Science, pp, Jeju, Korea

  29. K. Lavangnananda, and C. Wongwattanakarn (2007) Utilizing Symbolic Representation and Evolutionary Computation in Classification of Control Chart Patterns. Soft Computing in Industrial Applications 2007

  30. T. Armstrong and T .Oates. RIPTIDE: Segmenting Data Using Multiple Resolutions. In the Proceedings of the 6th IEEE International Conference on Development and Learning (ICDL), 2007.

  31. T. Armstrong and T. Oates. UNDERTOW: Multi-Level Segmentation of Real-Valued Time Series. In the Proceedings of the 22nd AAAI Conference on Artificial Intelligence (AAAI) (student abstract), 2007.

  32. Time Discretisation Applied to Anomaly Detection in a Marine Engine. Morgan, Liu, Turnbull, and Brown 2007.

  33. Yang, E., F. Berthiamume, M. L. Yarmush, and I. P. Androulakis. SeLection of INformative Genes via Symbolic Hashing Of Time Series .Proceedings of the Joint 9th International Symposium, Processing Systems Engineering and 16th European Symposium, 2006.

  34. M. Zoumboulakis and G. Roussos, Escalation: Complex Event Detection in Wireless Sensor Networks ,in Proceedings of 2nd European Conference on Smart Sensing and Context (EuroSSC), 23-25 Oct 2007, Lake District, UK.

  35. M. Fabri, G. Mascioli, G. Palonara, A. M. Perdon, S. R. Viola (2007) Activation and delay in FMRI brain signals of selective attention. in Proceedings of Int. IJCNN07 Workshop on Neurodynamics, Orlando, Florida, USA, August 17, 2007.

  36. Kosuke Makio, Yoshiki Tanaka, and Kuniaki Uehara (2007) Discovery of Skills from Motion Data. New Frontiers in Artificial Intelligence

  37. Da Silva, J.C.; Klusch, M. (2007): Privacy-Preserving Discovery of Frequent Patterns in Time Series. Proceedings of the 7th Industrial Conference on Data Mining ICDM, Leipzig, Germany, Springer.

  38. Discovery Association Rules in Time Series Data. Kittipong Warasup and Chakarida Nukoolkit

  39. Ooi Boon Yaik Chan Huah Yong Fazilah Haron (2006) CPU Usage Pattern Discovery Using Suffix Tree. Distributed Frameworks for Multimedia Applications, 2006.

  40. Combining SAX and Piecewise Linear Approximation to Improve Similarity Search on Financial Time Series. Hung, Nguyen Quoc Viet Anh, Duong Tuan  Information Technology Convergence, 2007. ISITC 2007.

  41.  Julia Hunter and Martin Colley (2007) Feature Extraction from Sensor Data Streams for Real-Time Human Behaviour Recognition. PKDD2007

  42. Analysis of Regulatory Interaction Networks from Clusters of Co-expressed Genes (2008) E. Yang et al

  43. Visualizing Multivariate Time Series Data to Detect Specific Medical Conditions. Ordonez et al. AMIA 2008.

  44. Almotairi, Saleh I. et al (2007) Extracting Inter-arrival Time Based Behaviour from Honeypot Traffic using Cliques.

  45. Kulahcioglu B., Ozdemir S., Kumova B.I., Application of Symbolic Piecewise Aggregate Approximation (PAA) Analysis to ECG Signals, The 17th IASTED International Conference on Applied Simulation and Modelling (ASM 2008) .

  46. Tokenization for Gesture Space Modelling. Aaron Licata, Alexandra Psarrou 13th International Conference on Applications of Natural Language to Information Systems, Doctoral Symposium (NLDB'08-DS)

 

Some other papers that reference SAX.

  1. Chen, J. S., Moon, Y. S. & Yeung, H. W. (2005). Palmprint Authentication Using Time Series. In proceedings of the 5th International Conference on Audio- and Video-Based Biometric Person Authentication. Hilton Rye Town, NY. July 20-22.

  2. Gaber, M. M., Zaslavsky, A. & Krishnaswamy, S. (2005). Mining Data Streams: A Review. ACM SIGMOD Record, Vol. 34, No. 1. June 2005.

  3. Chen, L & Ozsu, M. T. (2005). Using Multi-Scale Histograms to Answer Pattern Existence and Shape Match Queries. In proceedings of the 17th International Conference on Scientific and Statistical Database Management (SSDBM). Santa Barbara, CA. June 27-29.

  4. Morchen, F. & Ultsch, A. (2005). Optimizing Time Series Discretization for Knowledge Discovery. In proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Chicago, IL. Aug 21-24.

  5. Morchen, F., Ultsch, A. & Hoos, O. (2005). Extracting Interpretable Muscle Activation Patterns with Time Series Knowledge Mining. International Journal of Knowledge-Based & Intelligent Engineering Systems.

  6. Morchen, F., Ultsch, A., Thies, M., Lohken, I., Nocker, M., Stamm, C., Efthymiou, N. & Kummerer, M. (2005). MusicMiner: Visualizing Timbre Distance of Music as Topographical Maps. Tech Report. Department of Mathematics and Computer Science, University of Marburg, Germany.

  7. Makio, K., Tanaka, Y. & Uehara, K. (2005). Discovery of Skills from Motion Data. Tech Report

  8.  Zuo, X. & Jin, X. (2005). Accurate Symbolization of Time Series. In proceedings of the 9th Pacific-Asia Conference on Knowledge Discovery and Data Mining. Hanoi, Vietnam. May 18-20.

  9. Liu, Z., Yu, J. X., Lin, X., Lu, H. & Wang, W. (2005). Where Are the Motifs in Time-Series Data. In proceedings of the 9th Pacific-Asia Conference on Knowledge Discovery and Data Mining. Hanoi, Vietnam. May 18-20.

  10.  Megalooikonomou, V., Wang, Q., Li, G. & Faloutsos, C. (2005). Multiresolution Symbolic Representation of Time Series. In proceedings of the 21st IEEE International Conference on Data Engineering (ICDE). Tokyo, Japan. Apr 5-9.

  11. Hetland, M. L & Satrom, P. (2005). Evoluntionary Rule Mining in Time Series Databases. Machine Learning.

  12. Bagnall, A. & Janacek, G. (2005). Clustering Time Series with Clipped Data. Machine Learning.

  13. Jalili, S. & Alipour, M. A. (2004). Incremental Relation Exploration within Urban Traffic Flows. In proceedings of the 6th International Conference on Applied Computational Intelligence. Blankenberghe, Belgium. Sept 1-3.

  14. Nukoolkit, C. & Rattanamahawichai, S. (2004). Clustering and Similarity Matching of Time Series Data with Sequence Alignment. In proceedings of the 1st Thailand Computer Science Conference. Bangkok, Thailand. Dec 16-17. pp. 18-23.

  15. Goebel, V. & Plagemann, T. (2004). Tutorial: Data Stream Management Systems (DSMS) - Applications, Concepts, and Systems.. In proceedings of the 2nd International Workshop on Multimedia Interactive Protocols and Systems (MIPS) Grenoble, France. Nov 16-19.

  16. Fu, T. C., Chung, F. L., Luk, R. & Ng, C. M. (2004). Financial Time Series Indexing Based on Low Resolution Clustering . In proceedings of the Workshop on Temporal Data Mining: Algorithms, Theory and Applications, at the 4th IEEE International Conference on Data Mining (ICDM) Brighton, UK. Nov 1.

  17. Gaudin, R. & Nicoloyannis, N. (2004). Apprentissage non supervise de series temporelles ma l'aide des k-Means et d'une nouvelle methode d'agregation de series.

  18. Tan, Z. & Tung, A. H. (2004). Substructure Clustering on Sequential 3d Object Datasets . In proceedings of the 20th International Conference on Data Engineering (ICDE). Boston, MA. Mar 30 - Apr 2.

  19. Wu, Y. & Chang, E. Y. (2004). Distance Function Design and Fusion for Sequence Data. In proceedings of the 13th International Conference on Information and Knowledge Management (CIKM). Washington DC. Nov 8-13.

  20. Megalooikonomou, V., Li, G., Wang, Q. & Faloutsos, C. (2004). A Dimensionality Reduction Technique for Efficient Similarity Analysis of Time Series Databases. In proceedings of the 13th ACM Conference on Information and Knowledge Management (CIKM). Washington, D.C. Nov 8-13.

  21. Denton, A. (2004). Density-Based Clustering of Time Series Subsequences. In proceedings of the 3rd Workshop on Mining Temporal and Sequential Data (TDM), in conjunction with the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Seattle, WA. Aug 22.

  22.  Tanaka, Y. & Uehara, K. (2004). Motif Discovery Algorithm from Motion Data. In proceedings of the 18th Annual Conference of the Japanese Society for Artificial Intelligence (JSAI). Kanazawa, Japan. June 2-4.

  23. Zhang, H., Ho, T. B. & Lin, M. S. (2004). A Non-Parametric Wavelet Feature Extractor for Time-Series Classification. In proceedings of the 8th Pacific-Asia Conference on Knowledge Discovery and Data Engineering (PAKDD). May 26-28. Sydney, Australia.

  24. Moerchen, F. & Ultsch, A. (2004). Mining Hierarchical Temporal Patterns in Multivariate Time Series. In proceedings of the 27th German Conference on Artificial Intelligence (KI). Sept 20-24. Ulm, Germany.

  25. Morchen, F. & Ultsch, A. (2004). Discovering Temporal Knowledge in Multivariate Time Series.

  26. Hofmann, U., Miloucheva, I., Pfeiffenberger, T. & Strohmeier, F. (2004). Active Monitoring Toolkit for Longterm QoS Analysis in Large Scale Internet. In proceedings of the 2nd International Workshop on Inter-Domain Performance and Simulation (IPS). March 22-23. Budapest, Hungary.

  27.  Rombo, S. & Terracina, G. (2004). Discovering Representative Models in Large Time Series Databases. In proceedings of the 6th International Conference On Flexible Query Answering Systems (FQAS 2004). June 24-26. Lyon, France. Lecture Notes in Computer Science, Springer-Verlag.

  28. Udechukwu, A., Barker, K. & Alhajj, R. (2004). An Efficient Framework for Time Series Trend Mining. In proceedings of the 6th International Conference on Enterprise Information Systems (ICEIS 2004). April 14-17. Porto, Portugal.

  29. Zhang, H., Ho, T.B., Zhang, Y., Lin, M.S. (2005). Unsupervised Feature Extraction for Time Series Clustering Using Orthogonal Wavelet Transform, Journal Informatica

  30. Udechukwu, A., Barker, K. & Alhajj, R. (2004). Discovering All Frequent Trends in Time Series. In proceedings of the 2004 Winter International Symposium on Information and Communication Technologies (WISICT 2004). Jan 5-8. Cancun, Mexico.

  31.  Bagnall, A. J. & Janakec, G. (2004). Clustering Time Series from ARMA Models with Clipped Data. Technical Report CMP-C04-01. School of Computing Science, University of East Anglia. February.

  32.  Somayajulu G. Sripada, Ehud Reiter, Jim Hunter and Jin Yu (2003). Generating English Summaries of Time Series Data using the Gricean Maxims. In the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. August 24 - 27. Washington, DC, USA.

  33.  Tanaka, Y. & Uehara, K. (2003). Discover Motifs in Multi Dimensional Time-Series Using the Principal Component Analysis and the MDL Principle. In proceedings of the 3rd International Conference on Machine Learning and Data Mining in Pattern Recognition. pp.252-265.

  34. Hui Zhang, Tu Bao Ho, Mao-Song Lin, and Wei Huang (2005) Combining the Global and Partial Information for Distance-Based Time Series Classification and Clustering. Vol.10 No.1, 2006 Journal of Advanced Computational Intelligence and Intelligent Informatics

  35. Siba Haidar, Philippe Joly, Bilal Chebaro: Style Similarity Measure for Video Documents Comparison. CIVR 2005: 307-317

  36. Xinqiang Zuo, Xiaoming Jin (2005) A Multi-Hierarchical Representation for Similarity Measurement of Time Series. PAKDD

  37. Discretization from Data Streams: Applications to Histograms and Data Mining Joao Gama and Carlos Pinto PKDD 2005

  38. Chen, L., Ozsu, T. & Oria, V. (2003). Symbolic Representation and Retrieval of Moving Object Trajectories. Technical Report CS-2003-30. University of Waterloo.

  39. Tak-chung Fu (2005). SBT-Forest, an Indexing Approach for Specialized Binary Tree . Third International Conference on Information Technology and Applications (ICITA'05) Volume 1 pp. 149-154

  40. Chen, L. & Ozsu, T. (2003). Multi-Scale Histograms for Answering Queries over Time Series Data. In proceedings of the 20th International Conference on Data Engineering (ICDE). Mar 30 - Apr 2. Boston, MA.

  41. Shmueli G. and Fienberg, S. E. (2006), Current and Potential Statistical Methods for Monitoring Multiple Data Streams for Bio-Surveillance, in Statistical Methods in Counter-Terrorism, Eds: A Wilson and D Olwell, Springer.

  42. Sankalp Balachandran, Dipankar Dasgupta, Fernando Nino, Deon Garrett. A General Framework for Evolving Multi-Shaped Detectors in Negative Selection. Submitted to the IEEE Transactions on Evolutionary Computation, January 2006.

  43. A.M.M. Sharif Ullah and Khalifa H. Harib (2005). Knowledge Extraction from Time Series and its application to Surface Roughness Simulation. Information, Knowledge and Systems Management

  44. Yu Suzuki, Kyoji Kawagoe (2006) Extended SAX, Extension of Symbolic Aggregate Approximation, for Financial Time Series Data Representation

  45. Duan Guifang, Yu Suzuki, Kyoji Kawagoe (2006) Grid Representation of Time Series Data

  46. Claudia Bauzer-Medeiros(2006) Vers un entrepot de donnees pour le trafic routier

  47. Battuguldur Lkhagva, Yu Suzuki and Kyoji Kawagoe: ``New Time Series Data Representation ESAX for Financial Applications'', International Special Workshop on Databases for Next-Generation Researchers (SWOD 2006) in conjunction with International Conference on Data Engineering (ICDE 2006), pp. 17 - 22, Atlanta, USA, April 2006.

  48. Tak Chung Fu (2007) Visualizing Frequently Appearing and Surprising Patterns from Time Series across Different Resolutions. ADMA 2007

  49. Pouget, M. Dacier, J. Zimmerman, A. Clark, and G. Mohay. Internet Attack Knowledge Discovery via Clusters and Cliques of Attack Traces Journal of Information Assurance and Security 1 (2006) 21-32

  50. Jai-Jin Lim   Shin, K.G.  (2005). Energy-efficient self-adapting online linear forecasting for wireless sensor network applications.: Mobile Adhoc and Sensor Systems Conference, 2005.

  51. Bernard Hugueney (2006). Adaptive Segmentation-Based Symbolic Representations of Time Series for Better Modeling and Lower Bounding Distance Measures

  52. McGovern, Amy, Kruger, Adrianna, Rosendahl, Derek, and Droegemeier, Kelvin. (2006) Open problem: Dynamic Relational Models for Improved Hazardous Weather Prediction. Presented at the ICML Workshop on Open Problems in Statistical Relational Learning.

  53. Pedro G. Ferreira, Paulo J. Azevedo, Candida G. Silva2, Rui M.M. Brito (2006) Mining Approximate Motifs in Time Series. DS 2006.

  54. Efficient Methods on Predictions for Similarity Search over Stream Time Series Xiang Lian, and Lei Chen, (2006) 18th International Conference on Scientific and Statistical Database Management (SSDBM'06)   pp. 241-250

  55. Aleks Aris, Ben Shneiderman, Catherine Plaisant, Galit Shmueli and Wolfgang Jank. (2006). Representing Unevenly-Spaced Time Series Data for Visualization and Interactive Exploration

  56. Faisal I. Bashir, Ashfaq A. Khokhar, and Dan Schonfeld (2006). Real-Time Motion Trajectory-Based Indexing and Retrieval of Video Sequences

  57. Battuguldur Lkhagva, Yu Suzuki and Kyoji Kawagoe (2006) New Time Series Data Representation ESAX for Financial Applications

  58. Guifang Duan Yu Suzuki Kyoji Kawagoe (2006) Grid Representation for Efficient Similarity Search in Time Series Databases

  59. Battuguldur Lkhagva, Yu Suzuki and Kyoji Kawagoe (2006). Extended SAX: Extension of Symbolic Aggregate Approximation for Financial Time Series Data Representation. DEWS2006 4A-i8

  60. Guifang Duan Yu Suzuki Kyoji Kawagoe (2006). Grid Representation of Time Series Data for Similarity Search. DEWS2006 2A-i5

  61. David Minnen, Thad Starner, Irfan Essa, and Charles Isbell (2006). Discovering Characteristic Actions from On-Body Sensor Data

  62. Junzhi Li, Guoping Xia, (2006). Association Rules Mining from Time Series Based on Rough Set. Sixth International Conference on Intelligent Systems Design and Applications (ISDA'06) pp. 509-516

  63. A novel non-overlapping bi-clustering algorithm for network generation using Living Cell Array data. Bioinformatics (7 September 2007). E Y, Foteinou PT, King KR, Yarmush ML, Androulakis

  64. Clustering Streaming Time Series Using CBC. Weimin Li1 Contact Information, Liangxu Liu1 and Jiajin Le