iSAX: indexing and mining terabyte sized time series

iSAX: indexing and mining terabyte sized time series. Shieh, J. & Keogh, E. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, of KDD '08, pages 623–631, New York, NY, USA, August, 2008. Association for Computing Machinery.

Paper doi abstract bibtex

Current research in indexing and mining time series data has produced many interesting algorithms and representations. However, the algorithms and the size of data considered have generally not been representative of the increasingly massive datasets encountered in science, engineering, and business domains. In this work, we show how a novel multi-resolution symbolic representation can be used to index datasets which are several orders of magnitude larger than anything else considered in the literature. Our approach allows both fast exact search and ultra fast approximate search. We show how to exploit the combination of both types of search as sub-routines in data mining algorithms, allowing for the exact mining of truly massive real world datasets, containing millions of time series.

@inproceedings{shieh_isax_2008,
	address = {New York, NY, USA},
	series = {{KDD} '08},
	title = {{iSAX}: indexing and mining terabyte sized time series},
	isbn = {978-1-60558-193-4},
	shorttitle = {\textit{i}{SAX}},
	url = {https://doi.org/10.1145/1401890.1401966},
	doi = {10.1145/1401890.1401966},
	abstract = {Current research in indexing and mining time series data has produced many interesting algorithms and representations. However, the algorithms and the size of data considered have generally not been representative of the increasingly massive datasets encountered in science, engineering, and business domains. In this work, we show how a novel multi-resolution symbolic representation can be used to index datasets which are several orders of magnitude larger than anything else considered in the literature. Our approach allows both fast exact search and ultra fast approximate search. We show how to exploit the combination of both types of search as sub-routines in data mining algorithms, allowing for the exact mining of truly massive real world datasets, containing millions of time series.},
	urldate = {2020-10-01},
	booktitle = {Proceedings of the 14th {ACM} {SIGKDD} international conference on {Knowledge} discovery and data mining},
	publisher = {Association for Computing Machinery},
	author = {Shieh, Jin and Keogh, Eamonn},
	month = aug,
	year = {2008},
	keywords = {data mining, indexing, representations, time series},
	pages = {623--631},
}

Downloads: 0

{"_id":"8kL35EyqHKCtxbjmH","bibbaseid":"shieh-keogh-isaxindexingandminingterabytesizedtimeseries-2008","author_short":["Shieh, J.","Keogh, E."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","address":"New York, NY, USA","series":"KDD '08","title":"iSAX: indexing and mining terabyte sized time series","isbn":"978-1-60558-193-4","shorttitle":"<i>i</i>SAX","url":"https://doi.org/10.1145/1401890.1401966","doi":"10.1145/1401890.1401966","abstract":"Current research in indexing and mining time series data has produced many interesting algorithms and representations. However, the algorithms and the size of data considered have generally not been representative of the increasingly massive datasets encountered in science, engineering, and business domains. In this work, we show how a novel multi-resolution symbolic representation can be used to index datasets which are several orders of magnitude larger than anything else considered in the literature. Our approach allows both fast exact search and ultra fast approximate search. We show how to exploit the combination of both types of search as sub-routines in data mining algorithms, allowing for the exact mining of truly massive real world datasets, containing millions of time series.","urldate":"2020-10-01","booktitle":"Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining","publisher":"Association for Computing Machinery","author":[{"propositions":[],"lastnames":["Shieh"],"firstnames":["Jin"],"suffixes":[]},{"propositions":[],"lastnames":["Keogh"],"firstnames":["Eamonn"],"suffixes":[]}],"month":"August","year":"2008","keywords":"data mining, indexing, representations, time series","pages":"623–631","bibtex":"@inproceedings{shieh_isax_2008,\n\taddress = {New York, NY, USA},\n\tseries = {{KDD} '08},\n\ttitle = {{iSAX}: indexing and mining terabyte sized time series},\n\tisbn = {978-1-60558-193-4},\n\tshorttitle = {\\textit{i}{SAX}},\n\turl = {https://doi.org/10.1145/1401890.1401966},\n\tdoi = {10.1145/1401890.1401966},\n\tabstract = {Current research in indexing and mining time series data has produced many interesting algorithms and representations. However, the algorithms and the size of data considered have generally not been representative of the increasingly massive datasets encountered in science, engineering, and business domains. In this work, we show how a novel multi-resolution symbolic representation can be used to index datasets which are several orders of magnitude larger than anything else considered in the literature. Our approach allows both fast exact search and ultra fast approximate search. We show how to exploit the combination of both types of search as sub-routines in data mining algorithms, allowing for the exact mining of truly massive real world datasets, containing millions of time series.},\n\turldate = {2020-10-01},\n\tbooktitle = {Proceedings of the 14th {ACM} {SIGKDD} international conference on {Knowledge} discovery and data mining},\n\tpublisher = {Association for Computing Machinery},\n\tauthor = {Shieh, Jin and Keogh, Eamonn},\n\tmonth = aug,\n\tyear = {2008},\n\tkeywords = {data mining, indexing, representations, time series},\n\tpages = {623--631},\n}\n\n\n\n","author_short":["Shieh, J.","Keogh, E."],"key":"shieh_isax_2008","id":"shieh_isax_2008","bibbaseid":"shieh-keogh-isaxindexingandminingterabytesizedtimeseries-2008","role":"author","urls":{"Paper":"https://doi.org/10.1145/1401890.1401966"},"keyword":["data mining","indexing","representations","time series"],"metadata":{"authorlinks":{}},"html":""},"bibtype":"inproceedings","biburl":"https://bibbase.org/zotero/mh_lenguyen","dataSources":["iwKepCrWBps7ojhDx"],"keywords":["data mining","indexing","representations","time series"],"search_terms":["isax","indexing","mining","terabyte","sized","time","series","shieh","keogh"],"title":"iSAX: indexing and mining terabyte sized time series","year":2008}