Variable-length Subsequence Clustering in Time Series. Duan, J. & Guo, L. IEEE Transactions on Knowledge and Data Engineering, 2020. Conference Name: IEEE Transactions on Knowledge and Data Engineering
doi  abstract   bibtex   
Subsequence clustering is an important issue in time series data mining. Observing that most time series consist of various patterns with different unknown lengths, we propose an optimization framework to adaptively estimate the lengths and representations for different patterns. Our framework minimizes the inner subsequence cluster errors with respect to subsequence clusters and segmentation under time series cover constraint where the subsequence cluster lengths can be variable. To optimize our framework, we first generate abundant initial subsequence clusters with different lengths. Then, three cluster operations, i.e., cluster splitting, combination and removing, are used to iteratively refine the cluster lengths and representations by respectively splitting clusters consisting of different patterns, joining neighboring clusters belonging to the same pattern and removing clusters to the predefined cluster number. During each cluster refinement, we employ an efficient algorithm to alternatively optimize subsequence clusters and segmentation based on dynamic programming. Our method can automatically and efficiently extract the unknown variable-length subsequence clusters in the time series. Comparative results with the state-of-the-art are conducted on various synthetic and real time series, and quantitative and qualitative performances demonstrate the effectiveness of our method.
@article{duan_variable-length_2020,
	title = {Variable-length {Subsequence} {Clustering} in {Time} {Series}},
	issn = {1558-2191},
	doi = {10.1109/TKDE.2020.2986965},
	abstract = {Subsequence clustering is an important issue in time series data mining. Observing that most time series consist of various patterns with different unknown lengths, we propose an optimization framework to adaptively estimate the lengths and representations for different patterns. Our framework minimizes the inner subsequence cluster errors with respect to subsequence clusters and segmentation under time series cover constraint where the subsequence cluster lengths can be variable. To optimize our framework, we first generate abundant initial subsequence clusters with different lengths. Then, three cluster operations, i.e., cluster splitting, combination and removing, are used to iteratively refine the cluster lengths and representations by respectively splitting clusters consisting of different patterns, joining neighboring clusters belonging to the same pattern and removing clusters to the predefined cluster number. During each cluster refinement, we employ an efficient algorithm to alternatively optimize subsequence clusters and segmentation based on dynamic programming. Our method can automatically and efficiently extract the unknown variable-length subsequence clusters in the time series. Comparative results with the state-of-the-art are conducted on various synthetic and real time series, and quantitative and qualitative performances demonstrate the effectiveness of our method.},
	journal = {IEEE Transactions on Knowledge and Data Engineering},
	author = {Duan, Jiangyong and Guo, Lili},
	year = {2020},
	note = {Conference Name: IEEE Transactions on Knowledge and Data Engineering},
	keywords = {Adaptation models, Clustering algorithms, Clustering methods, Data mining, Feature extraction, Optimization, Time series analysis, subsequence clustering, time series data mining, time series segmentation, variable-length patterns},
	pages = {1--1},
}

Downloads: 0