Synchronization-based clustering on evolving data stream. Shao, J., Tan, Y., Gao, L., Yang, Q., Plant, C., & Assent, I. Information Sciences, 501:573–587, October, 2019.
Synchronization-based clustering on evolving data stream [link]Paper  doi  abstract   bibtex   
Clustering streams of data is of increasing importance in many applications. In this paper, we propose a new synchronization-based clustering approach for evolving data streams, called SyncTree, which maintains all micro-clusters at different levels of granularity depending upon the data recency. Instead of using a sliding window or decay function to focus on recent data, SyncTree summarizes all continuously-arriving objects as synchronized micro-clusters sequentially in a batch fashion. Owing to the powerful concept of synchronization, the derived micro-clusters truly reflect the intrinsic cluster structure rather than summarize statistics of data, and old micro-clusters can be intuitively summarized at a higher level by iterative clustering to fit memory constraints. Building upon the hierarchical micro-clusters, SyncTree allows investigating the cluster structure of the data stream between any two time stamps in the past, and also provides a principled way to analyze the cluster evolution. Empirical results demonstrate that our method has good performance compared to state-of-the-art algorithms.
@article{shao_synchronization-based_2019,
	title = {Synchronization-based clustering on evolving data stream},
	volume = {501},
	issn = {0020-0255},
	url = {https://www.sciencedirect.com/science/article/pii/S0020025518307400},
	doi = {10.1016/j.ins.2018.09.035},
	abstract = {Clustering streams of data is of increasing importance in many applications. In this paper, we propose a new synchronization-based clustering approach for evolving data streams, called SyncTree, which maintains all micro-clusters at different levels of granularity depending upon the data recency. Instead of using a sliding window or decay function to focus on recent data, SyncTree summarizes all continuously-arriving objects as synchronized micro-clusters sequentially in a batch fashion. Owing to the powerful concept of synchronization, the derived micro-clusters truly reflect the intrinsic cluster structure rather than summarize statistics of data, and old micro-clusters can be intuitively summarized at a higher level by iterative clustering to fit memory constraints. Building upon the hierarchical micro-clusters, SyncTree allows investigating the cluster structure of the data stream between any two time stamps in the past, and also provides a principled way to analyze the cluster evolution. Empirical results demonstrate that our method has good performance compared to state-of-the-art algorithms.},
	language = {en},
	urldate = {2021-10-18},
	journal = {Information Sciences},
	author = {Shao, Junming and Tan, Yue and Gao, Lianli and Yang, Qinli and Plant, Claudia and Assent, Ira},
	month = oct,
	year = {2019},
	keywords = {Clustering, Data stream, Evolving analysis, Synchronization},
	pages = {573--587},
}

Downloads: 0