ESA-Stream: Efficient Self-Adaptive Online Data Stream Clustering

ESA-Stream: Efficient Self-Adaptive Online Data Stream Clustering. Li, Y., Li, H., Wang, Z., Liu, B., Cui, J., & Fei, H. IEEE Transactions on Knowledge and Data Engineering, 2020. Conference Name: IEEE Transactions on Knowledge and Data Engineering
doi abstract bibtex 2 downloads

Many big data applications produce a massive amount of high-dimensional, real-time, and evolving streaming data. Clustering such data streams with both effectiveness and efficiency are critical for these applications. Although there are well-known data stream clustering algorithms that are based on the popular online-offline framework, these algorithms still face some major challenges. Several critical questions are still not answer satisfactorily: How to perform dimensionality reduction effectively and efficiently in the online dynamic environment? How to enable the clustering algorithm to achieve complete real-time online processing? How to make algorithm parameters learn in a self-supervised or self-adaptive manner to cope with high-speed evolving streams? In this paper, we focus on tackling these challenges by proposing a fully online stream clustering algorithm (called ESA-Stream) that can learn parameters online dynamically in a self-adaptive manner, speedup dimensionality reduction, and cluster data streams effectively and efficiently in an online and dynamic environment Experiments on a wide range of synthetic and real-world data streams show that ESA-Stream outperforms state-of-the-art baselines considerably in both effectiveness and efficiency.

@article{li_esa-stream_2020,
	title = {{ESA}-{Stream}: {Efficient} {Self}-{Adaptive} {Online} {Data} {Stream} {Clustering}},
	issn = {1558-2191},
	shorttitle = {{ESA}-{Stream}},
	doi = {10.1109/TKDE.2020.2990196},
	abstract = {Many big data applications produce a massive amount of high-dimensional, real-time, and evolving streaming data. Clustering such data streams with both effectiveness and efficiency are critical for these applications. Although there are well-known data stream clustering algorithms that are based on the popular online-offline framework, these algorithms still face some major challenges. Several critical questions are still not answer satisfactorily: How to perform dimensionality reduction effectively and efficiently in the online dynamic environment? How to enable the clustering algorithm to achieve complete real-time online processing? How to make algorithm parameters learn in a self-supervised or self-adaptive manner to cope with high-speed evolving streams? In this paper, we focus on tackling these challenges by proposing a fully online stream clustering algorithm (called ESA-Stream) that can learn parameters online dynamically in a self-adaptive manner, speedup dimensionality reduction, and cluster data streams effectively and efficiently in an online and dynamic environment Experiments on a wide range of synthetic and real-world data streams show that ESA-Stream outperforms state-of-the-art baselines considerably in both effectiveness and efficiency.},
	journal = {IEEE Transactions on Knowledge and Data Engineering},
	author = {Li, Yanni and Li, Hui and Wang, Zhi and Liu, Bing and Cui, Jiangtao and Fei, Hang},
	year = {2020},
	note = {Conference Name: IEEE Transactions on Knowledge and Data Engineering},
	keywords = {Clustering algorithms, Clustering methods, Data Stream, Dimensionality reduction, Heuristic algorithms, Indexes, Online Clustering, Partitioning algorithms, Real-time systems, Self-Adaptive},
	pages = {1--1},
}

Downloads: 2

{"_id":"wbnEFASibphZxRMfH","bibbaseid":"li-li-wang-liu-cui-fei-esastreamefficientselfadaptiveonlinedatastreamclustering-2020","authorIDs":["g3PvK52Kwc77Jj57i"],"author_short":["Li, Y.","Li, H.","Wang, Z.","Liu, B.","Cui, J.","Fei, H."],"bibdata":{"bibtype":"article","type":"article","title":"ESA-Stream: Efficient Self-Adaptive Online Data Stream Clustering","issn":"1558-2191","shorttitle":"ESA-Stream","doi":"10.1109/TKDE.2020.2990196","abstract":"Many big data applications produce a massive amount of high-dimensional, real-time, and evolving streaming data. Clustering such data streams with both effectiveness and efficiency are critical for these applications. Although there are well-known data stream clustering algorithms that are based on the popular online-offline framework, these algorithms still face some major challenges. Several critical questions are still not answer satisfactorily: How to perform dimensionality reduction effectively and efficiently in the online dynamic environment? How to enable the clustering algorithm to achieve complete real-time online processing? How to make algorithm parameters learn in a self-supervised or self-adaptive manner to cope with high-speed evolving streams? In this paper, we focus on tackling these challenges by proposing a fully online stream clustering algorithm (called ESA-Stream) that can learn parameters online dynamically in a self-adaptive manner, speedup dimensionality reduction, and cluster data streams effectively and efficiently in an online and dynamic environment Experiments on a wide range of synthetic and real-world data streams show that ESA-Stream outperforms state-of-the-art baselines considerably in both effectiveness and efficiency.","journal":"IEEE Transactions on Knowledge and Data Engineering","author":[{"propositions":[],"lastnames":["Li"],"firstnames":["Yanni"],"suffixes":[]},{"propositions":[],"lastnames":["Li"],"firstnames":["Hui"],"suffixes":[]},{"propositions":[],"lastnames":["Wang"],"firstnames":["Zhi"],"suffixes":[]},{"propositions":[],"lastnames":["Liu"],"firstnames":["Bing"],"suffixes":[]},{"propositions":[],"lastnames":["Cui"],"firstnames":["Jiangtao"],"suffixes":[]},{"propositions":[],"lastnames":["Fei"],"firstnames":["Hang"],"suffixes":[]}],"year":"2020","note":"Conference Name: IEEE Transactions on Knowledge and Data Engineering","keywords":"Clustering algorithms, Clustering methods, Data Stream, Dimensionality reduction, Heuristic algorithms, Indexes, Online Clustering, Partitioning algorithms, Real-time systems, Self-Adaptive","pages":"1–1","bibtex":"@article{li_esa-stream_2020,\n\ttitle = {{ESA}-{Stream}: {Efficient} {Self}-{Adaptive} {Online} {Data} {Stream} {Clustering}},\n\tissn = {1558-2191},\n\tshorttitle = {{ESA}-{Stream}},\n\tdoi = {10.1109/TKDE.2020.2990196},\n\tabstract = {Many big data applications produce a massive amount of high-dimensional, real-time, and evolving streaming data. Clustering such data streams with both effectiveness and efficiency are critical for these applications. Although there are well-known data stream clustering algorithms that are based on the popular online-offline framework, these algorithms still face some major challenges. Several critical questions are still not answer satisfactorily: How to perform dimensionality reduction effectively and efficiently in the online dynamic environment? How to enable the clustering algorithm to achieve complete real-time online processing? How to make algorithm parameters learn in a self-supervised or self-adaptive manner to cope with high-speed evolving streams? In this paper, we focus on tackling these challenges by proposing a fully online stream clustering algorithm (called ESA-Stream) that can learn parameters online dynamically in a self-adaptive manner, speedup dimensionality reduction, and cluster data streams effectively and efficiently in an online and dynamic environment Experiments on a wide range of synthetic and real-world data streams show that ESA-Stream outperforms state-of-the-art baselines considerably in both effectiveness and efficiency.},\n\tjournal = {IEEE Transactions on Knowledge and Data Engineering},\n\tauthor = {Li, Yanni and Li, Hui and Wang, Zhi and Liu, Bing and Cui, Jiangtao and Fei, Hang},\n\tyear = {2020},\n\tnote = {Conference Name: IEEE Transactions on Knowledge and Data Engineering},\n\tkeywords = {Clustering algorithms, Clustering methods, Data Stream, Dimensionality reduction, Heuristic algorithms, Indexes, Online Clustering, Partitioning algorithms, Real-time systems, Self-Adaptive},\n\tpages = {1--1},\n}\n\n\n\n","author_short":["Li, Y.","Li, H.","Wang, Z.","Liu, B.","Cui, J.","Fei, H."],"key":"li_esa-stream_2020","id":"li_esa-stream_2020","bibbaseid":"li-li-wang-liu-cui-fei-esastreamefficientselfadaptiveonlinedatastreamclustering-2020","role":"author","urls":{},"keyword":["Clustering algorithms","Clustering methods","Data Stream","Dimensionality reduction","Heuristic algorithms","Indexes","Online Clustering","Partitioning algorithms","Real-time systems","Self-Adaptive"],"metadata":{"authorlinks":{"li, h":"https://lihuixidian.github.io/"}},"downloads":2,"html":""},"bibtype":"article","biburl":"https://bibbase.org/zotero/mh_lenguyen","creationDate":"2020-04-24T02:17:43.592Z","downloads":2,"keywords":["clustering algorithms","clustering methods","data stream","dimensionality reduction","heuristic algorithms","indexes","online clustering","partitioning algorithms","real-time systems","self-adaptive"],"search_terms":["esa","stream","efficient","self","adaptive","online","data","stream","clustering","li","li","wang","liu","cui","fei"],"title":"ESA-Stream: Efficient Self-Adaptive Online Data Stream Clustering","year":2020,"dataSources":["8PmKvfGBhqCs9Xee3","XJ7Gu6aiNbAiJAjbw","XvjRDbrMBW2XJY3p9","3C6BKwtiX883bctx4","5THezwiL4FyF8mm4G","RktFJE9cDa98BRLZF","qpxPuYKLChgB7ox6D","PfM5iniYHEthCfQDH","yanwtMpCcFaHzRwWb","iwKepCrWBps7ojhDx"]}