Density-Based Data Streams Clustering over Sliding Windows. Ren, J. & Ma, R. In 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery, volume 5, pages 248–252, August, 2009.
doi  abstract   bibtex   
Data stream clustering is an important task in data stream mining. In this paper, we propose SDStream, a new method for performing density-based data streams clustering over sliding windows. SDStream adopts CluStream clustering framework. In the online component, the potential core-micro-cluster and outlier micro-cluster structures are introduced to maintain the potential clusters and outliers. They are stored in the form of exponential histogram of cluster feature (EHCF) in main memory and are maintained by the maintenance of EHCFs. Outdated micro-clusters which need to be deleted are found by the value of t in temporal cluster feature (TCF). In the offline component, the final clusters of arbitrary shape are generated according to all the potential core-micro-clusters maintained online by DBSCAN algorithm. Experimental results show that SDStream which can generate clusters of arbitrary shape has a much higher clustering quality than CluStream which generates spherical clusters.
@inproceedings{ren_density-based_2009,
	title = {Density-{Based} {Data} {Streams} {Clustering} over {Sliding} {Windows}},
	volume = {5},
	doi = {10.1109/FSKD.2009.553},
	abstract = {Data stream clustering is an important task in data stream mining. In this paper, we propose SDStream, a new method for performing density-based data streams clustering over sliding windows. SDStream adopts CluStream clustering framework. In the online component, the potential core-micro-cluster and outlier micro-cluster structures are introduced to maintain the potential clusters and outliers. They are stored in the form of exponential histogram of cluster feature (EHCF) in main memory and are maintained by the maintenance of EHCFs. Outdated micro-clusters which need to be deleted are found by the value of t in temporal cluster feature (TCF). In the offline component, the final clusters of arbitrary shape are generated according to all the potential core-micro-clusters maintained online by DBSCAN algorithm. Experimental results show that SDStream which can generate clusters of arbitrary shape has a much higher clustering quality than CluStream which generates spherical clusters.},
	booktitle = {2009 {Sixth} {International} {Conference} on {Fuzzy} {Systems} and {Knowledge} {Discovery}},
	author = {Ren, Jiadong and Ma, Ruiqing},
	month = aug,
	year = {2009},
	keywords = {Cities and towns, Clustering algorithms, Data mining, Data structures, Educational institutions, Electronic mail, Fuzzy systems, Histograms, Partitioning algorithms, Shape, data stream, density-based clustering, sliding windows},
	pages = {248--252},
}

Downloads: 0