Big Data Clustering: A Review. Shirkhorshidi, A. S., Aghabozorgi, S., Wah, T. Y., & Herawan, T. In Murgante, B., Misra, S., Rocha, A. M. A. C., Torre, C., Rocha, J. G., Falcão, M. I., Taniar, D., Apduhan, B. O., & Gervasi, O., editors, Computational Science and Its Applications – ICCSA 2014, of Lecture Notes in Computer Science, pages 707–720, Cham, 2014. Springer International Publishing.
doi  abstract   bibtex   
Clustering is an essential data mining and tool for analyzing big data. There are difficulties for applying clustering techniques to big data duo to new challenges that are raised with big data. As Big Data is referring to terabytes and petabytes of data and clustering algorithms are come with high computational costs, the question is how to cope with this problem and how to deploy clustering techniques to big data and get the results in a reasonable time. This study is aimed to review the trend and progress of clustering algorithms to cope with big data challenges from very first proposed algorithms until today’s novel solutions. The algorithms and the targeted challenges for producing improved clustering algorithms are introduced and analyzed, and afterward the possible future path for more advanced algorithms is illuminated based on today’s available technologies and frameworks.
@inproceedings{shirkhorshidi_big_2014,
	address = {Cham},
	series = {Lecture {Notes} in {Computer} {Science}},
	title = {Big {Data} {Clustering}: {A} {Review}},
	isbn = {978-3-319-09156-3},
	shorttitle = {Big {Data} {Clustering}},
	doi = {10.1007/978-3-319-09156-3_49},
	abstract = {Clustering is an essential data mining and tool for analyzing big data. There are difficulties for applying clustering techniques to big data duo to new challenges that are raised with big data. As Big Data is referring to terabytes and petabytes of data and clustering algorithms are come with high computational costs, the question is how to cope with this problem and how to deploy clustering techniques to big data and get the results in a reasonable time. This study is aimed to review the trend and progress of clustering algorithms to cope with big data challenges from very first proposed algorithms until today’s novel solutions. The algorithms and the targeted challenges for producing improved clustering algorithms are introduced and analyzed, and afterward the possible future path for more advanced algorithms is illuminated based on today’s available technologies and frameworks.},
	language = {en},
	booktitle = {Computational {Science} and {Its} {Applications} – {ICCSA} 2014},
	publisher = {Springer International Publishing},
	author = {Shirkhorshidi, Ali Seyed and Aghabozorgi, Saeed and Wah, Teh Ying and Herawan, Tutut},
	editor = {Murgante, Beniamino and Misra, Sanjay and Rocha, Ana Maria A. C. and Torre, Carmelo and Rocha, Jorge Gustavo and Falcão, Maria Irene and Taniar, David and Apduhan, Bernady O. and Gervasi, Osvaldo},
	year = {2014},
	keywords = {Big Data, Clustering, MapReduce, Parallel Clustering},
	pages = {707--720},
}

Downloads: 0