Paper Website abstract bibtex
In this paper, we propose a novel document clustering method based on the non-negative factorization of the term-document matrix of the given document corpus. In the la-tent semantic space derived by the non-negative matrix fac-torization (NMF), each axis captures the base topic of a par-ticular document cluster, and each document is represented as an additive combination of the base topics. The cluster membership of each document can be easily determined by finding the base topic (the axis) with which the document has the largest projection value. Our experimental evalua-tions show that the proposed document clustering method surpasses the latent semantic indexing and the spectral clus-tering methods not only in the easy and reliable derivation of document clustering results, but also in document clus-tering accuracies.