Topical Clustering of Tweets. Rosa, K., D., Shah, R., Lin, B., Gershman, A., & Frederking, R.
Topical Clustering of Tweets [pdf]Paper  Topical Clustering of Tweets [pdf]Website  abstract   bibtex   
In the emerging field of micro-blogging and social communication services, users post millions of short messages every day. Keeping track of all the messages posted by your friends and the conversation as a whole can become tedious or even impossible. In this paper, we presented a study on automatically clustering and classifying Twitter messages, also known as " tweets " , into different categories, inspired by the approaches taken by news aggregating services like Google News. Our results suggest that the clusters produced by traditional unsupervised methods can often be incoherent from a topical perspective, but utilizing a supervised methodology that utilize the hash-tags as indicators of topics produce surprisingly good results. We also offer a discussion on temporal effects of our methodology and training set size considerations. Lastly, we describe a simple method of finding the most representative tweet in a cluster, and provide an analysis of the results.

Downloads: 0