Discovering geographical topics in the twitter stream. Hong, L., Ahmed, A., Gurumurthy, S., Smola, A., & Tsioutsiouliklis, K. In Mille, A., Gandon, F., Misselis, J., Rabinovich, M., & Staab, S., editors, pages 769–778, 2012. ACM.
Discovering geographical topics in the twitter stream [pdf]Paper  abstract   bibtex   1 download  
Micro-blogging services have become indispensable com- munication tools for online users for disseminating break- ing news, eyewitness accounts, individual expression, and protest groups. Recently, Twitter, along with other on- line social networking services such as Foursquare, Gowalla, Facebook and Yelp, have started supporting location ser- vices in their messages, either explicitly, by letting users choose their places, or implicitly, by enabling geo-tagging, which is to associate messages with latitudes and longitudes. This functionality allows researchers to address an exciting set of questions: 1) How is information created and shared across geographical locations, 2) How do spatial and linguis- tic characteristics of people vary across regions, and 3) How to model human mobility. Although many attempts have been made for tackling these problems, previous methods are either complicated to be implemented or oversimplified that cannot yield reasonable performance. It is a challenge task to discover topics and identify users’ interests from these geo-tagged messages due to the sheer amount of data and diversity of language variations used on these location sharing services. In this paper we focus on Twitter and present an algorithm by modeling diversity in tweets based on topical diversity, geographical diversity, and an interest distribution of the user. Furthermore, we take the Markovian nature of a user’s location into account. Our model exploits sparse factorial coding of the attributes, thus allowing us to deal with a large and diverse set of covariates efficiently. Our approach is vital for applications such as user profiling, content recommendation and topic tracking. We show high accuracy in location estimation based on our model. Moreover, the algorithm identifies interesting topics based on location and language.

Downloads: 1