An Evaluation of Topic Modelling Techniques for Twitter. Jónsson, E. & Stolee, J.
An Evaluation of Topic Modelling Techniques for Twitter [pdf]Paper  An Evaluation of Topic Modelling Techniques for Twitter [pdf]Website  abstract   bibtex   
In this paper, we complete an evaluation of various topic modelling algorithms, and examine their performance when working with Twitter tweets. LDA [1] is an algorithm that is often used when modelling topics within text, and it has been proven to be affective; however, LDA may not necessarily perform well when working with documents that are short in length [2, 3, 4]. We compare LDA to three models which offer potential improvements over the downfalls of LDA when modelling tweets. This includes a variation of LDA, referred to as LDA-U, which aggregates data on a user-basis in an effort to improve the standard LDA model's performance[3]. We also evaluate two other models specifically designed to work with short text: the " biterm topic model " (BTM), and a " word2vec Gaussian mix-ture model " , which models topics as a distribution over words in semantic space [4].

Downloads: 0