Short and Sparse Text Topic Modeling via Self-Aggregation. Quan, X., Kit, C., Ge, Y., & Pan, S., J.
Short and Sparse Text Topic Modeling via Self-Aggregation [pdf]Paper  Short and Sparse Text Topic Modeling via Self-Aggregation [pdf]Website  abstract   bibtex   
The overwhelming amount of short text data on social media and elsewhere has posed great chal-lenges to topic modeling due to the sparsity prob-lem. Most existing attempts to alleviate this prob-lem resort to heuristic strategies to aggregate short texts into pseudo-documents before the application of standard topic modeling. Although such strate-gies cannot be well generalized to more general genres of short texts, the success has shed light on how to develop a generalized solution. In this pa-per, we present a novel model towards this goal by integrating topic modeling with short text aggre-gation during topic inference. The aggregation is founded on general topical affinity of texts rather than particular heuristics, making the model read-ily applicable to various short texts. Experimental results on real-world datasets validate the effective-ness of this new model, suggesting that it can distill more meaningful topics from short texts.

Downloads: 0