Comparative summarisation of rich media collections. Bista, U. In ACM International Conference on Web Search and Data Mining (WSDM '19), Doctoral Consortium, Melbourne, VIC, Australia, 2019.
Comparative summarisation of rich media collections [pdf]Paper  abstract   bibtex   
The goal of this thesis is to develop techniques for comparative summarisation of multimodal document collections. Comparative summarisation is extractive summarisation in comparative settings, where documents form two or more groups, e.g. articles on the same topic but from different sources. Comparative summarisa- tion involves, not only, selecting representative and diverse sam- ples within groups, but also samples that highlight commonalities and differences between the groups. We posit that comparative summarisation is a fruitful problem for diverse use cases, such as comparing content over time, authors, or distinct view points. We formulate the problem of comparative summarisation by reducing it to binary classification problem and define objectives to incorpo- rate representativeness, diversity and comparativeness. We design new automatic and crowd-sourced evaluation protocols for sum- marisation evaluation that scales much better than the evaluations requiring manually created ground truth summaries. We show the efficacy of the approach in a newly curated datasets of controver- sial news topics. We plan to develop new collection comparison methods for multimodal document collections.

Downloads: 0