Towards capturing and preserving changes on the Web of Data. Umbrich, J., Mrzelj, N., & Polleres, A. In Managing the Evolution and Preservation of the Data Web - First Diachron Workshop at ESWC 2015, pages 50–65, Portorož, Slovenia, May, 2015.
Towards capturing and preserving changes on the Web of Data [pdf]Paper  abstract   bibtex   
Existing Web archives aim to capture and preserve the changes of documents on the Web and provide data corpora of high value which are used in various areas (e.g. to optimise algorithms or to study the Zeitgeist of a generation). So far, the Web archives concentrate their efforts to capture the large Web of documents with periodic snapshot crawls. Little focus is drawn to preserve the continuously growing Web of Data and actually keeping track of the real frequency of changes. In this work we present our efforts to capture and archive the changes on the Web of Data. We describe our infrastructure and focus on evaluating strategies to accurately capture the changes of data and to also estimate the crawl time for a given set of URLs with the aim to optimally schedule the revising of URLs with limited resources.

Downloads: 0