Generating Links by Mining Quotations. Kolak, O. & Schilit, B. N. In pages 117-126.
doi  abstract   bibtex   
Scanning books, magazines, and newspapers has become a widespread activity because people believe that much of the worlds information still resides off-line. In general after works are scanned they are indexed for search and processed to add links. This paper describes a new approach to automatically add links by mining popularly quoted passages. Our technique connects elements that are semantically rich, so strong relations are made. Moreover, link targets point within a work, facilitating navigation. This paper makes three contributions. We describe a scalable algorithm for mining repeated word sequences from extremely large text corpora. Second, we present techniques that filter and rank the repeated sequences for quotations. Third, we present a new user interface for navigating across and within works in the collection using quotation links. Our system has been run on a digital library of over 1 million books and has been used by thousands of people.
@inproceedings{ kol08,
  crossref = {acmht08},
  author = {Okan Kolak and Bill N. Schilit},
  title = {Generating Links by Mining Quotations},
  pages = {117-126},
  doi = {10.1145/1379092.1379117},
  abstract = {Scanning books, magazines, and newspapers has become a widespread activity because people believe that much of the worlds information still resides off-line. In general after works are scanned they are indexed for search and processed to add links. This paper describes a new approach to automatically add links by mining popularly quoted passages. Our technique connects elements that are semantically rich, so strong relations are made. Moreover, link targets point within a work, facilitating navigation. This paper makes three contributions. We describe a scalable algorithm for mining repeated word sequences from extremely large text corpora. Second, we present techniques that filter and rank the repeated sequences for quotations. Third, we present a new user interface for navigating across and within works in the collection using quotation links. Our system has been run on a digital library of over 1 million books and has been used by thousands of people.}
}
Downloads: 0