Urban Web Crawling. Ahlers, D. & Boll, S. In pages 25-32.
abstract   bibtex   
Local search is increasingly becoming a major focus point of research interest as a widely-recognized speciality search with a large application area. Its data is usually aggregated from a variety of sources. One as yet largely untapped source of location data is the WWW. Today, the Web does not explicitly reveal its location-relation, rather this information is hidden somewhere within pages' contents. To exploit such location information, we need to find, extract and geo-spatially index relevant Web pages. For an effective retrieval of such content, this paper examines the application of focused Web crawling to the geospatial domain. We describe our approach for a geo-aware focused crawling of urban areas and other regions with a high building density and present our experimental results that gives us insight into spatial Web information such as location density and link distance between topical pages. Our crawls and evaluations back our hypothesis that adaptive focused crawling yields good results on the urban geospatial topic.
@inproceedings{ ahl08,
  crossref = {locweb2008},
  author = {Dirk Ahlers and Susanne Boll},
  title = {Urban Web Crawling},
  pages = {25-32},
  abstract = {Local search is increasingly becoming a major focus point of research interest as a widely-recognized speciality search with a large application area. Its data is usually aggregated from a variety of sources. One as yet largely untapped source of location data is the WWW. Today, the Web does not explicitly reveal its location-relation, rather this information is hidden somewhere within pages' contents. To exploit such location information, we need to find, extract and geo-spatially index relevant Web pages. For an effective retrieval of such content, this paper examines the application of focused Web crawling to the geospatial domain. We describe our approach for a geo-aware focused crawling of urban areas and other regions with a high building density and present our experimental results that gives us insight into spatial Web information such as location density and link distance between topical pages. Our crawls and evaluations back our hypothesis that adaptive focused crawling yields good results on the urban geospatial topic.}
}

Downloads: 0