Multi-level Semantic Labelling of Numerical Values. Neumaier, S., Umbrich, J., Parreira, J., & Polleres, A. In Proceedings of the 15th International Semantic Web Conference (ISWC 2016) - Part I, volume 9981, of Lecture Notes in Computer Science (LNCS), pages 428–445, Kobe, Japan, October, 2016. Springer. \textbfNominated for best student paper award
Multi-level Semantic Labelling of Numerical Values [pdf]Paper  doi  abstract   bibtex   
With the success of Open Data a huge amount of tabular data sources became available that could potentially be mapped and linked into the Web of (Linked) Data. Most existing approaches to ``semantically label'' such tabular data rely on mappings of textual information to classes, properties, or instances in RDF knowledge bases in order to link – and eventually transform – tabular data into RDF. However, as we will illustrate, Open Data tables typically contain a large portion of numerical columns and/or non-textual headers; therefore solutions that solely focus on textual ``cues'' are only partially applicable for mapping such data sources. We propose an approach to find and rank candidates of semantic labels and context descriptions for a given bag of numerical values. To this end, we apply a hierarchical clustering over information taken from DBpedia to build a background knowledge graph of possible ``semantic contexts'' for bags of numerical values, over which we perform a nearest neighbour search to rank the most likely candidates. Our evaluation shows that our approach can assign fine-grained semantic labels, when there is enough supporting evidence in the background knowledge graph. In other cases, our approach can nevertheless assign high level contexts to the data, which could potentially be used in combination with other approaches to narrow down the search space of possible labels.

Downloads: 0