Multi-level Semantic Labelling of Numerical Values. Neumaier, S., Umbrich, J., Parreira, J., & Polleres, A. In Proceedings of the 15th International Semantic Web Conference (ISWC 2016) - Part I, volume 9981, of Lecture Notes in Computer Science (LNCS), pages 428–445, Kobe, Japan, October, 2016. Springer. \textbfNominated for best student paper awardPaper doi abstract bibtex With the success of Open Data a huge amount of tabular data sources became available that could potentially be mapped and linked into the Web of (Linked) Data. Most existing approaches to ``semantically label'' such tabular data rely on mappings of textual information to classes, properties, or instances in RDF knowledge bases in order to link – and eventually transform – tabular data into RDF. However, as we will illustrate, Open Data tables typically contain a large portion of numerical columns and/or non-textual headers; therefore solutions that solely focus on textual ``cues'' are only partially applicable for mapping such data sources. We propose an approach to find and rank candidates of semantic labels and context descriptions for a given bag of numerical values. To this end, we apply a hierarchical clustering over information taken from DBpedia to build a background knowledge graph of possible ``semantic contexts'' for bags of numerical values, over which we perform a nearest neighbour search to rank the most likely candidates. Our evaluation shows that our approach can assign fine-grained semantic labels, when there is enough supporting evidence in the background knowledge graph. In other cases, our approach can nevertheless assign high level contexts to the data, which could potentially be used in combination with other approaches to narrow down the search space of possible labels.
@inproceedings{neum-etal-2016ISWC,
author = {Neumaier, Sebastian and Umbrich, J\"urgen and Parreira, Josiane and Polleres, Axel},
title = {Multi-level Semantic Labelling of Numerical Values},
abstract = {With the success of Open Data a huge amount of tabular data sources became available that could potentially be mapped and linked into the Web of (Linked) Data. Most existing approaches to ``semantically label'' such tabular data rely on mappings of textual information to classes, properties, or instances in RDF knowledge bases in order to link -- and eventually transform -- tabular data into RDF. However, as we will illustrate, Open Data tables typically contain a large portion of numerical columns and/or non-textual headers; therefore solutions that solely focus on textual ``cues'' are only partially applicable for mapping such data sources. We propose an approach to find and rank candidates of semantic labels and context descriptions for a given bag of numerical values. To this end, we apply a hierarchical clustering over information taken from DBpedia to build a background knowledge graph of possible ``semantic contexts'' for bags of numerical values, over which we perform a nearest neighbour search to rank the most likely candidates. Our evaluation shows that our approach can assign fine-grained semantic labels, when there is enough supporting evidence in the background knowledge graph. In other cases, our approach can nevertheless assign high level contexts to the data, which could potentially be used in combination with other approaches to narrow down the search space of possible labels.},
note = {\textbf{Nominated for best student paper award}},
month = oct,
day = {17--21},
pages = {428--445},
year = 2016,
booktitle = {Proceedings of the 15th International Semantic Web Conference (ISWC 2016) - Part I},
volume = 9981,
address = {Kobe, Japan},
series = LNCS,
publisher = {Springer},
url = {http://polleres.net/publications/neum-etal-2016ISWC.pdf},
doi = {https://doi.org/10.1007/978-3-319-46523-4_26}
}
Downloads: 0
{"_id":"Ld3aXHxCg9BYARCAn","bibbaseid":"neumaier-umbrich-parreira-polleres-multilevelsemanticlabellingofnumericalvalues-2016","downloads":0,"creationDate":"2016-09-06T12:38:41.017Z","title":"Multi-level Semantic Labelling of Numerical Values","author_short":["Neumaier, S.","Umbrich, J.","Parreira, J.","Polleres, A."],"year":2016,"bibtype":"inproceedings","biburl":"www.polleres.net/mypublications.bib","bibdata":{"bibtype":"inproceedings","type":"inproceedings","author":[{"propositions":[],"lastnames":["Neumaier"],"firstnames":["Sebastian"],"suffixes":[]},{"propositions":[],"lastnames":["Umbrich"],"firstnames":["Jürgen"],"suffixes":[]},{"propositions":[],"lastnames":["Parreira"],"firstnames":["Josiane"],"suffixes":[]},{"propositions":[],"lastnames":["Polleres"],"firstnames":["Axel"],"suffixes":[]}],"title":"Multi-level Semantic Labelling of Numerical Values","abstract":"With the success of Open Data a huge amount of tabular data sources became available that could potentially be mapped and linked into the Web of (Linked) Data. Most existing approaches to ``semantically label'' such tabular data rely on mappings of textual information to classes, properties, or instances in RDF knowledge bases in order to link – and eventually transform – tabular data into RDF. However, as we will illustrate, Open Data tables typically contain a large portion of numerical columns and/or non-textual headers; therefore solutions that solely focus on textual ``cues'' are only partially applicable for mapping such data sources. We propose an approach to find and rank candidates of semantic labels and context descriptions for a given bag of numerical values. To this end, we apply a hierarchical clustering over information taken from DBpedia to build a background knowledge graph of possible ``semantic contexts'' for bags of numerical values, over which we perform a nearest neighbour search to rank the most likely candidates. Our evaluation shows that our approach can assign fine-grained semantic labels, when there is enough supporting evidence in the background knowledge graph. In other cases, our approach can nevertheless assign high level contexts to the data, which could potentially be used in combination with other approaches to narrow down the search space of possible labels.","note":"\\textbfNominated for best student paper award","month":"October","day":"17–21","pages":"428–445","year":"2016","booktitle":"Proceedings of the 15th International Semantic Web Conference (ISWC 2016) - Part I","volume":"9981","address":"Kobe, Japan","series":"Lecture Notes in Computer Science (LNCS)","publisher":"Springer","url":"http://polleres.net/publications/neum-etal-2016ISWC.pdf","doi":"https://doi.org/10.1007/978-3-319-46523-4_26","bibtex":"@inproceedings{neum-etal-2016ISWC,\n author = {Neumaier, Sebastian and Umbrich, J\\\"urgen and Parreira, Josiane and Polleres, Axel},\n title = {Multi-level Semantic Labelling of Numerical Values},\n abstract = {With the success of Open Data a huge amount of tabular data sources became available that could potentially be mapped and linked into the Web of (Linked) Data. Most existing approaches to ``semantically label'' such tabular data rely on mappings of textual information to classes, properties, or instances in RDF knowledge bases in order to link -- and eventually transform -- tabular data into RDF. However, as we will illustrate, Open Data tables typically contain a large portion of numerical columns and/or non-textual headers; therefore solutions that solely focus on textual ``cues'' are only partially applicable for mapping such data sources. We propose an approach to find and rank candidates of semantic labels and context descriptions for a given bag of numerical values. To this end, we apply a hierarchical clustering over information taken from DBpedia to build a background knowledge graph of possible ``semantic contexts'' for bags of numerical values, over which we perform a nearest neighbour search to rank the most likely candidates. Our evaluation shows that our approach can assign fine-grained semantic labels, when there is enough supporting evidence in the background knowledge graph. In other cases, our approach can nevertheless assign high level contexts to the data, which could potentially be used in combination with other approaches to narrow down the search space of possible labels.},\n note = {\\textbf{Nominated for best student paper award}},\n month = oct,\n day = {17--21},\n pages = {428--445},\n year = 2016,\n booktitle = {Proceedings of the 15th International Semantic Web Conference (ISWC 2016) - Part I},\n volume = 9981,\n address = {Kobe, Japan},\n series = LNCS,\n publisher = {Springer},\n url = {http://polleres.net/publications/neum-etal-2016ISWC.pdf},\n doi = {https://doi.org/10.1007/978-3-319-46523-4_26}\n} \n\n\n","author_short":["Neumaier, S.","Umbrich, J.","Parreira, J.","Polleres, A."],"key":"neum-etal-2016ISWC","id":"neum-etal-2016ISWC","bibbaseid":"neumaier-umbrich-parreira-polleres-multilevelsemanticlabellingofnumericalvalues-2016","role":"author","urls":{"Paper":"http://polleres.net/publications/neum-etal-2016ISWC.pdf"},"metadata":{"authorlinks":{"polleres, a":"https://bibbase.org/show?bib=www.polleres.net/mypublications.bib"}},"downloads":0,"html":""},"search_terms":["multi","level","semantic","labelling","numerical","values","neumaier","umbrich","parreira","polleres"],"keywords":[],"authorIDs":["FyLDFGg993nDS2Spf"],"dataSources":["cBfwyqsLFQQMc4Fss","gixxkiKt6rtWGoKSh","QfLT6siHZuHw9MqvK"]}