A Compact In-Memory Dictionary for RDF data

A Compact In-Memory Dictionary for RDF data. Bazoobandi, H. R, De Rooij, S., Urbani, J., Ten Teije, A., Van Harmelen, F., & Bal, H. In twelfth European Semantic Web Conference, ESWC, (LNCS 9088), pages 205–220, 2015. Springer.

Paper abstract bibtex

While almost all dictionary compression techniques focus on static RDF data, we present a compact in-memory RDF dictionary for dynamic and streaming data. To do so, we analysed the structure of terms in real-world datasets and observed a high degree of common prefixes. We studied the applicability of Trie data structures on RDF data to reduce the memory occupied by common prefixes and discovered that all existing Trie implementations lead to either poor performance, or an excessive memory wastage. In our approach, we address the existing limitations of Tries for RDF data, and propose a new variant of Trie which contains some optimiza-tions explicitly designed to improve the performance on RDF data. Fur-thermore, we show how we use this Trie as an in-memory dictionary by using as numerical ID a memory address instead of an integer counter. This design removes the need for an additional decoding data structure, and further reduces the occupied memory. An empirical analysis on real-world datasets shows that with a reasonable overhead our technique uses 50-59% less memory than a conventional uncompressed dictionary.

@inproceedings{Bazoobandi2015,
abstract = {While almost all dictionary compression techniques focus on static RDF data, we present a compact in-memory RDF dictionary for dynamic and streaming data. To do so, we analysed the structure of terms in real-world datasets and observed a high degree of common prefixes. We studied the applicability of Trie data structures on RDF data to reduce the memory occupied by common prefixes and discovered that all existing Trie implementations lead to either poor performance, or an excessive memory wastage. In our approach, we address the existing limitations of Tries for RDF data, and propose a new variant of Trie which contains some optimiza-tions explicitly designed to improve the performance on RDF data. Fur-thermore, we show how we use this Trie as an in-memory dictionary by using as numerical ID a memory address instead of an integer counter. This design removes the need for an additional decoding data structure, and further reduces the occupied memory. An empirical analysis on real-world datasets shows that with a reasonable overhead our technique uses 50-59{\%} less memory than a conventional uncompressed dictionary.},
author = {Bazoobandi, Hamid R and {De Rooij}, Steven and Urbani, Jacopo and {Ten Teije}, Annette and {Van Harmelen}, Frank and Bal, Henri},
booktitle = {twelfth European Semantic Web Conference, ESWC, (LNCS 9088)},
pages = {205--220},
publisher = {Springer},
title = {{A Compact In-Memory Dictionary for RDF data}},
url = {http://www.cs.vu.nl/{~}frankh/postscript/ESWC15.pdf http://www.cs.vu.nl/{~}annette/papers-pdf/2015ESWC-RDFVault.pdf},
year = {2015}
}

Downloads: 0

{"_id":"5rkNGcgCZ6vvoj4Hx","bibbaseid":"bazoobandi-derooij-urbani-tenteije-vanharmelen-bal-acompactinmemorydictionaryforrdfdata-2015","author_short":["Bazoobandi, H. R","De Rooij, S.","Urbani, J.","Ten Teije, A.","Van Harmelen, F.","Bal, H."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","abstract":"While almost all dictionary compression techniques focus on static RDF data, we present a compact in-memory RDF dictionary for dynamic and streaming data. To do so, we analysed the structure of terms in real-world datasets and observed a high degree of common prefixes. We studied the applicability of Trie data structures on RDF data to reduce the memory occupied by common prefixes and discovered that all existing Trie implementations lead to either poor performance, or an excessive memory wastage. In our approach, we address the existing limitations of Tries for RDF data, and propose a new variant of Trie which contains some optimiza-tions explicitly designed to improve the performance on RDF data. Fur-thermore, we show how we use this Trie as an in-memory dictionary by using as numerical ID a memory address instead of an integer counter. This design removes the need for an additional decoding data structure, and further reduces the occupied memory. An empirical analysis on real-world datasets shows that with a reasonable overhead our technique uses 50-59% less memory than a conventional uncompressed dictionary.","author":[{"propositions":[],"lastnames":["Bazoobandi"],"firstnames":["Hamid","R"],"suffixes":[]},{"propositions":[],"lastnames":["De Rooij"],"firstnames":["Steven"],"suffixes":[]},{"propositions":[],"lastnames":["Urbani"],"firstnames":["Jacopo"],"suffixes":[]},{"propositions":[],"lastnames":["Ten Teije"],"firstnames":["Annette"],"suffixes":[]},{"propositions":[],"lastnames":["Van Harmelen"],"firstnames":["Frank"],"suffixes":[]},{"propositions":[],"lastnames":["Bal"],"firstnames":["Henri"],"suffixes":[]}],"booktitle":"twelfth European Semantic Web Conference, ESWC, (LNCS 9088)","pages":"205–220","publisher":"Springer","title":"A Compact In-Memory Dictionary for RDF data","url":"http://www.cs.vu.nl/~frankh/postscript/ESWC15.pdf http://www.cs.vu.nl/~annette/papers-pdf/2015ESWC-RDFVault.pdf","year":"2015","bibtex":"@inproceedings{Bazoobandi2015,\nabstract = {While almost all dictionary compression techniques focus on static RDF data, we present a compact in-memory RDF dictionary for dynamic and streaming data. To do so, we analysed the structure of terms in real-world datasets and observed a high degree of common prefixes. We studied the applicability of Trie data structures on RDF data to reduce the memory occupied by common prefixes and discovered that all existing Trie implementations lead to either poor performance, or an excessive memory wastage. In our approach, we address the existing limitations of Tries for RDF data, and propose a new variant of Trie which contains some optimiza-tions explicitly designed to improve the performance on RDF data. Fur-thermore, we show how we use this Trie as an in-memory dictionary by using as numerical ID a memory address instead of an integer counter. This design removes the need for an additional decoding data structure, and further reduces the occupied memory. An empirical analysis on real-world datasets shows that with a reasonable overhead our technique uses 50-59{\\%} less memory than a conventional uncompressed dictionary.},\nauthor = {Bazoobandi, Hamid R and {De Rooij}, Steven and Urbani, Jacopo and {Ten Teije}, Annette and {Van Harmelen}, Frank and Bal, Henri},\nbooktitle = {twelfth European Semantic Web Conference, ESWC, (LNCS 9088)},\npages = {205--220},\npublisher = {Springer},\ntitle = {{A Compact In-Memory Dictionary for RDF data}},\nurl = {http://www.cs.vu.nl/{~}frankh/postscript/ESWC15.pdf http://www.cs.vu.nl/{~}annette/papers-pdf/2015ESWC-RDFVault.pdf},\nyear = {2015}\n}\n","author_short":["Bazoobandi, H. R","De Rooij, S.","Urbani, J.","Ten Teije, A.","Van Harmelen, F.","Bal, H."],"key":"Bazoobandi2015","id":"Bazoobandi2015","bibbaseid":"bazoobandi-derooij-urbani-tenteije-vanharmelen-bal-acompactinmemorydictionaryforrdfdata-2015","role":"author","urls":{"Paper":"http://www.cs.vu.nl/~frankh/postscript/ESWC15.pdf http://www.cs.vu.nl/~annette/papers-pdf/2015ESWC-RDFVault.pdf"},"metadata":{"authorlinks":{}},"html":""},"bibtype":"inproceedings","biburl":"http://www.cs.vu.nl/~annette/Annette-mendeley.bib","dataSources":["9CnmDh6oPMNTwHksm","LEcHEiZn5fKAs2WWu"],"keywords":[],"search_terms":["compact","memory","dictionary","rdf","data","bazoobandi","de rooij","urbani","ten teije","van harmelen","bal"],"title":"A Compact In-Memory Dictionary for RDF data","year":2015}