TAPER: query-aware, partition-enhancement for large, heterogenous graphs. Firth, H. & Missier, P. Distributed and Parallel Databases, 2017.
TAPER: query-aware, partition-enhancement for large, heterogenous graphs [link]Paper  doi  abstract   bibtex   
Graph partitioning has long been seen as a viable approach to addressing Graph DBMS scalability. A partitioning, however, may introduce extra query processing latency unless it is sensitive to a specific query workload, and optimised to minimise inter-partition traversals for that workload. Additionally, it should also be possible to incrementally adjust the partitioning in reaction to changes in the graph topology, the query workload, or both. Because of their complexity, current partitioning algorithms fall short of one or both of these requirements, as they are designed for offline use and as one-off operations. The TAPER system aims to address both requirements, whilst leveraging existing partitioning algorithms. TAPER takes any given initial partitioning as a starting point, and iteratively adjusts it by swapping chosen vertices across partitions, heuristically reducing the probability of inter-partition traversals for a given path queries workload. Iterations are inexpensive thanks to time and space optimisations in the underlying support data structures. We evaluate TAPER on two different large test graphs and over realistic query workloads. Our results indicate that, given a hash-based partitioning, TAPER reduces the number of inter-partition traversals by \$\$\textbackslashbackslashsim \$\$ ? 80%; given an unweighted Metis partitioning, by \$\$\textbackslashbackslashsim \$\$ ? 30%. These reductions are achieved within eight iterations and with the additional advantage of being workload-aware and usable online.
@article{firth_taper_2017,
	title = {{TAPER}: query-aware, partition-enhancement for large, heterogenous graphs},
	issn = {1573-7578},
	url = {http://dx.doi.org/10.1007/s10619-017-7196-y},
	doi = {10.1007/s10619-017-7196-y},
	abstract = {Graph partitioning has long been seen as a viable approach to addressing Graph DBMS scalability. A partitioning, however, may introduce extra query processing latency unless it is sensitive to a specific query workload, and optimised to minimise inter-partition traversals for that workload. Additionally, it should also be possible to incrementally adjust the partitioning in reaction to changes in the graph topology, the query workload, or both. Because of their complexity, current partitioning algorithms fall short of one or both of these requirements, as they are designed for offline use and as one-off operations. The TAPER system aims to address both requirements, whilst leveraging existing partitioning algorithms. TAPER takes any given initial partitioning as a starting point, and iteratively adjusts it by swapping chosen vertices across partitions, heuristically reducing the probability of inter-partition traversals for a given path queries workload. Iterations are inexpensive thanks to time and space optimisations in the underlying support data structures. We evaluate TAPER on two different large test graphs and over realistic query workloads. Our results indicate that, given a hash-based partitioning, TAPER reduces the number of inter-partition traversals by \$\${\textbackslash}backslashsim \$\$ ? 80\%; given an unweighted Metis partitioning, by \$\${\textbackslash}backslashsim \$\$ ? 30\%. These reductions are achieved within eight iterations and with the additional advantage of being workload-aware and usable online.},
	journal = {Distributed and Parallel Databases},
	author = {Firth, Hugo and Missier, Paolo},
	year = {2017},
	pages = {1--31}
}

Downloads: 0