Optimizing Energy Consumption and Parallel Performance for Betweenness Centrality using GPUs. McLaughlin, A., Riedy, J., & Bader, D. A. In The IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, September, 2014. ``Rising Stars'' section
doi  abstract   bibtex   
Applications of high-performance graph analysis range from computational biology to network security and even transportation. These applications often consider graphs under rapid change and are moving beyond HPC platforms into energy-constrained embedded systems. This paper optimizes one successful and demanding analysis kernel, betweenness centrality, for NVIDIA GPU accelerators in both environments. Our algorithm for static analysis is capable of exceeding 2 million traversed edges per second per watt (MTEPS/W). Optimizing the parallel algorithm and treating the dynamic problem directly achieves a 6.39$\times$ average speed-up and 84% average reduction in energy consumption.
@inproceedings{bc-hpec14,
  author = {Adam McLaughlin and Jason Riedy and David A. Bader},
  ejr-withauthor = {Adam McLaughlin and David A. Bader},
  title = {Optimizing Energy Consumption and Parallel Performance for Betweenness Centrality using {GPU}s},
  opttags = {parallel; graph; energy},
  booktitle = {The IEEE High Performance Extreme Computing Conference (HPEC)},
  year = 2014,
  month = sep,
  address = {Waltham, MA},
  note = {``Rising Stars'' section},
  dom = 11,
  role = {proceedings},
  doi = {10.1109/HPEC.2014.7040980},
  file = {material/Optimizing_BC_HPEC14.pdf},
  abstract = {Applications of high-performance graph analysis range from computational biology to network security and even transportation.  These applications often consider graphs under rapid change and are moving beyond HPC platforms into energy-constrained embedded systems.  This paper optimizes one successful and demanding analysis kernel, betweenness centrality, for NVIDIA GPU accelerators in both environments.  Our algorithm for static analysis is capable of exceeding 2 million traversed edges per second per watt (MTEPS/W).  Optimizing the parallel algorithm and treating the dynamic problem directly achieves a 6.39$\times$ average speed-up and 84\% average reduction in energy consumption.},
  projtag = {xscala, grateful, hpda},
  keywords = {hpda, graph analysis, parallel algorithm},
  ejr-proj = {high-performance-data-analysis, graph-analysis},
  ejr-grant = {grateful, xscala}
}

Downloads: 0