Contributions to high-performance big data computing

Contributions to high-performance big data computing. Fox, G., Qiu, J., Crandall, D., von Laszewski, G., Beckstein, O., Paden, J., Paraskevakos, I., Jha, S., Wang, F., Marathe, M., Vullikanti, A., & Cheatham III, T. E. In Grandinetti, L., Joubert, G. R., Michielsen, K., Mirtaheri, S. L., Taufer, M., & Yokota, R., editors, Future Trends of HPC in a Disruptive Scenario, volume 34, of Advances in Parallel Computing, pages 34–81. IOS Press, 2019.

Paper abstract bibtex

Our project is at the interface of Big Data and HPC – High-Performance Big Data computing and this paper describes a collaboration between 7 collaborating Universities at Arizona State, Indiana (lead), Kansas, Rutgers, Stony Brook, Virginia Tech, and Utah. It addresses the intersection of High-performance and Big Data computing with several different application areas or communities driving the requirements for software systems and algorithms. We describe the base architecture, including the HPC-ABDS, High-Performance Computing enhanced Apache Big Data Stack, and an application use case study identifying key features that determine software and algorithm requirements. We summarize middleware including Harp-DAAL collective communication layer, Twister2 Big Data toolkit, and pilot jobs. Then we present the SPIDAL Scalable Parallel Interoperable Data Analytics Library and our work for it in core machine-learning, image processing and the application communities, Network science, Polar Science, Biomolecular Simulations, Pathology, and Spatial systems. We describe basic algorithms and their integration in end-to-end use cases.

@incollection{fox_contributions_2019,
	series = {Advances in {Parallel} {Computing}},
	title = {Contributions to high-performance big data computing},
	volume = {34},
	url = {https://doi.org/10.3233/APC190005},
	abstract = {Our project is at the interface of Big Data and HPC – High-Performance Big Data computing and this paper describes a collaboration between 7 collaborating Universities at Arizona State, Indiana (lead), Kansas, Rutgers, Stony Brook, Virginia Tech, and Utah. It addresses the intersection of High-performance and Big Data computing with several different application areas or communities driving the requirements for software systems and algorithms. We describe the base architecture, including the HPC-ABDS, High-Performance Computing enhanced Apache Big Data Stack, and an application use case study identifying key features that determine software and algorithm requirements. We summarize middleware including Harp-DAAL collective communication layer, Twister2 Big Data toolkit, and pilot jobs. Then we present the SPIDAL Scalable Parallel Interoperable Data Analytics Library and our work for it in core machine-learning, image processing and the application communities, Network science, Polar Science, Biomolecular Simulations, Pathology, and Spatial systems. We describe basic algorithms and their integration in end-to-end use cases.},
	booktitle = {Future {Trends} of {HPC} in a {Disruptive} {Scenario}},
	publisher = {IOS Press},
	author = {Fox, Geoffrey and Qiu, Judy and Crandall, David and von Laszewski, Gregor and Beckstein, Oliver and Paden, John and Paraskevakos, Ioannis and Jha, Shantenu and Wang, Fusheng and Marathe, Madhav and Vullikanti, Anil and Cheatham III, Thomas E.},
	editor = {Grandinetti, Lucio and Joubert, Gerhard R. and Michielsen, Kristel and Mirtaheri, Seyedeh Leili and Taufer, Michela and Yokota, Rio},
	year = {2019},
	pages = {34--81},
}

Downloads: 0

{"_id":"dZWKNBMnFSW2vW3wu","bibbaseid":"fox-qiu-crandall-vonlaszewski-beckstein-paden-paraskevakos-jha-etal-contributionstohighperformancebigdatacomputing-2019","authorIDs":["8cWRMdohWXvhMhW7a"],"author_short":["Fox, G.","Qiu, J.","Crandall, D.","von Laszewski, G.","Beckstein, O.","Paden, J.","Paraskevakos, I.","Jha, S.","Wang, F.","Marathe, M.","Vullikanti, A.","Cheatham III, T. E."],"bibdata":{"bibtype":"incollection","type":"incollection","series":"Advances in Parallel Computing","title":"Contributions to high-performance big data computing","volume":"34","url":"https://doi.org/10.3233/APC190005","abstract":"Our project is at the interface of Big Data and HPC – High-Performance Big Data computing and this paper describes a collaboration between 7 collaborating Universities at Arizona State, Indiana (lead), Kansas, Rutgers, Stony Brook, Virginia Tech, and Utah. It addresses the intersection of High-performance and Big Data computing with several different application areas or communities driving the requirements for software systems and algorithms. We describe the base architecture, including the HPC-ABDS, High-Performance Computing enhanced Apache Big Data Stack, and an application use case study identifying key features that determine software and algorithm requirements. We summarize middleware including Harp-DAAL collective communication layer, Twister2 Big Data toolkit, and pilot jobs. Then we present the SPIDAL Scalable Parallel Interoperable Data Analytics Library and our work for it in core machine-learning, image processing and the application communities, Network science, Polar Science, Biomolecular Simulations, Pathology, and Spatial systems. We describe basic algorithms and their integration in end-to-end use cases.","booktitle":"Future Trends of HPC in a Disruptive Scenario","publisher":"IOS Press","author":[{"propositions":[],"lastnames":["Fox"],"firstnames":["Geoffrey"],"suffixes":[]},{"propositions":[],"lastnames":["Qiu"],"firstnames":["Judy"],"suffixes":[]},{"propositions":[],"lastnames":["Crandall"],"firstnames":["David"],"suffixes":[]},{"propositions":["von"],"lastnames":["Laszewski"],"firstnames":["Gregor"],"suffixes":[]},{"propositions":[],"lastnames":["Beckstein"],"firstnames":["Oliver"],"suffixes":[]},{"propositions":[],"lastnames":["Paden"],"firstnames":["John"],"suffixes":[]},{"propositions":[],"lastnames":["Paraskevakos"],"firstnames":["Ioannis"],"suffixes":[]},{"propositions":[],"lastnames":["Jha"],"firstnames":["Shantenu"],"suffixes":[]},{"propositions":[],"lastnames":["Wang"],"firstnames":["Fusheng"],"suffixes":[]},{"propositions":[],"lastnames":["Marathe"],"firstnames":["Madhav"],"suffixes":[]},{"propositions":[],"lastnames":["Vullikanti"],"firstnames":["Anil"],"suffixes":[]},{"propositions":[],"lastnames":["Cheatham","III"],"firstnames":["Thomas","E."],"suffixes":[]}],"editor":[{"propositions":[],"lastnames":["Grandinetti"],"firstnames":["Lucio"],"suffixes":[]},{"propositions":[],"lastnames":["Joubert"],"firstnames":["Gerhard","R."],"suffixes":[]},{"propositions":[],"lastnames":["Michielsen"],"firstnames":["Kristel"],"suffixes":[]},{"propositions":[],"lastnames":["Mirtaheri"],"firstnames":["Seyedeh","Leili"],"suffixes":[]},{"propositions":[],"lastnames":["Taufer"],"firstnames":["Michela"],"suffixes":[]},{"propositions":[],"lastnames":["Yokota"],"firstnames":["Rio"],"suffixes":[]}],"year":"2019","pages":"34–81","bibtex":"@incollection{fox_contributions_2019,\n\tseries = {Advances in {Parallel} {Computing}},\n\ttitle = {Contributions to high-performance big data computing},\n\tvolume = {34},\n\turl = {https://doi.org/10.3233/APC190005},\n\tabstract = {Our project is at the interface of Big Data and HPC – High-Performance Big Data computing and this paper describes a collaboration between 7 collaborating Universities at Arizona State, Indiana (lead), Kansas, Rutgers, Stony Brook, Virginia Tech, and Utah. It addresses the intersection of High-performance and Big Data computing with several different application areas or communities driving the requirements for software systems and algorithms. We describe the base architecture, including the HPC-ABDS, High-Performance Computing enhanced Apache Big Data Stack, and an application use case study identifying key features that determine software and algorithm requirements. We summarize middleware including Harp-DAAL collective communication layer, Twister2 Big Data toolkit, and pilot jobs. Then we present the SPIDAL Scalable Parallel Interoperable Data Analytics Library and our work for it in core machine-learning, image processing and the application communities, Network science, Polar Science, Biomolecular Simulations, Pathology, and Spatial systems. We describe basic algorithms and their integration in end-to-end use cases.},\n\tbooktitle = {Future {Trends} of {HPC} in a {Disruptive} {Scenario}},\n\tpublisher = {IOS Press},\n\tauthor = {Fox, Geoffrey and Qiu, Judy and Crandall, David and von Laszewski, Gregor and Beckstein, Oliver and Paden, John and Paraskevakos, Ioannis and Jha, Shantenu and Wang, Fusheng and Marathe, Madhav and Vullikanti, Anil and Cheatham III, Thomas E.},\n\teditor = {Grandinetti, Lucio and Joubert, Gerhard R. and Michielsen, Kristel and Mirtaheri, Seyedeh Leili and Taufer, Michela and Yokota, Rio},\n\tyear = {2019},\n\tpages = {34--81},\n}\n\n","author_short":["Fox, G.","Qiu, J.","Crandall, D.","von Laszewski, G.","Beckstein, O.","Paden, J.","Paraskevakos, I.","Jha, S.","Wang, F.","Marathe, M.","Vullikanti, A.","Cheatham III, T. E."],"editor_short":["Grandinetti, L.","Joubert, G. R.","Michielsen, K.","Mirtaheri, S. L.","Taufer, M.","Yokota, R."],"key":"fox_contributions_2019","id":"fox_contributions_2019","bibbaseid":"fox-qiu-crandall-vonlaszewski-beckstein-paden-paraskevakos-jha-etal-contributionstohighperformancebigdatacomputing-2019","role":"author","urls":{"Paper":"https://doi.org/10.3233/APC190005"},"metadata":{"authorlinks":{"beckstein, o":"https://becksteinlab.physics.asu.edu/research/54/list-of-publications"}},"downloads":0},"bibtype":"incollection","biburl":"https://api.zotero.org/users/1446965/collections/GJIR5FQV/items?key=DK7eBbaofVxXe4ShaO2nLItp&format=bibtex&limit=100","creationDate":"2020-09-15T20:44:53.479Z","downloads":0,"keywords":[],"search_terms":["contributions","high","performance","big","data","computing","fox","qiu","crandall","von laszewski","beckstein","paden","paraskevakos","jha","wang","marathe","vullikanti","cheatham iii"],"title":"Contributions to high-performance big data computing","year":2019,"dataSources":["aWZX3bdqYnwJ4TFqr","PGB7KMr8nSSyHsKme","Y8jAQ6b2eCo2cdomS","nsS5dn53AvB6eMZRk"]}