From describing to prescribing parallelism: Translating the SPEC ACCEL OpenACC suite to OpenMP target directives

From describing to prescribing parallelism: Translating the SPEC ACCEL OpenACC suite to OpenMP target directives. Juckeland, G., Hernandez, O., Jacob, A., C., Neilson, D., Larrea, V., G., V., Wienke, S., Bobyr, A., Brantley, W., C., Chandrasekaran, S., Colgrove, M., Grund, A., Henschel, R., Joubert, W., Müller, M., S., Raddatz, D., Shelepugin, P., Whitney, B., Wang, B., & Kumaran, K. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9945 LNCS:470-488, Springer Verlag, 2016.

Website doi abstract bibtex

Current and next generation HPC systems will exploit accelerators and self-hosting devices within their compute nodes to accelerate applications. This comes at a time when programmer productivity and the ability to produce portable code has been recognized as a major concern. One of the goals of OpenMP and OpenACC is to allow the user to specify parallelism via directives so that compilers can generate device specific code and optimizations. However, the challenge of porting codes becomes more complex because of the different types of parallelism and memory hierarchies available on different architectures. In this paper we discuss our experience with porting the SPEC ACCEL benchmarks from OpenACC to OpenMP 4.5 using a performance portable style that lets the compiler make platform-specific optimizations to achieve good performance on a variety of systems. The ported SPEC ACCEL OpenMP benchmarks were validated on different platforms including Xeon Phi, GPUs and CPUs. We believe that this experience can help the community and compiler vendors understand how users plan to write OpenMP 4.5 applications in a performance portable style. © Springer International Publishing AG 2016.

@article{
 title = {From describing to prescribing parallelism: Translating the SPEC ACCEL OpenACC suite to OpenMP target directives},
 type = {article},
 year = {2016},
 keywords = {Application programming interfaces (API),Benchmarking,Codes (s,Offloading,OpenMP,Openacc,SPEC,SPEC ACCEL},
 pages = {470-488},
 volume = {9945 LNCS},
 websites = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-84992549244&doi=10.1007%2F978-3-319-46079-6_33&partnerID=40&md5=9655cf681ac699c57e5a2bdef2a48e0c},
 publisher = {Springer Verlag},
 id = {61bf5bfe-ccf6-3949-93c5-87e33f167000},
 created = {2019-10-01T17:20:36.936Z},
 file_attached = {false},
 profile_id = {42d295c0-0737-38d6-8b43-508cab6ea85d},
 last_modified = {2019-10-01T17:25:25.871Z},
 read = {false},
 starred = {false},
 authored = {true},
 confirmed = {true},
 hidden = {false},
 citation_key = {Juckeland2016470},
 source_type = {article},
 notes = {cited By 0; Conference of International Workshops on High Performance Computing, ISC High Performance 2016 and Workshop on 2nd International Workshop on Communication Architectures at Extreme Scale, ExaComm 2016, Workshop on Exascale Multi/Many Core Computing Systems, E-MuCoCoS 2016, HPC I/O in the Data Center, HPC-IODC 2016, Application Performance on Intel Xeon Phi – Being Prepared for KNL and Beyond, IXPUG 2016, International Workshop on OpenPOWER for HPC, IWOPH 2016, International Workshop on Performance Portable Programming Models for Accelerators, P^3MA 2016, Workshop on Virtualization in High-Performance Cloud Computing, VHPC 2016, Workshop on Performance and Scalability of Storage Systems, WOPSSS 2016 ; Conference Date: 19 June 2016 Through 23 June 2016; Conference Code:185039},
 folder_uuids = {22c3b665-9e84-4884-8172-710aa9082eaf},
 private_publication = {false},
 abstract = {Current and next generation HPC systems will exploit accelerators and self-hosting devices within their compute nodes to accelerate applications. This comes at a time when programmer productivity and the ability to produce portable code has been recognized as a major concern. One of the goals of OpenMP and OpenACC is to allow the user to specify parallelism via directives so that compilers can generate device specific code and optimizations. However, the challenge of porting codes becomes more complex because of the different types of parallelism and memory hierarchies available on different architectures. In this paper we discuss our experience with porting the SPEC ACCEL benchmarks from OpenACC to OpenMP 4.5 using a performance portable style that lets the compiler make platform-specific optimizations to achieve good performance on a variety of systems. The ported SPEC ACCEL OpenMP benchmarks were validated on different platforms including Xeon Phi, GPUs and CPUs. We believe that this experience can help the community and compiler vendors understand how users plan to write OpenMP 4.5 applications in a performance portable style. © Springer International Publishing AG 2016.},
 bibtype = {article},
 author = {Juckeland, G and Hernandez, O and Jacob, A C and Neilson, D and Larrea, V G V and Wienke, S and Bobyr, A and Brantley, W C and Chandrasekaran, S and Colgrove, M and Grund, A and Henschel, R and Joubert, W and Müller, M S and Raddatz, D and Shelepugin, P and Whitney, B and Wang, B and Kumaran, K},
 editor = {Mohr B. Kunkel J.M., Taufer M},
 doi = {10.1007/978-3-319-46079-6_33},
 journal = {Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)}
}

Downloads: 0

{"_id":"ZqcYv3EEBTnsNpYDo","bibbaseid":"juckeland-hernandez-jacob-neilson-larrea-wienke-bobyr-brantley-etal-fromdescribingtoprescribingparallelismtranslatingthespecaccelopenaccsuitetoopenmptargetdirectives-2016","downloads":0,"creationDate":"2018-03-12T19:10:27.152Z","title":"From describing to prescribing parallelism: Translating the SPEC ACCEL OpenACC suite to OpenMP target directives","author_short":["Juckeland, G.","Hernandez, O.","Jacob, A., C.","Neilson, D.","Larrea, V., G., V.","Wienke, S.","Bobyr, A.","Brantley, W., C.","Chandrasekaran, S.","Colgrove, M.","Grund, A.","Henschel, R.","Joubert, W.","Müller, M., S.","Raddatz, D.","Shelepugin, P.","Whitney, B.","Wang, B.","Kumaran, K."],"year":2016,"bibtype":"article","biburl":"https://bibbase.org/service/mendeley/42d295c0-0737-38d6-8b43-508cab6ea85d","bibdata":{"title":"From describing to prescribing parallelism: Translating the SPEC ACCEL OpenACC suite to OpenMP target directives","type":"article","year":"2016","keywords":"Application programming interfaces (API),Benchmarking,Codes (s,Offloading,OpenMP,Openacc,SPEC,SPEC ACCEL","pages":"470-488","volume":"9945 LNCS","websites":"https://www.scopus.com/inward/record.uri?eid=2-s2.0-84992549244&doi=10.1007%2F978-3-319-46079-6_33&partnerID=40&md5=9655cf681ac699c57e5a2bdef2a48e0c","publisher":"Springer Verlag","id":"61bf5bfe-ccf6-3949-93c5-87e33f167000","created":"2019-10-01T17:20:36.936Z","file_attached":false,"profile_id":"42d295c0-0737-38d6-8b43-508cab6ea85d","last_modified":"2019-10-01T17:25:25.871Z","read":false,"starred":false,"authored":"true","confirmed":"true","hidden":false,"citation_key":"Juckeland2016470","source_type":"article","notes":"cited By 0; Conference of International Workshops on High Performance Computing, ISC High Performance 2016 and Workshop on 2nd International Workshop on Communication Architectures at Extreme Scale, ExaComm 2016, Workshop on Exascale Multi/Many Core Computing Systems, E-MuCoCoS 2016, HPC I/O in the Data Center, HPC-IODC 2016, Application Performance on Intel Xeon Phi – Being Prepared for KNL and Beyond, IXPUG 2016, International Workshop on OpenPOWER for HPC, IWOPH 2016, International Workshop on Performance Portable Programming Models for Accelerators, P^3MA 2016, Workshop on Virtualization in High-Performance Cloud Computing, VHPC 2016, Workshop on Performance and Scalability of Storage Systems, WOPSSS 2016 ; Conference Date: 19 June 2016 Through 23 June 2016; Conference Code:185039","folder_uuids":"22c3b665-9e84-4884-8172-710aa9082eaf","private_publication":false,"abstract":"Current and next generation HPC systems will exploit accelerators and self-hosting devices within their compute nodes to accelerate applications. This comes at a time when programmer productivity and the ability to produce portable code has been recognized as a major concern. One of the goals of OpenMP and OpenACC is to allow the user to specify parallelism via directives so that compilers can generate device specific code and optimizations. However, the challenge of porting codes becomes more complex because of the different types of parallelism and memory hierarchies available on different architectures. In this paper we discuss our experience with porting the SPEC ACCEL benchmarks from OpenACC to OpenMP 4.5 using a performance portable style that lets the compiler make platform-specific optimizations to achieve good performance on a variety of systems. The ported SPEC ACCEL OpenMP benchmarks were validated on different platforms including Xeon Phi, GPUs and CPUs. We believe that this experience can help the community and compiler vendors understand how users plan to write OpenMP 4.5 applications in a performance portable style. © Springer International Publishing AG 2016.","bibtype":"article","author":"Juckeland, G and Hernandez, O and Jacob, A C and Neilson, D and Larrea, V G V and Wienke, S and Bobyr, A and Brantley, W C and Chandrasekaran, S and Colgrove, M and Grund, A and Henschel, R and Joubert, W and Müller, M S and Raddatz, D and Shelepugin, P and Whitney, B and Wang, B and Kumaran, K","editor":"Mohr B. Kunkel J.M., Taufer M","doi":"10.1007/978-3-319-46079-6_33","journal":"Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)","bibtex":"@article{\n title = {From describing to prescribing parallelism: Translating the SPEC ACCEL OpenACC suite to OpenMP target directives},\n type = {article},\n year = {2016},\n keywords = {Application programming interfaces (API),Benchmarking,Codes (s,Offloading,OpenMP,Openacc,SPEC,SPEC ACCEL},\n pages = {470-488},\n volume = {9945 LNCS},\n websites = {https://www.scopus.com/inward/record.uri?eid=2-s2.0-84992549244&doi=10.1007%2F978-3-319-46079-6_33&partnerID=40&md5=9655cf681ac699c57e5a2bdef2a48e0c},\n publisher = {Springer Verlag},\n id = {61bf5bfe-ccf6-3949-93c5-87e33f167000},\n created = {2019-10-01T17:20:36.936Z},\n file_attached = {false},\n profile_id = {42d295c0-0737-38d6-8b43-508cab6ea85d},\n last_modified = {2019-10-01T17:25:25.871Z},\n read = {false},\n starred = {false},\n authored = {true},\n confirmed = {true},\n hidden = {false},\n citation_key = {Juckeland2016470},\n source_type = {article},\n notes = {cited By 0; Conference of International Workshops on High Performance Computing, ISC High Performance 2016 and Workshop on 2nd International Workshop on Communication Architectures at Extreme Scale, ExaComm 2016, Workshop on Exascale Multi/Many Core Computing Systems, E-MuCoCoS 2016, HPC I/O in the Data Center, HPC-IODC 2016, Application Performance on Intel Xeon Phi – Being Prepared for KNL and Beyond, IXPUG 2016, International Workshop on OpenPOWER for HPC, IWOPH 2016, International Workshop on Performance Portable Programming Models for Accelerators, P^3MA 2016, Workshop on Virtualization in High-Performance Cloud Computing, VHPC 2016, Workshop on Performance and Scalability of Storage Systems, WOPSSS 2016 ; Conference Date: 19 June 2016 Through 23 June 2016; Conference Code:185039},\n folder_uuids = {22c3b665-9e84-4884-8172-710aa9082eaf},\n private_publication = {false},\n abstract = {Current and next generation HPC systems will exploit accelerators and self-hosting devices within their compute nodes to accelerate applications. This comes at a time when programmer productivity and the ability to produce portable code has been recognized as a major concern. One of the goals of OpenMP and OpenACC is to allow the user to specify parallelism via directives so that compilers can generate device specific code and optimizations. However, the challenge of porting codes becomes more complex because of the different types of parallelism and memory hierarchies available on different architectures. In this paper we discuss our experience with porting the SPEC ACCEL benchmarks from OpenACC to OpenMP 4.5 using a performance portable style that lets the compiler make platform-specific optimizations to achieve good performance on a variety of systems. The ported SPEC ACCEL OpenMP benchmarks were validated on different platforms including Xeon Phi, GPUs and CPUs. We believe that this experience can help the community and compiler vendors understand how users plan to write OpenMP 4.5 applications in a performance portable style. © Springer International Publishing AG 2016.},\n bibtype = {article},\n author = {Juckeland, G and Hernandez, O and Jacob, A C and Neilson, D and Larrea, V G V and Wienke, S and Bobyr, A and Brantley, W C and Chandrasekaran, S and Colgrove, M and Grund, A and Henschel, R and Joubert, W and Müller, M S and Raddatz, D and Shelepugin, P and Whitney, B and Wang, B and Kumaran, K},\n editor = {Mohr B. Kunkel J.M., Taufer M},\n doi = {10.1007/978-3-319-46079-6_33},\n journal = {Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)}\n}","author_short":["Juckeland, G.","Hernandez, O.","Jacob, A., C.","Neilson, D.","Larrea, V., G., V.","Wienke, S.","Bobyr, A.","Brantley, W., C.","Chandrasekaran, S.","Colgrove, M.","Grund, A.","Henschel, R.","Joubert, W.","Müller, M., S.","Raddatz, D.","Shelepugin, P.","Whitney, B.","Wang, B.","Kumaran, K."],"editor_short":["Mohr B. Kunkel J.M., T., M."],"urls":{"Website":"https://www.scopus.com/inward/record.uri?eid=2-s2.0-84992549244&doi=10.1007%2F978-3-319-46079-6_33&partnerID=40&md5=9655cf681ac699c57e5a2bdef2a48e0c"},"biburl":"https://bibbase.org/service/mendeley/42d295c0-0737-38d6-8b43-508cab6ea85d","bibbaseid":"juckeland-hernandez-jacob-neilson-larrea-wienke-bobyr-brantley-etal-fromdescribingtoprescribingparallelismtranslatingthespecaccelopenaccsuitetoopenmptargetdirectives-2016","role":"author","keyword":["Application programming interfaces (API)","Benchmarking","Codes (s","Offloading","OpenMP","Openacc","SPEC","SPEC ACCEL"],"metadata":{"authorlinks":{}},"downloads":0},"search_terms":["describing","prescribing","parallelism","translating","spec","accel","openacc","suite","openmp","target","directives","juckeland","hernandez","jacob","neilson","larrea","wienke","bobyr","brantley","chandrasekaran","colgrove","grund","henschel","joubert","müller","raddatz","shelepugin","whitney","wang","kumaran"],"keywords":["application programming interfaces (api)","benchmarking","codes (s","offloading","openmp","openacc","spec","spec accel"],"authorIDs":[],"dataSources":["zgahneP4uAjKbudrQ","ya2CyA73rpZseyrZ8","2252seNhipfTmjEBQ"]}