Efficient ranking and selection in high performance computing environments. Ni, E. C., Ciocan, D. F., Henderson, S. G., & Hunter, S. R. Operations Research, 65(3):821–836, 2017.
Efficient ranking and selection in high performance computing environments [pdf]Paper  abstract   bibtex   1 download  
The goal of ranking and selection (R&S) procedures is to identify the best stochastic system from among a finite set of competing alternatives. Such procedures require constructing estimates of each system's performance, which can be obtained simultaneously by running multiple independent replications on a parallel computing platform. Nontrivial statistical and implementation issues arise when designing R&S procedures for a parallel computing environment. We propose several design principles for parallel R&S procedures that preserve statistical validity and maximize core utilization, especially when large numbers of alternatives or cores are involved. These principles are followed closely by our parallel Good Selection Procedure (GSP), which, under the assumption of normally distributed output, (i) guarantees to select a system in the indifference zone with high probability, (ii) in tests on up to 1,024 parallel cores runs efficiently, and (iii) in an example uses smaller sample sizes compared to existing parallel procedures, particularly for large problems (over 106 alternatives). In our computational study we discuss 3 methods for implementing GSP on parallel computers, namely the Message-Passing Interface (MPI), Hadoop MapReduce, and Spark, and show that Spark provides a good compromise between the efficiency of MPI and robustness to core failures.
@article{nietal17,
	abstract = {The goal of ranking and selection (R&S) procedures is to identify the best stochastic system from among a finite set of competing alternatives. Such procedures require constructing estimates of each system's performance, which can be obtained simultaneously by running multiple independent replications on a parallel computing platform. Nontrivial statistical and implementation issues arise when designing R&S procedures for a parallel computing environment. We propose several design principles for parallel R&S procedures that preserve statistical validity and maximize core utilization, especially when large numbers of alternatives or cores are involved. These principles are followed closely by our parallel Good Selection Procedure (GSP), which, under the assumption of normally distributed output, (i) guarantees to select a system in the indifference zone with high probability, (ii) in tests on up to 1,024 parallel cores runs efficiently, and (iii) in an example uses smaller sample sizes compared to existing parallel procedures, particularly for large problems (over 106 alternatives). In our computational study we discuss 3 methods for implementing GSP on parallel computers, namely the Message-Passing Interface (MPI), Hadoop MapReduce, and Spark, and show that Spark provides a good compromise between the efficiency of MPI and robustness to core failures.},
	author = {Eric C. Ni and Dragos F.\ Ciocan and Shane G.\ Henderson and Susan R.\ Hunter},
	date-added = {2016-09-30 16:55:01 +0000},
	date-modified = {2017-05-17 11:28:11 +0000},
	journal = {Operations Research},
	number = {3},
	pages = {821--836},
	title = {Efficient ranking and selection in high performance computing environments},
	url_paper = {pubs/ParallelRS.pdf},
	volume = {65},
	year = {2017}}
Downloads: 1