Robust Statistical Methods for Hit Selection in RNA Interference High-Throughput Screening Experiments. Zhang, X. D.; Yang, X. C.; Chung, N.; Gates, A.; Stec, E.; Kunapuli, P.; Holder, D. J; Ferrer, M.; and Espeseth, A. S Pharmacogenomics, 7(3):299--309, Apr, 2006.
doi  abstract   bibtex   
RNA interference (RNAi) high-throughput screening (HTS) experiments carried out using large (>5000 short interfering [si]RNA) libraries generate a huge amount of data. In order to use these data to identify the most effective siRNAs tested, it is critical to adopt and develop appropriate statistical methods. To address the questions in hit selection of RNAi HTS, we proposed a quartile-based method which is robust to outliers, true hits and nonsymmetrical data. We compared it with the more traditional tests, mean +/- k standard deviation (SD) and median +/- 3 median of absolute deviation (MAD). The results suggested that the quartile-based method selected more hits than mean +/- k SD under the same preset error rate. The number of hits selected by median +/- k MAD was close to that by the quartile-based method. Further analysis suggested that the quartile-based method had the greatest power in detecting true hits, especially weak or moderate true hits. Our investigation also suggested that platewise analysis (determining effective siRNAs on a plate-by-plate basis) can adjust for systematic errors in different plates, while an experimentwise analysis, in which effective siRNAs are identified in an analysis of the entire experiment, cannot. However, experimentwise analysis may detect a cluster of true positive hits placed together in one or several plates, while platewise analysis may not. To display hit selection results, we designed a specific figure called a plate-well series plot. We thus suggest the following strategy for hit selection in RNAi HTS experiments. First, choose the quartile-based method, or median +/- k MAD, for identifying effective siRNAs. Second, perform the chosen method experimentwise on transformed/normalized data, such as percentage inhibition, to check the possibility of hit clusters. If a cluster of selected hits are observed, repeat the analysis based on untransformed data to determine whether the cluster is due to an artifact in the data. If no clusters of hits are observed, select hits by performing platewise analysis on transformed data. Third, adopt the plate-well series plot to visualize both the data and the hit selection results, as well as to check for artifacts.
@article{Zhang:2006ph,
	Abstract = {RNA interference (RNAi) high-throughput screening (HTS) experiments carried out using large (>5000 short interfering [si]RNA) libraries generate a huge amount of data. In order to use these data to identify the most effective siRNAs tested, it is critical to adopt and develop appropriate statistical methods. To address the questions in hit selection of RNAi HTS, we proposed a quartile-based method which is robust to outliers, true hits and nonsymmetrical data. We compared it with the more traditional tests, mean +/- k standard deviation (SD) and median +/- 3 median of absolute deviation (MAD). The results suggested that the quartile-based method selected more hits than mean +/- k SD under the same preset error rate. The number of hits selected by median +/- k MAD was close to that by the quartile-based method. Further analysis suggested that the quartile-based method had the greatest power in detecting true hits, especially weak or moderate true hits. Our investigation also suggested that platewise analysis (determining effective siRNAs on a plate-by-plate basis) can adjust for systematic errors in different plates, while an experimentwise analysis, in which effective siRNAs are identified in an analysis of the entire experiment, cannot. However, experimentwise analysis may detect a cluster of true positive hits placed together in one or several plates, while platewise analysis may not. To display hit selection results, we designed a specific figure called a plate-well series plot. We thus suggest the following strategy for hit selection in RNAi HTS experiments. First, choose the quartile-based method, or median +/- k MAD, for identifying effective siRNAs. Second, perform the chosen method experimentwise on transformed/normalized data, such as percentage inhibition, to check the possibility of hit clusters. If a cluster of selected hits are observed, repeat the analysis based on untransformed data to determine whether the cluster is due to an artifact in the data. If no clusters of hits are observed, select hits by performing platewise analysis on transformed data. Third, adopt the plate-well series plot to visualize both the data and the hit selection results, as well as to check for artifacts.},
	Author = {Zhang, Xiaohua Douglas and Yang, Xiting Cindy and Chung, Namjin and Gates, Adam and Stec, Erica and Kunapuli, Priya and Holder, Dan J and Ferrer, Marc and Espeseth, Amy S},
	Crdt = {2006/04/14 09:00},
	Date = {2006 Apr},
	Date-Added = {2009-04-06 15:51:54 -0400},
	Date-Modified = {2009-04-06 15:53:53 -0400},
	Doi = {10.2217/14622416.7.3.299},
	Issn = {1462-2416 (Print)},
	Journal = {Pharmacogenomics},
	Jt = {Pharmacogenomics},
	Keywords = {clustering; cluster; rnai; ncgc},
	Mhda = {2006/06/13 09:00},
	Month = {Apr},
	Number = {3},
	Pages = {299--309},
	Pl = {England},
	Pst = {ppublish},
	Sb = {IM},
	Title = {Robust Statistical Methods for Hit Selection in {RNA} Interference High-Throughput Screening Experiments.},
	Volume = {7},
	Year = {2006},
	Bdsk-Url-1 = {http://dx.doi.org/10.2217/14622416.7.3.299}}
Downloads: 0