Net Reclassification Index and Integrated Discrimination Index Are Not Appropriate for Testing Whether a Biomarker Improves Predictive Performance

Net Reclassification Index and Integrated Discrimination Index Are Not Appropriate for Testing Whether a Biomarker Improves Predictive Performance. Burch, P. M., Glaab, W. E., Holder, D. J., Phillips, J. A., Sauer, J., & Walker, E. G. Toxicological Sciences: An Official Journal of the Society of Toxicology, 156(1):11–13, 2017.
doi abstract bibtex

One of the goals of the Critical Path Institute's Predictive Safety Testing Consortium (PSTC) is to promote best practices for evaluating novel markers of drug induced injury. This includes the use of sound statistical methods. For rat studies, these practices have centered around comparing the area under the receiver-operator characteristic curve for each novel injury biomarker to those for the standard markers. In addition, the PSTC has previously used the net reclassification index (NRI) and integrated discrimination index (IDI) to assess the increased certainty provided by each novel injury biomarker when added to the information already provided by the standard markers. Due to their relatively simple interpretations, NRI and IDI have generally been popular measures of predictive performance. However recent literature suggests that significance tests for NRI and IDI can have inflated false positive rates and thus, tests based on these metrics should not be relied upon. Instead, when parametric models are employed to assess the added predictive value of a new marker, following (Pepe, M. S., Kerr, K. F., Longton, G., and Wang, Z. (2013). Testing for improvement in prediction model performance. Stat. Med. 32, 1467-1482), the PSTC recommends that likelihood based methods be used for significance testing.

@article{burch_net_2017,
	title = {Net {Reclassification} {Index} and {Integrated} {Discrimination} {Index} {Are} {Not} {Appropriate} for {Testing} {Whether} a {Biomarker} {Improves} {Predictive} {Performance}},
	volume = {156},
	issn = {1096-0929},
	doi = {10.1093/toxsci/kfw225},
	abstract = {One of the goals of the Critical Path Institute's Predictive Safety Testing Consortium (PSTC) is to promote best practices for evaluating novel markers of drug induced injury. This includes the use of sound statistical methods. For rat studies, these practices have centered around comparing the area under the receiver-operator characteristic curve for each novel injury biomarker to those for the standard markers. In addition, the PSTC has previously used the net reclassification index (NRI) and integrated discrimination index (IDI) to assess the increased certainty provided by each novel injury biomarker when added to the information already provided by the standard markers. Due to their relatively simple interpretations, NRI and IDI have generally been popular measures of predictive performance. However recent literature suggests that significance tests for NRI and IDI can have inflated false positive rates and thus, tests based on these metrics should not be relied upon. Instead, when parametric models are employed to assess the added predictive value of a new marker, following (Pepe, M. S., Kerr, K. F., Longton, G., and Wang, Z. (2013). Testing for improvement in prediction model performance. Stat. Med. 32, 1467-1482), the PSTC recommends that likelihood based methods be used for significance testing.},
	language = {eng},
	number = {1},
	journal = {Toxicological Sciences: An Official Journal of the Society of Toxicology},
	author = {Burch, Peter M. and Glaab, Warren E. and Holder, Daniel J. and Phillips, Jonathan A. and Sauer, John-Michael and Walker, Elizabeth G.},
	year = {2017},
	pmid = {27815493},
	pmcid = {PMC5837334},
	keywords = {Animals, Biomarkers, Drug Evaluation, Preclinical, Drug-Related Side Effects and Adverse Reactions, Drugs, Investigational, False Positive Reactions, Humans, IDI, Models, Statistical, Muscular Diseases, NRI, Organizations, Nonprofit, Predictive Value of Tests, ROC Curve, Renal Insufficiency, Toxicity Tests, United States, Xenobiotics, biomarkers {\textless} Safety Evaluation, statistics.},
	pages = {11--13},
}

Downloads: 0

{"_id":"dWaYXn59pdYb8J3o5","bibbaseid":"burch-glaab-holder-phillips-sauer-walker-netreclassificationindexandintegrateddiscriminationindexarenotappropriatefortestingwhetherabiomarkerimprovespredictiveperformance-2017","author_short":["Burch, P. M.","Glaab, W. E.","Holder, D. J.","Phillips, J. A.","Sauer, J.","Walker, E. G."],"bibdata":{"bibtype":"article","type":"article","title":"Net Reclassification Index and Integrated Discrimination Index Are Not Appropriate for Testing Whether a Biomarker Improves Predictive Performance","volume":"156","issn":"1096-0929","doi":"10.1093/toxsci/kfw225","abstract":"One of the goals of the Critical Path Institute's Predictive Safety Testing Consortium (PSTC) is to promote best practices for evaluating novel markers of drug induced injury. This includes the use of sound statistical methods. For rat studies, these practices have centered around comparing the area under the receiver-operator characteristic curve for each novel injury biomarker to those for the standard markers. In addition, the PSTC has previously used the net reclassification index (NRI) and integrated discrimination index (IDI) to assess the increased certainty provided by each novel injury biomarker when added to the information already provided by the standard markers. Due to their relatively simple interpretations, NRI and IDI have generally been popular measures of predictive performance. However recent literature suggests that significance tests for NRI and IDI can have inflated false positive rates and thus, tests based on these metrics should not be relied upon. Instead, when parametric models are employed to assess the added predictive value of a new marker, following (Pepe, M. S., Kerr, K. F., Longton, G., and Wang, Z. (2013). Testing for improvement in prediction model performance. Stat. Med. 32, 1467-1482), the PSTC recommends that likelihood based methods be used for significance testing.","language":"eng","number":"1","journal":"Toxicological Sciences: An Official Journal of the Society of Toxicology","author":[{"propositions":[],"lastnames":["Burch"],"firstnames":["Peter","M."],"suffixes":[]},{"propositions":[],"lastnames":["Glaab"],"firstnames":["Warren","E."],"suffixes":[]},{"propositions":[],"lastnames":["Holder"],"firstnames":["Daniel","J."],"suffixes":[]},{"propositions":[],"lastnames":["Phillips"],"firstnames":["Jonathan","A."],"suffixes":[]},{"propositions":[],"lastnames":["Sauer"],"firstnames":["John-Michael"],"suffixes":[]},{"propositions":[],"lastnames":["Walker"],"firstnames":["Elizabeth","G."],"suffixes":[]}],"year":"2017","pmid":"27815493","pmcid":"PMC5837334","keywords":"Animals, Biomarkers, Drug Evaluation, Preclinical, Drug-Related Side Effects and Adverse Reactions, Drugs, Investigational, False Positive Reactions, Humans, IDI, Models, Statistical, Muscular Diseases, NRI, Organizations, Nonprofit, Predictive Value of Tests, ROC Curve, Renal Insufficiency, Toxicity Tests, United States, Xenobiotics, biomarkers \\textless Safety Evaluation, statistics.","pages":"11–13","bibtex":"@article{burch_net_2017,\n\ttitle = {Net {Reclassification} {Index} and {Integrated} {Discrimination} {Index} {Are} {Not} {Appropriate} for {Testing} {Whether} a {Biomarker} {Improves} {Predictive} {Performance}},\n\tvolume = {156},\n\tissn = {1096-0929},\n\tdoi = {10.1093/toxsci/kfw225},\n\tabstract = {One of the goals of the Critical Path Institute's Predictive Safety Testing Consortium (PSTC) is to promote best practices for evaluating novel markers of drug induced injury. This includes the use of sound statistical methods. For rat studies, these practices have centered around comparing the area under the receiver-operator characteristic curve for each novel injury biomarker to those for the standard markers. In addition, the PSTC has previously used the net reclassification index (NRI) and integrated discrimination index (IDI) to assess the increased certainty provided by each novel injury biomarker when added to the information already provided by the standard markers. Due to their relatively simple interpretations, NRI and IDI have generally been popular measures of predictive performance. However recent literature suggests that significance tests for NRI and IDI can have inflated false positive rates and thus, tests based on these metrics should not be relied upon. Instead, when parametric models are employed to assess the added predictive value of a new marker, following (Pepe, M. S., Kerr, K. F., Longton, G., and Wang, Z. (2013). Testing for improvement in prediction model performance. Stat. Med. 32, 1467-1482), the PSTC recommends that likelihood based methods be used for significance testing.},\n\tlanguage = {eng},\n\tnumber = {1},\n\tjournal = {Toxicological Sciences: An Official Journal of the Society of Toxicology},\n\tauthor = {Burch, Peter M. and Glaab, Warren E. and Holder, Daniel J. and Phillips, Jonathan A. and Sauer, John-Michael and Walker, Elizabeth G.},\n\tyear = {2017},\n\tpmid = {27815493},\n\tpmcid = {PMC5837334},\n\tkeywords = {Animals, Biomarkers, Drug Evaluation, Preclinical, Drug-Related Side Effects and Adverse Reactions, Drugs, Investigational, False Positive Reactions, Humans, IDI, Models, Statistical, Muscular Diseases, NRI, Organizations, Nonprofit, Predictive Value of Tests, ROC Curve, Renal Insufficiency, Toxicity Tests, United States, Xenobiotics, biomarkers {\\textless} Safety Evaluation, statistics.},\n\tpages = {11--13},\n}\n\n\n\n","author_short":["Burch, P. M.","Glaab, W. E.","Holder, D. J.","Phillips, J. A.","Sauer, J.","Walker, E. G."],"key":"burch_net_2017","id":"burch_net_2017","bibbaseid":"burch-glaab-holder-phillips-sauer-walker-netreclassificationindexandintegrateddiscriminationindexarenotappropriatefortestingwhetherabiomarkerimprovespredictiveperformance-2017","role":"author","urls":{},"keyword":["Animals","Biomarkers","Drug Evaluation","Preclinical","Drug-Related Side Effects and Adverse Reactions","Drugs","Investigational","False Positive Reactions","Humans","IDI","Models","Statistical","Muscular Diseases","NRI","Organizations","Nonprofit","Predictive Value of Tests","ROC Curve","Renal Insufficiency","Toxicity Tests","United States","Xenobiotics","biomarkers \\textless Safety Evaluation","statistics."],"metadata":{"authorlinks":{}}},"bibtype":"article","biburl":"https://bibbase.org/zotero-group/PSTCAdmin/2378681","dataSources":["kJsHdPWp2KDtbWiwE","mQkdcv24jRmjasdbs","knnrEJWve9PHkHyWX","FHiGn4fB74DxTzvid"],"keywords":["animals","biomarkers","drug evaluation","preclinical","drug-related side effects and adverse reactions","drugs","investigational","false positive reactions","humans","idi","models","statistical","muscular diseases","nri","organizations","nonprofit","predictive value of tests","roc curve","renal insufficiency","toxicity tests","united states","xenobiotics","biomarkers \\textless safety evaluation","statistics."],"search_terms":["net","reclassification","index","integrated","discrimination","index","appropriate","testing","whether","biomarker","improves","predictive","performance","burch","glaab","holder","phillips","sauer","walker"],"title":"Net Reclassification Index and Integrated Discrimination Index Are Not Appropriate for Testing Whether a Biomarker Improves Predictive Performance","year":2017}