Genomic prediction with epistasis models: on the marker-coding-dependent performance of the extended GBLUP and properties of the categorical epistasis model (CE). Martini, J. W. R., Gao, N., Cardoso, D. F., Wimmer, V., Erbe, M., Cantet, R. J. C., & Simianer, H. BMC Bioinformatics, 18(1):3, 2017. doi abstract bibtex BACKGROUND Epistasis marker effect models incorporating products of marker values as predictor variables in a linear regression approach (extended GBLUP, EGBLUP) have been assessed as potentially beneficial for genomic prediction, but their performance depends on marker coding. Although this fact has been recognized in literature, the nature of the problem has not been thoroughly investigated so far. RESULTS We illustrate how the choice of marker coding implicitly specifies the model of how effects of certain allele combinations at different loci contribute to the phenotype, and investigate coding-dependent properties of EGBLUP. Moreover, we discuss an alternative categorical epistasis model (CE) eliminating undesired properties of EGBLUP and show that the CE model can improve predictive ability. Finally, we demonstrate that the coding-dependent performance of EGBLUP offers the possibility to incorporate prior experimental information into the prediction method by adapting the coding to already available phenotypic records on other traits. CONCLUSION Based on our results, for EGBLUP, a symmetric coding -1,1 or -1,0,1 should be preferred, whereas a standardization using allele frequencies should be avoided. Moreover, CE can be a valuable alternative since it does not possess the undesired theoretical properties of EGBLUP. However, which model performs best will depend on characteristics of the data and available prior information. Data from previous experiments can for instance be incorporated into the marker coding of EGBLUP.
@article{Martini2017Genomic,
abstract = {BACKGROUND
Epistasis marker effect models incorporating products of marker values as predictor variables in a linear regression approach (extended GBLUP, EGBLUP) have been assessed as potentially beneficial for genomic prediction, but their performance depends on marker coding. Although this fact has been recognized in literature, the nature of the problem has not been thoroughly investigated so far.
RESULTS
We illustrate how the choice of marker coding implicitly specifies the model of how effects of certain allele combinations at different loci contribute to the phenotype, and investigate coding-dependent properties of EGBLUP. Moreover, we discuss an alternative categorical epistasis model (CE) eliminating undesired properties of EGBLUP and show that the CE model can improve predictive ability. Finally, we demonstrate that the coding-dependent performance of EGBLUP offers the possibility to incorporate prior experimental information into the prediction method by adapting the coding to already available phenotypic records on other traits.
CONCLUSION
Based on our results, for EGBLUP, a symmetric coding {-1,1} or {-1,0,1} should be preferred, whereas a standardization using allele frequencies should be avoided. Moreover, CE can be a valuable alternative since it does not possess the undesired theoretical properties of EGBLUP. However, which model performs best will depend on characteristics of the data and available prior information. Data from previous experiments can for instance be incorporated into the marker coding of EGBLUP.},
author = {Martini, Johannes W. R. and Gao, Ning and Cardoso, Diercles F. and Wimmer, Valentin and Erbe, Malena and Cantet, Rodolfo J. C. and Simianer, Henner},
year = {2017},
title = {Genomic prediction with epistasis models: on the marker-coding-dependent performance of the extended {GBLUP} and properties of the categorical epistasis model {(CE)}},
keywords = {gen;phd},
pages = {3},
volume = {18},
number = {1},
journal = {BMC Bioinformatics},
doi = {10.1186/s12859-016-1439-1},
file = {http://www.ncbi.nlm.nih.gov/pubmed/28049412},
file = {https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5209948},
howpublished = {refereed}
}
Downloads: 0
{"_id":"4a4y48Fczpszs33XK","bibbaseid":"martini-gao-cardoso-wimmer-erbe-cantet-simianer-genomicpredictionwithepistasismodelsonthemarkercodingdependentperformanceoftheextendedgblupandpropertiesofthecategoricalepistasismodelce-2017","downloads":0,"creationDate":"2019-03-13T12:15:56.463Z","title":"Genomic prediction with epistasis models: on the marker-coding-dependent performance of the extended GBLUP and properties of the categorical epistasis model (CE)","author_short":["Martini, J. W. R.","Gao, N.","Cardoso, D. F.","Wimmer, V.","Erbe, M.","Cantet, R. J. C.","Simianer, H."],"year":2017,"bibtype":"article","biburl":"http://www.uni-goettingen.de/de/document/download/9d7c40531010bf5be953ccd9446e47ae.bib/GRK1644BibHomepage.bib","bibdata":{"bibtype":"article","type":"article","abstract":"BACKGROUND Epistasis marker effect models incorporating products of marker values as predictor variables in a linear regression approach (extended GBLUP, EGBLUP) have been assessed as potentially beneficial for genomic prediction, but their performance depends on marker coding. Although this fact has been recognized in literature, the nature of the problem has not been thoroughly investigated so far. RESULTS We illustrate how the choice of marker coding implicitly specifies the model of how effects of certain allele combinations at different loci contribute to the phenotype, and investigate coding-dependent properties of EGBLUP. Moreover, we discuss an alternative categorical epistasis model (CE) eliminating undesired properties of EGBLUP and show that the CE model can improve predictive ability. Finally, we demonstrate that the coding-dependent performance of EGBLUP offers the possibility to incorporate prior experimental information into the prediction method by adapting the coding to already available phenotypic records on other traits. CONCLUSION Based on our results, for EGBLUP, a symmetric coding -1,1 or -1,0,1 should be preferred, whereas a standardization using allele frequencies should be avoided. Moreover, CE can be a valuable alternative since it does not possess the undesired theoretical properties of EGBLUP. However, which model performs best will depend on characteristics of the data and available prior information. Data from previous experiments can for instance be incorporated into the marker coding of EGBLUP.","author":[{"propositions":[],"lastnames":["Martini"],"firstnames":["Johannes","W.","R."],"suffixes":[]},{"propositions":[],"lastnames":["Gao"],"firstnames":["Ning"],"suffixes":[]},{"propositions":[],"lastnames":["Cardoso"],"firstnames":["Diercles","F."],"suffixes":[]},{"propositions":[],"lastnames":["Wimmer"],"firstnames":["Valentin"],"suffixes":[]},{"propositions":[],"lastnames":["Erbe"],"firstnames":["Malena"],"suffixes":[]},{"propositions":[],"lastnames":["Cantet"],"firstnames":["Rodolfo","J.","C."],"suffixes":[]},{"propositions":[],"lastnames":["Simianer"],"firstnames":["Henner"],"suffixes":[]}],"year":"2017","title":"Genomic prediction with epistasis models: on the marker-coding-dependent performance of the extended GBLUP and properties of the categorical epistasis model (CE)","keywords":"gen;phd","pages":"3","volume":"18","number":"1","journal":"BMC Bioinformatics","doi":"10.1186/s12859-016-1439-1","file":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5209948","howpublished":"refereed","bibtex":"@article{Martini2017Genomic,\r\n abstract = {BACKGROUND\r\n\r\nEpistasis marker effect models incorporating products of marker values as predictor variables in a linear regression approach (extended GBLUP, EGBLUP) have been assessed as potentially beneficial for genomic prediction, but their performance depends on marker coding. Although this fact has been recognized in literature, the nature of the problem has not been thoroughly investigated so far.\r\n\r\nRESULTS\r\n\r\nWe illustrate how the choice of marker coding implicitly specifies the model of how effects of certain allele combinations at different loci contribute to the phenotype, and investigate coding-dependent properties of EGBLUP. Moreover, we discuss an alternative categorical epistasis model (CE) eliminating undesired properties of EGBLUP and show that the CE model can improve predictive ability. Finally, we demonstrate that the coding-dependent performance of EGBLUP offers the possibility to incorporate prior experimental information into the prediction method by adapting the coding to already available phenotypic records on other traits.\r\n\r\nCONCLUSION\r\n\r\nBased on our results, for EGBLUP, a symmetric coding {-1,1} or {-1,0,1} should be preferred, whereas a standardization using allele frequencies should be avoided. Moreover, CE can be a valuable alternative since it does not possess the undesired theoretical properties of EGBLUP. However, which model performs best will depend on characteristics of the data and available prior information. Data from previous experiments can for instance be incorporated into the marker coding of EGBLUP.},\r\n author = {Martini, Johannes W. R. and Gao, Ning and Cardoso, Diercles F. and Wimmer, Valentin and Erbe, Malena and Cantet, Rodolfo J. C. and Simianer, Henner},\r\n year = {2017},\r\n title = {Genomic prediction with epistasis models: on the marker-coding-dependent performance of the extended {GBLUP} and properties of the categorical epistasis model {(CE)}},\r\n keywords = {gen;phd},\r\n pages = {3},\r\n volume = {18},\r\n number = {1},\r\n journal = {BMC Bioinformatics},\r\n doi = {10.1186/s12859-016-1439-1},\r\n file = {http://www.ncbi.nlm.nih.gov/pubmed/28049412},\r\n file = {https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5209948},\r\n howpublished = {refereed}\r\n}\r\n\r\n\r\n","author_short":["Martini, J. W. R.","Gao, N.","Cardoso, D. F.","Wimmer, V.","Erbe, M.","Cantet, R. J. C.","Simianer, H."],"key":"Martini2017Genomic","id":"Martini2017Genomic","bibbaseid":"martini-gao-cardoso-wimmer-erbe-cantet-simianer-genomicpredictionwithepistasismodelsonthemarkercodingdependentperformanceoftheextendedgblupandpropertiesofthecategoricalepistasismodelce-2017","role":"author","urls":{},"keyword":["gen;phd"],"metadata":{"authorlinks":{}},"downloads":0},"search_terms":["genomic","prediction","epistasis","models","marker","coding","dependent","performance","extended","gblup","properties","categorical","epistasis","model","martini","gao","cardoso","wimmer","erbe","cantet","simianer"],"keywords":["gen;phd"],"authorIDs":[],"dataSources":["psxr4mFyE5JDwFLuZ","2w3D54bmLuhpt4TNv","t8S6Y6RWEwDAQHiSQ","cLGdYAfLyvQDgrYmh"]}