A multiple linear regression model with multiplicative log-normal error term for atmospheric concentration data. Liao, K., Park, E. S., Zhang, J., Cheng, L., Ji, D., Ying, Q., & Yu, J. Z. SCIENCE OF THE TOTAL ENVIRONMENT, ELSEVIER, RADARWEG 29, 1043 NX AMSTERDAM, NETHERLANDS, MAY 1, 2021. doi abstract bibtex The homoscedasticity assumption (the variance of the error term is the same across all the observations) is a key assumption in the ordinary linear squares (OLS) solution degrees fa linear regression model. The validity of this assumption is examined for a multiple linear regression model used to determine the source contributions to the observed black carbon concentrations at 12 background monitoring sites across China using a hybrid modeling approach. Residual analysis from the traditional OLS method, which assumes that the error term is additive and normally distributed with a mean of zero, shows pronounced heterosceclasticity based on the Breusch-Pagan test for 11 datasets. Noticing that the atmospheric black carbon data are log-normally distributed, we make a new assumption that the error terms are multiplicative and log-normally distributed. When the coefficients of the multilinear regression model are determined using the maximum likelihood estimation (MLE), the distribution of the residuals in 8 out of the 12 datasets is in good accordance with the revised assumption. Furthermore, the MLE computation under this novel assumption could be proved mathematically identical to minimizing a log-scale objective function, which considerably reduces the complexity in the MLE calculation. The new method is further demonstrated to have dear advantages in numerical simulation experiments of a 5-variable multiple linear regression model using synthesized data with prescribed coefficients and lognormally distributed multiplicative errors. Under all 9 simulation scenarios, the new method yields the most accurate estimations of the regression coefficients and has significantly higher coverage probability (on average, 95% for all five coefficients) than OLS (79%) and weighted least squares (WLS, 72%) methods. (C) 2020 Elsevier B.V. All rights reserved.
@article{ WOS:000617681100015,
Author = {Liao, Kezheng and Park, Eun Sug and Zhang, Jie and Cheng, Linjun and Ji,
Dongsheng and Ying, Qi and Yu, Jian Zhen},
Title = {{A multiple linear regression model with multiplicative log-normal error
term for atmospheric concentration data}},
Journal = {{SCIENCE OF THE TOTAL ENVIRONMENT}},
Year = {{2021}},
Volume = {{767}},
Month = {{MAY 1}},
Abstract = {{The homoscedasticity assumption (the variance of the error term is the
same across all the observations) is a key assumption in the ordinary
linear squares (OLS) solution degrees fa linear regression model. The
validity of this assumption is examined for a multiple linear regression
model used to determine the source contributions to the observed black
carbon concentrations at 12 background monitoring sites across China
using a hybrid modeling approach. Residual analysis from the traditional
OLS method, which assumes that the error term is additive and normally
distributed with a mean of zero, shows pronounced heterosceclasticity
based on the Breusch-Pagan test for 11 datasets. Noticing that the
atmospheric black carbon data are log-normally distributed, we make a
new assumption that the error terms are multiplicative and log-normally
distributed. When the coefficients of the multilinear regression model
are determined using the maximum likelihood estimation (MLE), the
distribution of the residuals in 8 out of the 12 datasets is in good
accordance with the revised assumption. Furthermore, the MLE computation
under this novel assumption could be proved mathematically identical to
minimizing a log-scale objective function, which considerably reduces
the complexity in the MLE calculation. The new method is further
demonstrated to have dear advantages in numerical simulation experiments
of a 5-variable multiple linear regression model using synthesized data
with prescribed coefficients and lognormally distributed multiplicative
errors. Under all 9 simulation scenarios, the new method yields the most
accurate estimations of the regression coefficients and has
significantly higher coverage probability (on average, 95\% for all five
coefficients) than OLS (79\%) and weighted least squares (WLS, 72\%)
methods. (C) 2020 Elsevier B.V. All rights reserved.}},
Publisher = {{ELSEVIER}},
Address = {{RADARWEG 29, 1043 NX AMSTERDAM, NETHERLANDS}},
Type = {{Article}},
Language = {{English}},
Affiliation = {{Yu, JZ (Corresponding Author), Hong Kong Univ Sci \& Technol, Dept Chem, Kowloon, Clear Water Bay, Hong Kong, Peoples R China.
Ying, Q (Corresponding Author), Texas A\&M Univ, Zachry Dept Civil \& Environm Engn, College Stn, TX 77843 USA.
Liao, Kezheng; Yu, Jian Zhen, Hong Kong Univ Sci \& Technol, Dept Chem, Kowloon, Clear Water Bay, Hong Kong, Peoples R China.
Park, Eun Sug, Texas A\&M Univ, Texas A\&M Transportat Inst, College Stn, TX 77843 USA.
Zhang, Jie; Ying, Qi, Texas A\&M Univ, Zachry Dept Civil \& Environm Engn, College Stn, TX 77843 USA.
Cheng, Linjun, China Natl Environm Monitoring Ctr, Beijing 100012, Peoples R China.
Ji, Dongsheng, Chinese Acad Sci, Inst Atmospher Phys, State Key Lab Atmospher Boundary Layer Phys \& Atm, Beijing 100191, Peoples R China.
Ji, Dongsheng, Chinese Acad Sci, Ctr Excellence Reg Atmospher Environm, Inst Urban Environm, Xiamen 361021, Peoples R China.}},
DOI = {{10.1016/j.scitotenv.2020.144282}},
Article-Number = {{144282}},
ISSN = {{0048-9697}},
EISSN = {{1879-1026}},
Keywords = {{Log-normal distribution; Multilinear regression; Maximum likelihood
estimation; Residual; Source attribution}},
Research-Areas = {{Environmental Sciences \& Ecology}},
Web-of-Science-Categories = {{Environmental Sciences}},
Author-Email = {{qying@civil.tamu.edu
chjianyu@ust.hk}},
ResearcherID-Numbers = {{Yu, Jian Zhen/A-9669-2008
Ji, Dongsheng/E-3807-2018}},
ORCID-Numbers = {{Yu, Jian Zhen/0000-0002-6165-6500
Ji, Dongsheng/0000-0002-7889-4417}},
Funding-Acknowledgement = {{Hong Kong Research Grant CouncilHong Kong Research Grants Council
{[}R6011-18]; National Institutes of HealthUnited States Department of
Health \& Human ServicesNational Institutes of Health (NIH) - USA {[}R01
ES029509]; Hong Kong PhD Fellowship}},
Funding-Text = {{This work was partially supported by the Hong Kong Research Grant
Council (R6011-18) to J.Z. Yu and a Hong Kong PhD Fellowship to K.Z.
Liao. Q. Ying, E.S. Park and J. Zhang are partially supported by a grant
from the National Institutes of Health (R01 ES029509). The CMAQ model
simulations were performed using the computer clusters at the Texas A\&M
High Performance Research Computing (https://hprc.tamu.edu/).}},
Number-of-Cited-References = {{36}},
Times-Cited = {{3}},
Usage-Count-Last-180-days = {{6}},
Usage-Count-Since-2013 = {{21}},
Journal-ISO = {{Sci. Total Environ.}},
Doc-Delivery-Number = {{QG6GE}},
Unique-ID = {{WOS:000617681100015}},
DA = {{2021-12-02}},
}
Downloads: 0
{"_id":"vEAFbBtGSFMAWmJPo","bibbaseid":"liao-park-zhang-cheng-ji-ying-yu-amultiplelinearregressionmodelwithmultiplicativelognormalerrortermforatmosphericconcentrationdata-2021","author_short":["Liao, K.","Park, E. S.","Zhang, J.","Cheng, L.","Ji, D.","Ying, Q.","Yu, J. Z."],"bibdata":{"bibtype":"article","type":"Article","author":[{"propositions":[],"lastnames":["Liao"],"firstnames":["Kezheng"],"suffixes":[]},{"propositions":[],"lastnames":["Park"],"firstnames":["Eun","Sug"],"suffixes":[]},{"propositions":[],"lastnames":["Zhang"],"firstnames":["Jie"],"suffixes":[]},{"propositions":[],"lastnames":["Cheng"],"firstnames":["Linjun"],"suffixes":[]},{"propositions":[],"lastnames":["Ji"],"firstnames":["Dongsheng"],"suffixes":[]},{"propositions":[],"lastnames":["Ying"],"firstnames":["Qi"],"suffixes":[]},{"propositions":[],"lastnames":["Yu"],"firstnames":["Jian","Zhen"],"suffixes":[]}],"title":"A multiple linear regression model with multiplicative log-normal error term for atmospheric concentration data","journal":"SCIENCE OF THE TOTAL ENVIRONMENT","year":"2021","volume":"767","month":"MAY 1","abstract":"The homoscedasticity assumption (the variance of the error term is the same across all the observations) is a key assumption in the ordinary linear squares (OLS) solution degrees fa linear regression model. The validity of this assumption is examined for a multiple linear regression model used to determine the source contributions to the observed black carbon concentrations at 12 background monitoring sites across China using a hybrid modeling approach. Residual analysis from the traditional OLS method, which assumes that the error term is additive and normally distributed with a mean of zero, shows pronounced heterosceclasticity based on the Breusch-Pagan test for 11 datasets. Noticing that the atmospheric black carbon data are log-normally distributed, we make a new assumption that the error terms are multiplicative and log-normally distributed. When the coefficients of the multilinear regression model are determined using the maximum likelihood estimation (MLE), the distribution of the residuals in 8 out of the 12 datasets is in good accordance with the revised assumption. Furthermore, the MLE computation under this novel assumption could be proved mathematically identical to minimizing a log-scale objective function, which considerably reduces the complexity in the MLE calculation. The new method is further demonstrated to have dear advantages in numerical simulation experiments of a 5-variable multiple linear regression model using synthesized data with prescribed coefficients and lognormally distributed multiplicative errors. Under all 9 simulation scenarios, the new method yields the most accurate estimations of the regression coefficients and has significantly higher coverage probability (on average, 95% for all five coefficients) than OLS (79%) and weighted least squares (WLS, 72%) methods. (C) 2020 Elsevier B.V. All rights reserved.","publisher":"ELSEVIER","address":"RADARWEG 29, 1043 NX AMSTERDAM, NETHERLANDS","language":"English","affiliation":"Yu, JZ (Corresponding Author), Hong Kong Univ Sci & Technol, Dept Chem, Kowloon, Clear Water Bay, Hong Kong, Peoples R China. Ying, Q (Corresponding Author), Texas A&M Univ, Zachry Dept Civil & Environm Engn, College Stn, TX 77843 USA. Liao, Kezheng; Yu, Jian Zhen, Hong Kong Univ Sci & Technol, Dept Chem, Kowloon, Clear Water Bay, Hong Kong, Peoples R China. Park, Eun Sug, Texas A&M Univ, Texas A&M Transportat Inst, College Stn, TX 77843 USA. Zhang, Jie; Ying, Qi, Texas A&M Univ, Zachry Dept Civil & Environm Engn, College Stn, TX 77843 USA. Cheng, Linjun, China Natl Environm Monitoring Ctr, Beijing 100012, Peoples R China. Ji, Dongsheng, Chinese Acad Sci, Inst Atmospher Phys, State Key Lab Atmospher Boundary Layer Phys & Atm, Beijing 100191, Peoples R China. Ji, Dongsheng, Chinese Acad Sci, Ctr Excellence Reg Atmospher Environm, Inst Urban Environm, Xiamen 361021, Peoples R China.","doi":"10.1016/j.scitotenv.2020.144282","article-number":"144282","issn":"0048-9697","eissn":"1879-1026","keywords":"Log-normal distribution; Multilinear regression; Maximum likelihood estimation; Residual; Source attribution","research-areas":"Environmental Sciences & Ecology","web-of-science-categories":"Environmental Sciences","author-email":"qying@civil.tamu.edu chjianyu@ust.hk","researcherid-numbers":"Yu, Jian Zhen/A-9669-2008 Ji, Dongsheng/E-3807-2018","orcid-numbers":"Yu, Jian Zhen/0000-0002-6165-6500 Ji, Dongsheng/0000-0002-7889-4417","funding-acknowledgement":"Hong Kong Research Grant CouncilHong Kong Research Grants Council [R6011-18]; National Institutes of HealthUnited States Department of Health & Human ServicesNational Institutes of Health (NIH) - USA [R01 ES029509]; Hong Kong PhD Fellowship","funding-text":"This work was partially supported by the Hong Kong Research Grant Council (R6011-18) to J.Z. Yu and a Hong Kong PhD Fellowship to K.Z. Liao. Q. Ying, E.S. Park and J. Zhang are partially supported by a grant from the National Institutes of Health (R01 ES029509). The CMAQ model simulations were performed using the computer clusters at the Texas A&M High Performance Research Computing (https://hprc.tamu.edu/).","number-of-cited-references":"36","times-cited":"3","usage-count-last-180-days":"6","usage-count-since-2013":"21","journal-iso":"Sci. Total Environ.","doc-delivery-number":"QG6GE","unique-id":"WOS:000617681100015","da":"2021-12-02","bibtex":"@article{ WOS:000617681100015,\nAuthor = {Liao, Kezheng and Park, Eun Sug and Zhang, Jie and Cheng, Linjun and Ji,\n Dongsheng and Ying, Qi and Yu, Jian Zhen},\nTitle = {{A multiple linear regression model with multiplicative log-normal error\n term for atmospheric concentration data}},\nJournal = {{SCIENCE OF THE TOTAL ENVIRONMENT}},\nYear = {{2021}},\nVolume = {{767}},\nMonth = {{MAY 1}},\nAbstract = {{The homoscedasticity assumption (the variance of the error term is the\n same across all the observations) is a key assumption in the ordinary\n linear squares (OLS) solution degrees fa linear regression model. The\n validity of this assumption is examined for a multiple linear regression\n model used to determine the source contributions to the observed black\n carbon concentrations at 12 background monitoring sites across China\n using a hybrid modeling approach. Residual analysis from the traditional\n OLS method, which assumes that the error term is additive and normally\n distributed with a mean of zero, shows pronounced heterosceclasticity\n based on the Breusch-Pagan test for 11 datasets. Noticing that the\n atmospheric black carbon data are log-normally distributed, we make a\n new assumption that the error terms are multiplicative and log-normally\n distributed. When the coefficients of the multilinear regression model\n are determined using the maximum likelihood estimation (MLE), the\n distribution of the residuals in 8 out of the 12 datasets is in good\n accordance with the revised assumption. Furthermore, the MLE computation\n under this novel assumption could be proved mathematically identical to\n minimizing a log-scale objective function, which considerably reduces\n the complexity in the MLE calculation. The new method is further\n demonstrated to have dear advantages in numerical simulation experiments\n of a 5-variable multiple linear regression model using synthesized data\n with prescribed coefficients and lognormally distributed multiplicative\n errors. Under all 9 simulation scenarios, the new method yields the most\n accurate estimations of the regression coefficients and has\n significantly higher coverage probability (on average, 95\\% for all five\n coefficients) than OLS (79\\%) and weighted least squares (WLS, 72\\%)\n methods. (C) 2020 Elsevier B.V. All rights reserved.}},\nPublisher = {{ELSEVIER}},\nAddress = {{RADARWEG 29, 1043 NX AMSTERDAM, NETHERLANDS}},\nType = {{Article}},\nLanguage = {{English}},\nAffiliation = {{Yu, JZ (Corresponding Author), Hong Kong Univ Sci \\& Technol, Dept Chem, Kowloon, Clear Water Bay, Hong Kong, Peoples R China.\n Ying, Q (Corresponding Author), Texas A\\&M Univ, Zachry Dept Civil \\& Environm Engn, College Stn, TX 77843 USA.\n Liao, Kezheng; Yu, Jian Zhen, Hong Kong Univ Sci \\& Technol, Dept Chem, Kowloon, Clear Water Bay, Hong Kong, Peoples R China.\n Park, Eun Sug, Texas A\\&M Univ, Texas A\\&M Transportat Inst, College Stn, TX 77843 USA.\n Zhang, Jie; Ying, Qi, Texas A\\&M Univ, Zachry Dept Civil \\& Environm Engn, College Stn, TX 77843 USA.\n Cheng, Linjun, China Natl Environm Monitoring Ctr, Beijing 100012, Peoples R China.\n Ji, Dongsheng, Chinese Acad Sci, Inst Atmospher Phys, State Key Lab Atmospher Boundary Layer Phys \\& Atm, Beijing 100191, Peoples R China.\n Ji, Dongsheng, Chinese Acad Sci, Ctr Excellence Reg Atmospher Environm, Inst Urban Environm, Xiamen 361021, Peoples R China.}},\nDOI = {{10.1016/j.scitotenv.2020.144282}},\nArticle-Number = {{144282}},\nISSN = {{0048-9697}},\nEISSN = {{1879-1026}},\nKeywords = {{Log-normal distribution; Multilinear regression; Maximum likelihood\n estimation; Residual; Source attribution}},\nResearch-Areas = {{Environmental Sciences \\& Ecology}},\nWeb-of-Science-Categories = {{Environmental Sciences}},\nAuthor-Email = {{qying@civil.tamu.edu\n chjianyu@ust.hk}},\nResearcherID-Numbers = {{Yu, Jian Zhen/A-9669-2008\n Ji, Dongsheng/E-3807-2018}},\nORCID-Numbers = {{Yu, Jian Zhen/0000-0002-6165-6500\n Ji, Dongsheng/0000-0002-7889-4417}},\nFunding-Acknowledgement = {{Hong Kong Research Grant CouncilHong Kong Research Grants Council\n {[}R6011-18]; National Institutes of HealthUnited States Department of\n Health \\& Human ServicesNational Institutes of Health (NIH) - USA {[}R01\n ES029509]; Hong Kong PhD Fellowship}},\nFunding-Text = {{This work was partially supported by the Hong Kong Research Grant\n Council (R6011-18) to J.Z. Yu and a Hong Kong PhD Fellowship to K.Z.\n Liao. Q. Ying, E.S. Park and J. Zhang are partially supported by a grant\n from the National Institutes of Health (R01 ES029509). The CMAQ model\n simulations were performed using the computer clusters at the Texas A\\&M\n High Performance Research Computing (https://hprc.tamu.edu/).}},\nNumber-of-Cited-References = {{36}},\nTimes-Cited = {{3}},\nUsage-Count-Last-180-days = {{6}},\nUsage-Count-Since-2013 = {{21}},\nJournal-ISO = {{Sci. Total Environ.}},\nDoc-Delivery-Number = {{QG6GE}},\nUnique-ID = {{WOS:000617681100015}},\nDA = {{2021-12-02}},\n}\n\n","author_short":["Liao, K.","Park, E. S.","Zhang, J.","Cheng, L.","Ji, D.","Ying, Q.","Yu, J. Z."],"key":"WOS:000617681100015","id":"WOS:000617681100015","bibbaseid":"liao-park-zhang-cheng-ji-ying-yu-amultiplelinearregressionmodelwithmultiplicativelognormalerrortermforatmosphericconcentrationdata-2021","role":"author","urls":{},"keyword":["Log-normal distribution; Multilinear regression; Maximum likelihood estimation; Residual; Source attribution"],"metadata":{"authorlinks":{}}},"bibtype":"article","biburl":"http://yingqi95616.ddns.net:8001/publicationlist.bib","dataSources":["kTLQ96xxQwQovcx6r","LT3gToj3w22mutXHY","MjJL6KgnAM64Por3d","SN9t6exrr8GS3PxiX"],"keywords":["log-normal distribution; multilinear regression; maximum likelihood estimation; residual; source attribution"],"search_terms":["multiple","linear","regression","model","multiplicative","log","normal","error","term","atmospheric","concentration","data","liao","park","zhang","cheng","ji","ying","yu"],"title":"A multiple linear regression model with multiplicative log-normal error term for atmospheric concentration data","year":2021,"downloads":1}