An original imputation technique of missing data for assessing exposure of newborns to perchlorate in drinking water. Caron, A., Clement, G., Heyman, C., Aernout, E., Chazard, E., & Le Tertre, A. Studies in Health Technology and Informatics, 210:860–864, 2015.
abstract   bibtex   
INTRODUCTION: Incompleteness of epidemiological databases is a major drawback when it comes to analyzing data. We conceived an epidemiological study to assess the association between newborn thyroid function and the exposure to perchlorates found in the tap water of the mother's home. Only 9% of newborn's exposure to perchlorate was known. The aim of our study was to design, test and evaluate an original method for imputing perchlorate exposure of newborns based on their maternity of birth. METHODS: In a first database, an exhaustive collection of newborn's thyroid function measured during a systematic neonatal screening was collected. In this database the municipality of residence of the newborn's mother was only available for 2012. Between 2004 and 2011, the closest data available was the municipality of the maternity of birth. Exposure was assessed using a second database which contained the perchlorate levels for each municipality. We computed the catchment area of every maternity ward based on the French nationwide exhaustive database of inpatient stay. Municipality, and consequently perchlorate exposure, was imputed by a weighted draw in the catchment area. Missing values for remaining covariates were imputed by chained equation. A linear mixture model was computed on each imputed dataset. We compared odds ratios (ORs) and 95% confidence intervals (95% CI) estimated on real versus imputed 2012 data. The same model was then carried out for the whole imputed database. RESULTS: The ORs estimated on 36,695 observations by our multiple imputation method are comparable to the real 2012 data. On the 394,979 observations of the whole database, the ORs remain stable but the 95% CI tighten considerably. DISCUSSION: The model estimates computed on imputed data are similar to those calculated on real data. The main advantage of multiple imputation is to provide unbiased estimate of the ORs while maintaining their variances. Thus, our method will be used to increase the statistical power of future studies by including all 394,979 newborns.
@article{caron_original_2015,
	title = {An original imputation technique of missing data for assessing exposure of newborns to perchlorate in drinking water},
	volume = {210},
	issn = {0926-9630},
	abstract = {INTRODUCTION: Incompleteness of epidemiological databases is a major drawback when it comes to analyzing data. We conceived an epidemiological study to assess the association between newborn thyroid function and the exposure to perchlorates found in the tap water of the mother's home. Only 9\% of newborn's exposure to perchlorate was known. The aim of our study was to design, test and evaluate an original method for imputing perchlorate exposure of newborns based on their maternity of birth.
METHODS: In a first database, an exhaustive collection of newborn's thyroid function measured during a systematic neonatal screening was collected. In this database the municipality of residence of the newborn's mother was only available for 2012. Between 2004 and 2011, the closest data available was the municipality of the maternity of birth. Exposure was assessed using a second database which contained the perchlorate levels for each municipality. We computed the catchment area of every maternity ward based on the French nationwide exhaustive database of inpatient stay. Municipality, and consequently perchlorate exposure, was imputed by a weighted draw in the catchment area. Missing values for remaining covariates were imputed by chained equation. A linear mixture model was computed on each imputed dataset. We compared odds ratios (ORs) and 95\% confidence intervals (95\% CI) estimated on real versus imputed 2012 data. The same model was then carried out for the whole imputed database.
RESULTS: The ORs estimated on 36,695 observations by our multiple imputation method are comparable to the real 2012 data. On the 394,979 observations of the whole database, the ORs remain stable but the 95\% CI tighten considerably.
DISCUSSION: The model estimates computed on imputed data are similar to those calculated on real data. The main advantage of multiple imputation is to provide unbiased estimate of the ORs while maintaining their variances. Thus, our method will be used to increase the statistical power of future studies by including all 394,979 newborns.},
	language = {eng},
	journal = {Studies in Health Technology and Informatics},
	author = {Caron, Alexandre and Clement, Guillaume and Heyman, Christophe and Aernout, Eva and Chazard, Emmanuel and Le Tertre, Alain},
	year = {2015},
	pmid = {25991277},
	keywords = {Adolescent, Adult, Computer Simulation, Drinking Water, Environmental Exposure, Female, France, Humans, Infant, Newborn, Infant, Newborn, Diseases, Middle Aged, Models, Statistical, Perchlorates, Pregnancy, Prenatal Exposure Delayed Effects, Prevalence, Reproducibility of Results, Risk Assessment, Sample Size, Sensitivity and Specificity, Thyroid Diseases, Treatment Outcome, Water Pollutants, Chemical, Water Pollution, Chemical, Young Adult},
	pages = {860--864},
}

Downloads: 0