Addressing data privacy in matched studies via virtual pooling. Saha-Chaudhuri, P. & Weinberg, C. BMC Medical Research Methodology, 17(1):136, December, 2017.
Addressing data privacy in matched studies via virtual pooling [link]Paper  doi  abstract   bibtex   
Background: Data confidentiality and shared use of research data are two desirable but sometimes conflicting goals in research with multi-center studies and distributed data. While ideal for straightforward analysis, confidentiality restrictions forbid creation of a single dataset that includes covariate information of all participants. Current approaches such as aggregate data sharing, distributed regression, meta-analysis and score-based methods can have important limitations. Methods: We propose a novel application of an existing epidemiologic tool, specimen pooling, to enable confidentialitypreserving analysis of data arising from a matched case-control, multi-center design. Instead of pooling specimens prior to assay, we apply the methodology to virtually pool (aggregate) covariates within nodes. Such virtual pooling retains most of the information used in an analysis with individual data and since individual participant data is not shared externally, within-node virtual pooling preserves data confidentiality. We show that aggregated covariate levels can be used in a conditional logistic regression model to estimate individual-level odds ratios of interest. Results: The parameter estimates from the standard conditional logistic regression are compared to the estimates based on a conditional logistic regression model with aggregated data. The parameter estimates are shown to be similar to those without pooling and to have comparable standard errors and confidence interval coverage. Conclusions: Virtual data pooling can be used to maintain confidentiality of data from multi-center study and can be particularly useful in research with large-scale distributed data.
@article{saha-chaudhuri_addressing_2017-1,
	title = {Addressing data privacy in matched studies via virtual pooling},
	volume = {17},
	issn = {1471-2288},
	url = {https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-017-0419-0},
	doi = {10.1186/s12874-017-0419-0},
	abstract = {Background: Data confidentiality and shared use of research data are two desirable but sometimes conflicting goals in research with multi-center studies and distributed data. While ideal for straightforward analysis, confidentiality restrictions forbid creation of a single dataset that includes covariate information of all participants. Current approaches such as aggregate data sharing, distributed regression, meta-analysis and score-based methods can have important limitations. Methods: We propose a novel application of an existing epidemiologic tool, specimen pooling, to enable confidentialitypreserving analysis of data arising from a matched case-control, multi-center design. Instead of pooling specimens prior to assay, we apply the methodology to virtually pool (aggregate) covariates within nodes. Such virtual pooling retains most of the information used in an analysis with individual data and since individual participant data is not shared externally, within-node virtual pooling preserves data confidentiality. We show that aggregated covariate levels can be used in a conditional logistic regression model to estimate individual-level odds ratios of interest. Results: The parameter estimates from the standard conditional logistic regression are compared to the estimates based on a conditional logistic regression model with aggregated data. The parameter estimates are shown to be similar to those without pooling and to have comparable standard errors and confidence interval coverage. Conclusions: Virtual data pooling can be used to maintain confidentiality of data from multi-center study and can be particularly useful in research with large-scale distributed data.},
	language = {en},
	number = {1},
	urldate = {2020-06-10},
	journal = {BMC Medical Research Methodology},
	author = {Saha-Chaudhuri, P. and Weinberg, C.R.},
	month = dec,
	year = {2017},
	pages = {136},
	file = {Saha-Chaudhuri and Weinberg - 2017 - Addressing data privacy in matched studies via vir.pdf:/Users/neil.hawkins/Zotero/storage/L6F6QDM4/Saha-Chaudhuri and Weinberg - 2017 - Addressing data privacy in matched studies via vir.pdf:application/pdf},
}

Downloads: 0