Critical Provocations for Synthetic Data. Susser, D. & Seeman, J. Surveillance & Society, December, 2024.
Critical Provocations for Synthetic Data [link]Paper  doi  abstract   bibtex   
Training artificial intelligence (AI) systems requires vast quantities of data, and AI developers face a variety of barriers to accessing the information they need. Synthetic data has captured researchers’ and industry’s imagination as a potential solution to this problem. While some of the enthusiasm for synthetic data may be warranted, in this short paper we offer critical counterweight to simplistic narratives that position synthetic data as a cost-free solution to every data-access challenge—provocations highlighting ethical, political, and governance issues the use of synthetic data can create. We question the idea that synthetic data, by its nature, is exempt from privacy and related ethical concerns. We caution that framing synthetic data in binary opposition to “real” measurement data could subtly shift the normative standards to which data collectors and processors are held. And we argue that by promising to divorce data from its constituents—the people it represents and impacts—synthetic data could create new obstacles to democratic data governance.
@article{susser_critical_2024,
	title = {Critical {Provocations} for {Synthetic} {Data}},
	volume = {22},
	copyright = {https://creativecommons.org/licenses/by-nc-nd/4.0},
	issn = {1477-7487},
	url = {https://ojs.library.queensu.ca/index.php/surveillance-and-society/article/view/18335},
	doi = {10.24908/ss.v22i4.18335},
	abstract = {Training artificial intelligence (AI) systems requires vast quantities of data, and AI developers face a variety of barriers to accessing the information they need. Synthetic data has captured researchers’ and industry’s imagination as a potential solution to this problem. While some of the enthusiasm for synthetic data may be warranted, in this short paper we offer critical counterweight to simplistic narratives that position synthetic data as a cost-free solution to every data-access challenge—provocations highlighting ethical, political, and governance issues the use of synthetic data can create. We question the idea that synthetic data, by its nature, is exempt from privacy and related ethical concerns. We caution that framing synthetic data in binary opposition to “real” measurement data could subtly shift the normative standards to which data collectors and processors are held. And we argue that by promising to divorce data from its constituents—the people it represents and impacts—synthetic data could create new obstacles to democratic data governance.},
	number = {4},
	urldate = {2024-12-12},
	journal = {Surveillance \& Society},
	author = {Susser, Daniel and Seeman, Jeremy},
	month = dec,
	year = {2024},
}

Downloads: 0