Towards Fairness in Synthetic Healthcare Data: A Framework for the Evaluation of Synthetization Algorithms. Warnecke, Y., Kuhn, M., Diederichs, F., Brix, T. J, Clever, L., Bergmann, R., Heider, D., & Storck, M. Studies in Health Technology and Informatics, 331:25–34, IOS Press, Netherlands, 2025. doi abstract bibtex Synthetic data generation is a rapidly evolving field, with significant potential for improving data privacy. However, evaluating the performance of synthetic data generation methods, especially the tradeoff between fairness and utility of the generated data, remains a challenge. In this work, we present our comprehensive framework, which evaluates fair synthetic data generation methods, benchmarking them against state-of-the-art synthesizers. The proposed framework consists of selection, evaluation, and application components that assess fairness, utility, and resemblance in real-world scenarios. The framework was applied to state-of-the-art data synthesizers, including TabFairGAN, DECAF, TVAE, and CTGAN, using a publicly available medical dataset. The results reveal the strengths and limitations of each synthesizer, including their bias mitigation strategies and trade-offs between fairness and utility, thereby showing the framework's effectiveness.
@article{Warnecke2025FairnessSynthetic,
title = {Towards Fairness in Synthetic Healthcare Data: A Framework for the Evaluation of Synthetization Algorithms},
author = {Warnecke, Yannik and Kuhn, Martin and Diederichs, Felix and Brix, Tobias J and Clever, Lena and Bergmann, Ralph and Heider, Dominik and Storck, Michael},
journal = {Studies in Health Technology and Informatics},
volume = {331},
pages = {25--34},
year = {2025},
doi = {10.3233/SHTI251376},
issn = {1879-8365},
publisher = {IOS Press},
address = {Netherlands},
abstract = {Synthetic data generation is a rapidly evolving field, with significant potential for improving data privacy. However, evaluating the performance of synthetic data generation methods, especially the tradeoff between fairness and utility of the generated data, remains a challenge. In this work, we present our comprehensive framework, which evaluates fair synthetic data generation methods, benchmarking them against state-of-the-art synthesizers. The proposed framework consists of selection, evaluation, and application components that assess fairness, utility, and resemblance in real-world scenarios. The framework was applied to state-of-the-art data synthesizers, including TabFairGAN, DECAF, TVAE, and CTGAN, using a publicly available medical dataset. The results reveal the strengths and limitations of each synthesizer, including their bias mitigation strategies and trade-offs between fairness and utility, thereby showing the framework's effectiveness.},
pmid = {40899524},
keywords = {Algorithms, Artificial Intelligence, Data Quality, Delivery of Health Care, Health Equity, Medical Informatics}
}
Downloads: 0
{"_id":"HXuZeToJhQpiqfK4t","bibbaseid":"warnecke-kuhn-diederichs-brix-clever-bergmann-heider-storck-towardsfairnessinsynthetichealthcaredataaframeworkfortheevaluationofsynthetizationalgorithms-2025","author_short":["Warnecke, Y.","Kuhn, M.","Diederichs, F.","Brix, T. J","Clever, L.","Bergmann, R.","Heider, D.","Storck, M."],"bibdata":{"bibtype":"article","type":"article","title":"Towards Fairness in Synthetic Healthcare Data: A Framework for the Evaluation of Synthetization Algorithms","author":[{"propositions":[],"lastnames":["Warnecke"],"firstnames":["Yannik"],"suffixes":[]},{"propositions":[],"lastnames":["Kuhn"],"firstnames":["Martin"],"suffixes":[]},{"propositions":[],"lastnames":["Diederichs"],"firstnames":["Felix"],"suffixes":[]},{"propositions":[],"lastnames":["Brix"],"firstnames":["Tobias","J"],"suffixes":[]},{"propositions":[],"lastnames":["Clever"],"firstnames":["Lena"],"suffixes":[]},{"propositions":[],"lastnames":["Bergmann"],"firstnames":["Ralph"],"suffixes":[]},{"propositions":[],"lastnames":["Heider"],"firstnames":["Dominik"],"suffixes":[]},{"propositions":[],"lastnames":["Storck"],"firstnames":["Michael"],"suffixes":[]}],"journal":"Studies in Health Technology and Informatics","volume":"331","pages":"25–34","year":"2025","doi":"10.3233/SHTI251376","issn":"1879-8365","publisher":"IOS Press","address":"Netherlands","abstract":"Synthetic data generation is a rapidly evolving field, with significant potential for improving data privacy. However, evaluating the performance of synthetic data generation methods, especially the tradeoff between fairness and utility of the generated data, remains a challenge. In this work, we present our comprehensive framework, which evaluates fair synthetic data generation methods, benchmarking them against state-of-the-art synthesizers. The proposed framework consists of selection, evaluation, and application components that assess fairness, utility, and resemblance in real-world scenarios. The framework was applied to state-of-the-art data synthesizers, including TabFairGAN, DECAF, TVAE, and CTGAN, using a publicly available medical dataset. The results reveal the strengths and limitations of each synthesizer, including their bias mitigation strategies and trade-offs between fairness and utility, thereby showing the framework's effectiveness.","pmid":"40899524","keywords":"Algorithms, Artificial Intelligence, Data Quality, Delivery of Health Care, Health Equity, Medical Informatics","bibtex":"@article{Warnecke2025FairnessSynthetic,\n title = {Towards Fairness in Synthetic Healthcare Data: A Framework for the Evaluation of Synthetization Algorithms},\n author = {Warnecke, Yannik and Kuhn, Martin and Diederichs, Felix and Brix, Tobias J and Clever, Lena and Bergmann, Ralph and Heider, Dominik and Storck, Michael},\n journal = {Studies in Health Technology and Informatics},\n volume = {331},\n pages = {25--34},\n year = {2025},\n doi = {10.3233/SHTI251376},\n issn = {1879-8365},\n publisher = {IOS Press},\n address = {Netherlands},\n abstract = {Synthetic data generation is a rapidly evolving field, with significant potential for improving data privacy. However, evaluating the performance of synthetic data generation methods, especially the tradeoff between fairness and utility of the generated data, remains a challenge. In this work, we present our comprehensive framework, which evaluates fair synthetic data generation methods, benchmarking them against state-of-the-art synthesizers. The proposed framework consists of selection, evaluation, and application components that assess fairness, utility, and resemblance in real-world scenarios. The framework was applied to state-of-the-art data synthesizers, including TabFairGAN, DECAF, TVAE, and CTGAN, using a publicly available medical dataset. The results reveal the strengths and limitations of each synthesizer, including their bias mitigation strategies and trade-offs between fairness and utility, thereby showing the framework's effectiveness.},\n pmid = {40899524},\n keywords = {Algorithms, Artificial Intelligence, Data Quality, Delivery of Health Care, Health Equity, Medical Informatics}\n}\n\n","author_short":["Warnecke, Y.","Kuhn, M.","Diederichs, F.","Brix, T. J","Clever, L.","Bergmann, R.","Heider, D.","Storck, M."],"key":"Warnecke2025FairnessSynthetic","id":"Warnecke2025FairnessSynthetic","bibbaseid":"warnecke-kuhn-diederichs-brix-clever-bergmann-heider-storck-towardsfairnessinsynthetichealthcaredataaframeworkfortheevaluationofsynthetizationalgorithms-2025","role":"author","urls":{},"keyword":["Algorithms","Artificial Intelligence","Data Quality","Delivery of Health Care","Health Equity","Medical Informatics"],"metadata":{"authorlinks":{}}},"bibtype":"article","biburl":"https://web.wi2.uni-trier.de/publications/WI2Publikationen.bib","dataSources":["MSp3DzP4ToPojqkFy"],"keywords":["algorithms","artificial intelligence","data quality","delivery of health care","health equity","medical informatics"],"search_terms":["towards","fairness","synthetic","healthcare","data","framework","evaluation","synthetization","algorithms","warnecke","kuhn","diederichs","brix","clever","bergmann","heider","storck"],"title":"Towards Fairness in Synthetic Healthcare Data: A Framework for the Evaluation of Synthetization Algorithms","year":2025}