Realistic data synthesis using enhanced generative adversarial networks. Baowaly, M. K., Liu, C. L., & Chen, K. T. In Proceedings - IEEE 2nd International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2019, pages 289–292, June, 2019. Institute of Electrical and Electronics Engineers Inc..
doi  abstract   bibtex   
Real data with privacy and confidentiality concerns are not often available or are too expensive to afford in respect of both time and money. In this situation, it is a good alternative to use synthetic data. The objective of this research is to generate realistic synthetic data so that people can use it freely. We propose a synthetic data generation model based on boundary-seeking generative adversarial networks (BGANs)-designated as medical BGAN or medBGAN and compare its performances with an existing method medical GAN (medGAN). We aim to perform the investigation on several datasets in two different domains: electronic health records (EHRs) in the medical domain and a crime dataset in the City of Los Angeles Police Department. Firstly, we train the models and generate synthetic data by using these trained models. We then analyze and compare the models' performance by applying some statistical methods (dimension-wise average and Kolmogorov-Smirnov test) and two machine learning tasks (association rule mining and prediction). The comprehensive analysis of this study shows that the proposed model is more efficient in generating realistic synthetic data than those generated using medGAN.
@inproceedings{baowaly_realistic_2019,
	title = {Realistic data synthesis using enhanced generative adversarial networks},
	isbn = {978-1-72811-488-0},
	doi = {10.1109/AIKE.2019.00057},
	abstract = {Real data with privacy and confidentiality concerns are not often available or are too expensive to afford in respect of both time and money. In this situation, it is a good alternative to use synthetic data. The objective of this research is to generate realistic synthetic data so that people can use it freely. We propose a synthetic data generation model based on boundary-seeking generative adversarial networks (BGANs)-designated as medical BGAN or medBGAN and compare its performances with an existing method medical GAN (medGAN). We aim to perform the investigation on several datasets in two different domains: electronic health records (EHRs) in the medical domain and a crime dataset in the City of Los Angeles Police Department. Firstly, we train the models and generate synthetic data by using these trained models. We then analyze and compare the models' performance by applying some statistical methods (dimension-wise average and Kolmogorov-Smirnov test) and two machine learning tasks (association rule mining and prediction). The comprehensive analysis of this study shows that the proposed model is more efficient in generating realistic synthetic data than those generated using medGAN.},
	booktitle = {Proceedings - {IEEE} 2nd {International} {Conference} on {Artificial} {Intelligence} and {Knowledge} {Engineering}, {AIKE} 2019},
	publisher = {Institute of Electrical and Electronics Engineers Inc.},
	author = {Baowaly, Mrinal Kanti and Liu, Chao Lin and Chen, Kuan Ta},
	month = jun,
	year = {2019},
	keywords = {Boundary seeking GANs, Data synthesis, Electronic health records, Generative adversarial networks, Synthetic data generation},
	pages = {289--292},
}

Downloads: 0