Fair and Private Data Preprocessing through Microaggregation. González-Zelaya, V., Salas, J., Megías, D., & Missier, P. ACM Trans. Knowl. Discov. Data, Association for Computing Machinery, New York, NY, USA, dec, 2023.
Fair and Private Data Preprocessing through Microaggregation [link]Paper  doi  abstract   bibtex   1 download  
Privacy protection for personal data and fairness in automated decisions are fundamental requirements for responsible Machine Learning. Both may be enforced through data preprocessing and share a common target: data should remain useful for a task, while becoming uninformative of the sensitive information. The intrinsic connection between privacy and fairness implies that modifications performed to guarantee one of these goals, may have an effect on the other, e.g., hiding a sensitive attribute from a classification algorithm might prevent a biased decision rule having such attribute as a criterion. This work resides at the intersection of algorithmic fairness and privacy. We show how the two goals are compatible, and may be simultaneously achieved, with a small loss in predictive performance. Our results are competitive with both state-of-the-art fairness correcting algorithms and hybrid privacy-fairness methods. Experiments were performed on three widely used benchmark datasets: Adult Income, COMPAS, and German Credit.
@article{10.1145/3617377,
author = {Gonz\'{a}lez-Zelaya, Vladimiro and Salas, Juli\'{a}n and Meg\'{\i}as, David and Missier, Paolo},
title = {Fair and Private Data Preprocessing through Microaggregation},
year = {2023},
issue_date = {April 2024},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {18},
number = {3},
issn = {1556-4681},
url = {https://doi.org/10.1145/3617377},
doi = {10.1145/3617377},
abstract = {Privacy protection for personal data and fairness in automated decisions are fundamental requirements for responsible Machine Learning. Both may be enforced through data preprocessing and share a common target: data should remain useful for a task, while becoming uninformative of the sensitive information. The intrinsic connection between privacy and fairness implies that modifications performed to guarantee one of these goals, may have an effect on the other, e.g., hiding a sensitive attribute from a classification algorithm might prevent a biased decision rule having such attribute as a criterion. This work resides at the intersection of algorithmic fairness and privacy. We show how the two goals are compatible, and may be simultaneously achieved, with a small loss in predictive performance. Our results are competitive with both state-of-the-art fairness correcting algorithms and hybrid privacy-fairness methods. Experiments were performed on three widely used benchmark datasets: Adult Income, COMPAS, and German Credit.},
journal = {ACM Trans. Knowl. Discov. Data},
month = {dec},
articleno = {49},
numpages = {24},
keywords = {ethical AI, privacy preserving data mining, algorithmic fairness, Responsible machine learning, fair classification}
}

Downloads: 1