Language Models Get a Gender Makeover: Mitigating Gender Bias with Few-Shot Data Interventions

Language Models Get a Gender Makeover: Mitigating Gender Bias with Few-Shot Data Interventions. Thakur, H., Jain, A., Vaddamanu, P., Liang, P. P., & Morency, L. In Rogers, A., Boyd-Graber, J., & Okazaki, N., editors, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 340–351, Toronto, Canada, July, 2023. Association for Computational Linguistics.

Paper doi abstract bibtex

Societal biases present in pre-trained large language models are a critical issue as these models have been shown to propagate biases in countless downstream applications, rendering them unfair towards specific groups of people. Since large-scale retraining of these models from scratch is both time and compute-expensive, a variety of approaches have been previously proposed that de-bias a pre-trained model. While the majority of current state-of-the-art debiasing methods focus on changes to the training regime, in this paper, we propose data intervention strategies as a powerful yet simple technique to reduce gender bias in pre-trained models. Specifically, we empirically show that by fine-tuning a pre-trained model on only 10 debiased (intervened) training examples, the tendency to favor any gender is significantly reduced. Since our proposed method only needs a few training examples, we argue that our few-shot de-biasing approach is highly feasible and practical. Through extensive experimentation, we show that our de-biasing technique performs better than competitive state-of-the-art baselines with minimal loss in language modeling ability.

@inproceedings{thakurLanguageModelsGet2023,
	address = {Toronto, Canada},
	title = {Language {Models} {Get} a {Gender} {Makeover}: {Mitigating} {Gender} {Bias} with {Few}-{Shot} {Data} {Interventions}},
	shorttitle = {Language {Models} {Get} a {Gender} {Makeover}},
	url = {https://aclanthology.org/2023.acl-short.30},
	doi = {10.18653/v1/2023.acl-short.30},
	abstract = {Societal biases present in pre-trained large language models are a critical issue as these models have been shown to propagate biases in countless downstream applications, rendering them unfair towards specific groups of people. Since large-scale retraining of these models from scratch is both time and compute-expensive, a variety of approaches have been previously proposed that de-bias a pre-trained model. While the majority of current state-of-the-art debiasing methods focus on changes to the training regime, in this paper, we propose data intervention strategies as a powerful yet simple technique to reduce gender bias in pre-trained models. Specifically, we empirically show that by fine-tuning a pre-trained model on only 10 debiased (intervened) training examples, the tendency to favor any gender is significantly reduced. Since our proposed method only needs a few training examples, we argue that our few-shot de-biasing approach is highly feasible and practical. Through extensive experimentation, we show that our de-biasing technique performs better than competitive state-of-the-art baselines with minimal loss in language modeling ability.},
	urldate = {2024-07-29},
	booktitle = {Proceedings of the 61st {Annual} {Meeting} of the {Association} for {Computational} {Linguistics} ({Volume} 2: {Short} {Papers})},
	publisher = {Association for Computational Linguistics},
	author = {Thakur, Himanshu and Jain, Atishay and Vaddamanu, Praneetha and Liang, Paul Pu and Morency, Louis-Philippe},
	editor = {Rogers, Anna and Boyd-Graber, Jordan and Okazaki, Naoaki},
	month = jul,
	year = {2023},
	pages = {340--351},
}

Downloads: 0

{"_id":"ZzTCrHtvdSLgRhX75","bibbaseid":"thakur-jain-vaddamanu-liang-morency-languagemodelsgetagendermakeovermitigatinggenderbiaswithfewshotdatainterventions-2023","author_short":["Thakur, H.","Jain, A.","Vaddamanu, P.","Liang, P. P.","Morency, L."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","address":"Toronto, Canada","title":"Language Models Get a Gender Makeover: Mitigating Gender Bias with Few-Shot Data Interventions","shorttitle":"Language Models Get a Gender Makeover","url":"https://aclanthology.org/2023.acl-short.30","doi":"10.18653/v1/2023.acl-short.30","abstract":"Societal biases present in pre-trained large language models are a critical issue as these models have been shown to propagate biases in countless downstream applications, rendering them unfair towards specific groups of people. Since large-scale retraining of these models from scratch is both time and compute-expensive, a variety of approaches have been previously proposed that de-bias a pre-trained model. While the majority of current state-of-the-art debiasing methods focus on changes to the training regime, in this paper, we propose data intervention strategies as a powerful yet simple technique to reduce gender bias in pre-trained models. Specifically, we empirically show that by fine-tuning a pre-trained model on only 10 debiased (intervened) training examples, the tendency to favor any gender is significantly reduced. Since our proposed method only needs a few training examples, we argue that our few-shot de-biasing approach is highly feasible and practical. Through extensive experimentation, we show that our de-biasing technique performs better than competitive state-of-the-art baselines with minimal loss in language modeling ability.","urldate":"2024-07-29","booktitle":"Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)","publisher":"Association for Computational Linguistics","author":[{"propositions":[],"lastnames":["Thakur"],"firstnames":["Himanshu"],"suffixes":[]},{"propositions":[],"lastnames":["Jain"],"firstnames":["Atishay"],"suffixes":[]},{"propositions":[],"lastnames":["Vaddamanu"],"firstnames":["Praneetha"],"suffixes":[]},{"propositions":[],"lastnames":["Liang"],"firstnames":["Paul","Pu"],"suffixes":[]},{"propositions":[],"lastnames":["Morency"],"firstnames":["Louis-Philippe"],"suffixes":[]}],"editor":[{"propositions":[],"lastnames":["Rogers"],"firstnames":["Anna"],"suffixes":[]},{"propositions":[],"lastnames":["Boyd-Graber"],"firstnames":["Jordan"],"suffixes":[]},{"propositions":[],"lastnames":["Okazaki"],"firstnames":["Naoaki"],"suffixes":[]}],"month":"July","year":"2023","pages":"340–351","bibtex":"@inproceedings{thakurLanguageModelsGet2023,\n\taddress = {Toronto, Canada},\n\ttitle = {Language {Models} {Get} a {Gender} {Makeover}: {Mitigating} {Gender} {Bias} with {Few}-{Shot} {Data} {Interventions}},\n\tshorttitle = {Language {Models} {Get} a {Gender} {Makeover}},\n\turl = {https://aclanthology.org/2023.acl-short.30},\n\tdoi = {10.18653/v1/2023.acl-short.30},\n\tabstract = {Societal biases present in pre-trained large language models are a critical issue as these models have been shown to propagate biases in countless downstream applications, rendering them unfair towards specific groups of people. Since large-scale retraining of these models from scratch is both time and compute-expensive, a variety of approaches have been previously proposed that de-bias a pre-trained model. While the majority of current state-of-the-art debiasing methods focus on changes to the training regime, in this paper, we propose data intervention strategies as a powerful yet simple technique to reduce gender bias in pre-trained models. Specifically, we empirically show that by fine-tuning a pre-trained model on only 10 debiased (intervened) training examples, the tendency to favor any gender is significantly reduced. Since our proposed method only needs a few training examples, we argue that our few-shot de-biasing approach is highly feasible and practical. Through extensive experimentation, we show that our de-biasing technique performs better than competitive state-of-the-art baselines with minimal loss in language modeling ability.},\n\turldate = {2024-07-29},\n\tbooktitle = {Proceedings of the 61st {Annual} {Meeting} of the {Association} for {Computational} {Linguistics} ({Volume} 2: {Short} {Papers})},\n\tpublisher = {Association for Computational Linguistics},\n\tauthor = {Thakur, Himanshu and Jain, Atishay and Vaddamanu, Praneetha and Liang, Paul Pu and Morency, Louis-Philippe},\n\teditor = {Rogers, Anna and Boyd-Graber, Jordan and Okazaki, Naoaki},\n\tmonth = jul,\n\tyear = {2023},\n\tpages = {340--351},\n}\n\n","author_short":["Thakur, H.","Jain, A.","Vaddamanu, P.","Liang, P. P.","Morency, L."],"editor_short":["Rogers, A.","Boyd-Graber, J.","Okazaki, N."],"key":"thakurLanguageModelsGet2023","id":"thakurLanguageModelsGet2023","bibbaseid":"thakur-jain-vaddamanu-liang-morency-languagemodelsgetagendermakeovermitigatinggenderbiaswithfewshotdatainterventions-2023","role":"author","urls":{"Paper":"https://aclanthology.org/2023.acl-short.30"},"metadata":{"authorlinks":{}}},"bibtype":"inproceedings","biburl":"https://bibbase.org/f/vr5ooa48xeYes5KDD/ailaw.bib","dataSources":["7FkfQdR6FwGXEAZFa","QHxajSYCsDY5s5PEr"],"keywords":[],"search_terms":["language","models","gender","makeover","mitigating","gender","bias","few","shot","data","interventions","thakur","jain","vaddamanu","liang","morency"],"title":"Language Models Get a Gender Makeover: Mitigating Gender Bias with Few-Shot Data Interventions","year":2023}