Measuring Fairness of Text Classifiers via Prediction Sensitivity

Measuring Fairness of Text Classifiers via Prediction Sensitivity. Krishna, S., Gupta, R., Verma, A., Dhamala, J., Pruksachatkun, Y., & Chang, K. In Muresan, S., Nakov, P., & Villavicencio, A., editors, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5830–5842, Dublin, Ireland, May, 2022. Association for Computational Linguistics.

Paper doi abstract bibtex

With the rapid growth in language processing applications, fairness has emerged as an important consideration in data-driven solutions. Although various fairness definitions have been explored in the recent literature, there is lack of consensus on which metrics most accurately reflect the fairness of a system. In this work, we propose a new formulation – accumulated prediction sensitivity, which measures fairness in machine learning models based on the model's prediction sensitivity to perturbations in input features. The metric attempts to quantify the extent to which a single prediction depends on a protected attribute, where the protected attribute encodes the membership status of an individual in a protected group. We show that the metric can be theoretically linked with a specific notion of group fairness (statistical parity) and individual fairness. It also correlates well with humans' perception of fairness. We conduct experiments on two text classification datasets – Jigsaw Toxicity, and Bias in Bios, and evaluate the correlations between metrics and manual annotations on whether the model produced a fair outcome. We observe that the proposed fairness metric based on prediction sensitivity is statistically significantly more correlated with human annotation than the existing counterfactual fairness metric.

@inproceedings{krishna-etal-2022-measuring,
    title = "Measuring Fairness of Text Classifiers via Prediction Sensitivity",
    author = "Krishna, Satyapriya  and
      Gupta, Rahul  and
      Verma, Apurv  and
      Dhamala, Jwala  and
      Pruksachatkun, Yada  and
      Chang, Kai-Wei",
    editor = "Muresan, Smaranda  and
      Nakov, Preslav  and
      Villavicencio, Aline",
    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.acl-long.401",
    doi = "10.18653/v1/2022.acl-long.401",
    pages = "5830--5842",
    abstract = "With the rapid growth in language processing applications, fairness has emerged as an important consideration in data-driven solutions. Although various fairness definitions have been explored in the recent literature, there is lack of consensus on which metrics most accurately reflect the fairness of a system. In this work, we propose a new formulation {--} accumulated prediction sensitivity, which measures fairness in machine learning models based on the model{'}s prediction sensitivity to perturbations in input features. The metric attempts to quantify the extent to which a single prediction depends on a protected attribute, where the protected attribute encodes the membership status of an individual in a protected group. We show that the metric can be theoretically linked with a specific notion of group fairness (statistical parity) and individual fairness. It also correlates well with humans{'} perception of fairness. We conduct experiments on two text classification datasets {--} Jigsaw Toxicity, and Bias in Bios, and evaluate the correlations between metrics and manual annotations on whether the model produced a fair outcome. We observe that the proposed fairness metric based on prediction sensitivity is statistically significantly more correlated with human annotation than the existing counterfactual fairness metric.",
}

Downloads: 0

{"_id":"vv7SFaeYWJ7NupSDP","bibbaseid":"krishna-gupta-verma-dhamala-pruksachatkun-chang-measuringfairnessoftextclassifiersviapredictionsensitivity-2022","author_short":["Krishna, S.","Gupta, R.","Verma, A.","Dhamala, J.","Pruksachatkun, Y.","Chang, K."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","title":"Measuring Fairness of Text Classifiers via Prediction Sensitivity","author":[{"propositions":[],"lastnames":["Krishna"],"firstnames":["Satyapriya"],"suffixes":[]},{"propositions":[],"lastnames":["Gupta"],"firstnames":["Rahul"],"suffixes":[]},{"propositions":[],"lastnames":["Verma"],"firstnames":["Apurv"],"suffixes":[]},{"propositions":[],"lastnames":["Dhamala"],"firstnames":["Jwala"],"suffixes":[]},{"propositions":[],"lastnames":["Pruksachatkun"],"firstnames":["Yada"],"suffixes":[]},{"propositions":[],"lastnames":["Chang"],"firstnames":["Kai-Wei"],"suffixes":[]}],"editor":[{"propositions":[],"lastnames":["Muresan"],"firstnames":["Smaranda"],"suffixes":[]},{"propositions":[],"lastnames":["Nakov"],"firstnames":["Preslav"],"suffixes":[]},{"propositions":[],"lastnames":["Villavicencio"],"firstnames":["Aline"],"suffixes":[]}],"booktitle":"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","month":"May","year":"2022","address":"Dublin, Ireland","publisher":"Association for Computational Linguistics","url":"https://aclanthology.org/2022.acl-long.401","doi":"10.18653/v1/2022.acl-long.401","pages":"5830–5842","abstract":"With the rapid growth in language processing applications, fairness has emerged as an important consideration in data-driven solutions. Although various fairness definitions have been explored in the recent literature, there is lack of consensus on which metrics most accurately reflect the fairness of a system. In this work, we propose a new formulation – accumulated prediction sensitivity, which measures fairness in machine learning models based on the model's prediction sensitivity to perturbations in input features. The metric attempts to quantify the extent to which a single prediction depends on a protected attribute, where the protected attribute encodes the membership status of an individual in a protected group. We show that the metric can be theoretically linked with a specific notion of group fairness (statistical parity) and individual fairness. It also correlates well with humans' perception of fairness. We conduct experiments on two text classification datasets – Jigsaw Toxicity, and Bias in Bios, and evaluate the correlations between metrics and manual annotations on whether the model produced a fair outcome. We observe that the proposed fairness metric based on prediction sensitivity is statistically significantly more correlated with human annotation than the existing counterfactual fairness metric.","bibtex":"@inproceedings{krishna-etal-2022-measuring,\n title = \"Measuring Fairness of Text Classifiers via Prediction Sensitivity\",\n author = \"Krishna, Satyapriya and\n Gupta, Rahul and\n Verma, Apurv and\n Dhamala, Jwala and\n Pruksachatkun, Yada and\n Chang, Kai-Wei\",\n editor = \"Muresan, Smaranda and\n Nakov, Preslav and\n Villavicencio, Aline\",\n booktitle = \"Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)\",\n month = may,\n year = \"2022\",\n address = \"Dublin, Ireland\",\n publisher = \"Association for Computational Linguistics\",\n url = \"https://aclanthology.org/2022.acl-long.401\",\n doi = \"10.18653/v1/2022.acl-long.401\",\n pages = \"5830--5842\",\n abstract = \"With the rapid growth in language processing applications, fairness has emerged as an important consideration in data-driven solutions. Although various fairness definitions have been explored in the recent literature, there is lack of consensus on which metrics most accurately reflect the fairness of a system. In this work, we propose a new formulation {--} accumulated prediction sensitivity, which measures fairness in machine learning models based on the model{'}s prediction sensitivity to perturbations in input features. The metric attempts to quantify the extent to which a single prediction depends on a protected attribute, where the protected attribute encodes the membership status of an individual in a protected group. We show that the metric can be theoretically linked with a specific notion of group fairness (statistical parity) and individual fairness. It also correlates well with humans{'} perception of fairness. We conduct experiments on two text classification datasets {--} Jigsaw Toxicity, and Bias in Bios, and evaluate the correlations between metrics and manual annotations on whether the model produced a fair outcome. We observe that the proposed fairness metric based on prediction sensitivity is statistically significantly more correlated with human annotation than the existing counterfactual fairness metric.\",\n}\n\n","author_short":["Krishna, S.","Gupta, R.","Verma, A.","Dhamala, J.","Pruksachatkun, Y.","Chang, K."],"editor_short":["Muresan, S.","Nakov, P.","Villavicencio, A."],"key":"krishna-etal-2022-measuring","id":"krishna-etal-2022-measuring","bibbaseid":"krishna-gupta-verma-dhamala-pruksachatkun-chang-measuringfairnessoftextclassifiersviapredictionsensitivity-2022","role":"author","urls":{"Paper":"https://aclanthology.org/2022.acl-long.401"},"metadata":{"authorlinks":{}}},"bibtype":"inproceedings","biburl":"https://bibbase.org/network/files/c9YT3bPQbbKhRpNjQ","dataSources":["cHsbGgduWECnNiuXH","xyMawYs8pT6NpYmyA","oeAiTHqrkizzWaYvm","zBsRPq756yvs86a9i","h7kKWXpJh2iaX92T5"],"keywords":[],"search_terms":["measuring","fairness","text","classifiers","via","prediction","sensitivity","krishna","gupta","verma","dhamala","pruksachatkun","chang"],"title":"Measuring Fairness of Text Classifiers via Prediction Sensitivity","year":2022}