Authors

* External authors

Venue

Date

Share

"What We Can’t Measure, We Can’t Understand": Challenges to Demographic Data Procurement in the Pursuit of Fairness

McKane Andrus*

Elena Spitzer*

Jeffrey Brown*

Alice Xiang

* External authors

FAccT-2021

2021

Abstract

As calls for fair and unbiased algorithmic systems increase, so too does the number of individuals working on algorithmic fairness in industry. However, these practitioners often do not have access to the demographic data they feel they need to detect bias in practice. Even with the growing variety of toolkits and strategies for working towards algorithmic fairness, they almost invariably require access to demographic attributes or proxies. We investigated this dilemma through semi-structured interviews with 38 practitioners and professionals either working in or adjacent to algorithmic fairness. Participants painted a complex picture of what demographic data availability and use look like on the ground, ranging from not having access to personal data of any kind to being legally required to collect and use demographic data for discrimination assessments. In many domains, demographic data collection raises a host of difficult questions, including how to balance privacy and fairness, how to define relevant social categories, how to ensure meaningful consent, and whether it is appropriate for private companies to infer someone’s demographics. Our research suggests challenges that must be considered by businesses, regulators, researchers, and community groups in order to enable practitioners to address algorithmic bias in practice. Critically, we do not propose that the overall goal of future work should be to simply lower the barriers to collecting demographic data. Rather, our study surfaces a swath of normative questions about how, when, and whether this data should be procured, and, in cases where it is not, what should still be done to mitigate bias.

Related Publications

A View From Somewhere: Human-Centric Face Representations

ICLR, 2023
Jerone T. A. Andrews, Przemyslaw Joniak, Alice Xiang

Few datasets contain self-identified sensitive attributes, inferring attributes risks introducing additional biases, and collecting attributes can carry legal risks. Besides, categorical labels can fail to reflect the continuous nature of human phenotypic diversity, making i…

Considerations for Ethical Speech Recognition Datasets

WSDM, 2023
Orestis Papakyriakopoulos, Alice Xiang

Speech AI Technologies are largely trained on publicly available datasets or by the massive web-crawling of speech. In both cases, data acquisition focuses on minimizing collection effort, without necessarily taking the data subjects’ protection or user needs into considerat…

Causality for Temporal Unfairness Evaluation and Mitigation

NeurIPS, 2022
Aida Rahmattalabi, Alice Xiang

Recent interests in causality for fair decision-making systems has been accompanied with great skepticism due to practical and epistemological challenges with applying existing causal fairness approaches. Existing works mainly seek to remove the causal effect of social categ…

  • HOME
  • Publications
  • "What We Can’t Measure, We Can’t Understand": Challenges to Demographic Data Procurement in the Pursuit of Fairness

JOIN US

Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.