Authors

* External authors

Venue

Date

Share

A View From Somewhere: Human-Centric Face Representations

Jerone T. A. Andrews

Przemyslaw Joniak*

Alice Xiang

* External authors

ICLR 2023

2023

Abstract

Few datasets contain self-identified sensitive attributes, inferring attributes risks introducing additional biases, and collecting attributes can carry legal risks. Besides, categorical labels can fail to reflect the continuous nature of human phenotypic diversity, making it difficult to compare the similarity between same-labeled faces. To address these issues, we present A View From Somewhere (AVFS)—a dataset of 638,180 human judgments of face similarity. We demonstrate the utility of AVFS for learning a continuous, low-dimensional embedding space aligned with human perception. Our embedding space, induced under a novel conditional framework, not only enables the accurate prediction of face similarity, but also provides a human-interpretable decomposition of the dimensions used in the human-decision making process, and the importance distinct annotators place on each dimension. We additionally show the practicality of the dimensions for collecting continuous attributes, performing classification, and comparing dataset attribute disparities.

Related Publications

GenDataAgent: On-the-fly Dataset Augmentation with Synthetic Data

ICLR, 2026
Zhiteng Li, Lele Chen, Jerone Andrews, Yunhao Ba, Yulun Zhang, Alice Xiang

We propose a generative agent that augments training datasets with synthetic datafor model fine-tuning. Unlike prior work, which uniformly samples synthetic data,our agent iteratively generates relevant samples on-the-fly, aligning with the targetdistribution. It prioritizes…

Responsibly Training Foundation Models: Actualizing Ethical Principles for Curating Large-Scale Training Datasets in the Era …

ACM SIGCHI, 2025
Morgan Klaus Scheuerman, Dora Zhao*, Jerone T. A. Andrews, Abeba Birhane, Q. Vera Liao*, Georgia Panagiotidou*, Pooja Chitre*, Kathleen Pine, Shawn Walker*, Jieyu Zhao*, Alice Xiang

AI technologies have become ubiquitous, influencing domains from healthcare to finance and permeating our daily lives. Concerns about the values underlying the creation and use of datasets to develop AI technologies are growing. Current dataset practices often disregard crit…

A Taxonomy of Challenges to Curating Fair Datasets

NeurIPS, 2024
Dora Zhao*, Morgan Klaus Scheuerman, Pooja Chitre*, Jerone Andrews, Georgia Panagiotidou*, Shawn Walker*, Kathleen H. Pine*, Alice Xiang

Despite extensive efforts to create fairer machine learning (ML) datasets, there remains a limited understanding of the practical aspects of dataset curation. Drawing from interviews with 30 ML dataset curators, we present a comprehensive taxonomy of the challenges and trade…

JOIN US

Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.