Authors

* External authors

Venue

Date

Share

Query by Activity Video in the Wild

Tao Hu*

William Thong

Pascal Mettes*

Cees Snoek*

* External authors

ICIP 2023

2023

Abstract

This paper considers retrieval of videos containing human activity from just a video query. In the literature, a common assumption is that all activities have sufficient labelled examples when learning an embedding for retrieval. However, this assumption does not hold in practice, as only a portion of activities have many examples, while other activities are only described by few examples. In this paper, we propose a visual-semantic embedding network that explicitly deals with the imbalanced scenario for activity retrieval. Our network contains two novel modules. The visual alignment module performs a global alignment between the input video and visual feature bank representations for all activities. The semantic module performs an alignment between the input video and semantic activity representations. By matching videos with both visual and semantic activity representations over all activities, we no longer ignore infrequent activities during retrieval. Experiments on a new imbalanced activity retrieval benchmark show the effectiveness of our proposal.

Related Publications

Ethical Considerations for Responsible Data Curation

NeurIPS, 2023
Jerone Andrews, Dora Zhao*, William Thong, Apostolos Modas, Orestis Papakyriakopoulos*, Alice Xiang

Human-centric computer vision (HCCV) data curation practices often neglect privacy and bias concerns, leading to dataset retractions and unfair models. HCCV datasets constructed through nonconsensual web scraping lack crucial metadata for comprehensive fairness and robustnes…

Beyond Skin Tone: A Multidimensional Measure of Apparent Skin Color

ICCV, 2023
William Thong, Przemyslaw Joniak*, Alice Xiang

This paper strives to measure apparent skin color in computer vision, beyond a unidimensional scale on skin tone. In their seminal paper Gender Shades, Buolamwini and Gebru have shown how gender classification systems can be biased against women with darker skin tones. While…

Augmented data sheets for speech datasets and ethical decision-making

FaccT, 2023
Orestis Papakyriakopoulos*, Anna Seo Gyeong Choi*, William Thong, Dora Zhao*, Jerone Andrews, Rebecca Bourke, Alice Xiang, Allison Koenecke*

Human-centric image datasets are critical to the development of computer vision technologies. However, recent investigations have foregrounded significant ethical issues related to privacy and bias, which have resulted in the complete retraction, or modification, of several …

JOIN US

Shape the Future of AI with Sony AI

We want to hear from those of you who have a strong desire
to shape the future of AI.