I See, Therefore I Do: Estimating Causal Effects for Image Treatments
A Thorat
R Kolla
* External authors
KDD-25
2025
Abstract
Causal effect estimation under observational studies is challenging due to the lack of ground truth data and treatment assignment bias. Though various methods exist in literature for addressing this problem, most of them ignore multi-dimensional treatment information by considering it as scalar, either continuous or discrete. Recently, certain works have demonstrated the utility of this rich yet complex treatment information into the estimation process, resulting in better causal effect estimation. However, these works have been demonstrated on either graphs or textual treatments. There is a notable gap in existing literature in addressing higher dimensional data such as images that has a wide variety of applications. In this work, we propose a model named NICE (Network for Image treatments Causal effect Estimation), for estimating individual causal effects when treatments are images. NICE demonstrates an effective way to use the rich multidimensional information present in image treatments that helps in obtaining improved causal effect estimates. To evaluate the performance of NICE, we propose a novel semi-synthetic data simulation framework that generates potential outcomes when images serve as treatments. Empirical results on these datasets, under various setups including the zero-shot case, demonstrate that NICE significantly outperforms existing models that incorporate treatment information for causal effect estimation.
Related Publications
Recommendation models enhance online user engagement by suggesting personalized content, boosting satisfaction and retention. Session-based Recommender systems (SR) have become a significant approach, focusing on capturing users' short-term preferences for more accurate reco…
In the era of digital streaming platforms, personalized movie recommendations, and genre prediction have become pivotal for enhancing user engagement and satisfaction. With the growing number of OTT (Over-The-Top) platforms like Netflix, Amazon Prime Video, and Disney+, the …
This research explores the efficacy of four state-of-the-art Large Language Models (LLMs): GPT-3.5-turbo-0301, Vicuna, PaLM 2, and Dolly in predicting (i) movie genres using audio transcripts of movie trailers and (ii) meta-information such as director and cast details using…
JOIN US
Shape the Future of AI with Sony AI
We want to hear from those of you who have a strong desire
to shape the future of AI.