Authors
- W. Bradley Knox*
- Stephane Hatgis-Kessell*
- Sigurdur Orn Adalgeirsson*
- Serena Booth*
- Anca Dragan*
- Peter Stone
- Scott Niekum*
* External authors
Venue
- AAAI 2024
Date
- 2024
Learning Optimal Advantage from Preferences and Mistaking it for Reward
W. Bradley Knox*
Stephane Hatgis-Kessell*
Sigurdur Orn Adalgeirsson*
Serena Booth*
Anca Dragan*
Scott Niekum*
* External authors
AAAI 2024
2024
Abstract
We consider algorithms for learning reward functions from human preferences over pairs of trajectory segments---as used in reinforcement learning from human feedback (RLHF)---including those used to fine tune ChatGPT and other contemporary language models. Most recent work on such algorithms assumes that human preferences are generated based only upon the reward accrued within those segments, which we call their partial return function. But if this assumption is false because people base their preferences on information other than partial return, then what type of function is their algorithm learning from preferences? We argue that this function is better thought of as an approximation of the optimal advantage function, not as a partial return function as previously believed.
Related Publications
Racing autonomous cars faster than the best human drivers has been a longstanding grand challenge for the fields of Artificial Intelligence and robotics. Recently, an end-to-end deep reinforcement learning agent met this challenge in a high-fidelity racing simulator, Gran Tu…
Autonomous mobility tasks such as lastmile delivery require reasoning about operator indicated preferences over terrains on which the robot should navigate to ensure both robot safety and mission success. However, coping with out of distribution data from novel terrains or a…
Advances in artificial intelligence (AI) will transform many aspects of our lives and society, bringing immense opportunities but also posing significant risks and challenges. The next several decades may well be a turning point for humanity, comparable to the industrial rev…
JOIN US
Shape the Future of AI with Sony AI
We want to hear from those of you who have a strong desire
to shape the future of AI.