Authors
- Miguel Vasco*
- Takuma Seno
- Kenta Kawamoto
- Kaushik Subramanian
- Pete Wurman
- Peter Stone
* External authors
Date
- 2024
A Super-human Vision-based Reinforcement Learning Agent for Autonomous Racing in Gran Turismo
Miguel Vasco*
* External authors
2024
Reinforcement Learning Conference (RLC), 2024
*Equal Contribution, †Work done during his
internship at Tokyo Laboratory, Sony AI.
Abstract
Racing autonomous cars faster than the best human drivers has been a longstanding grand challenge for the fields of Artificial Intelligence and robotics. Recently, an end-to-end deep reinforcement learning agent met this challenge in a high-fidelity racing simulator, Gran Turismo. However, this agent relied on global features that require instrumentation external to the car. This paper introduces, to the best of our knowledge, the first super-human car racing agent whose sensor input is purely local to the car, namely pixels from an ego-centric camera view and quantities that can be sensed from on-board the car, such as the car's velocity. By leveraging global features only at training time, the learned agent is able to outperform the best human drivers in time trial (one car on the track at a time) races using only local input features. The resulting agent is evaluated in Gran Turismo 7 on multiple tracks and cars. Detailed ablation experiments demonstrate the agent's strong reliance on visual inputs, making it the first vision-based super-human car racing agent.
Designing a Super-Human Vision Racing Agent
Observations
We build multimodal observations of our racing agent with local and global features. For local features, we consider the image and propriocentric information (e.g., velocity and acceleration of the car). For global features we consider information related to the shape to the course.
Actions
Our agent outputs a delta steering angle and a combined throttle and brake value. The gear shift of the vehicle is controlled by automatic transmission. We set the control frequency to 10 Hz and the game, running at 60 Hz, linearly interpolates the steering angle between steps.
Reward Function
We designed a multi-component reward function for the agent that rewards track progression and penalizes collisions, off-track driving, and inconsistent driving.
Training Scheme
We exploit an asymmetric actor-critic architecture, in which the critic function uses different input modalities from those for the policy function, to train our agent: the policy network \(\pi_{\phi}\) is provided with proprioceptic information \(o^p\) and image features \(h^I\), encoded with a convolutional neural network \(q_{\theta}\), to output actions \(\mathbf{a}\). The critic network \(Q_{\psi}\) is provided with local proprioceptic observations and global observations \(o^g\) to predict quantiles of future returns. During execution, our agent only receives local features from the Gran Turismo™ 7 simulator.
Results
We evaluate our agent in time trial races, where the goal is to complete a lap across the track in the minimum time possible. We consider three scenarios in Gran Turismo™ 7 with different combinations of cars, tracks, and conditions (track time and weather): Monza, Tokyo, and Spa, modeled after real-world circuits and roads.
We compare our agent against over 130K human drivers in each scenario. In all three scenarios, our agent achieves super-human performance, with lap times that significantly surpass the performance of the best human player.
In the paper we additionally highlight: (i) the importance of the asymmetrical training scheme; (ii) novel driving behavior in comparison with the best human reference drivers, and (iii) the strong dependence on image features for the decision-making of our agent.
BibTeX
@InProceedings{RLC24-sophy,
author="Miguel Vasco and Takuma Seno and Kenta Kawamoto and Kaushik Subramanian and PeterR.\ Wurman and Peter Stone",
title="A Super-human Vision-based Reinforcement Learning Agent for Autonomous Racing in {G}ran {T}urismo",
booktitle="Reinforcement Learning Conference (RLC)",
month="August",
year="2024",
location="Amherst, MA, USA",
abstract={Racing autonomous cars faster than the best human drivers
has been a longstanding grand challenge for the fields of
Artificial Intelligence and robotics. Recently, an
end-to-end deep reinforcement learning agent met this
challenge in a high-fidelity racing simulator, Gran
Turismo. However, this agent relied on global features
that require instrumentation external to the car. This
paper introduces, to the best of our knowledge, the first
super-human car racing agent whose sensor input is purely
local to the car, namely pixels from an ego-centric camera
view and quantities that can be sensed from on-board the
car, such as the car's velocity. By leveraging global
features only at training time, the learned agent is able
to outperform the best human drivers in time trial (one
car on the track at a time) races using only local input
features. The resulting agent is evaluated in Gran Turismo
7 on multiple tracks and cars. Detailed ablation
experiments demonstrate the agent's strong reliance on
visual inputs, making it the first vision-based
super-human car racing agent.}
},
Related Publications
Current approaches to learning cooperative multi-agent behaviors assume relatively restrictive settings. In standard fully cooperative multi-agent reinforcement learning, the learning algorithm controls all agents in the scenario, while in ad hoc teamwork, the learning algor…
The ability to approach the same problem from different angles is a cornerstone of human intelligence that leads to robust solutions and effective adaptation to problem variations. In contrast, current RL methodologies tend to lead to policies that settle on a single solutio…
Autonomous racing poses a significant challenge for control, requiring planning minimum-time trajectories under uncertain dynamics and controlling vehicles at their handling limits. Current methods requiring hand-designed physical models or reward functions specific to each …
JOIN US
Shape the Future of AI with Sony AI
We want to hear from those of you who have a strong desire
to shape the future of AI.