Design of a game theoretic adaptive cruise control algorithm using a one player dynamic game model with full information structure. We apply a receding horizon control algorithm for predicting the actions of other players/drivers to yield the optimal acceleration.
For this project we constrain the autonomous driving challenge to one-dimension where the autonomous driver's objective is to achieve its set speed while avoiding collisions.
Fig.1 - Simplified Adaptive cruise control model.
With full information structure the autonomous driver (in red) is able to read the following state variables:
Fig.2 - Model state variables
With the state variables defined we can now formulate our ACC one-player dynamic game.
Fig.3 - One-Player Dynamic Game formulation
Where the action uk represents the action taken at time step k. However, how is the optimal uk that balances our objectives determined? For this challenge, we introduce a performance metric known as the cost function J(xk,uk)The cost function should penalize two main phenomena in this scenario:
- Events where a collision is likely
- Deviations from our velocity setpoint
For measuring the likelihood of a potential crash we use the time-to-collision metric (TTC) whose derivation can be shown in figure 4.
Fig.4 - TTC Derivation
TTC is simply the relative velocity over relative distance. This derivation only taking care of the positive and negative cases of relative distance and velocity so we reach a compact and correct form of the metric. The TTC metric is robust against scenarios where the driver has a vehicle converging on his/her tail at a high speed. It also works in the best interest of traffic overall since it deters against any premature breaking.
With our TTC defined, we insert the metric into an asymptotic function to map the TTC to a cost scalar.
Fig.5 - TTC to Cost Mapping
This asymptotic function enables us to exponentially penalize the TTC as it approaches zero (i.e. a collision). α in this equation is a weight that controls when the TTC threshold should start exploding to large scalars (i.e. start penalizing when TTC<15 seconds).
When we subsitute the TTC for the front and rear cars into this function we obtain the following cost function for avoiding collisions.
Fig.6 - Collsion likelihood cost function
The cost of deviation from our velocity can be represented simply through the following equation:
Fig.7 - Velocity Setpoint Cost Function
Where β represents a weight on how much we want the car to prioritize reaching the setpoint.
Finally, we acheive our cost function for which we want to minimize subject to our action uk.
Fig.8 - Complete ACC Cost Function.
Intuitively, if the action commits to the maximum acceleration it crashes into the front vehicle and if it commits to the minimum acceleration the rear vehicle will crash into it. In both cases the cost function explodes to infinity. So how do we find the middle ground that minimizes the cost function? We implement a receding horizon control algorithm that works as follows:
- Discretize the action space u∈[-amax,amax] into M distinct accerlations
- Read the state variables xk
- At time step k, simulate N time steps ahead for all possible actions, assuming the front and rear vehicles will commit to the same velocity for those time steps. Evaluate the states for each action time step pair.
- Evaluate the cost function for all states
5. Sum across the time steps and select the jth action that yields the lowest cost
- Apply action and iterate for next step.
For simulation we assume the following state initial condition and model parameters.
Initial States | Model Parameters | ||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
ACC_Demo.mp4
Fig.9 - Adaptive Cruise Control Demonstration
From this result we conclude that the ACC is a success! The ego vehicle working under the ACC algorithm, appropriately applies its maximum acceleration to avoid being rear ended. After avoiding a collision we can see that it is attempting to reach its setpoint of 100m/s but is limited to the front vehicle's position.
In summary, this ACC algorithm is simulates the performance of all actions before selecting the best action. During this simulation the algorithm assumes the other agents will commit to their speeds for the next N time steps and evaluates the states for all possible actions for N time steps. Once the states are calculated through simple kinematic equations, we evaluate the performance of action by applying the cost function along the trajectory of each action. Then we select the action associated with best performance (i.e. lowest cost).
With the logic of the game theoretic adaptive cruise control in place, some future improvements for this ACC include:
- Enable ego car to detect front and rear relative distances instead of just passing the values through in simulation
- Make distance and velocity estimates robust to noise/disturbances
- Propperly car's gas -> position plant
- Remodel cars to occupy some distances instead of just point agents
- Can include objective of conserving fuel or minimize braking
- Add additional minimum safety distances for margin of error
- Accurate models of humans and their driving policy
This was the final project of UCSB's ECE 270 course, Non-Cooperative Game Theory, taught by Professor Joao Hespanha fall 2021.