Model Predictive Control (MPC)

This blog contains a practical introduction to model predictive control featuring the ‘hacks’ we pulled off to win the IROS-RSJ Navigation and Manipulation Challenge. Most of these tricks were cooked up to solve a gird world problem but they are valid for continuous real world tasks. These in no way are the most elegant solutions (we were undergrads powered by caffeine and deadlines) but they work 😛. The goal here is to make MPC approachable and beginner-friendly. I’ve also thrown in simple ROS2 and TurtleBot code so you can get your hands dirty right away.

An analogy: Imagine you are an artist about to create/recreate a piece. You have a vision for the final piece. You start laying brushstrokes on your sketchbook focusing on one part of the drawing at a time. Along the way you lay brushstrokes that are not exactly what you intended them to be, maybe they are a little too thick, maybe in the wrong orientation or you just accidentally laid the brush at the wrong place. You observe these errors and work from there, planning your further brushstrokes accordingly. This continual adjustment process is key.

Model Predictive Control works on a similar principle of iterative correction and planning:

It solves an optimal control problem for a finite prediction horizon.

Using the current state as the initial state and the system model, a set of optimal control actions are calculated for the prediction horizon of which only the first control action is applied to the system.

The control problem is then again solved using the new state as the current state and the process is repeated.

The last step ensures that we are continually taking care of any uncertainties in the system model as the models are generally only an estimate of the real world dynamics. The beauty of this is that it allows us to get away with using simpler system models to reduce the complexity of high DOF systems like quadrupeds and humanoids.

MPC is generally quite robust to noises since it plans how to recover in finite time (our prediction horizon), rather than recovering in a single time step producing jerky movements.

Optimal Control Problem/ Goal	Final artwork
Prediction Horizon	The part of drawing you are currently focusing on
Applying the first control action	Single brushstroke
Uncertainties in the system model	Slightly off brushstrokes
Noise	Accidental or unintentional brushstrokes

A visual representation of discrete time MPC (By Martin Behrendt - Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=7963069)

Onto the mathematics:

The discrete time system dynamics of any system can be represented as:

x_{t+1}=f(x_t\ ,u_t)

$x_t$ → state at time $t$

$u_t$ → control action to be taken at time $t$

$f$ → state dynamics

As can be seen from the above graph, MPC can provide discontinuous control laws that are not easily achieved using conventional control techniques.

Note if, $x_{t+1}=f(x_t) + g(x_t)u_t$ → control affine dynamic update equation

Now let $l(x_i , u_i)$ be the cost-to-go at state $x_i$ applying a control action $u_i$ and $l(x_n)$ be the cost of being at state $x_n$ . Then the cost function $J$ is defined to be the sum of cost-to-go from the current and predicted states:

J\left(X\ ,\ U\right)\ =\ \sum_{i\ =\ 0}^{N-1}l\left(x_i\ ,\ u_i\right)\ +\ l\left(x_n\right)

$U$ → Vector containing the corresponding N-1 control actions

$X$ → Vector containing N states

The MPC control problem can now be framed as a constrained optimization problem:

\begin{split} min\ J\left(X\ ,\ U\right)\ &=\ \sum_{i\ =\ 0}^{N-1}l\left(x_i\ ,\ u_i\right)\ +\ l\left(x_n\right)\\ \text{s.t. } x_{i+1}\ &=\ f\left(x_i\ ,\ u_i\right) \end{split} \\ x_i\ \in\ \mathbb{X}\ \forall\ i\ \in\ \left[0,N\right] \text{ and} \\u_i\ \in\ \mathbb{U}\ \forall\ i\ \in\ \left[0,N-1\right]

$\mathbb{X}$ →Set of permitted action values $\mathbb{U}$ →Set of permitted control values

To solve the above optimization problem we will convert it into a Non-linear Programming Problem (NLP) and use the open source CasADi solver to find a numerical solution. We will be using two approaches namely Single Shooting and Multiple Shooting.

1) Single Shooting:

Let $W$ be a vector containing the optimization variables, i.e. $W=\left[u_0,u_1\ldots u_{N-1}\right]$ .

Given the initial state $x_0$ , we can now get the next N states (predicted) as a function of W as follows:

X\ =\ \left[\begin{matrix}x_0\\\begin{matrix}x_1=f\left(x_0,u_0\right)\\\begin{matrix}\vdots\\x_N=f\left(f\left(x_{N-2},u_{N-2}\right),u_{N-1}\right)\\\end{matrix}\\\end{matrix}\\\end{matrix}\right]

The NLP can now be represented as:

\begin{split} {_W ^{min\ }}&J(X(W),W) \\ \text{s.t. }& g\left(X(W),W\right)\le0 \end{split}

where the function $g$ is now used to represent the equality and inequality constraints for the NLP. The actuator limits and the state constraints are represented this way. The above equation can now be solved using any NLP solver.

Note:

Single shooting is simple and intuitive to implement.

However, it scales poorly with increasing prediction horizon $N$ due to the recursive computation of states

The state trajectory is not part of the optimization variables and must be recomputed from scratch each time.

This also makes warm-starting difficult, as only the control inputs (not states) can be initialized with previous solutions.

The recursive structure increases non-linearity, making the problem harder to solve as $N$ increases.

2) Multiple Shooting:

The key idea behind multiple shooting MPC is to break down the prediction horizon time interval into shorter partitions and add the system model as a state constraint at each optimization step instead of solving for it recursively. Basically, include both the control inputs and state variables in the optimization vector:

So, $W=\left[u_0,u_1\ldots u_{N-1},x_0,x_1\ldots x_N\right]$

Now the cost becomes a function of controls and the states ( $W$ ) unlike in single shooting where it was a recursive function of the optimization variables. This reformulation makes the NLP ‘less non-linear’ and hence decreases the computation time. Also, instead of computing the state trajectory recursively, the system dynamics are enforced as equality constraints at each step. This decouples the dynamics from the cost function and improves numerical stability.

Now the NLP problem will have two sets constraints, i.e. inequality and equality constraints:

$g_1$ → the controls and state constraints

$g_2$ → to make our predicted states equal to the actual states.

The NLP can now be formulated as:

\begin{split} &\quad\quad\quad\mathop{\min}\limits_{W}J(W)\\ \text{s.t. }& g_1\left(X,W\right)\le0, \text{ and} \\g_2=&\left[\begin{matrix}{\overline{x_0}-\ x}_0\\\begin{matrix}f\left(x_0,u_0\right)-x_1\\\begin{matrix}\vdots\\f(x_{N-1},\ u_{N-1}) - x_{N}\\\end{matrix}\\\end{matrix}\\\end{matrix}\right]=0 \end{split}

Benefits:

The full trajectory (controls and states) is optimized jointly, enabling better warm-starting.

The problem is typically less nonlinear compared to single shooting, reducing computation time.

More robust to poor initial guesses, especially in long horizons or unstable systems.

On to the hacks:

Here we will go through the hacks we used to augment MPC for better control. You can refer the code for implementation of these hacks The problem we solve looks like this:

The world our differential drive robot operates in

The co-ordinate system we assign to the robot

Depending on the reference state and control values used, a point tracking or a path tracking MPC can be deployed.

In point tracking, only the next point is provided as the reference.

In path tracking, the next $N$ points are provided as reference, with the initial point fixed as the robot’s current state.

We used a path tracking MPC for our purpose, giving it the next $N$ points from the path planning module. This allows the controller to follow a reference velocity across the entire trajectory instead of slowing down at every point and accelerating again, as typically seen in point tracking.

To get a more accurate estimate of the state trajectory, we use the 4th-order Runge-Kutta method instead of simple Euler discretization. (Note: refer to this amazing lecture by Prof. Zachary Manchester on why never to use Euler’s method for state prediction)

Adaptive Weights: Classical MPC uses a fixed set of weights for your robot, which often requires over-tuning for a specific hardware setup or environment. The problem with this is that even small changes, like deploying the same code on a similar robot or operating in a slightly different environment, can break performance. Adaptive weights help solve this by letting the controller adjust depending on the situation.

By selecting weights based on the path or environment the robot is going to face, the same MPC framework can generalize across similar systems and programmed scenarios. It also makes tuning much easier since you no longer need to overfit the weights to every specific setup. For instance:

The weight on linear velocity $R_{11}$ can be reduced if there's a straight path ahead.

The weight on yaw angle can be increased or equivalently, the weight on angular velocity actuation $R_{22}$ can be decreased if there's a sharp turn coming up.

To do this efficiently, we initialize the MPC problem in variable form offline to save computation. Then, instead of just passing the reference state and control values as parameters, we also append the weights to the parameter matrix. These are substituted into the MPC problem at runtime and passed to CasADi to compute the optimal control.

Resolution for Path Planning: We use a variable resolution for path planning, higher resolution during turns (as shown in the figure below) to maintain accuracy, and lower resolution for straight segments to save computation. This way, we reduce the number of waypoints the optimizer needs to track without losing important path information.

The reasoning here is simple: a straight path doesn’t require dense waypoints. As long as the yaw is maintained correctly, the robot will stay on course. We ensure this by assigning a higher weight to yaw in the cost function during straight-line segments.

This approach not only reduces computational load but also leads to a higher average linear speed, since the controller has fewer waypoints to track and doesn’t need to slow down as often to precisely match every intermediate target.

Corrective Turns: Paths generated in grid worlds often include sharp turns. In a standard path-tracking MPC, the controller tries to follow the next $N$ waypoints rather than just the immediate one. As a result, it tends to take a smooth arc through the turn instead of a sharp one. While this looks good in open spaces, it can be problematic when there are obstacles near the corner, since the oversmoothed trajectory can lead to collisions. Note that we have a separte planning module (refer to the paper) for waypoint calculation and adding the obstacles as constraints would increase the time it takes for the solver to compute the trajectory. So we instead employ a simpler hack.

While it’s possible to enforce tighter turns (like in-place rotations) by over-tuning or modifying the cost function, doing so usually destabilizes the controller in other scenarios, like straight or diagonal motion. We initially tried handling this by switching to point-tracking at tight corners, but that introduced its own set of problems like slowing down at the waypoints(speed was an important factor for winning :p) .

In our path-tracking MPC, we tried increasing the position cost in the $Q$ matrix to force tighter turns. Although this helped avoid obstacles, the resulting behavior became jerky and slower, especially at turns which wasn’t ideal. To fix this, we developed a corrective turning algorithm.

Here's how it works:

If the bot is still far from a turning point (distance greater than a threshold), we pre-process the path before feeding it to the controller.

The filter removes local waypoints just after the turn, effectively telling the controller to prioritize getting to the turn itself.

Once the bot is within the threshold distance, the filter is removed and the full path is restored.

The threshold is easy to set, it can be based on obstacle size or the local environment.

This simple filtering step, when combined with adaptive weights, led to smooth and fast behaviors, even in tight, obstacle-filled paths.You can clearly see the difference in the following figure:

The yellow dotted line shows the original orthogonal path from the discrete planner.

The red line shows the path produced by the MPC tracker without correction, it overshoots the corners.

The blue line, with turn correction and adaptive weights, gives the final smooth and safe trajectory

If you notice mistakes and errors in this post, please contact me at [shivamsood2000 at gmail dot com] and I will be more than happy to correct them :D

References:

[1] Amazing video and MATLAB implementation: https://github.com/MMehrez/MPC-and-MHE-implementation-in-MATLAB-using-Casadi

[2] Our code and paper for the winning solution in the IROS-RSJ Robotic Challenge:

Paper: https://arxiv.org/abs/2209.08663

Code: https://github.com/thisisjaskaran/multi-waypoint-indoor-navigation