🐈‍⬛

Model Predictive Control (MPC)

This blog contains a practical introduction to model predictive control featuring the ‘hacks’ we pulled off to win the IROS-RSJ Navigation and Manipulation Challenge. Most of these tricks were cooked up to solve a gird world problem but they are valid for continuous real world tasks. These in no way are the most elegant solutions (we were undergrads powered by caffeine and deadlines) but they work 😛. The goal here is to make MPC approachable and beginner-friendly. I’ve also thrown in simple ROS2 and TurtleBot code so you can get your hands dirty right away.

An analogy: Imagine you are an artist about to create/recreate a piece. You have a vision for the final piece. You start laying brushstrokes on your sketchbook focusing on one part of the drawing at a time. Along the way you lay brushstrokes that are not exactly what you intended them to be, maybe they are  a little too thick, maybe in the wrong orientation or you just accidentally laid the brush at the wrong place. You observe these errors and work from there, planning your further brushstrokes accordingly. This continual adjustment process is key.

Model Predictive Control works on a similar principle of iterative correction and planning:

The last step ensures that we are continually taking care of any uncertainties in the system model as the models are generally only an estimate of the real world dynamics. The beauty of this is that it allows us to get away with using simpler system models to reduce the complexity of high DOF systems like quadrupeds and humanoids.

MPC is generally quite robust to noises since it plans how to recover in finite time (our prediction horizon), rather than recovering in a single time step producing jerky movements.

Optimal Control Problem/ GoalFinal artwork
Prediction HorizonThe part of drawing you are currently focusing on
Applying the first control actionSingle brushstroke
Uncertainties in the system modelSlightly off brushstrokes
NoiseAccidental or unintentional brushstrokes

A visual representation of discrete time MPC (By Martin Behrendt - Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=7963069)

Onto the mathematics:

The discrete time system dynamics of any system can be represented as:

xt+1=f(xt ,ut)x_{t+1}=f(x_t\ ,u_t)

xtx_t → state at time tt

utu_t → control action to be taken at time tt

ff → state dynamics

As can be seen from the above graph, MPC can provide discontinuous control laws that are not easily achieved using conventional control techniques.

Note if, xt+1=f(xt)+g(xt)utx_{t+1}=f(x_t) + g(x_t)u_t → control affine dynamic update equation

Now let l(xi,ui)l(x_i , u_i) be the cost-to-go at state xix_i applying a control action uiu_i and l(xn)l(x_n) be the cost of being at state xnx_n. Then the cost function JJ is defined to be the sum of cost-to-go from the current and predicted states:

J(X , U) = i = 0N1l(xi , ui) + l(xn)J\left(X\ ,\ U\right)\ =\ \sum_{i\ =\ 0}^{N-1}l\left(x_i\ ,\ u_i\right)\ +\ l\left(x_n\right)

UU → Vector containing the corresponding N-1 control actions

XX → Vector containing N states

The MPC control problem can now be framed as a constrained optimization problem:

min J(X , U) = i = 0N1l(xi , ui) + l(xn)s.t. xi+1 = f(xi , ui)xi  X  i  [0,N] andui  U  i  [0,N1] \begin{split} min\ J\left(X\ ,\ U\right)\ &=\ \sum_{i\ =\ 0}^{N-1}l\left(x_i\ ,\ u_i\right)\ +\ l\left(x_n\right)\\ \text{s.t. } x_{i+1}\ &=\ f\left(x_i\ ,\ u_i\right) \end{split} \\ x_i\ \in\ \mathbb{X}\ \forall\ i\ \in\ \left[0,N\right] \text{ and} \\u_i\ \in\ \mathbb{U}\ \forall\ i\ \in\ \left[0,N-1\right]

X\mathbb{X} →Set of permitted action values U\mathbb{U} →Set of permitted control values

To solve the above optimization problem we will convert it into a Non-linear Programming Problem (NLP) and use the open source CasADi solver to find a numerical solution. We will be using two approaches namely Single Shooting and Multiple Shooting.

1) Single Shooting:

Let WW be a vector containing the optimization variables, i.e. W=[u0,u1uN1]W=\left[u_0,u_1\ldots u_{N-1}\right].

Given the initial state x0x_0, we can now get the next N states (predicted) as a function of W as follows:

  X = [x0x1=f(x0,u0)xN=f(f(xN2,uN2),uN1)]    X\ =\ \left[\begin{matrix}x_0\\\begin{matrix}x_1=f\left(x_0,u_0\right)\\\begin{matrix}\vdots\\x_N=f\left(f\left(x_{N-2},u_{N-2}\right),u_{N-1}\right)\\\end{matrix}\\\end{matrix}\\\end{matrix}\right]

The NLP can now be represented as:

Wmin J(X(W),W)s.t. g(X(W),W)0 \begin{split} {_W ^{min\ }}&J(X(W),W) \\ \text{s.t. }& g\left(X(W),W\right)\le0 \end{split}

where the function gg is now used to represent the equality and inequality constraints for the NLP. The actuator limits and the state constraints are represented this way. The above equation can now be solved using any NLP solver.

Note:

  • Single shooting is simple and intuitive to implement.
  • However, it scales poorly with increasing prediction horizon NN due to the recursive computation of states
  • The state trajectory is not part of the optimization variables and must be recomputed from scratch each time.
  • This also makes warm-starting difficult, as only the control inputs (not states) can be initialized with previous solutions.
  • The recursive structure increases non-linearity, making the problem harder to solve as NN increases.

2) Multiple Shooting:

The key idea behind multiple shooting MPC is to break down the prediction horizon time interval into shorter partitions and add the system model as a state constraint at each optimization step instead of solving for it recursively. Basically, include both the control inputs and state variables in the optimization vector:

So,   W=[u0,u1uN1,x0,x1xN]    W=\left[u_0,u_1\ldots u_{N-1},x_0,x_1\ldots x_N\right]

Now the cost becomes a function of controls and the states (WW) unlike in single shooting where it was a recursive function of the optimization variables. This reformulation makes the NLP ‘less non-linear’ and hence decreases the computation time. Also, instead of computing the state trajectory recursively, the system dynamics are enforced as equality constraints at each step. This decouples the dynamics from the cost function and improves numerical stability.

Now the NLP problem will have two sets constraints, i.e. inequality and equality constraints:

g1g_1 → the controls and state constraints

g2g_2 → to make our predicted states equal to the actual states.


The NLP can now be formulated as:

minWJ(W)s.t. g1(X,W)0, andg2=[x0 x0f(x0,u0)x1f(xN1, uN1)xN]=0 \begin{split} &\quad\quad\quad\mathop{\min}\limits_{W}J(W)\\ \text{s.t. }& g_1\left(X,W\right)\le0, \text{ and} \\g_2=&\left[\begin{matrix}{\overline{x_0}-\ x}_0\\\begin{matrix}f\left(x_0,u_0\right)-x_1\\\begin{matrix}\vdots\\f(x_{N-1},\ u_{N-1}) - x_{N}\\\end{matrix}\\\end{matrix}\\\end{matrix}\right]=0 \end{split}

Benefits:

On to the hacks:

Here we will go through the hacks we used to augment MPC for better control. You can refer the code for implementation of these hacks The problem we solve looks like this:

The world our differential drive robot operates in
The co-ordinate system we assign to the robot

Depending on the reference state and control values used, a point tracking or a path tracking MPC can be deployed.

We used a path tracking MPC for our purpose, giving it the next NN points from the path planning module. This allows the controller to follow a reference velocity across the entire trajectory instead of slowing down at every point and accelerating again, as typically seen in point tracking.

To get a more accurate estimate of the state trajectory, we use the 4th-order Runge-Kutta method instead of simple Euler discretization. (Note: refer to this amazing lecture by Prof. Zachary Manchester on why never to use Euler’s method for state prediction)

Adaptive Weights: Classical MPC uses a fixed set of weights for your robot, which often requires over-tuning for a specific hardware setup or environment. The problem with this is that even small changes, like deploying the same code on a similar robot or operating in a slightly different environment, can break performance. Adaptive weights help solve this by letting the controller adjust depending on the situation.

By selecting weights based on the path or environment the robot is going to face, the same MPC framework can generalize across similar systems and programmed scenarios. It also makes tuning much easier since you no longer need to overfit the weights to every specific setup. For instance:

To do this efficiently, we initialize the MPC problem in variable form offline to save computation. Then, instead of just passing the reference state and control values as parameters, we also append the weights to the parameter matrix. These are substituted into the MPC problem at runtime and passed to CasADi to compute the optimal control.

Resolution for Path Planning: We use a variable resolution for path planning, higher resolution during turns (as shown in the figure below) to maintain accuracy, and lower resolution for straight segments to save computation. This way, we reduce the number of waypoints the optimizer needs to track without losing important path information.

The reasoning here is simple: a straight path doesn’t require dense waypoints. As long as the yaw is maintained correctly, the robot will stay on course. We ensure this by assigning a higher weight to yaw in the cost function during straight-line segments.

This approach not only reduces computational load but also leads to a higher average linear speed, since the controller has fewer waypoints to track and doesn’t need to slow down as often to precisely match every intermediate target.

Adjustment of waypoint resolution

Corrective Turns: Paths generated in grid worlds often include sharp turns. In a standard path-tracking MPC, the controller tries to follow the next NN waypoints rather than just the immediate one. As a result, it tends to take a smooth arc through the turn instead of a sharp one. While this looks good in open spaces, it can be problematic when there are obstacles near the corner, since the oversmoothed trajectory can lead to collisions. Note that we have a separte planning module (refer to the paper) for waypoint calculation and adding the obstacles as constraints would increase the time it takes for the solver to compute the trajectory. So we instead employ a simpler hack.

While it’s possible to enforce tighter turns (like in-place rotations) by over-tuning or modifying the cost function, doing so usually destabilizes the controller in other scenarios, like straight or diagonal motion. We initially tried handling this by switching to point-tracking at tight corners, but that introduced its own set of problems like slowing down at the waypoints(speed was an important factor for winning :p) .

In our path-tracking MPC, we tried increasing the position cost in the QQ matrix to force tighter turns. Although this helped avoid obstacles, the resulting behavior became jerky and slower, especially at turns which wasn’t ideal. To fix this, we developed a corrective turning algorithm.

Here's how it works:

This simple filtering step, when combined with adaptive weights, led to smooth and fast behaviors, even in tight, obstacle-filled paths.You can clearly see the difference in the following figure:

If you notice mistakes and errors in this post, please contact me at [shivamsood2000 at gmail dot com] and I will be more than happy to correct them :D

References:

[1] Amazing video and MATLAB implementation: https://github.com/MMehrez/MPC-and-MHE-implementation-in-MATLAB-using-Casadi

[2] Our code and paper for the winning solution in the IROS-RSJ Robotic Challenge: