Model Predictive Control

  • Model Predictive Control (MPC) is a feedback control algorithm that uses a model to make predictions about future outputs of a problem.
  • MPC is more convenient to use for Multiple-Input Multiple-Output (MIMO) systems than PID controllers because it is easily compatible with MIMO plants unlike PIDs where a lot of effort is needed to design  flows where certain outputs of the system influence certain inputs.
  • The vanilla MPC works as follows:
    • Simulate multiple trajectories for a specific time-horizon H that follow certain specified constraints
    • Use online optimization techniques to optimize trajectories and choose the best one
    • Move one time step along the chosen trajectory
    • Advance the horizon by one time step
    • Repeat entire cycle until the goal state is reached
  • Cost of online optimization is usually the bottleneck in MPCs as it is performed for every time step.
  • The optimization basically reduces the error between the reference trajectory and simulated trajectories.
  • A cost function J is used to determine the amount of deviation of a trajectory from the reference trajectory.
  • For each cycle, the trajectory with the smallest J is chosen.
  • Repeat the MPC cycle until the goal state is reached
  • MPC is considered a receding horizon control because the horizon is pushed forward by one time step in every cycle.
  • Some parameters for designing MPCs are
    • Sample Time: This is the rate at which the MPC algorithm is run. If it is too slow, the controller will have a slow reaction time. If it is too fast, the computation resources might be exhausted quickly
    • Prediction Horizon: This is the time period into the future that is simulated. If it is too short, the controller would not be able to foresee long into the future and act quickly. If it is too long, unexpected events in the future might happen which could invalidate the controller’s decision.
    • Control Horizon: It is the number of control moves to perform for a given control output prediction. Each control move can be thought of as a free variable to be computed by the optimizer. The higher the number, the more computational expense, the smaller the number, the less smooth the moves will be.
    • Constraints: These are restrictions to the simulated trajectories. Hard constraints are those that can absolutely NOT be violated. Soft constraints are more liberal and can be violated if necessary. It is generally not a good idea to have hard constraints on both inputs and outputs since that could prevent the optimizer from coming up with a feasible solution. A recommendation will be to only set soft constraints on the outputs and avoid having hard constraints on both the inputs and the rate of change of the inputs.
    • Weights:  Weights are a way to express the importance of certain goals of the controller. It is important to weight the input and output goals relative to each other so as to maintain smooth control.
  • There are various types of MPC controllers, based on the complexity of the plant model and the computation resources available. The prominent types are as follows:
    • Linear-Time-Invariant MPC: This is an MPC used to control linear plant dynamical systems with linear constraints and linear cost functions.
    • Adaptive MPC: This MPC is used for nonlinear plant models with linear constraints and linear cost functions. The plant model is linearized at each point of interest in an MPC cycle so that a regular LTI MPC can be used to control this linearized model. This linearized model only works well at that particular operating point so it is imperative that linearization of the plant is done at each time step. The structure of the optimization problem, i.e. the number of states and constraints remain the same at each operating point.
    • Gain-scheduled MPC: With this method, the states of interest are all linearized offline into multiple MPCs were each model has its own unique number of states and constraints. When running, a switching algorithm is used to select the appropriate MPC at each time step. This consumes a lot more memory than previous MPC types since multiple MPCs are stored in memory. It is however significantly computationally less expensive since all the optimization is performed offline and the only significant online computation is the operation of the switching algorithm.
    • Non-Linear MPC: This MPC is used for nonlinear plant models with nonlinear cost functions and nonlinear constraints. As such, optimization is usually  non-convex with many local optima and hard-to-find global optimum. Solving this optimization problem requires expending significant computation resources.
  • Some ways to speed-up an MPC are:
    • Reducing the order/complexity of the plant model
    • Reducing the prediction horizon
    • Reducing the control horizon
    • Reducing the number of constraints
    • Limit the number of iterations of the MPC cycle. This might lead to suboptimal solutions though.
  • Below is a video of the application of MPC with iterative LQR as the optimizer to control the joints of simulated robots to reach certain goal states in Mujoco Simulator.
  • I plan to implement similar behaviors with MPC + iLQG optimizer in OpenAI’s Roboschool env in Pybullet simulation environment since it is open-source and free, unlike Mujoco. I will detail my implementation journey, code and results in a future blog post.

SOURCES:

Leave a Reply

Your email address will not be published. Required fields are marked *