I’m on a journey through the murky lairs of model-based learning. I explored model-based control methods, mainly MPCs over the weekend and I have since tried to implement a paper by Yuval Tassa, Tom Erez and Emanuel Todorov on Synthesis and Stabilization of Complex Behaviors through Online Trajectory Optimization. In this work, they used MPC with iLQR as its optimizer to get complex humanoids to perform short horizon tasks such as standing up and recovering its posture in the face of external perturbations. I hoped to implement this work, or at least look up an already existing implementation . To my chagrin, the implementations I found weren’t exactly beginner-friendly and would probably take a while to understand and reproduce. So I decided to take it slowly. One line at a time
I came across OpenAI’s python wrapper to Mujoco, Mujoco-py in my search and I found it pretty easy to use as most of the gory details seemed to have been abstracted away. So I decided to spend some time exploring its salient features before I try to implement MPC to control a humanoid in the environment. To start, I run a few of their examples and went through their corresponding python scripts. The example that seemed close enough to the joint control applications I was interested in was the Tosser example. I found the script for this example pretty easy to understand so I thought it wise to add in some comments about what each line’s purpose in the script was.
Here’s what the simulation does when tosser.py is run:
Here’s my commented version of the tosser.py script in the Mujocopy Github repository