Skip to content

o-Oscar/rl_toolbox

Repository files navigation

rl_toolbox - Actual implementation of PPO for quadrupedal locomotion

The name rl_toolbox comes from the many implementations of RL algorithms that I tested before I setteled with PPO implemented by stable baseline 3.

This repo contains the environment as well as the helper algorithms necessary to train a quadruped in simulation and deploy the neural networks on a real machine.

Key features

  • Gym environment to simulate a quadruped (specifically Idef'X) using the simulator Erquy.
  • Helper functions to calculate IK and build a small library of motions for the RL to be based on.
  • Small reimplementation of PPO by stable baseline 3 to enforce symetrical gait and avoid very large kl-divergence.
  • Transfer algorithm to train a student that can only access measurable physical properties against a teacher trained in RL that is given the full space of observations. Inspiration heavily drawn from this paper.

Resulting policy

The baseline policy we can get is a simple walking policy, robust to small push on the robot.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published