Installation

This platform can be implemented on both windows and ubuntu if you have installed python3. Currently, there is no ROS-related packages. It only requires pytorch, numpy, and opencv.

Package version

Python: 3.8.10
numpy: 1.22.2
opencv: 4.7.0
pytorch: 1.13.1 + cu117 (cpu only also works for our platform)
matplotlib: 3.7.1

Installation

windows: Anaconda3 is recommended because it is convenient to install and uninstall. One can install some extra packages (torch et.al) after installing anaconda3
ubuntu: Ubuntu 20.04 is recommended because it has already integrated python3. However, anaconda3 is still required if your ubuntu version is lower than 20.04.
The version of PyTorch depends on the device. One can choose CPU only or a specified CUDA version. We have tested our code using different versions of PyTorch and they all works.

ReinforcementLearning

Currently, this repository consists of algorithm, common, datasave, environment, and simulation five directories.

Algorithm

Algorithm includes some commonly used reinforcement learning algorithms.
The following table lists RL algorithms in the corresponding directories.

Directory	Algorithm	Description
actor_critic	A2C DDPG SAC TD3	----
policy_base	PPO DPPO DPPO2	---- ---- does not work
value_base	DQN DoubleDQN DuelingDQN	----
rl_base	----	Basic class that inherited by other algorithms

Common

Common includes common_func.py and common_cls.py containing some basic functions.
The following table lists the contents of the two py files.

File	Description
common_cls.py	ReplayBuffer, RolloutBuffer, OUNoise, NeuralNetworks, etc
common_func.py	basic mathematical functions, geometry operations, etc

Datasave

Datasave saves networks trained by RL algorithms and some data files.

Environment

Environment contains some physical models, which are called 'environment' in RL.
The 'config' directory contains the **.xml file, the model description files of all environments.
The 'envs' directory covers the ODE of the physical environments.
The following table lists all the current environments.

Environment	Directory	Description
CartPole	./CartPole/	continuous, position and angle
CartPoleAngleOnly	./CartPole/	continuous, just angle
CartPoleAngleOnlyDiscrete	./CartPole/	discrete, just angle
FlightAttitudeSimulator	./FlightAttitudeSimulator/	discrete
FlightAttitudeSimulator2StateContinuous	./FlightAttitudeSimulator/	continuous, state are only theta and dtheta
FlightAttitudeSimulatorContinuous	./FlightAttitudeSimulator/	continuous
UAVHover	./UAV/	continuous, other files in ./UAV are not RL environments
UGVBidirectional	./UGV/	continuous, the vehicle can move forward and backward
UGVForward	./UGV/	continuous, the vehicle can only move forward
UGVForwardDiscrete	./UGV/	discrete, the vehicle can only move forward
UGVForwardObstacleContinuous	./UGV/	continuous, the vehicle needs to avoid obstacles
UGVForwardObstacleDiscrete	./UGV/	discrete, the vehicle needs to avoid obstacles
UGVForward_pid	./UGV_PID/	UGV forward with PID controller tuned by RL
UGVBidirectional_pid	./UGV_PID/	UGV bidirectional with PID controller tuned by RL
TwoLinkManipulator	./RobotManipulators/	continuous, full drive
BallBalancer1D	./RobotManipulators/	continuous, 1D ball balanced by a manipulator
Simulation

Simulation is the place where we implement our simulation experiments,
which means, using different algorithms in different environments.

Demos

Currently, we have the following well-trained controllers:

DDPG

A DDPG controller for

FlightAttitudeSimulator
UGVBidirectional (motion planner)
UGVForward (motion planner)
UGVForwardObstacleAvoidance (motion planner)

DQN

A DQN controller for

FlightAttitudeSimulator
SecondOrderIntegration
SecondOrderIntegration_Discrete

A Dueling DQN controller for

FlightAttitudeSimulator

TD3

A TD3 trajectory planner for:

UGVForwardObstacleAvoidance
CartPole
CartPoleAngleOnly
FlightAttitudeSimulator
SecondOrderIntegration
UGVForward_pid

PPO

A PPO controller for:

CartPoleAngleOnly
FlightAttitudeSimulator2State
SecondOrderIntegration_Discrete
UGVForward_pid
UGVBidirectional_pid
TwoLinkManipulator

DPPO

A DPPO controller for:

CartPoleAngleOnly
CartPole
FlightAttitudeSimulator2State
SecondOrderIntegration
UGVBidirectional_pid
TwoLinkManipulator
BallBalancer1D

Run the scripts

All runnable scripts are in './simulation/'.

A DQN controller for a flight attitude simulator.

In 'DQN-4-Flight-Attitude-Simulator.py', set: (set TRAIN to be True if you want to train a new controller)

 TRAIN = False
 RETRAIN = False
 TEST = not TRAIN

In command window:

cd simulation/DQN_based/
python3 DQN-4-Flight-Attitude-Simulator.py