Skip to content

Basic implementation of Deep Deterministic Policy Gradient (DDPG)

Notifications You must be signed in to change notification settings

mlcuva/ddpg_demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MLC @ UVA DDPG Baseline

MLC Logo

A Pytorch implementation of Deep Determinisitc Policy Gradient for simple continuous control tasks.

More info: Continuous control with deep reinforcement learning

Pretrained Agents

Watching a pretrained agent on Pendulum-v0:

python run.py --env Pendulum-v0 --agent saves/pretrained_pendulum --episodes 10

or on MountainCarContinuous-v0:

python run.py --env MountainCarContinuous-v0 --agent saves/pretrained_mountaincar --episodes 10

Train Agents

python ddpg.py

There are a ton of CL flags. See the bottom of ddpg.py for a full list, but here are the important ones:

  • --env is the gym environment id. Options are MountainCarContinuous-v0 and Pendulum-v0
  • --num_episodes is how many episodes of experience to collect during training. Defaults to 500.
  • --batch_size is how many sample transitions are passed through the networks at once during training. Defaults to 128. This may need to be reduced when running on CPUs.
  • --render is either 1 or 0. 1 lets you watch the agent as it learns. This slows the process down.

About

Basic implementation of Deep Deterministic Policy Gradient (DDPG)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages