Name		Name	Last commit message	Last commit date
parent directory ..
dir_duck_lr0.00003		dir_duck_lr0.00003
images		images
MinitaurDuck_SAC_h420_lr0.00003_b128.ipynb		MinitaurDuck_SAC_h420_lr0.00003_b128.ipynb
README.md		README.md
replay_memory.py		replay_memory.py
sac_agent.py		sac_agent.py

README.md

Project - MinitaurBulletDuckEnv with Soft Actor Critic (SAC)

Introduction

Solving the environment require an average total reward of over 5.0 over 100 consecutive episodes.
We solve the MinitaurBulletEnv environment in 12888 episodes, in 75 hours, by usage of the SAC algorithm,
see the basic paper SAC: Off-Policy Maximum Entropy Deep RL with a Stochastic Actor.

Training Score

Steps of episodes

Here is the graph of the average number of steps for 100 series.

The last few lines from the log

...
Ep.: 12780, Tot.St.: 3899776, Avg.Num.St.: 402.1, Min-Max.Sc.: (0.26, 12.11), Avg.Score: 4.503, Time: 74:22:39
Ep.: 12790, Tot.St.: 3903794, Avg.Num.St.: 413.3, Min-Max.Sc.: (0.26, 12.11), Avg.Score: 4.695, Time: 74:28:13
Ep.: 12800, Tot.St.: 3906399, Avg.Num.St.: 401.2, Min-Max.Sc.: (0.26, 12.11), Avg.Score: 4.539, Time: 74:31:50
Ep.: 12810, Tot.St.: 3909642, Avg.Num.St.: 397.4, Min-Max.Sc.: (0.26, 12.11), Avg.Score: 4.451, Time: 74:36:19
Ep.: 12820, Tot.St.: 3912919, Avg.Num.St.: 384.0, Min-Max.Sc.: (0.26, 12.11), Avg.Score: 4.304, Time: 74:40:52
Ep.: 12830, Tot.St.: 3917483, Avg.Num.St.: 384.9, Min-Max.Sc.: (1.18, 12.11), Avg.Score: 4.351, Time: 74:47:11
Ep.: 12840, Tot.St.: 3921720, Avg.Num.St.: 395.2, Min-Max.Sc.: (1.18, 12.11), Avg.Score: 4.434, Time: 74:53:04
Ep.: 12850, Tot.St.: 3927608, Avg.Num.St.: 413.8, Min-Max.Sc.: (0.56, 12.51), Avg.Score: 4.612, Time: 75:01:13
Ep.: 12860, Tot.St.: 3933131, Avg.Num.St.: 415.6, Min-Max.Sc.: (0.34, 12.51), Avg.Score: 4.642, Time: 75:08:51
Ep.: 12870, Tot.St.: 3938640, Avg.Num.St.: 438.9, Min-Max.Sc.: (0.34, 12.51), Avg.Score: 4.924, Time: 75:16:29
Ep.: 12880, Tot.St.: 3942984, Avg.Num.St.: 432.1, Min-Max.Sc.: (0.34, 12.51), Avg.Score: 4.845, Time: 75:22:32
Solved environment with Avg Score: 5.023

Trials not reaching the threshold

lr = 0.0005,
batch size = 128,
40000 episodes,
maximal vaue for average score = 2.6,

The graph of the average number of steps for 100 series.

Other SAC projects

Video

See video You can sleep while I drive, Minitaur with Duck on youtube.

Credit

Based on Pranjal Tandon's code (https://github.com/pranz24).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MinitaurDuck-Soft-Actor-Critic

MinitaurDuck-Soft-Actor-Critic

README.md

Project - MinitaurBulletDuckEnv with Soft Actor Critic (SAC)

Introduction

Training Score

Steps of episodes

The last few lines from the log

Trials not reaching the threshold

Other SAC projects

Video

Credit

Files

MinitaurDuck-Soft-Actor-Critic

Directory actions

More options

Directory actions

More options

Latest commit

History

MinitaurDuck-Soft-Actor-Critic

Folders and files

parent directory

README.md

Project - MinitaurBulletDuckEnv with Soft Actor Critic (SAC)

Introduction

Training Score

Steps of episodes

The last few lines from the log

Trials not reaching the threshold

Other SAC projects

Video

Credit