GitHub - Panjete/mujocoagents at 9bcb8cf6f293461ce1cdeb4c5b99df07016683e3

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
__pycache__		__pycache__
agents		agents
best_models		best_models
col864.egg-info		col864.egg-info
models		models
policies		policies
scripts		scripts
utils		utils
.DS_Store		.DS_Store
README.md		README.md
config.py		config.py
environment_lin.yml		environment_lin.yml
report.txt		report.txt
requirements.txt		requirements.txt
run.sh		run.sh
setup.py		setup.py

Repository files navigation

Hopper : input 11, out 3 Half Cheetah : input 17, out 6 Ant : input 27, out 8

ntrajs : 50, maxlentrajs: 100 , iters : 50, lossfunction:MSE , NN: linear, beta: 0.5 reward : 742
ntrajs : 50, maxlentrajs: 100 , iters : 50, lossfunction:MSE , NN: linear, beta: 1/(1+ sqrt(timesteps/1000)) reward : 720ish
training iterations need to be more : steady increase phase
training iters make reward peak at around 75. Then, rewards fall down - did go as high as 1500 when 100 timestamps
ntrajs : 50, maxlentrajs: 100 , iters : 75, lossfunction:MSE , NN: linear, beta: 1/(1+ timesteps/1000) reward : 2200ish
ntrajs : 50, maxlentrajs: 400 , iters : 75, lossfunction:MSE , NN: linear, beta: 1/(1+ timesteps/1000) reward : 2400ish - more stable near 75

ntrajs : 50, maxlentrajs: 400 , iters : 75, lossfunction:MSE , NN: linear, beta: 1/(1+ timesteps/1000) reward : 2400ish - had achieved this fairly early though?
ntrajs : 50, maxlentrajs: 400 , iters : 75, lossfunction:MSE , NN: linear, beta: 1/(1+ timesteps/1000), optimiser : ADAM reward : 2400ish - had achieved this fairly early though?