-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Have you tried using multiple cpu on the Example here in A2C? #25
Comments
Use |
It work but the render_all would not work even using the env.env_method(method_name='render_all') to call the method. |
Use this code instead: for e in env.envs:
plt.figure(figsize=(16, 6))
e.render_all()
plt.show() |
There is a fact you should consider: from copy import deepcopy
def step_wait(self):
for env_idx in range(self.num_envs):
obs, self.buf_rews[env_idx], self.buf_dones[env_idx], self.buf_infos[env_idx] =\
self.envs[env_idx].step(self.actions[env_idx])
if self.buf_dones[env_idx]:
# save final observation where user can get it, then reset
self.buf_infos[env_idx]['terminal_observation'] = obs
# obs = self.envs[env_idx].reset()
self._save_obs(env_idx, obs)
return (self._obs_from_buf(), np.copy(self.buf_rews), np.copy(self.buf_dones),
deepcopy(self.buf_infos))
DummyVecEnv.step_wait = step_wait |
Hello, Removing the .reset in Dummyvec end results this error... This happends at timesteps = 32000 current_price = self.prices[self._current_tick]
IndexError: index 2335 is out of bounds for axis 0 with size 2335 I think that is the length of the data frame. Error:
|
from copy import deepcopy
import numpy as np
import pandas as pd
import gym
import gym_anytrading
import quantstats as qs
from stable_baselines import A2C
from stable_baselines.common.vec_env import DummyVecEnv
import matplotlib.pyplot as plt
df = gym_anytrading.datasets.STOCKS_GOOGL.copy()
window_size = 10
start_index = window_size
end_index = len(df)
env_maker = lambda: gym.make(
'stocks-v0',
df = df,
window_size = window_size,
frame_bound = (start_index, end_index)
)
env = DummyVecEnv([env_maker for _ in range(16)])
policy_kwargs = dict(net_arch=[64, 'lstm', dict(vf=[128, 128, 128], pi=[64, 64])])
model = A2C('MlpLstmPolicy', env, verbose=1, policy_kwargs=policy_kwargs)
model.learn(total_timesteps=1000)
class DummyVecEnv2(DummyVecEnv):
def step_wait(self):
for env_idx in range(self.num_envs):
obs, self.buf_rews[env_idx], self.buf_dones[env_idx], self.buf_infos[env_idx] = self.envs[env_idx].step(self.actions[env_idx])
if self.buf_dones[env_idx]:
# save final observation where user can get it, then reset
self.buf_infos[env_idx]['terminal_observation'] = obs
# obs = self.envs[env_idx].reset()
self._save_obs(env_idx, obs)
return (self._obs_from_buf(), np.copy(self.buf_rews), np.copy(self.buf_dones),
deepcopy(self.buf_infos))
env = DummyVecEnv2([env_maker for i in range(16)])
observation = env.reset()
while True:
# observation = observation[np.newaxis, ...]
# action = env.action_space.sample()
action, _states = model.predict(observation)
observation, reward, done, info = env.step(action)
# env.render()
if done.all():
print("info:", info)
break
for e in env.envs:
plt.figure(figsize=(16, 6))
e.render_all()
plt.show() |
You are a Guru! It works now. What you did was after learning, override the DummyvecEnv by removing the reset. Am i correct? |
Thanks man :) Yeah, somehow, but I didn't override |
I am trying to use multiple cpu for the example provided on this link?
I tried to change the environment to multiple cpu.
But I have a problem in the done and info in stable baselines. It seems they turned into arrays.
There is an error in this code: any suggestions or any of you done this? It seems lstm in stable baselines are like this.
The text was updated successfully, but these errors were encountered: