Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] Stable Baseline render vectorized forex enviroment #1

Closed
simonesalvucci opened this issue Jan 29, 2020 · 8 comments
Closed

Comments

@simonesalvucci
Copy link

Hello, I have a question...

I'm currently using the stable baselines library to train a model using your 'forex-v0' environment.

env = DummyVecEnv([lambda: gym.make('forex-v0', frame_bound=(10, 500), window_size=10)])
policy_kwargs = dict(net_arch=[64, 'lstm',dict(vf=[128,128,128], pi=[64,64])])
model = A2C("MlpLstmPolicy", env, verbose=1, policy_kwargs=policy_kwargs)
model.learn(total_timesteps=5000)

After training the model I perform a test using your code:

observation = env.reset()
while True:
        action = model.predict(observation)
        observation, reward, done, info = env.step(action)
        # env.render()
        if done:
            print("info:", info)
            break

# Plotting results
plt.cla()
env.render_all()
plt.show()

But unfortunately I get a DummyVecEnv has no render_all() which makes sense to me because now the environment is in a Vector.
The thing I don't understand is how I can call env.render_all() in the Vector.
My confusion it's because when I call env.render() everything works fine, but not when I call env.render_all()

@AminHP
Copy link
Owner

AminHP commented Jan 29, 2020

Hi.
I took a look at the source code of the DummyVecEnv here. It seems you should use env.envs[0].render_all().

I didn't test it, so please check it out and let me know if it works.

@simonesalvucci
Copy link
Author

simonesalvucci commented Jan 30, 2020

Thank you @AminHP

It worked, but there was another issue in render_all().

line 140 in render_all
    if self._position_history[i] == Positions.Short:
IndexError: list index out of range

@AminHP AminHP closed this as completed in 1507088 Jan 30, 2020
AminHP added a commit that referenced this issue Jan 30, 2020
@AminHP AminHP reopened this Jan 30, 2020
@AminHP
Copy link
Owner

AminHP commented Jan 30, 2020

It seems the problem is solved. Try to remove your gym-anytrading and install it using the command below:

pip install https://github.com/AminHP/gym-anytrading/archive/master.zip

@simonesalvucci
Copy link
Author

The IndexError is fixed now, thank you.

But I think there is still something wrong going on, I've tried to debug the Environment unsuccessful.
What I get from render_all() is a plot with just one short position, which is very weird...

Screenshot 2020-01-30 at 13 45 08

@AminHP
Copy link
Owner

AminHP commented Jan 30, 2020

Can you show me your code?

@simonesalvucci
Copy link
Author

env = DummyVecEnv([lambda: gym.make('forex-v0', frame_bound=(100, 5000), window_size=10)])

# Training Env
policy_kwargs = dict(net_arch=[64, 'lstm',dict(vf=[128,128,128], pi=[64,64])])
model = A2C("MlpLstmPolicy", env, verbose=1, policy_kwargs=policy_kwargs)
model.learn(total_timesteps=1000)

# Testing Env 
observation = env.reset()
while True:
    # action = env.action_space.sample()
    action = model.predict(observation)
    observation, reward, done, info = env.step(action)
    # env.render()
    if done:
        print("info:", info)
        break

# Plotting results
plt.cla()
env.envs[0].render_all()
plt.show()

@AminHP
Copy link
Owner

AminHP commented Jan 30, 2020

The problem is something inside the DummyVecEnv which resets the environment automatically after it is done.

Also, there was a mistake in your code. Try this:

env_maker = lambda: gym.make('forex-v0', frame_bound=(100, 5000), window_size=10)
env = DummyVecEnv([env_maker])

# Training Env
policy_kwargs = dict(net_arch=[64, 'lstm',dict(vf=[128,128,128], pi=[64,64])])
model = A2C("MlpLstmPolicy", env, verbose=1, policy_kwargs=policy_kwargs)
model.learn(total_timesteps=1000)

# Testing Env 
env = env_maker()
observation = env.reset()

while True:
    observation = observation[np.newaxis, ...]
    # action = env.action_space.sample()
    action, _states = model.predict(observation)
    observation, reward, done, info = env.step(action)
    # env.render()
    if done:
        print("info:", info)
        break

# Plotting results
plt.cla()
env.render_all()
plt.show()

@simonesalvucci
Copy link
Author

Thank you @AminHP

I must have missed the reset of the environment in DummyVecEnv.
Now it makes much more sense.

The code above works perfectly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants