AgentHive

AgentHive is the agents module for RoboHive. It contains trained agents as well as the primitives and helper scripts to train new agents for RoboHive environments.

Overview

AgentHive provides the tools and helper scripts for training agents as well as offline and online execution of pre-packaged trained agents. RoboHive can be used with any openAI-Gym compatible algorithmic baselines to train agents for its environment. RoboHive developers have used the following baseline frameworks during developments. The with a goal of expanding over time.

Pretrained baselines

AgentHive comes prepackaged with as a set of pre-trained baselines. The goal of these baselines is to provide a mechanism for used to enjoy out-of-box capabilities wit RoboHive. We are continuously accepting contributions to grow our pre-trained collections, please send us a pull request.

Agent Utilities

Environment wrappers

AgentHive provides environment wrappers specifically designed to work with RoboHive gym environment. Find examples in test/test_envs.py.

The basic usage is:

env = RoboHiveEnv(env_name="FrankaReachRandom_v2d-v0")

The following kitchen and franka visual environments should be used (they will be executed without flattening/unflattening of the images which is an expensive process):

env_list = ["visual_franka_slide_random-v3",
   "visual_franka_slide_close-v3",
   "visual_franka_slide_open-v3",
   "visual_franka_micro_random-v3",
   "visual_franka_micro_close-v3",
   "visual_franka_micro_open-v3",
   "visual_kitchen_knob1_off-v3",
   "visual_kitchen_knob1_on-v3",
   "visual_kitchen_knob2_off-v3",
   "visual_kitchen_knob2_on-v3",
   "visual_kitchen_knob3_off-v3",
   "visual_kitchen_knob3_on-v3",
   "visual_kitchen_knob4_off-v3",
   "visual_kitchen_knob4_on-v3",
   "visual_kitchen_light_off-v3",
   "visual_kitchen_light_on-v3",
   "visual_kitchen_sdoor_close-v3",
   "visual_kitchen_sdoor_open-v3",
   "visual_kitchen_ldoor_close-v3",
   "visual_kitchen_ldoor_open-v3",
   "visual_kitchen_rdoor_close-v3",
   "visual_kitchen_rdoor_open-v3",
   "visual_kitchen_micro_close-v3",
   "visual_kitchen_micro_open-v3",
   "visual_kitchen_close-v3"
]

To use the environment in parallel, wrap it in a ParallelEnv:

from torchrl.envs import EnvCreator, ParallelEnv
env = ParallelEnv(3, EnvCreator(lambda: RoboHiveEnv(env_name="FrankaReachRandom_v2d-v0")))

To use transforms (normalization, grayscale etc), use the env transforms:

from torchrl.envs import EnvCreator, ParallelEnv, TransformedEnv, R3MTransform
env = ParallelEnv(3, EnvCreator(lambda: RoboHiveEnv(env_name="FrankaReachRandom_v2d-v0")))
env = TransformedEnv(
    base_env,
    R3MTransform(
        "resnet18",
        ["pixels"],
        ["pixels_embed"],
    ),
)

Make sure that the R3M or VIP transform is appended after the ParallelEnv, otherwise you will pass as many images as there are processes through the ResNet module (and quickly run into an OOM exception).

Finally, the script of a typical data collector (executed on 4 different GPUs in an asynchronous manner) reads as follows:

import tqdm
from torchrl.collectors.collectors import MultiaSyncDataCollector, RandomPolicy
from agenthive.rl_envs import RoboHiveEnv
from torchrl.envs import ParallelEnv, TransformedEnv, GrayScale, ToTensorImage, Resize, ObservationNorm, EnvCreator, Compose, CatFrames

if __name__ == '__main__':
    # create a parallel env with 4 envs running independendly.
    # I put the 'cuda:0' device to show how to create an env on cuda (ie: the output tensors will be on cuda)
    # but this will be overwritten in the collector below
    penv = ParallelEnv(4, EnvCreator(lambda: RoboHiveEnv('FrankaReachRandom_v2d-v0', device='cuda:0', from_pixels=True)))
    # we append a series of standard transforms, all running on cuda
    tenv = TransformedEnv(penv, Compose(ToTensorImage(), Resize(84, 84), GrayScale(), CatFrames(4, in_keys=['pixels']), ObservationNorm(in_keys=['pixels'])))
    # this is how you initialize your observation norm transform (the API will be improved shortly)
    tenv.transform[-1].init_stats(reduce_dim=(0, 1), cat_dim=1, num_iter=1000)
    # we cheat a bit by using a totally random policy. A CNN will obviously slow down collection a bit
    policy = RandomPolicy(tenv.action_spec)  # some random policy

    # we create an async collector on 4 different devices. The "passing_devices"  indicate where the env is placed, and the "device" where the policy is executed.
    # For a maximum efficiency they should match. Also, you can either pass a string for those args (ie all devices match) or a list of strings/devices.
    collector = MultiaSyncDataCollector([tenv, tenv, tenv, tenv], policy=policy, frames_per_batch=400, max_frames_per_traj=1000, total_frames=1_000_000,
                                        passing_devices=['cuda:0', 'cuda:1', 'cuda:2', 'cuda:3'],
                                        devices=['cuda:0', 'cuda:1', 'cuda:2', 'cuda:3'])
    # a simple collection loop to log the speed
    pbar = tqdm.tqdm(total=1_000_000)
    for data in collector:
        pbar.update(data.numel())
    del collector
    del tenv

Model training

The safest way of coding up a model and training it is to refer to the official torchrl examples:

Execution

AgentHive is optimized for the MUJOCO backend. Make sure to set the sim_backend environment variable to "MUJOCO" before running the code:

sim_backend=MUJOCO python script.py

Installation

AgentHive has two core dependencies: torchrl and RoboHive. RoboHive relies on mujoco and mujoco-py for physics simulation and rendering. As of now, RoboHive requires you to use the old mujoco bindings as well as the v0.13 of gym. TorchRL provides detailed instructions. on how to setup an environment with the old mujoco bindings.

See also the Getting Started markdown for more info on setting up your env.

For more complete instructions, check the installation pipeline in .circleci/unittest/linux/script/install.sh

Name		Name	Last commit message	Last commit date
Latest commit History 181 Commits
.circleci		.circleci
agents		agents
baselines/mjrl		baselines/mjrl
examples		examples
rlhive		rlhive
scripts		scripts
test		test
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
GET_STARTED.md		GET_STARTED.md
README.md		README.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
setup.cfg		setup.cfg
setup.py		setup.py
version.txt		version.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AgentHive

Overview

Pretrained baselines

Agent Utilities

Environment wrappers

Model training

Execution

Installation

About

Releases

Packages

Languages

Jdvakil/agenthive

Folders and files

Latest commit

History

Repository files navigation

AgentHive

Overview

Pretrained baselines

Agent Utilities

Environment wrappers

Model training

Execution

Installation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages