Update mujoco-py to mujoco and gym to gymnasium + more (Farama-Founda…

…tion#421)
reginald-mclean · Jul 12, 2023 · 19411bd · 19411bd
1 parent 3e38597
commit 19411bd
Show file tree

Hide file tree

Showing 128 changed files with 2,332 additions and 1,267 deletions.
diff --git a/.gitignore b/.gitignore
@@ -146,3 +146,5 @@ MUJOCO_LOG.TXT
 # tool
 Pipfile
 Pipfile.lock
+
+mujoco_migration.py
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -10,7 +10,7 @@ Ensure that your task and pull request:
 * [ ] Can be performed by a real robot arm
 * [ ] Is dissimilar from current tasks
 * [ ] Contains meaningful internal variation (e.g. different object positions, etc.)
-* [ ] Conforms to the action space, observation space, and reward functions conventions used by metaworld environments
+* [ ] Conforms to the action space, observation space, and reward functions conventions used by Meta-World environments
 * [ ] Uses existing assets if they exist, and that any new assets added are high-quality
 * [ ] Follows the code quality, style, testing, and documentation guidelines outlined below
 * [ ] Provides learning curves which show the task can by solved by PPO and SAC, using the implementations linked below
@@ -153,20 +153,20 @@ These are Meta-World specific rules which are not part of the aforementioned sty
     ```python
     import collections
 
-    import gym.spaces
+    import gymnasium.spaces
 
     from garage.tf.models import MLPModel
 
     q = collections.deque(10)
-    d = gym.spaces.Discrete(5)
+    d = gymnasium.spaces.Discrete(5)
     m = MLPModel(output_dim=2)
     ```
 
     *Don't*
     ```python
     from collections import deque
 
-    from gym.spaces import Discrete
+    from gymnasium.spaces import Discrete
     import tensorflow as tf
 
     from garage.tf.models import MLPModel
@@ -239,14 +239,14 @@ Do's and Don'ts for avoiding accidental merge commits and other headaches:
 * *Don't* use `git merge`
 * *Don't* use `git pull` (unless git tells you that your branch can be fast-forwarded)
 * *Don't* make commits in the `master` branch---always use a feature branch
-* *Do* fetch upstream (`rlworkgroup/metaworld`) frequently and keep your `master` branch up-to-date with upstream
+* *Do* fetch upstream (`Farama-Foundation/Metaworld`) frequently and keep your `master` branch up-to-date with upstream
 * *Do* rebase your feature branch on `master` frequently
 * *Do* keep only one or a few commits in your feature branch, and use `git commit --amend` to update your changes. This helps prevent long chains of identical merges during a rebase.
 
 Please see [this guide](https://gist.github.com/markreid/12e7c2203916b93d23c27a263f6091a0) for a tutorial on the workflow. Note: unlike the guide, we don't use separate `develop`/`master` branches, so all PRs should be based on `master` rather than `develop`
 
 ### Commit message format
-metaworld follows the git commit message guidelines documented [here](https://gist.github.com/robertpainsi/b632364184e70900af4ab688decf6f53) and [here](https://chris.beams.io/posts/git-commit/). You can also find an in-depth guide to writing great commit messages [here](https://github.com/RomuloOliveira/commit-messages-guide/blob/master/README.md)
+Meta-World follows the git commit message guidelines documented [here](https://gist.github.com/robertpainsi/b632364184e70900af4ab688decf6f53) and [here](https://chris.beams.io/posts/git-commit/). You can also find an in-depth guide to writing great commit messages [here](https://github.com/RomuloOliveira/commit-messages-guide/blob/master/README.md)
 
 In short:
 * All commit messages have an informative subject line of 50 characters
@@ -257,20 +257,20 @@ In short:
 
 These recipes assume you are working out of a private GitHub fork.
 
-If you are working directly as a contributor to `rlworkgroup`, you can replace references to `rlworkgroup` with `origin`. You also, of course, do not need to add `rlworkgroup` as a remote, since it will be `origin` in your repository.
+If you are working directly as a contributor to `Farama-Foundation`, you can replace references to `Farama-Foundation` with `origin`. You also, of course, do not need to add `Farama-Foundation` as a remote, since it will be `origin` in your repository.
 
-#### Clone your GitHub fork and setup the rlworkgroup remote
+#### Clone your GitHub fork and setup the Farama-Foundation remote
 ```sh
 git clone git@github.com:<your_github_username>/metaworld.git
 cd metaworld
-git remote add rlworkgroup git@github.com:rlworkgroup/metaworld.git
-git fetch rlworkgroup
+git remote add Farama-Foundation git@github.com:Farama-Foundation/metaworld.git
+git fetch Farama-Foundation
 ```
 
 #### Update your GitHub fork with the latest from upstream
 ```sh
-git fetch rlworkgroup
-git reset --hard master rlworkgroup/master
+git fetch Farama-Foundation
+git reset --hard master Farama-Foundation/master
 git push -f origin master
 ```
 
@@ -287,8 +287,8 @@ git push origin myfeaturebranch
 #### Rebase a feature branch so it's up-to-date with upstream and push it to your fork
 ```sh
 git checkout master
-git fetch rlworkgroup
-git reset --hard rlworkgroup/master
+git fetch Farama-Foundation
+git reset --hard Farama-Foundation/master
 git checkout myfeaturebranch
 git rebase master
 # you may need to manually reconcile merge conflicts here. Follow git's instructions.
@@ -298,4 +298,4 @@ git push -f origin myfeaturebranch # -f is frequently necessary because rebases
 ## Release
 
 ### Modify CHANGELOG.md
-For each release in metaworld, modify [CHANGELOG.md](https://github.com/rlworkgroup/metaworld/blob/master/CHANGELOG.md) with the most relevant changes from the latest release. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), which adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+For each release in metaworld, modify [CHANGELOG.md](https://github.com/Farama-Foundation/Metaworld/blob/master/CHANGELOG.md) with the most relevant changes from the latest release. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), which adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
diff --git a/README.md b/README.md
@@ -1,6 +1,8 @@
 # Meta-World
-[![License](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/rlworkgroup/metaworld/blob/master/LICENSE)
-![Build Status](https://github.com/rlworkgroup/metaworld/workflows/MetaWorld%20CI/badge.svg)
+[![License](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/Farama-Foundation/metaworld/blob/master/LICENSE)
+![Build Status](https://github.com/Farama-Foundation/Metaworld/workflows/MetaWorld%20CI/badge.svg)
+
+# The current version of Meta-World is a work in progress. If you find any bugs/errors please open an issue.
 
 __Meta-World is an open-source simulated benchmark for meta-reinforcement learning and multi-task learning consisting of 50 distinct robotic manipulation tasks.__ We aim to provide task distributions that are sufficiently broad to evaluate meta-RL algorithms' generalization ability to new behaviors.
 
@@ -20,23 +22,25 @@ __Table of Contents__
 - [Acknowledgements](#acknowledgements)
 
 ## Join the Community
+
 Metaworld is now maintained by the Farama Foundation! You can interact with our community and the new developers in our [Discord server](https://discord.gg/PfR7a79FpQ)
 
 ## Maintenance Status
 The current roadmap for Meta-World can be found [here](https://github.com/Farama-Foundation/Metaworld/issues/409)
 
 ## Installation
-Meta-World is based on MuJoCo, which has a proprietary dependency we can't set up for you. Please follow the [instructions](https://github.com/openai/mujoco-py#install-mujoco) in the mujoco-py package for help. Once you're ready to install everything, run:
+To install everything, run:
+
 
 ```
 pip install git+https://github.com/Farama-Foundation/Metaworld.git@master#egg=metaworld
 ```
 
 Alternatively, you can clone the repository and install an editable version locally:
 
-```
-git clone https://github.com/rlworkgroup/metaworld.git
-cd metaworld
+```sh
+git clone https://github.com/Farama-Foundation/Metaworld.git
+cd Metaworld
 pip install -e .
 ```
 
@@ -50,11 +54,11 @@ Here is a list of benchmark environments for meta-RL (ML*) and multi-task-RL (MT
 * [__ML1__](https://meta-world.github.io/figures/ml1.gif) is a meta-RL benchmark environment which tests few-shot adaptation to goal variation within single task. You can choose to test variation within any of [50 tasks](https://meta-world.github.io/figures/ml45-1080p.gif) for this benchmark.
 * [__ML10__](https://meta-world.github.io/figures/ml10.gif) is a meta-RL benchmark which tests few-shot adaptation to new tasks. It comprises 10 meta-train tasks, and 3 test tasks.
 * [__ML45__](https://meta-world.github.io/figures/ml45-1080p.gif) is a meta-RL benchmark which tests few-shot adaptation to new tasks. It comprises 45 meta-train tasks and 5 test tasks.
-* [__MT10__](https://meta-world.github.io/figures/mt10.gif), __MT1__, and __MT50__ are multi-task-RL benchmark environments for learning a multi-task policy that perform 10, 1, and 50 training tasks respectively. __MT1__ is similar to __ML1__ becau you can choose to test variation within any of [50 tasks](https://meta-world.github.io/figures/ml45-1080p.gif) for this benchmark.  In the original Metaworld experiments, we augment MT10 and MT50 environment observations with a one-hot vector which identifies the task. We don't enforce how users utilize task one-hot vectors, however one solution would be to use a Gym wrapper such as [this one](https://github.com/rlworkgroup/garage/blob/master/src/garage/envs/multi_env_wrapper.py)
+* [__MT10__](https://meta-world.github.io/figures/mt10.gif), __MT1__, and __MT50__ are multi-task-RL benchmark environments for learning a multi-task policy that perform 10, 1, and 50 training tasks respectively. __MT1__ is similar to __ML1__ because you can choose to test variation within any of [50 tasks](https://meta-world.github.io/figures/ml45-1080p.gif) for this benchmark.  In the original Meta-World experiments, we augment MT10 and MT50 environment observations with a one-hot vector which identifies the task. We don't enforce how users utilize task one-hot vectors, however one solution would be to use a Gym wrapper such as [this one](https://github.com/rlworkgroup/garage/blob/master/src/garage/envs/multi_env_wrapper.py)
 
 
 ### Basics
-We provide a `Benchmark` API, that allows constructing environments following the [`gym.Env`](https://github.com/openai/gym/blob/c33cfd8b2cc8cac6c346bc2182cd568ef33b8821/gym/core.py#L8) interface.
+We provide a `Benchmark` API, that allows constructing environments following the [`gymnasium.Env`](https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/core.py#L21) interface.
 
 To use a `Benchmark`, first construct it (this samples the tasks allowed for one run of an algorithm on the benchmark).
 Then, construct at least one instance of each environment listed in `benchmark.train_classes` and `benchmark.test_classes`.
@@ -95,7 +99,7 @@ env.set_task(task)  # Set task
 
 obs = env.reset()  # Reset environment
 a = env.action_space.sample()  # Sample an action
-obs, reward, done, info = env.step(a)  # Step the environoment with the sampled random action
+obs, reward, done, info = env.step(a)  # Step the environment with the sampled random action
 ```
 __MT1__ can be run the same way except that it does not contain any `test_tasks`
 ### Running a benchmark
@@ -117,7 +121,7 @@ for name, env_cls in ml10.train_classes.items():
 for env in training_envs:
   obs = env.reset()  # Reset environment
   a = env.action_space.sample()  # Sample an action
-  obs, reward, done, info = env.step(a)  # Step the environoment with the sampled random action
+  obs, reward, done, info = env.step(a)  # Step the environment with the sampled random action
 ```
 Create an environment with test tasks (this only works for ML10 and ML45, since MT10 and MT50 don't have a separate set of test tasks):
 ```python
@@ -137,11 +141,11 @@ for name, env_cls in ml10.test_classes.items():
 for env in testing_envs:
   obs = env.reset()  # Reset environment
   a = env.action_space.sample()  # Sample an action
-  obs, reward, done, info = env.step(a)  # Step the environoment with the sampled random action
+  obs, reward, done, info = env.step(a)  # Step the environment with the sampled random action
 ```
 
 ## Accessing Single Goal Environments
-You may wish to only access individual environments used in the Metaworld benchmark for your research.
+You may wish to only access individual environments used in the Meta-World benchmark for your research.
 We provide constructors for creating environments where the goal has been hidden (by zeroing out the goal in
 the observation) and environments where the goal is observable. They are called GoalHidden and GoalObservable
 environments respectively.
@@ -161,7 +165,7 @@ door_open_goal_hidden_cls = ALL_V2_ENVIRONMENTS_GOAL_HIDDEN["door-open-v2-goal-h
 env = door_open_goal_hidden_cls()
 env.reset()  # Reset environment
 a = env.action_space.sample()  # Sample an action
-obs, reward, done, info = env.step(a)  # Step the environoment with the sampled random action
+obs, reward, done, info = env.step(a)  # Step the environment with the sampled random action
 assert (obs[-3:] == np.zeros(3)).all() # goal will be zeroed out because env is HiddenGoal
 
 # You can choose to initialize the random seed of the environment.
@@ -173,7 +177,8 @@ env1.reset()  # Reset environment
 env2.reset()
 a1 = env1.action_space.sample()  # Sample an action
 a2 = env2.action_space.sample()
-next_obs1, _, _, _ = env1.step(a1)  # Step the environoment with the sampled random action
+next_obs1, _, _, _ = env1.step(a1)  # Step the environment with the sampled random action
+
 next_obs2, _, _, _ = env2.step(a2)
 assert (next_obs1[-3:] == next_obs2[-3:]).all() # 2 envs initialized with the same seed will have the same goal
 assert not (next_obs2[-3:] == np.zeros(3)).all()   # The env's are goal observable, meaning the goal is not zero'd out
@@ -183,7 +188,7 @@ env1.reset()  # Reset environment
 env3.reset()
 a1 = env1.action_space.sample()  # Sample an action
 a3 = env3.action_space.sample()
-next_obs1, _, _, _ = env1.step(a1)  # Step the environoment with the sampled random action
+next_obs1, _, _, _ = env1.step(a1)  # Step the environment with the sampled random action
 next_obs3, _, _, _ = env3.step(a3)
 
 assert not (next_obs1[-3:] == next_obs3[-3:]).all() # 2 envs initialized with different seeds will have different goals
@@ -208,11 +213,12 @@ If you use Meta-World for academic research, please kindly cite our CoRL 2019 pa
 ```
 
 ## Accompanying Baselines
-If you're looking for implementations of the baselines algorithms used in the Metaworld conference publication, please look at our sister directory, [Garage](https://github.com/rlworkgroup/garage).
+If you're looking for implementations of the baselines algorithms used in the Meta-World conference publication, please look at our sister directory, [Garage](https://github.com/rlworkgroup/garage).
+
 Note that these aren't the exact same baselines that were used in the original conference publication, however they are true to the original baselines.
 
 ## Become a Contributor
-We welcome all contributions to Meta-World. Please refer to the [contributor's guide](https://github.com/rlworkgroup/metaworld/blob/master/CONTRIBUTING.md) for how to prepare your contributions.
+We welcome all contributions to Meta-World. Please refer to the [contributor's guide](https://github.com/Farama-Foundation/Metaworld/blob/master/CONTRIBUTING.md) for how to prepare your contributions.
 
 ## Acknowledgements
 Meta-World is a work by [Tianhe Yu (Stanford University)](https://cs.stanford.edu/~tianheyu/), [Deirdre Quillen (UC Berkeley)](https://scholar.google.com/citations?user=eDQsOFMAAAAJ&hl=en), [Zhanpeng He (Columbia University)](https://zhanpenghe.github.io), [Ryan Julian (University of Southern California)](https://ryanjulian.me), [Karol Hausman (Google AI)](https://karolhausman.github.io),  [Chelsea Finn (Stanford University)](https://ai.stanford.edu/~cbfinn/) and [Sergey Levine (UC Berkeley)](https://people.eecs.berkeley.edu/~svlevine/).

diff --git a/docker/Dockerfile b/docker/Dockerfile
@@ -7,26 +7,17 @@ SHELL ["/bin/bash", "-o", "pipefail", "-c"]
 RUN apt-get -y update \
     && apt-get install --no-install-recommends -y \
     libglu1-mesa-dev libgl1-mesa-dev libosmesa6-dev \
-    xvfb unzip patchelf ffmpeg cmake swig \
+    xvfb unzip patchelf ffmpeg cmake swig git\
     && apt-get autoremove -y \
     && apt-get clean \
-    && rm -rf /var/lib/apt/lists/* \
-    # Download mujoco
-    && mkdir /root/.mujoco \
-    && cd /root/.mujoco \
-    && wget -qO- 'https://github.com/deepmind/mujoco/releases/download/2.1.0/mujoco210-linux-x86_64.tar.gz' | tar -xzvf -
-
-ENV LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/root/.mujoco/mujoco210/bin"
-
-# Build mujoco-py from source. Pypi installs wheel packages and Cython won't recompile old file versions in the Github Actions CI.
-# Thus generating the following error https://github.com/cython/cython/pull/4428
-RUN git clone https://github.com/openai/mujoco-py.git\
-    && cd mujoco-py \
-    && pip install -e .
+    && rm -rf /var/lib/apt/lists/*
 
 COPY . /usr/local/metaworld/
 WORKDIR /usr/local/metaworld/
-
+RUN free -g
 RUN pip install .[testing]
+RUN git clone https://github.com/reginald-mclean/Gymnasium.git
+RUN pip install -e Gymnasium
+
 
 ENTRYPOINT ["/usr/local/metaworld/docker/entrypoint"]