Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dataset Format Standard Modification] Record Unflattened Spaces in Datasets #77

Merged
merged 35 commits into from
Jun 16, 2023

Conversation

rodrigodelazcano
Copy link
Member

@rodrigodelazcano rodrigodelazcano commented May 24, 2023

Description

This PR is the continuation of issue #57. The PR removes the automatic flattening of observation/action spaces in StepDataCallback to store Dict and Tuple spaces in their original format. This will allow to create custom dataset spaces different from the environment.

Next tasks

  • Find solution to record Tuple spaces (maybe name each hdf5 group/dataset as element_{id})
  • Encode custom action/observation spaces for datasets (the spaces can be different from the environment's)
  • Fix pytest
  • Documentation update

Checklist:

  • I have run the pre-commit checks with pre-commit run --all-files (see CONTRIBUTING.md instructions to set it up)
  • I have run pytest -v and no errors are present.
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I solved any possible warnings that pytest -v has generated that are related to my code to the best of my knowledge.
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@rodrigodelazcano rodrigodelazcano marked this pull request as draft May 24, 2023 14:11
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typos in the comments and debugging print statements

@@ -250,8 +265,8 @@ def reset(
assert STEP_DATA_KEYS.issubset(step_data.keys())

# If last episode in global buffer has saved steps, we need to check if it was truncated or terminated
# If not, then we need to auto-truncate the episode
if len(self._buffer[-1]["actions"]) > 0:
# If not (empty dicitionary), then we need to auto-truncate the episode.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments should ideally be more understandable than this, I'd rephrase to "if the dictionary is not empty, "

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rephrased.

# If not, then we need to auto-truncate the episode
if len(self._buffer[-1]["actions"]) > 0:
# If not (empty dicitionary), then we need to auto-truncate the episode.
if self._buffer[-1]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel like there's probably a cleaner way to test for this but I'd have to look into the code a bit more deeply to be sure

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, but I'm open to suggestions here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking back this seems okay to do, the comment above explains it clearly enough

@balisujohn balisujohn changed the title [Standard modification] Record unflatten spaces in datasets [Standard modification] Record Unflattened Spaces in Datasets Jun 6, 2023
@balisujohn balisujohn changed the title [Standard modification] Record Unflattened Spaces in Datasets [Dataset Format Standard Modification] Record Unflattened Spaces in Datasets Jun 6, 2023
@balisujohn
Copy link
Collaborator

This is ready for review again @younik @rodrigodelazcano, though I still need to write the doc update for it.

@balisujohn balisujohn requested a review from younik June 8, 2023 17:35
@balisujohn balisujohn mentioned this pull request Jun 9, 2023
| `env_spec` | `str` | json string of the Gymnasium environment spec.|
| `dataset_name` | `str` | Name tag of the Minari dataset. |
| `dataset_id` | `str` | Identifier of the Minari dataset. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely a better var name to use so this is good

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to match our datasets, which seemed to use already dataset_id instead of dataset_name

| `code_permalink` | `str` | Link to a repository with the code used to generate the dataset.|
| `author` | `str` | Author's name that created the dataset. |
| `author_email` | `str` | Email of the author that created the dataset.|
| `algorithm_name` | `str` | Name of the expert policy used to create the dataset. |
| `action_space` | `str` | Serialized Gymnasium action space describing actions in dataset. |
| `observation_space` | `str` | Serialized Gymnasium observation space describing observations in dataset. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn’t look I’m too much detail but maybe it’s worth noting how they get serialized? Pickled I’m assuming if it’s like the rest of our stuff, but we may want to look into transforming them into safetensors as pickle objects can be security vulnerabilities (someone can maliciously upload stuff by modifying the source code and running it thru a local install no matter what kinds of precautions we take, if we use pickle)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We wrote custom json serialization/de-serialization strategy which doesn't provide a straightforward path to eval() attacks as far as I know. That's not to say there isn't any way someone could figure out how to do an eval attack. I will add a note to the documentation describing the serialization format.

This callback can be overridden to add extra environment information in each step or
edit the observation, action, reward, termination, truncation, or info returns.
"""

def __init__(self, env: gym.Env):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven’t looked in depth but shouldn’t there still be an init even if it doesn’t flatten things? I guess it’s just a callback though so maybe there’s no need

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to work ok without one, for example in the test here https://github.com/Farama-Foundation/Minari/blob/8ff86b9fab6229169eb7516f839c725223e4711a/tests/data_collector/callbacks/test_step_data_callback.py I think we don't need to set the self.env property anymore

# actions, observations, rewards, terminations, truncations, and infos
assert STEP_DATA_KEYS.issubset(step_data.keys())
# Check that the saved observation and action belong to the dataset's observation/action spaces
assert self.dataset_observation_space.contains(step_data["observations"])
assert self.dataset_action_space.contains(step_data["actions"])
Copy link
Member

@elliottower elliottower Jun 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general we want to have error messages if the assert fails just for simpler debugging, so something like assert(x), f“x is not true: {x}”

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some more descriptive messages to the asserts in data_collector.py

@@ -425,7 +462,16 @@ def save_to_disk(self, path: str, dataset_metadata: Optional[Dict] = None):
for key, value in dataset_metadata.items():
self._tmp_f.attrs[key] = value

self._buffer.append({key: [] for key in STEP_DATA_KEYS})
assert "observation_space" not in dataset_metadata.keys()
assert "action_space" not in dataset_metadata.keys()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again it’s a bit of a pain but best to add the assertion error statements just so when it fails you can see the value that was incorrect (and make it something more informative than assertion error 2 !=3, I’ve seen that before and it makes it such a headache to even figure out what went wrong

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@@ -309,20 +331,20 @@ def update_dataset_from_buffer(self, buffer: List[dict]):
element compared to the number of action steps {len(eps_buff['actions'])} \
The initial and final observation must be included"
seed = eps_buff.pop("seed", None)
eps_group = clear_episode_buffer(
episode_group = clear_episode_buffer(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good change imo, eps could be misunderstood as epsilon and it’s not that long to say episode so might as well

self._action_space = env.action_space

env.close()
# ww will default to using the reconstructed observation and action spaces from the dataset
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in “ww” (I’m assuming at least)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

)
self._action_space = deserialize_space(f.attrs["action_space"])
else:
# checking if the base library of the environment is present in the environment
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pain to standardize them all but ideally we probably want most of the comments to be uppercase at the beginning and look consistent, not a huge deal but your commenting above was great so maybe just making sure it all looks good

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this instance

episodes = data.get_episodes(episode_indices)
# verify we have the right number of episodes, available at the right indices
assert data.total_episodes == len(episodes)
# verify the actions and observations are in the appropriate action space and observation space, and that the episode lengths are correct
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lowercase comment (minor, can ignore if you want to)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed this instance

Copy link
Member

@elliottower elliottower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty much good to me besides a few very minor writing or code style things

@elliottower elliottower marked this pull request as ready for review June 13, 2023 16:46
@balisujohn
Copy link
Collaborator

@younik
#77 (comment)
In regards to this comment, I refactored those functions into get_episodes, which calls _get_episode_from_h5, which calls _h5_group_to_dict_recursive

@balisujohn
Copy link
Collaborator

I addressed the reviews on this PR. I will also refactor the tests to reduce the duplication of helper function/env/space definitions, then I think it's ready for final review.

@balisujohn
Copy link
Collaborator

balisujohn commented Jun 13, 2023

Refactor for the test dependencies is done :^), last thing I wanna check is that the old hosted datasets can still be sampled from ok temporarily until we convert them. (Also I will add a note about serialization format to the documentation.)

Copy link
Member

@younik younik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minor comments on tests; for me is good to merge, good job!

@pytest.mark.parametrize(
"space",
[
gym.spaces.Box(low=-1, high=4, shape=(2,), dtype=np.float32),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you delegate to test_common?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sounds good, I'll do it for test_serialization as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -1,5 +1,238 @@
from typing import Iterable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea for test_common was to factor repetitions in tests (like a list of spaces that we want to test or environments) similar to https://github.com/Farama-Foundation/Gymnasium/blob/main/tests/testing_env.py

So good to have here the environments that we can reuse in multiple tests; for the test themselves I will keep the structure of the package as before (with test_minari_storage.pyand test_minari_dataset.py)

To avoid confusion probably a better name of test_common.py is just common.py

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that makes sense to avoid the test file name format, I'll make that change.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

@balisujohn
Copy link
Collaborator

balisujohn commented Jun 13, 2023

One heads up is that I think this PR will break loading the Dict-spaced flattened datasets from the database, In particular pointmaze, since they are no longer in compliance with our dataset standard (unless we reintroduce optional flattening). I will test if this is the case later today when I have fast internet, but I wanted to indicate that I think this failure is likely. I added a dataset integrity test to test_dataset_download.py and it works okay for the Box-spaced environments it downloads.

…ndencies file name to common.py, removed depdency duplication in serialization.py, added a dataset integrity check to test_download_dataset_from_farama_server
@balisujohn
Copy link
Collaborator

OK, ready for final review. Assuming aren't blocked by the fact we will likely break outdated flattened dict pointmaze dataset loading, until the pointmaze datasets are replaced with up to date datasets.

Comment on lines 36 to 44
elif isinstance(space, gym.spaces.Tuple):
result = {"type": "Tuple", "subspaces": []}
for subspace in space.spaces:
result["subspaces"].append(serialize_space(subspace, to_string=False))
if to_string:
return json.dumps(result)
else:
return result

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing an else branch
When you have an unsupported space, this will throw an uninformative error at line 43 (result is not initialized); better to have an informative error in the else branch

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed, and also added a test to check that a TypeError is correctly raised if unsupported space types are serialized.

@younik younik merged commit 3579ad5 into Farama-Foundation:main Jun 16, 2023
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants