-
-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Dataset Format Standard Modification] Record Unflattened Spaces in Datasets #77
[Dataset Format Standard Modification] Record Unflattened Spaces in Datasets #77
Conversation
461eac7
to
d39bd01
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typos in the comments and debugging print statements
@@ -250,8 +265,8 @@ def reset( | |||
assert STEP_DATA_KEYS.issubset(step_data.keys()) | |||
|
|||
# If last episode in global buffer has saved steps, we need to check if it was truncated or terminated | |||
# If not, then we need to auto-truncate the episode | |||
if len(self._buffer[-1]["actions"]) > 0: | |||
# If not (empty dicitionary), then we need to auto-truncate the episode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comments should ideally be more understandable than this, I'd rephrase to "if the dictionary is not empty, "
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rephrased.
# If not, then we need to auto-truncate the episode | ||
if len(self._buffer[-1]["actions"]) > 0: | ||
# If not (empty dicitionary), then we need to auto-truncate the episode. | ||
if self._buffer[-1]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel like there's probably a cleaner way to test for this but I'd have to look into the code a bit more deeply to be sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, but I'm open to suggestions here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking back this seems okay to do, the comment above explains it clearly enough
…ll as a test for tuple action spaces and a combo env with nested dict and tuple action spaces
…rvations for create_dataset_from_buffers, this may be inefficient and need refactoring
… observation and action space of data now saved in dataset.
…pss+1 observations were being loaded when calling get_episodes
This is ready for review again @younik @rodrigodelazcano, though I still need to write the doc update for it. |
| `env_spec` | `str` | json string of the Gymnasium environment spec.| | ||
| `dataset_name` | `str` | Name tag of the Minari dataset. | | ||
| `dataset_id` | `str` | Identifier of the Minari dataset. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely a better var name to use so this is good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to match our datasets, which seemed to use already dataset_id
instead of dataset_name
| `code_permalink` | `str` | Link to a repository with the code used to generate the dataset.| | ||
| `author` | `str` | Author's name that created the dataset. | | ||
| `author_email` | `str` | Email of the author that created the dataset.| | ||
| `algorithm_name` | `str` | Name of the expert policy used to create the dataset. | | ||
| `action_space` | `str` | Serialized Gymnasium action space describing actions in dataset. | | ||
| `observation_space` | `str` | Serialized Gymnasium observation space describing observations in dataset. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn’t look I’m too much detail but maybe it’s worth noting how they get serialized? Pickled I’m assuming if it’s like the rest of our stuff, but we may want to look into transforming them into safetensors as pickle objects can be security vulnerabilities (someone can maliciously upload stuff by modifying the source code and running it thru a local install no matter what kinds of precautions we take, if we use pickle)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We wrote custom json serialization/de-serialization strategy which doesn't provide a straightforward path to eval() attacks as far as I know. That's not to say there isn't any way someone could figure out how to do an eval attack. I will add a note to the documentation describing the serialization format.
This callback can be overridden to add extra environment information in each step or | ||
edit the observation, action, reward, termination, truncation, or info returns. | ||
""" | ||
|
||
def __init__(self, env: gym.Env): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven’t looked in depth but shouldn’t there still be an init even if it doesn’t flatten things? I guess it’s just a callback though so maybe there’s no need
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to work ok without one, for example in the test here https://github.com/Farama-Foundation/Minari/blob/8ff86b9fab6229169eb7516f839c725223e4711a/tests/data_collector/callbacks/test_step_data_callback.py I think we don't need to set the self.env
property anymore
# actions, observations, rewards, terminations, truncations, and infos | ||
assert STEP_DATA_KEYS.issubset(step_data.keys()) | ||
# Check that the saved observation and action belong to the dataset's observation/action spaces | ||
assert self.dataset_observation_space.contains(step_data["observations"]) | ||
assert self.dataset_action_space.contains(step_data["actions"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general we want to have error messages if the assert fails just for simpler debugging, so something like assert(x), f“x is not true: {x}”
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some more descriptive messages to the asserts in data_collector.py
@@ -425,7 +462,16 @@ def save_to_disk(self, path: str, dataset_metadata: Optional[Dict] = None): | |||
for key, value in dataset_metadata.items(): | |||
self._tmp_f.attrs[key] = value | |||
|
|||
self._buffer.append({key: [] for key in STEP_DATA_KEYS}) | |||
assert "observation_space" not in dataset_metadata.keys() | |||
assert "action_space" not in dataset_metadata.keys() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again it’s a bit of a pain but best to add the assertion error statements just so when it fails you can see the value that was incorrect (and make it something more informative than assertion error 2 !=3, I’ve seen that before and it makes it such a headache to even figure out what went wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
@@ -309,20 +331,20 @@ def update_dataset_from_buffer(self, buffer: List[dict]): | |||
element compared to the number of action steps {len(eps_buff['actions'])} \ | |||
The initial and final observation must be included" | |||
seed = eps_buff.pop("seed", None) | |||
eps_group = clear_episode_buffer( | |||
episode_group = clear_episode_buffer( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good change imo, eps could be misunderstood as epsilon and it’s not that long to say episode so might as well
minari/dataset/minari_storage.py
Outdated
self._action_space = env.action_space | ||
|
||
env.close() | ||
# ww will default to using the reconstructed observation and action spaces from the dataset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in “ww” (I’m assuming at least)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
minari/dataset/minari_storage.py
Outdated
) | ||
self._action_space = deserialize_space(f.attrs["action_space"]) | ||
else: | ||
# checking if the base library of the environment is present in the environment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pain to standardize them all but ideally we probably want most of the comments to be uppercase at the beginning and look consistent, not a huge deal but your commenting above was great so maybe just making sure it all looks good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed this instance
episodes = data.get_episodes(episode_indices) | ||
# verify we have the right number of episodes, available at the right indices | ||
assert data.total_episodes == len(episodes) | ||
# verify the actions and observations are in the appropriate action space and observation space, and that the episode lengths are correct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lowercase comment (minor, can ignore if you want to)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed this instance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty much good to me besides a few very minor writing or code style things
@younik |
I addressed the reviews on this PR. I will also refactor the tests to reduce the duplication of helper function/env/space definitions, then I think it's ready for final review. |
Refactor for the test dependencies is done :^), last thing I wanna check is that the old hosted datasets can still be sampled from ok temporarily until we convert them. (Also I will add a note about serialization format to the documentation.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two minor comments on tests; for me is good to merge, good job!
tests/test_serialization.py
Outdated
@pytest.mark.parametrize( | ||
"space", | ||
[ | ||
gym.spaces.Box(low=-1, high=4, shape=(2,), dtype=np.float32), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you delegate to test_common?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sounds good, I'll do it for test_serialization as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
tests/test_common.py
Outdated
@@ -1,5 +1,238 @@ | |||
from typing import Iterable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea for test_common was to factor repetitions in tests (like a list of spaces that we want to test or environments) similar to https://github.com/Farama-Foundation/Gymnasium/blob/main/tests/testing_env.py
So good to have here the environments that we can reuse in multiple tests; for the test themselves I will keep the structure of the package as before (with test_minari_storage.py
and test_minari_dataset.py
)
To avoid confusion probably a better name of test_common.py
is just common.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah that makes sense to avoid the test file name format, I'll make that change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
One heads up is that I think this PR will break loading the |
…ndencies file name to common.py, removed depdency duplication in serialization.py, added a dataset integrity check to test_download_dataset_from_farama_server
OK, ready for final review. Assuming aren't blocked by the fact we will likely break outdated flattened dict pointmaze dataset loading, until the pointmaze datasets are replaced with up to date datasets. |
minari/serialization.py
Outdated
elif isinstance(space, gym.spaces.Tuple): | ||
result = {"type": "Tuple", "subspaces": []} | ||
for subspace in space.spaces: | ||
result["subspaces"].append(serialize_space(subspace, to_string=False)) | ||
if to_string: | ||
return json.dumps(result) | ||
else: | ||
return result | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing an else
branch
When you have an unsupported space, this will throw an uninformative error at line 43 (result
is not initialized); better to have an informative error in the else branch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed, and also added a test to check that a TypeError is correctly raised if unsupported space types are serialized.
…corresponding test
* change episodes structure + refactor * fix pre-commit
Description
This PR is the continuation of issue #57. The PR removes the automatic flattening of observation/action spaces in
StepDataCallback
to storeDict
andTuple
spaces in their original format. This will allow to create custom dataset spaces different from the environment.Next tasks
Tuple
spaces (maybe name each hdf5 group/dataset aselement_{id}
)Checklist:
pre-commit
checks withpre-commit run --all-files
(seeCONTRIBUTING.md
instructions to set it up)pytest -v
and no errors are present.pytest -v
has generated that are related to my code to the best of my knowledge.