Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DataTree.move to move a node to another place #9442

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Armavica
Copy link
Contributor

@Armavica Armavica commented Sep 7, 2024

@TomNicholas TomNicholas added the topic-DataTree Related to the implementation of a DataTree class label Sep 7, 2024
Copy link
Member

@TomNicholas TomNicholas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much @Armavica !


Returns
-------
DataTree
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DataTree
DataTree
Copied subtree with the node moved.

The node to move.
destination: str
The new node destination.
parents: bool, optional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this argument should be renamed, though I'm not sure what to. Perhaps create_intermediates?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure either, I took inspiration from pathlib: https://docs.python.org/3/library/pathlib.html#pathlib.Path.mkdir
Happy to change to anything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Taking inspiration from pathlib was a good idea. But I don't think it's a great name in this case...

The exists_ok arg in pathlib.Path.mkdir is also potentially relevant - the user might want to avoid replacing the subtree already at that path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good point. I will add an exists_ok argument for now, until we settle on names for these arguments.

@@ -1060,6 +1060,50 @@ def drop_nodes(
result._replace_node(children=children_to_keep)
return result

def move(self, origin: str, destination: str, parents: bool = False) -> DataTree:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's another possible way of writing this method: dt.move(destination) where it's assumed that the origin is the current node, and the returned result is the root of the new tree.

But I think your way is probably the less surprising of the two.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Should we do both, or would that be too confusing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your way is fine! I just wanted to comment to point this out for posterity :)

@Armavica
Copy link
Contributor Author

There are actually several non-trivial design questions to answer for this feature.
The main one I see is the semantics of the origin and destination paths. What do we want to happen when we do dt.move("/to_move", "/other/path") for the following DataTree:

dt = DataTree.from_dict({"/to_move/child": None, "other/path/here": None})
<xarray.DataTree>
Group: /
├── Group: /to_move
│   └── Group: /to_move/child
└── Group: /other
    └── Group: /other/path
        └── Group: /other/path/here

Option 1: destination is the new name of origin

dt.move("/to_move", "/other/path")
<xarray.DataTree>
Group: /
└── Group: /other
    └── Group: /other/path
        └── Group: /other/path/child

With this option, if destination existed before the move, it is overwritten by origin.

Option 2: destination is the new root of origin

dt.move("/to_move", "/other/path")
<xarray.DataTree>
Group: /
└── Group: /other
    └── Group: /other/path
        ├── Group: /other/path/here
        └── Group: /other/path/to_move
            └── Group: /other/path/to_move/child

With this option, if destination existed before the move, it is not overwritten, but one of its children might if it has the same name as origin

Option 3: if destination exists it is the new root, otherwise it is the new name

This seems closer to the unix mv semantics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-DataTree Related to the implementation of a DataTree class
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants