Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: enable relative_to() between to cloud files #149

Closed
remi-braun opened this issue Jul 1, 2021 · 5 comments
Closed

ENH: enable relative_to() between to cloud files #149

remi-braun opened this issue Jul 1, 2021 · 5 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@remi-braun
Copy link

Hello,

Is there any drowbacks to implement relative_to for Cloud paths ?

For example my usecase is:

root = AnyPath("s3://my-bucket/")
file = AnyPath("s3://my-bucket/END_TO_END_RESULTS/INDEX/result.tif")

And I would like to retrieve the relative path between the 2, a.k.a : "/END_TO_END_RESULTS/INDEX/result.tif".

Does it make sense ? I am not very familiar with the cloud paths usages.

@pjbull
Copy link
Member

pjbull commented Jul 1, 2021

@remi-braun Definitely not an unreasonable request.

The reason that we don't currently have relative paths, just absolute paths. For example., Path.relative_to returns a valid Path object. If we do realtive_to on a CloudPath, once we strip the bucket info, etc. we'll just have a string left to return to the user, but not a CloudPath.

We could implement relative_to to return a string that's stripped on the left hand side. I'm inclined to leave this open and see how common of a use case this is.

That said, for the particular use case you lay out, .key will give you everything except the bucket on S3:

In [1]: from cloudpathlib import CloudPath

In [2]: CloudPath("s3://my-bucket/END_TO_END_RESULTS/INDEX/result.tif").key
Out[2]: 'END_TO_END_RESULTS/INDEX/result.tif'

@remi-braun
Copy link
Author

Oh I understand, your point is, in my opinion, very relevant.

My workaround is in my case (I need a str and it should be OK since there are no relative paths for CloudPath):

if isinstance(file, CloudPath):
   rel_path = str(file).replace(str(res_dir), "") 
else:
   rel_path = file.relative_to(res_dir)

@analog-cbarber
Copy link

Both relative_to and is_relative_to should be supported. The relative_to method should just return a non absolute PosixPath instance, which can already be used with CloudPath.joinpath.

Obviously there are some cases where the resulting Path cannot be used to recreate the original path
(e.g. when the returned relative path includes the S3 bucket), but I think that is ok.

Support for relative is important when you are mapping files between two different file systems relative to
some particular root on each, where the file systems could be local, or different cloud providers.

@jayqi jayqi added the enhancement New feature or request label Sep 10, 2021
@pjbull pjbull added the good first issue Good for newcomers label Mar 2, 2022
@pjbull
Copy link
Member

pjbull commented May 30, 2022

Some notes from @Gilthans from #229:

Hello!
Firstly - this is a wonderful library! I was about to implement something similar when I decided to look for it, and was delighted to see this. Such a clean, well designed project is hard to come by. Thanks to everyone involved.

I've seen most methods were implemented, and those that weren't had documentation on why. There are two methods I'd like to contest the decision though: relative_to and resolve.

  • resolve is stunted in a CloudPath since it is always absolute, but I'd argue it's better to return the same path instead of raising an error. This allows for using resolve to force into an absolute path, without knowing in advance if this is a cloud path or a local path.
  • relative_to I think should be implemented - relative_to is always used between two absolute paths, although the result is always a relative path. The result would be a PurePosixPath (and not a CloudPath), which after some thought does make sense.

I'd be happy to write a PR for this, if it would be accepted, and again thanks to all the contributors!

After some thought, I'd argue the argument for resolve applies to absolute and as_posix as well, so I'd be happy to add those too.

@pjbull
Copy link
Member

pjbull commented Jun 1, 2022

Fixed with #232

@pjbull pjbull closed this as completed Jun 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants