Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to enable trailing slashes #136

Open
bitzl opened this issue Nov 18, 2022 · 8 comments
Open

Option to enable trailing slashes #136

bitzl opened this issue Nov 18, 2022 · 8 comments

Comments

@bitzl
Copy link
Contributor

bitzl commented Nov 18, 2022

The WebDAV server I work with expectets trailing / at the end of directory urls and I can't figure out how to do this with this library (it's a 3rd-party service, so I can't change the server behaviour).

In any case of

client = Client("https://theuser.your-storagebox.de", auth=("theuser", "thepassword"))
client.exists("abc")
client.exists("/abc")
client.exists("abc/")
client.exists("/abc/")

the URL is normalized to https://theuser.your-storagebox.de/abc, which then yields a HTTPError because of 301 redirect to https://theuser.your-storagebox.de/abc/.

Is there a way to manually set an URL with trailing slashes or to enable trailing slashes as default? I'm happy to help if needed.

@dolfinus
Copy link

dolfinus commented Nov 30, 2022

It looks like client.exists(path) calls client.propfind(path) with default arguments:

self.propfind(path)

And client.propfind(path) has argument follow_redirects with default value False:

follow_redirects: bool = False,

Maybe True should be passed in client.exists?

@RomainTT
Copy link

RomainTT commented Dec 30, 2022

lighttpd webdav module also needs trailing slashes for directories. For instance client.ls("my/directory/") does not work because it requests <base_url>/my/directory. But ls cannot know in advance if the target is a file or a directory. I think the best thing it can do is keeping the trailing slash if there is one in the argument. Maybe an option like keep_trailing_slash=True would be nice.

@dolfinus
Copy link

Maybe this client should always follow redirects to handle both servers which require trailing slash and which require to trim it.

@cclienti
Copy link

cclienti commented Jan 29, 2023

Hello, I got the same issue.

At the moment I'm forced to dynamically patch (monkey patch) the wrap_fn to force follow_redirects=True.

I'm considering to propose a pull request based on the following.

I think that a valid possibility would be to remove the follow_redirects parameter in the wrap_fn call in order to rely on client options passed to the constructor in (WebdavFilesystem or Client classes). Maybe such technique would also allow to remove all follow_redirects=True in the client.py file

    def propfind(
        self,
        path: str,
        data: str = None,
        headers: "HeaderTypes" = None,
        # follow_redirects: bool = False,   # Remove this line, but API break
    ) -> "MultiStatusResponse":
        """Returns properties of the specific resource by propfind request."""
        call = wrap_fn(
            self._request,
            HTTPMethod.PROPFIND,
            path,
            content=data,
            headers=headers,
            # follow_redirects=follow_redirects
        )
        http_resp = self.with_retry(call)
        return parse_multistatus_response(http_resp)

If we want to not break the propfind API we can also add the follow_redirects in the wrap_fn function call only if it is true.

    def propfind(
        self,
        path: str,
        data: str = None,
        headers: "HeaderTypes" = None,
        follow_redirects: bool = False,
    ) -> "MultiStatusResponse":
        """Returns properties of the specific resource by propfind request."""
        client_opts = {"follow_redirects": True} if follow_redirects else {}
        call = wrap_fn(
            self._request,
            HTTPMethod.PROPFIND,
            path,
            content=data,
            headers=headers,
            **client_opts
        )
        http_resp = self.with_retry(call)
        return parse_multistatus_response(http_resp)

With such change, the WebdavFilesystem (or Client) should be instantiated with the follow_redirects=True in the client_opts like in the following example:

from webdav4.fsspec import WebdavFileSystem
fs2 = WebdavFileSystem(
    "https://my-webdav-repo.com",
    auth=(user, pass),
    follow_redirects=True
)

By doing so, we should also add an option to the CLI in order to add a --follow-redirects flag in order to instantiate properly the Webdav4 Client.

I'm new with webdav4 so maybe I missed something.

@skshetry
Copy link
Owner

Hi, sorry for the late response. follow_redirects is set to False given httpx default. This is done for security reasons of course, and I wanted to keep it.

The issue here is that the standard WebDAV servers expect the PROPFIND call for collections/dirs to be made to URL with a trailing slash, i.e. /dir/ since it's a collection.
And for non-collection resources, the URI is without the slash, i.e. /file. But for APIs like exists(), we cannot be sure whether or not we should add a trailing slash as we don't know if it's a collection or not. A lot of servers these days do use redirection or return identical responses on both. So that saves us from checking both responses.

Path handling definitely needs to improve (#3) in webdav4. I'm okay with setting follow_redirects=True by default for the timebeing.

@cclienti
Copy link

In my opinion, a clean solution would be to remove the parameter follow_redirects: bool = False in the propfind method as I presented previously. By doing so, the user is responsible for enabling or not the redirection when the WebDavFileSystem is instantiated.

@skshetry
Copy link
Owner

The problem still exists though, if follow_redirects=False, as URI for a collection requires a trailing slash. At the moment, we just strip trailing-slashes. That could be fixed, but we don't always know if the resource is a collection or not (eg: in exists).

The temporary fix here is to get rid of follow_redirects=False from propfind and enable it in client by default.

@dolfinus
Copy link

dolfinus commented Feb 20, 2023

That could be fixed, but we don't always know if the resource is a collection or not (eg: in exists).

exists returns True for both file and directory, the resource type does not matter. Someone should use isdir/isfile instead

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants