#4605 match.params aren't URI decoded - further test #6578

RichardBradley · 2019-02-08T14:31:52Z

Hi,

I think that #4605 is still not fixed, as this test demonstrates.

I will attempt to fix, but I can't guarantee I will succeed. I think the fix needs to be in history, and that the path component decoding function needs to be decodeURIComponent, rather than decodeURI. For that to work, we need to rearrange the parsing so that decoding happens after parsing (I think this commit does it before parsing), otherwise the parse will be ambiguous.

I think that the current code shows confusion around types, e.g. the line I linked to above pathname = decodeURI(pathname) shows that the distinction between encoded URIs and values used as path parameters is not being properly maintained, which has led to this bug IMO. Decoded path parameters are of a different type to encoded URIs, so the two should not be stored in the same variable without inviting confusion and bugs.

Thanks,

Rich

RichardBradley · 2019-02-08T15:29:16Z

I have prepared a fix, which I have added to this PR.

I don't think that the URI decoding relevant to this bug should be happening in history at all, and I think that the decodeURI that was added in remix-run/history#442 is quite suspicious and may be incorrect. (Maybe try reverting it and see if the tests still pass after this commit?)

I believe that the history module should pass the paths to react-router, including any necessary URI encoding that is required for them to be valid URIs. When react-router extracts path params from these URIs, it should apply decodeURIComponent, because the values are moving from a URI typed domain to a string typed domain. I have done this in the attached commit (1967020) and all tests now pass.

Hope this helps,

Rich

RichardBradley · 2019-02-08T15:47:54Z

This will also fix #5607 and #5472, but I can't comment on those as they are both "locked" (why?)

pshrmn · 2019-04-02T00:54:00Z

With history v5, we will be removing pathname decoding. Both v4 and v5 of history will remain compatible with RRv5 (and v4). With the React Router and history v6 releases, we will be switching to non-decoded pathnames (they should always be encoded, but the user will be responsible for providing the correct encoding), but for the time being it will be possible for pathnames to be either encoded or non-encoded. I bring this up for two reasons:

I worry that we cannot safely call decodeURIComponent on the params until we are certain that the pathname is encoded. The primary edge case for this would be if the decoded pathname has a percent sign (either a valid octet that gets double decoded or an invalid octet that throws an error).

decodedLocation.pathname = "/message/%20 is the space octet"
<Route path="/message/:message" />
params = { message: "  is the space octet" }

We should provide a way for people using history v5 to have decoded params. Encoded pathnames is the safe choice, but having to constantly decode params yourself is annoying.

I haven't discussed this with any of the other maintainers, but I think that it might be beneficial to add a decode prop to the <Route> (only for v5), which would signal that we can safely decode the params.

<Route path="/artist/:name" decode ... />

That should solve the problem in this PR while letting anyone that knows that they have params that are unsafe to decode to skip the decoding. (The inverse solution of providing a prop to not decode is also possible)

v6 will expect pathnames to be encoded, so we should probably be able to safely decode params there.

Thoughts?

RichardBradley · 2019-04-02T09:23:48Z

I don't fully understand your comment. My initial reaction is that it doesn't really make sense, and you are trying to shoe-horn two incompatible concepts (URLs and some kind of more general "path") into a single library, which is causing you irreconcilable type errors.

I think you need to 1) actually specify what /type/ is held in the various values here and 2) enforce that consistently (which is hard in an untyped language). I think you are building on sand by proposing a user-specified "encoding" which may or may not be compatible with URL encoding.

In the meantime, I do think the proposed patch here is a) correct, b) an improvement on what is currently in the codebase and c) fixes a real bug. So I would suggest merging this patch and moving more theoretical discussions about "the user will be responsible for providing the correct encoding" in future versions to a different thread.

pshrmn · 2019-04-02T13:42:31Z

I will move the decoding issue elsewhere, but I'm against merging this PR at this point. It might be the correct decision in v6, but we'll cross that bridge when we get there.

RichardBradley · 2019-04-02T14:31:02Z

Are you saying that the path params should never be decoded?
So for a route like <Route path="/:parentId/:childId", a URL like /a%2fb/c should give values {parentId: "a%2fb", childId: "c"}?

I think that is unconventional to the point of being wrong, but I suppose it could work if it is documented and applied consistently.

Currently the codebase is somewhere half way between the two viewpoints.

pshrmn · 2019-04-02T15:14:53Z

I think that a param should be decoded. My worry is that if we call decodeURIComponent on a string that is already decoded, we run the risk of double decoding an octet-like string.

This is all stemming from the history v4 implementation where a pathname is a decoded string. Any percent-encoded octets in the string should not be considered encoded (in the param %20 is the space octet, %20 is a string containing three characters, not the octet for a single space character). If we call decodeURIComponent on that string, the octet will be converted to a space ( is the space octet).

I really just don't want to further break things by adding automatic decoding unless we can be certain that it will give the correct output. Once all pathnames are encoded, we will know that any percent-encoded octets are "real" octets, so decoding would be safe.

Maybe I'm overthinking it; v4's behavior would already make the %20 is... pathname buggy, so it is possible that automatic decoding wouldn't break anything that isn't already broken. 🤷‍♂️

RichardBradley · 2019-04-02T18:44:35Z

Maybe I'm overthinking it; v4's behavior would already make the %20 is... pathname buggy, so it is possible that automatic decoding wouldn't break anything that isn't already broken. 🤷‍♂️

I think so.

I would bet that everyone is using "web safe" path params, because they can't rely on the current behaviour. I don't think you need to worry about backwards-compatibility here, as there is clearly a bug, as evidenced by #4605, this issue and the unit tests I added.

pshrmn · 2019-04-02T19:04:07Z

I'm still slightly hesitant about blanket decoding every param. It is probably the ideal default behavior, but I wonder if anyone would want the ability to keep their params encoded. I'll leave the decision of whether to merge this as is or allow non-decoded params up to the other maintainers.

As an aside, I think that the title of this PR could be more clear. Something like "automatically decode params using decodeURIComponent" or "decode param characters not decoded by decodeURI". The latter describes what the PR fixes today, while the former describes what RR will want to do once pathnames are encoded.

// characters not decoded by decodeURI
; , / ? : @ & = + $ #

RichardBradley · 2019-04-03T09:08:07Z

I wonder if anyone would want the ability to keep their params encoded

I can't think of any reason why they might want this.
This doesn't make sense to me at all. There is no option in JSON.parse to keep the fields "encoded"; why should a URL decomposing lib offer an option to incorrectly decompose a URL?

I'll leave it with you.

As an aside, I think that the title of this PR could be more clear. Something like "automatically decode params using decodeURIComponent" or "decode param characters not decoded by decodeURI". The latter describes what the PR fixes today, while the former describes what RR will want to do once pathnames are encoded.

I disagree, I think "match.params aren't URI decoded" is a reasonable description of the bug.

Part of the confusion might be that the builtin function "decodeURI" is poorly named and not very useful, for historical reasons.
Notice that the MDN docs for decodeURI do not mention any RFCs or explain when it is valid to use this function :-(

This SO explains some of the differences: https://stackoverflow.com/a/3608791/8261

"decodeURIComponent"/"encodeURIComponent" is correct for URL path parameters, which is the context we are discussing.

stale · 2019-09-10T22:51:34Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

RichardBradley · 2019-09-11T09:23:52Z

I really hate "stale bot".

This is a real bug in react-router. Even though no-one is currently working on it at present, I don't think this issue should be closed.

max-degterev · 2019-09-11T14:04:32Z

What is the recommended workaround for this issue? Just noticed it upgrading from v3 to v5

loganvolkers · 2020-01-23T01:15:48Z

@suprMax We found a solution!

The biggest cause of this bug was actually pointed earlier in this thread. Huge thank you and props to @RichardBradley for the deep digging to find the baneful use of decodeURI.

I don't think that the URI decoding relevant to this bug should be happening in history at all, and I think that the decodeURI that was added in remix-run/history#442 is quite suspicious and may be incorrect. (Maybe try reverting it and see if the tests still pass after this commit?)

It appears that @pshrmn didn't really understand the URI specification and has introduce this situation.

The workaround is to:

Manually encodeURI the path before sending it to StaticRouter. Otherwise your URL will be partially decoded only some of the time.
Manually decode match.params

I've got a sandbox showing this off.

decodeURI decodes paths in a very strange way. This should hopefully solve remix-run#786. This will need to be coupled with a PR in react-router remix-run/react-router#6578.

vpstuart · 2020-05-19T07:22:51Z

Submitted PR on history project remix-run/history#797 that I'm hoping is the other half of this issue.

I am in favor of this PR, as it causes some weird gaps in support for %-encoded path segments for me which at the moment might force a home-grown replacement for a project, which I'd really rather avoid.

ryanflorence · 2021-12-09T19:30:23Z

Thanks for everybody's work on this. I'm closing issues that are old, if they are still problems please open a new issue and reference this one. We will be a lot more attentive now that our software is our sole focus and we survived 2020 😅. Thanks!

RichardBradley added 3 commits February 8, 2019 14:25

remix-run#4605 match.params aren't URI decoded - further test

4aecfe6

remix-run#4605 fix html encoding confusion in tests

91ca3ea

remix-run#4605 correctly uri decode params in paths

1967020

RichardBradley mentioned this pull request Feb 10, 2019

matchPath failing at unicode paths #5474

Closed

stale bot added the stale label Sep 10, 2019

stale bot removed the stale label Sep 11, 2019

mjackson force-pushed the master branch from 4195a96 to 375f320 Compare September 27, 2019 17:08

stale bot added stale and removed stale labels Dec 20, 2019

timdorr added the fresh label Dec 27, 2019

remix-run deleted a comment from RichardBradley Dec 27, 2019

remix-run deleted a comment from stale bot Dec 27, 2019

This was referenced May 19, 2020

Update LocationUtils.js remix-run/history#797

Closed

LocationUtils should not run pathname through decodeURI remix-run/history#786

Open

mjackson removed the fresh label Sep 9, 2021

mjackson force-pushed the main branch from 320be7a to 2e342ce Compare October 15, 2021 18:11

ryanflorence closed this Dec 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#4605 match.params aren't URI decoded - further test #6578

#4605 match.params aren't URI decoded - further test #6578

RichardBradley commented Feb 8, 2019 •

edited

Loading

RichardBradley commented Feb 8, 2019 •

edited

Loading

RichardBradley commented Feb 8, 2019 •

edited

Loading

pshrmn commented Apr 2, 2019

RichardBradley commented Apr 2, 2019

pshrmn commented Apr 2, 2019

RichardBradley commented Apr 2, 2019

pshrmn commented Apr 2, 2019

RichardBradley commented Apr 2, 2019

pshrmn commented Apr 2, 2019

RichardBradley commented Apr 3, 2019

stale bot commented Sep 10, 2019

RichardBradley commented Sep 11, 2019

max-degterev commented Sep 11, 2019 •

edited

Loading

loganvolkers commented Jan 23, 2020

vpstuart commented May 19, 2020

ryanflorence commented Dec 9, 2021

#4605 match.params aren't URI decoded - further test #6578

#4605 match.params aren't URI decoded - further test #6578

Conversation

RichardBradley commented Feb 8, 2019 • edited Loading

RichardBradley commented Feb 8, 2019 • edited Loading

RichardBradley commented Feb 8, 2019 • edited Loading

pshrmn commented Apr 2, 2019

RichardBradley commented Apr 2, 2019

pshrmn commented Apr 2, 2019

RichardBradley commented Apr 2, 2019

pshrmn commented Apr 2, 2019

RichardBradley commented Apr 2, 2019

pshrmn commented Apr 2, 2019

RichardBradley commented Apr 3, 2019

stale bot commented Sep 10, 2019

RichardBradley commented Sep 11, 2019

max-degterev commented Sep 11, 2019 • edited Loading

loganvolkers commented Jan 23, 2020

vpstuart commented May 19, 2020

ryanflorence commented Dec 9, 2021

RichardBradley commented Feb 8, 2019 •

edited

Loading

RichardBradley commented Feb 8, 2019 •

edited

Loading

RichardBradley commented Feb 8, 2019 •

edited

Loading

max-degterev commented Sep 11, 2019 •

edited

Loading