Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

urlapi: "normalize" numerical IPv4 host names #6871

Closed
wants to merge 1 commit into from

Conversation

bagder
Copy link
Member

@bagder bagder commented Apr 8, 2021

When the host name in a URL is given as an IPv4 numerical address, the
address can be specified with dotted numericals in four different ways:
a32, a.b24, a.b.c16 or a.b.c.d and each part can be specified in
decimal, octal (0-prefixed) or hexadecimal (0x-prefixed).

Instead of passing on the name as-is and leaving the handling to the
underlying name functions, which made them not work with c-ares but work
with getaddrinfo, this change now makes the curl URL API itself detect
and "normalize" host names specified as IPv4 numericals.

The WHATWG URL Spec says this is an okay way to specify a host name in a
URL. RFC 3896 does not allow them, but curl didn't prevent them before
and it seems other RFC 3896-using tools have not either. Host names used
like this are widely supported by other tools as well due to the
handling being done by getaddrinfo and friends.

I decided to add the functionality into the URL API itself so that all
users of these functions get the benefits, when for example wanting to
compare two URLs. Also, it makes curl built to use c-ares now support
them as well and make curl builds more consistent.

The normalization makes HTTPS and virtual hosted HTTP work fine even
when curl gets the address specified using one of the "obscure" formats.

Test 1560 is extended to verify.

Fixes #6863

@bagder bagder added the URL label Apr 8, 2021
lib/urlapi.c Show resolved Hide resolved
@bagder bagder added the feature-window A merge of this requires an open feature window label Apr 8, 2021
@bagder
Copy link
Member Author

bagder commented Apr 8, 2021

As a little precaution I'll merge this in the next feature-window

When the host name in a URL is given as an IPv4 numerical address, the
address can be specified with dotted numericals in four different ways:
a32, a.b24, a.b.c16 or a.b.c.d and each part can be specified in
decimal, octal (0-prefixed) or hexadecimal (0x-prefixed).

Instead of passing on the name as-is and leaving the handling to the
underlying name functions, which made them not work with c-ares but work
with getaddrinfo, this change now makes the curl URL API itself detect
and "normalize" host names specified as IPv4 numericals.

The WHATWG URL Spec says this is an okay way to specify a host name in a
URL. RFC 3896 does not allow them, but curl didn't prevent them before
and it seems other RFC 3896-using tools have not either. Host names used
like this are widely supported by other tools as well due to the
handling being done by getaddrinfo and friends.

I decided to add the functionality into the URL API itself so that all
users of these functions get the benefits, when for example wanting to
compare two URLs. Also, it makes curl built to use c-ares now support
them as well and make curl builds more consistent.

The normalization makes HTTPS and virtual hosted HTTP work fine even
when curl gets the address specified using one of the "obscure" formats.

Test 1560 is extended to verify.

Fixes #6863
Closes #6871
@bagder bagder force-pushed the bagder/urlapi-normalize-ipv4 branch from 89356f0 to 6de63f5 Compare April 8, 2021 15:35
@bagder bagder closed this in 56a037c Apr 19, 2021
@bagder bagder deleted the bagder/urlapi-normalize-ipv4 branch April 19, 2021 06:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-window A merge of this requires an open feature window URL
Development

Successfully merging this pull request may close these issues.

numerical host name not normalized
1 participant