Redesign HTTP::Client to implement missing features #6011

straight-shoota · 2018-04-25T19:24:37Z

HTTP::Client is missing some quite essential feature, and while there are some individual issues about that, it is better to discuss necessary design changes in a central place.

Some missing features:

Proxy support HTTP::Client proxy support #2963
Connecting to Unix socket HTTP::Client and HTTP::Server on Unix Socket #2735
Transparently re-use connections to different servers
HTTP/2 HTTP/2 support #2125
Mock connections (for testing)
Following redirects HTTP::Client: follow redirects #2721
Session handling HTTP::Client allow sessions #3081
Integration of custom middlewares (for example authentication)
Timeout
Stream debugging HTTP::Client may set an output stream for debugging? #6335

The first group is about adding more flexibility to use different types of connections or enhance their usage. This needs to separate the implementation for establishing a connection from the HTTP::Client.
#6001 is an incomplete experiment for that.

The concept of using a configurable transport is employed by many other implementations of HTTP clients:

Go: RoundTripper (interface request -> response) and Transport (implementation, including proxy, reusing connections, HTTP/2 and a lot of basic HTTP workflow like redirects).
Java: com.google.api.client.http.HttpTransport: builds a request (method, url -> request ) and request.execute() gets the response.
Ruby: Faraday implements a middleware interface similar to HTTP::Handler.
Ruby: Hurley has a connection property which is a lambda request -> response

It makes great sense to have a HTTP::Client as a high-level interface delegate the heavy-lifting to a replaceable low-level API. User should be able to configure HTTP::Client to use a specific transport.

The default transport should allow to connect to any TCP server and maintain a connection pool to re-use idle sockets, even beyond the life-span of a single client instance. Other transports could connect to a proxy, a Unix socket or just mock a connection for testing purposes.

It is a question how the interface between HTTP::Client and transport should be designed. In the experimental PR #6001 the transport returns an IO for a specific request, but the client still writes the request and reads the response.
I'm more tending towards making the interface as simple as Request -> Response. This would obviously move most of the implementation of HTTP::Client to the transport, but I don't think that's a bad thing to move the internal heavy-lifting to the lower level. The interface is also more versatile because it allows the transport to manipulate the request before writing/response after reading. This is for example necessary for proxy requests, because they need the full URL as resource. An additional benefit is that requests and responses don't necessarily need to be sent through an IO. A transport can for example implement an in-memory client cache or directly create responses for non-HTTP protocols such as file.

Such a simple interface would also make it easy to chain transports for more flexibility and might be good place to inject middleware.

This brings us to the second group of missing, more high-level features. Some HTTP clients such as Faraday allow to inject custom middleware to the request-response-chain which can implement for example authentication or following redirects etc.

I am convinced that essential features such as following redirects, basic auth etc. should not require a middleware handler, but be implemented directly either in the low- or high-level API. They need to be optional and configurable, but I wouldn't like to see a HTTP::Client setup using a herd of middleware handlers just to provide basic HTTP. That should only be used for stuff that's on-top (for example OAuth) and can't reasonably be included in the default client implementation.

Related: #1330 (socket factory as transport)

The text was updated successfully, but these errors were encountered:

jwoertink · 2018-04-26T16:09:44Z

Yes, that proxy support would be amazing!

rishavs · 2018-11-08T08:49:02Z

@straight-shoota Just a suggestion; can you use checkboxes instead of bullet points in your OP? it will make it easier to follow the progress in individual points, instead of checking each item individually.

BTW, really love this list of proposed improvements. proxy and http/2 support are the biggest ones IMO. For basic auth, i am more used to the idea of using a middleware, ala Node but i will be fine with handling it in a separate layer.

straight-shoota · 2018-11-14T12:57:09Z

@rishavs We're currently talking about rough design. The implementation of individual features is a far way to go.

straight-shoota · 2018-11-14T13:18:06Z

But I'd like to see this discussion moving forward.

In my opinion, the exposed API of HTTP::Client needs to be dead simple but build on a complex backend. It should be as easy as running curl or accessing a website in a browser. In almost all cases, the application doesn't care about how a HTTP response is acquired. Does it create a new TCP connection or re-use an existing one? Is it a simple HTTP/1.1 protocol or multiplexing through HTTP/2 -- at some point even HTTP/3 HTTP::Client should simply use the best available method supported by the server.
This is obviously a very sophisticated goal and will take a long time to reach, most likely till after 1.0.

Maybe this complex implementation doesn't even need to be the default - or even included in the stdlib at all. But the stdlib HTTP::Client API should be designed in such a way that the transport adapter is easily exchangeable. Its interface should be simple enough (Request -> Response) and don't assume anything about connection details. This makes it easy to improve the implementation step by step. The start would be more or less a simple, one-shot connection implementation for TCP/IP. It might even keep the connection alive and reuse it, if the next request is for the same host. Otherwise, a new connection should be established. Proxy and Unix sockets are further feature improvements and relatively easy. A big step would later be a reusable connection pool and HTTP/2. Eventually even HTTP/3.

PercussiveElbow · 2019-04-15T16:34:40Z

Has there been any movement recently on UNIX socket support?

straight-shoota · 2019-04-15T18:55:27Z

Not yet, unfortunately. But I'd like to move on with this. Possibly for 0.29.0.

m-o-e · 2019-12-17T01:07:52Z

But I'd like to see this discussion moving forward.

+1 from me (a happy crystal user who started missing persistent HTTP connections today 😞).

It should be as easy as running curl

Perhaps it could make sense to rebuild HTTP::Client on top of libcurl?

I don't know how core team would feel about introducing an external dependency
for something as central as HTTP, but in my experience I end up reaching for the
curl-binding in most languages most of the time anyway (e.g. typhoeus or PycURL),
because the stdlib client usually lacks in features and/or performance.

straight-shoota · 2019-12-17T22:53:58Z

@m-o-e I think this has been brought up in the discussion before, and it probably won't work to integrate libcurl into Crystal's event loop.

m-o-e · 2019-12-17T23:05:53Z

@straight-shoota Ah okay, understood. Sorry for the duplicate!

yorci · 2020-06-02T22:57:15Z

That would be great to have outgoing IP bind from HTTP::Client which tcpsocket have already. Its really useful when work with API's req/hour limits based on IP.

rishavs · 2020-10-15T10:51:58Z

Are any of these, part of the 1.0 plan?

asterite · 2020-10-15T12:06:54Z

@rishavs no

erdnaxeli · 2020-11-23T22:02:26Z

In almost all cases, the application doesn't care about how a HTTP response is acquired. Does it create a new TCP connection or re-use an existing one?

For this specific point (reuse connections) I think the application should be aware of what is happening. Sockets are a limited resource and app should control how they use it. Coming from Python (where the most used HTTP lib is requests), I think a concept of session should be good:

session = HTTP::Session.new
session.get("https://crystal-lang.org")

Inside a session, calls reuse connections (if possible), but simple calls with HTTP::Client should not.

straight-shoota · 2020-11-24T01:50:57Z

Most apps really don't need to care about that. It's good to have a low-level API for managing connections manually. But for typical use cases you just want to send HTTP requests from different parts of an application without having to drag a session object around. HTTP connection management can easily be abstracted away.

erdnaxeli · 2020-11-24T08:50:32Z

But if we do that I am worried about an app leaking sockets without control over them…

I went look at how other do it: Ruby doesn't do that by default, Python neither does it, Golang does it by default (you must instantiate a client to change it).

So ok why not, the default behavior would persist connections, and if you want to change it you would need to instantiate a client object and change the parameters.

didactic-drunk · 2021-03-18T21:49:00Z

Leaking sockets is a real problem. See taylorfinnell/awscr-s3#77. In this case the library could use a pool because all requests go to the same endpoint.

The same isn't true for the 100's of one off cases where a developer has a list of urls and tries:

urls.map { |url| HTTP::Client... }...

Often it will work in testing for a small list of urls then fail sporadically in production.

I think connection reuse should be opt in rather than opt out. Whether through a session or other API I don't know, but with real use I'm hitting socket issues from 3rd party libraries with no way to control it.

straight-shoota · 2021-03-18T21:58:48Z

If you run one-off requests with HTTP::Client's class methods, the connection should be properly closed automatically. That's expected behaviour. If you happen to observe socket leakage, it's a bug.

Ideally Crystal's HTTP client would manage a pool of connections and reuse them when appropriate, but it does not – see crystal-lang/crystal#6011 – so using a new TCP connection for each request seems like the safest approach. The previous implementation was hopelessly broken due to the `@client.close`, which resulted in `Closed stream (IO::Error)` errors when used concurrently... However, IIRC, without the explicit close other issues arise when TCP sockets timeout of their own accord.

crysbot · 2024-05-10T04:08:11Z

This issue has been mentioned on Crystal Forum. There might be relevant details there:

https://forum.crystal-lang.org/t/avoiding-closuring-a-proc-thats-an-instance-variable/6834/6

crysbot · 2024-07-21T20:55:37Z

This issue has been mentioned on Crystal Forum. There might be relevant details there:

https://forum.crystal-lang.org/t/pooling-http-connections-automatically/7030/1

sdogruyol added kind:feature topic:stdlib status:discussion labels Apr 27, 2018

straight-shoota mentioned this issue Jul 9, 2018

HTTP::Client may set an output stream for debugging? #6335

Closed

This was referenced Aug 2, 2018

[RFC] Structured Concurrency #6468

Open

Set default host in HTTP::Client#exec #6481

Merged

straight-shoota mentioned this issue Sep 4, 2018

HTTP::Client: follow redirects #2721

Open

straight-shoota mentioned this issue Oct 28, 2018

(WIP) Refactor HTTP::Client to use a transport #6001

Closed

icyleaf mentioned this issue Nov 15, 2018

Proxy for Web Connection icyleaf/halite#48

Closed

RX14 mentioned this issue Nov 20, 2018

gRPC for Crystal crystal-community/crystal-libraries-needed#99

Closed

straight-shoota mentioned this issue Dec 13, 2018

Fix RequestProcessor connection reuse with unconsumed requests #7055

Merged

straight-shoota mentioned this issue Feb 13, 2019

Implement resource owner credentials #7424

Merged

icyleaf mentioned this issue Jun 27, 2019

Support persistant connections icyleaf/halite#78

Open

asterite mentioned this issue Jul 11, 2019

HTTP::Client: check host is just a host #7958

Merged

straight-shoota mentioned this issue Jan 12, 2020

Standardize tls arguments in HTTP objects #8663

Closed

straight-shoota mentioned this issue Jun 24, 2020

Allow HTTP::Client to work with any IO #9543

Merged

asterite mentioned this issue Sep 17, 2020

Add HTTP::Client logging and basic instrumentation #9756

Merged

PercussiveElbow mentioned this issue Oct 19, 2020

Crystal version PercussiveElbow/docker-crystal#1

Closed

straight-shoota mentioned this issue Nov 11, 2020

[RFC] Split HTTP::Client::Response in two #9901

Open

straight-shoota mentioned this issue Dec 8, 2020

Allow HTTP::Client methods to receive a URI object #4954

Closed

straight-shoota mentioned this issue Feb 27, 2021

Add OAuth2::Client#http_client #10452

Merged

HertzDevil added topic:stdlib:networking and removed topic:stdlib labels Jun 5, 2022

straight-shoota mentioned this issue Aug 29, 2022

Cannot perform concurrent HTTPS GET requests from a single client session #12412

Open

HertzDevil mentioned this issue Apr 1, 2023

HTTP::Server vs ApplicationServer #13072

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redesign HTTP::Client to implement missing features #6011

Redesign HTTP::Client to implement missing features #6011

straight-shoota commented Apr 25, 2018 •

edited

Loading

jwoertink commented Apr 26, 2018

rishavs commented Nov 8, 2018

straight-shoota commented Nov 14, 2018

straight-shoota commented Nov 14, 2018

PercussiveElbow commented Apr 15, 2019

straight-shoota commented Apr 15, 2019

m-o-e commented Dec 17, 2019

straight-shoota commented Dec 17, 2019

m-o-e commented Dec 17, 2019

yorci commented Jun 2, 2020

rishavs commented Oct 15, 2020

asterite commented Oct 15, 2020

erdnaxeli commented Nov 23, 2020 •

edited

Loading

straight-shoota commented Nov 24, 2020

erdnaxeli commented Nov 24, 2020 •

edited

Loading

didactic-drunk commented Mar 18, 2021

straight-shoota commented Mar 18, 2021

crysbot commented May 10, 2024

crysbot commented Jul 21, 2024

Redesign HTTP::Client to implement missing features #6011

Redesign HTTP::Client to implement missing features #6011

Comments

straight-shoota commented Apr 25, 2018 • edited Loading

jwoertink commented Apr 26, 2018

rishavs commented Nov 8, 2018

straight-shoota commented Nov 14, 2018

straight-shoota commented Nov 14, 2018

PercussiveElbow commented Apr 15, 2019

straight-shoota commented Apr 15, 2019

m-o-e commented Dec 17, 2019

straight-shoota commented Dec 17, 2019

m-o-e commented Dec 17, 2019

yorci commented Jun 2, 2020

rishavs commented Oct 15, 2020

asterite commented Oct 15, 2020

erdnaxeli commented Nov 23, 2020 • edited Loading

straight-shoota commented Nov 24, 2020

erdnaxeli commented Nov 24, 2020 • edited Loading

didactic-drunk commented Mar 18, 2021

straight-shoota commented Mar 18, 2021

crysbot commented May 10, 2024

crysbot commented Jul 21, 2024

straight-shoota commented Apr 25, 2018 •

edited

Loading

erdnaxeli commented Nov 23, 2020 •

edited

Loading

erdnaxeli commented Nov 24, 2020 •

edited

Loading