Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for HTTP over UNIX sockets in Mojo::UserAgent #1053

Closed
wants to merge 2 commits into from

Conversation

salva
Copy link

@salva salva commented Feb 10, 2017

Summary

Allow Mojo::UserAgent to talk HTTP over UNIX sockets via the new scheme http+unix.

Example:

$ua->get('http+unix://'.url_escape('/var/run/docker.sock').'/version');

It also works for HTTP proxies:

$proxy->http('http+unix://'.url_escape('/var/run/squid.sock'));

Motivation

In my particular case, it was accessing docker listening on an UNIX socket, but it would apply to other services (see for instance, jhthorsen/mojo-redis2#14).

Discussion

The main decision to be taken when adding support for UNIX socket is how to tell the library (i.e. Mojo::UserAgent) that it should use a UNIX socket and then the path to that socket. There are at least two ways for that:

a) provide an extra method to instruct the UA to use an UNIX socket as, for instance, $ua->use_unix_sock($path). This is the approach taken by libcurl.

b) encode that information into the URL in some way.

In my oppinion b is preferable because the HTTP service parameters may come from several layers up on the software stack and that approach doesn't require passing an extra parameter down the stack just to support an uncommon operation.

The next issue is how to encode the socket path into the URL, a tricky business!

I know the following alternatives:

a) The easiest way is to just include the path into the URL as in http+unix:///var/run/docker.sock/version, but it requires touching the file system in order to discover which part is the socket path and which is the HTTP path (in this case /var/run/docker.sock and /version respectively) and what it is worse, it breaks common URL manipulation logic.

b) Encode the HTTP url as a parameter: "http+unix:///$socket_path?".url_escape($http_url).

I don't like this approach because it requires special handling when manipulating the URL. For instance, adding a new parameter involves unpacking $http_url from the URL, adding the parameter into $http_url and finally repacking it into the outer URL again.

c) Pass the socket path in the host slot URL-encoded. This goes against the RFCs but on the other side it seems to work well in practice and the resulting URLs don't require to be handled in special ways. It is the solution implemented in this patch.

d) Adding a subfield into the URL host slot as in http://unix_socket=$url_encoded_path;localhost/$path. This approach conforms to the RFCs, but the subfields support is an obscure feature of the URL specification. Mojo::URL doesn't even support it.

In conclusion, from my point of view, the selected approach works well in practice, without requiring extensive changes in either Mojo or the calling code in order to talk HTTP over UNIX sockets.

Regarding, the code modifications, I have tried to make then minimal.

As the interfaces provided by IO::Socket::IP and IO::Socket::Unix are not completely compatible, I have been forced to add a couple of ugly conditionals that check the $handle class, but on the other hand, I don't thing that would justify adding an adapter class just to get the two to behave identically.

Also, they don't always require the same flow. For instance, UNIX sockets do not require a name resolution step, or waiting for the connection to be established.

References

@kraih
Copy link
Member

kraih commented Feb 10, 2017

Very creative idea, love it! But wouldn't it work even better if you just used the proxy setting like $proxy->http('unix:///var/run/squid.sock'); and started requests normally like $ua->get('http://example.com/test')? Then the Host header would even be correct.

@kraih
Copy link
Member

kraih commented Feb 10, 2017

Regarding tests, i guess the easiest way would be to just add UNIX domain socket support to Mojo::Server::Daemon as well.

@kraih
Copy link
Member

kraih commented Feb 10, 2017

I suppose the squid use case would make $proxy->http('unix:///var/run/squid.sock'); weird, but that could then work with proxy->http('http+unix:///var/run/squid.sock');.

@kraih
Copy link
Member

kraih commented Feb 10, 2017

We would basically be handling UNIX domain sockets like SOCKS.

@salva
Copy link
Author

salva commented Feb 10, 2017

But wouldn't it work even better if you just used the proxy setting like $proxy->http('unix:///var/run/squid.sock'); and started requests normally like $ua->get('http://example.com/test')?

That was my first idea, but I discarded it because I wanted to be able to tell the UA to fetch some random URL that may come from a configuration file (and which in some cases would point to an UNIX socket and in others to a TCP service) without requiring any special handling.

Also, "abusing" the proxy support for that seemed conceptually wrong to me. I may want to use the same UA object to access several services, some over TCP other over UNIX sockets (even if in practice that is not a problem as I can use more than one UA object).

Then the Host header would even be correct.

Yes, currently this is an issue. Not every server likes getting a path in the Host header. But it can be easily worked around forcing a regular Host header. For instance:

$ua->get($http_unix_url, {Host => 'localhost'});

If you feel that is an important issue, then I would advocate for option d: adding the UNIX socket path as a subfield of the userinfo slot. I would have to reread the URI RFC carefully, but I think it should be something like this:

"http://;unix_path=$url_encoded_unix_path\@$hostname/$path"

or when user and/or password fields are required...

"http://$user:$passwd;unix_path=$url_encoded_unix_path\@$hostname/$path"

That allows us tu push both the path and the host name into the URL an it is standard compliant.

Anyway, if you are absolutely convinced the right way to support UNIX sockets is configuring then as proxies using a unix schema I would do it that way.

@salva
Copy link
Author

salva commented Feb 10, 2017

Regarding tests, i guess the easiest way would be to just add UNIX domain socket support to Mojo::Server::Daemon as well.

Well, the easiest way is to use some tool like socat to redirect the traffic from a unix to a tcp service, but I don't know if using an external tool like that would be acceptable for you?

I have not investigated how difficult would be to add support for UNIX sockets in the server side yet...

@kraih
Copy link
Member

kraih commented Feb 10, 2017

No, i'm afraid using external tools is not acceptable for us.

@kraih
Copy link
Member

kraih commented Feb 14, 2017

The latest changes broke tests.

@marcusramberg
Copy link
Member

I think in order for this to make sense it should be supported on both the client side and server side. In general I think the concept is quite useful tho.

@kraih
Copy link
Member

kraih commented Feb 14, 2017

Something i'm curious about, does this work for zero downtime software upgrades with Hypnotoad?

@shadowcat-mst
Copy link
Contributor

Note that if you want to test without adding unix socket support to the mojo servers just yet, I've run hypnotoad against a unix socket by pre-opening it and then setting the live restart env var to the fd.

@shadowcat-mst
Copy link
Contributor

http+unix or unix+http seems like the best way to be able to share a useragent object doing both - I rather like the idea of having proxy supported as well, though. e.g. that would make it easy to use spiped to have egress for http requests on another machine without needing to handle your own TCP security

@kraih
Copy link
Member

kraih commented Feb 14, 2017

I think documenting the current API would help a lot with gathering feedback, because all the different API variations mentioned in the comments can be rather confusing.

@kraih
Copy link
Member

kraih commented Feb 14, 2017

Next week is SUSE Hackweek, so if we can reach consensus on what the API should look like by then, i would have time to clean up and merge the pull request.

@salva
Copy link
Author

salva commented Feb 16, 2017

The latest changes broke tests.

Yes, it seems the tls.t test gets stuck consistently in Travis CI, but I am unable to reproduce the failure on my computer. Tests run fine here... I would try with other perl and OS combinations.

@kraih
Copy link
Member

kraih commented Feb 16, 2017

@salva It was a problem unrelated to this pull request and has been fixed already in Mojolicious and IO::Socket::SSL.

@kraih
Copy link
Member

kraih commented Feb 16, 2017

What's holding back this pull request right now is the lack of documentation for the chosen API.

salva and others added 2 commits February 16, 2017 09:52
Now it is possible to make Mojo listen on a UNIX socket running it as
follows:

  morbo -l http+unix://%2Ftmp%2Fmojo.sock script.pl
@salva
Copy link
Author

salva commented Feb 16, 2017

@salva It was a problem unrelated to this pull request and has been fixed already in Mojolicious and IO::Socket::SSL.

Then let's rebase...

@salva
Copy link
Author

salva commented Feb 16, 2017

What's holding back this pull request right now is the lack of documentation for the chosen API.

yes, yes, I would improve that, and also add tests for the new functionality.

@kraih
Copy link
Member

kraih commented Feb 17, 2017

Tests are not that important, i can add those too if necessary. But documentation is needed so we can discuss the API better.

@shadowcat-mst
Copy link
Contributor

Honestly, a single clear write-up of the API as a comment here would be a start, it can get turned into POD patches once we've finished figuring it out.

Looking at the commits, you seem to have support for http+unix in Mojo::UserAgent, the beginnings of support for listening on unix sockets, and the 'unix socket as proxy' feature sri suggested is totally missed (and I think that would be useful to have as well)

@salva
Copy link
Author

salva commented Feb 18, 2017

There is not much to document: the first patch adds support into Mojo::UserAgent for the new http+unix scheme. Those URIs are of the form as http+unix://$socket_path/$path?$query. The only tricky point is that as the $socket_path part may contain slashes (/) it must be url-escaped. For instance:

my $url = 'http+unix://'.Mojo::Util::url_escape($socket_path);
my $ua = Mojo::UserAgent->new;
$ua->get($url);
$ua->get($url, {Host => 'localhost'}); # for using a more server-friendly Host header

The second patch, adds support for http+unix into Mojo::Server::Daemon (I have tested it with morbo only), so that the following works:

morbo -l http+unix://%2Ftmp%2Fmorbo.sock/ web_app.pl

Or calling it from Perl, for instance:

my @urls = ['http+unix://%2Ftmp%2Fmorbo.sock/',   # %2F is url_escape('/')
            'http://localhost:8888'];
my $daemon = Mojo::Server::Daemon->new(listen => \@urls);

Is that clear enough? otherwise I would appreciate if you could formulate more specific questions and I would try to elaborate on those.

Regarding proxy support, the first patch already allows one to use the http+unix scheme in proxy URLs.

Adding support for the unix scheme as a proxy target which would just divert the connection to a Unix socket shouldn't be too difficult, but I fail to see it as something really useful. What are the use cases not already covered by supporting http+unix? and if you add unix, then, why not tcp too?

@kraih
Copy link
Member

kraih commented Feb 18, 2017

So far there doesn't appear to be much interest from the community in this patch. I don't know if this is because the patch is just very hard to understand, or a general disinterest in the feature. But i'm afraid so far it looks like this proposal will not get the required votes.

my %options = (Blocking => 0);
if (defined $args->{path}) {
$options{Peer} = $args->{path};
require IO::Socket::UNIX; # load it on demand
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you use IO::Socket::UNIX even on Windows?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also do not use that comment style.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you use IO::Socket::UNIX even on Windows?

Yes, IO::Socket::UNIX can be loaded on Windows, the ctor call will fail later.

}
$handle->blocking(0);

$self->_wait('_ready', $handle, $args);
if (defined $args->{path}) {
# UNIX sockets are inmediately ready!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not use that comment style.

$fd = fileno $handle;
my $reuse = $self->{reuse} = join ':', $address, $handle->sockport, $fd;
if (defined $path) {
require IO::Socket::UNIX;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, can't you use IO::Socket::UNIX on Windows?

$tx->remote_address($handle->peerhost)->remote_port($handle->peerport);
if ($handle->isa('IO::Socket::UNIX')) {
$tx->local_address($handle->hostpath)->local_port(-1);
$tx->remote_address($handle->peerpath)->remote_port(-1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a very sketchy decision, should the *_address attributes really contain paths?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the issue here is that those slots doesn't make sense for UNIX sockets, so, If you aim for correctness, probably, setting those to undef is the right thing to do.

After a shallow Inspection of the Mojo code, it seems to me that they are not used by Mojo internally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not just correctness, there might also be security implications.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, would you consider leaving then as undef acceptable?


if ($handle->isa('IO::Socket::UNIX')) {
$tx->local_address($handle->hostpath)->local_port(-1);
$tx->remote_address($handle->peerpath)->remote_port(-1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, assigning paths to *_address just seems wrong.

@shadowcat-mst
Copy link
Contributor

Requiring path escaping is not the usual approach to this - using "how far do I get through this path before I get a file" is much more common, and given Mojo's "pass sane data or live with the consequences" approach to things seems fine to me.

@salva
Copy link
Author

salva commented Feb 18, 2017

using "how far do I get through this path before I get a file"

The problem with this approach is that it breaks with common URL manipulation logic. For instance, resolving an absolute link /foo using as base http+unix:///path/to/socket.sock/bar/doz would result into http+unix:///foo when what you want is http+unix:///path/to/socket.sock/foo.

@kraih
Copy link
Member

kraih commented Feb 20, 2017

This week is Hackweek and i have time to work on this, we just need to reach consensus on the API. It's all up to you.

@shadowcat-mst
Copy link
Contributor

@salva ok, that's not a bad reason. Care to write up a proper API proposal?

@kraih
Copy link
Member

kraih commented Feb 20, 2017

Normally i reject pull requests without documentation very quickly, so they can be resubmitted and then have a much higher chance of getting quality feedback. This time i wanted to give it a little more time since i really liked the idea, but that was a mistake. It has been 10 days now and we have yet to see a writeup of the API. Therefore i am rejecting this pull request, in the hopes that it will get resubmitted with more documentation, or that an API proposal gets discussed on IRC/the mailing-list first.

@kraih kraih closed this Feb 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants