Fear of creating extra requests? #36

tino · 2014-06-25T21:27:25Z

Hi,

I really like the fact that you are creating a protocol for uploads. Building an upload server and html5 client I struggled a lot with deciding how to handle big, resumable uploads. I finally went with a protocol alike to the Amazon S3 multipart upload. It works really well and I am able to fully saturate the available upload bandwidth with parallel chunks (which is important because usually files are multiple GBs big), pause uploads (also in-explicitly, when the network connection drops), and resume them. I was going through the tickets that are currently open, and I made the following observation.

It seams that you want to put every piece of information in as few requests as possible. I see mentions of max-content-length (#24) , checksum support (#7), protocol discovery (#29), etc to be implemented in the first POST request. Isn't this wat an OPTIONS request is for?

Also, finishing the upload happens now because the server knows the length of the to be uploaded data up-front. Streaming uploads are to be implemented in the future however. So why not leave room or implement a "finalising" request. I think that it is also the right place for a final checksum (or hashtree) to be exchanged. And a place to handle things like Entity-Location (#30) (In my case, all chunks are assembled by a background task. The last request returns the tasks status location url to be consumed by the client).

We are talking about a resumable upload protocol, so I assume most use-cases concern big files because otherwise a single POST or PUT would suffice. So I can't imagine the extra requests would be of any concern, but correct me if I make the wrong assumptions.

I can see the beauty in the way the protocol is currently designed (single start request, the rest managed by headers), but for me that beauty fades away quickly when the features Parallel Chunks, Checksums and Streams will be implemented within the same constraints.

Tino

The text was updated successfully, but these errors were encountered:

qsorix · 2014-06-25T22:48:30Z

@tino, you're making very good points.

My use case does not involve big files. Sizes range from a few kilobytes to tens of megabytes. But they're being sent over a terrible GPRS link, with round trip time reaching 10 seconds, and frequent disconnections... if it wasn't for those disconnections, I would not require resumable uploads, as the link's speed is decent. For small files, another request will significantly impact total time. It is possible to use one protocol for small files and another one for big ones, but that complicates implementation of the clients... which in my case are small dumb sensors that I try not to overwhelm with logic.

On the other hand, my clients have plenty of time. If they need to wait because of another request, I can live with that.

Yes, I agree with you, and I would still try to use as little requests as possible when it makes sense. OPTIONS is a perfect example of a request that makes sense on its own, because a client can query server's options once and assume they'll not change for some time, e.g. until next 4xx error.

vayam · 2014-06-26T01:44:40Z

@qsorix is right. resumable makes sense for mobile uploads and chunked uploads for desktops.
the way I see chunked uploads evolve is /compose or /cat several individually resumable uploads. It will only build on top of this core protocol.

tino · 2014-06-27T12:06:36Z

I see two usecases for this protocol being used in the open issues (the ones you mention above), and I think they conflict a little / they don't completely need the same features.

On one side, there is the relative small uploads over unreliable connections, as mentioned by @qsorix , that require a slim and fast way to resume an intermitted upload. This was probably the goal tus started with.
On the other hand, the features Parallel Chunks and Streams are oriented towards large uploads.

I have the feeling that it would be wrong to build the features for the large uploads upon the "core" implementation. These features are implemented a lot easier when there is room for say, a finalising request (with checksum, processing info, etc.). Just as the AWS S3 API has a "normal" (PUT) upload method, and a multipart method.

So my suggestion would be to have the PATCH-ing (after getting the offset via a HEAD request) as the core (as is now), but switching to a "multipart" upload flow when requiring more features (upload expiration, chunking, checksumming, etc): initialisation request, one or more chunks, finalising request.

Acconut · 2014-12-22T15:28:25Z

@tino See #29 (comment) for a draft of an OPTIONS request.

qsorix mentioned this issue Oct 3, 2014

Project status? #33

Closed

timemachine3030 mentioned this issue Dec 1, 2014

Allow protocol discovery between client and server. #29

Closed

Acconut closed this as completed Oct 17, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fear of creating extra requests? #36

Fear of creating extra requests? #36

tino commented Jun 25, 2014

qsorix commented Jun 25, 2014

vayam commented Jun 26, 2014

tino commented Jun 27, 2014

Acconut commented Dec 22, 2014

Fear of creating extra requests? #36

Fear of creating extra requests? #36

Comments

tino commented Jun 25, 2014

qsorix commented Jun 25, 2014

vayam commented Jun 26, 2014

tino commented Jun 27, 2014

Acconut commented Dec 22, 2014