Skip to content

Commit

Permalink
lib: Curl_read/Curl_write clarifications
Browse files Browse the repository at this point in the history
- replace `Curl_read()`, `Curl_write()` and `Curl_nwrite()` to
  clarify when and at what level they operate
- send/recv of transfer related data is now done via
  `Curl_xfer_send()/Curl_xfer_recv()` which no longer has
  socket/socketindex as parameter. It decides on the transfer
  setup of `conn->sockfd` and `conn->writesockfd` on which
  connection filter chain to operate.
- send/recv on a specific connection filter chain is done via
  `Curl_conn_send()/Curl_conn_recv()` which get the socket index
  as parameter.
- rename `Curl_setup_transfer()` to `Curl_xfer_setup()` for
  naming consistency
- clarify that the special CURLE_AGAIN hangling to return
  `CURLE_OK` with length 0 only applies to `Curl_xfer_send()`
  and CURLE_AGAIN is returned by all other send() variants.
- fix a bug in websocket `curl_ws_recv()` that mixed up data
  when it arrived in more than a single chunk (to be made
  into a sperate PR, also)

Added as documented [in
CLIENT-READER.md](https://github.com/curl/curl/blob/5b1f31dfbab8aef467c419c68aa06dc738cb75d4/docs/CLIENT-READERS.md).

- old `Curl_buffer_send()` completely replaced by new `Curl_req_send()`
- old `Curl_fillreadbuffer()` replaced with `Curl_client_read()`
- HTTP chunked uploads are now formatted in a client reader added when
  needed.
- FTP line-end conversions are done in a client reader added when
  needed.
- when sending requests headers, remaining buffer space is filled with
  body data for sending in "one go". This is independent of the request
  body size. Resolves curl#12938 as now small and large requests have the
  same code path.

Changes done to test cases:

- test513: now fails before sending request headers as this initial
  "client read" triggers the setup fault. Behaves now the same as in
  hyper build
- test547, test555, test1620: fix the length check in the lib code to
  only fail for reads *smaller* than expected. This was a bug in the
  test code that never triggered in the old implementation.

Closes curl#12969
  • Loading branch information
icing authored and bagder committed Feb 28, 2024
1 parent 8d67c61 commit 9369c30
Show file tree
Hide file tree
Showing 26 changed files with 1,188 additions and 1,018 deletions.
93 changes: 93 additions & 0 deletions docs/CLIENT-READERS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# curl client readers

Client readers is a design in the internals of libcurl, not visible in its public API. They were started
in curl v8.7.0. This document describes the concepts, its high level implementation and the motivations.

## Naming

`libcurl` operates between clients and servers. A *client* is the application using libcurl, like the command line tool `curl` itself. Data to be uploaded to a server is **read** from the client and **sent** to the server, the servers response is **received** by `libcurl` and then **written** to the client.

With this naming established, client readers are concerned with providing data from the application to the server. Applications register callbacks via `CURLOPT_READFUNCTION`, data via `CURLOPT_POSTFIELDS` and other options to be used by `libcurl` when the request is send.

## Invoking

The transfer loop that sends and receives, is using `Curl_client_read()` to get more data to send for a transfer. If no specific reader has been installed yet, the default one that uses `CURLOPT_READFUNCTION` is added. The prototype is

```
CURLcode Curl_client_read(struct Curl_easy *data, char *buf, size_t blen,
size_t *nread, bool *eos);
```
The arguments are the transfer to read for, a buffer to hold the read data, its length, the actual number of bytes placed into the buffer and the `eos` (*end of stream*) flag indicating that no more data is available. The `eos` flag may be set for a read amount, if that amount was the last. That way curl can avoid to read an additional time.

The implementation of `Curl_client_read()` uses a chain of *client reader* instances to get the data. This is similar to the design of *client writers*. The chain of readers allows processing of the data to send.

The definition of a reader is:

```
struct Curl_crtype {
const char *name; /* writer name. */
CURLcode (*do_init)(struct Curl_easy *data, struct Curl_creader *writer);
CURLcode (*do_read)(struct Curl_easy *data, struct Curl_creader *reader,
char *buf, size_t blen, size_t *nread, bool *eos);
void (*do_close)(struct Curl_easy *data, struct Curl_creader *reader);
bool (*needs_rewind)(struct Curl_easy *data, struct Curl_creader *reader);
};
struct Curl_creader {
const struct Curl_crtype *crt; /* type implementation */
struct Curl_creader *next; /* Downstream reader. */
Curl_creader_phase phase; /* phase at which it operates */
};
```

`Curl_creader` is a reader instance with a `next` pointer to form the chain. It as a type `crt` which provides the implementation. The main callback is `do_read()` which provides the data to the caller. The others are for setup and tear down. `needs_rewind()` is explained further below.

## Phases and Ordering

Since client readers may transform the data being read through the chain, the order in which they are called is relevant for the outcome. When a reader is created, it gets the `phase` property in which it operates. Reader phases are defined like:

```
typedef enum {
CURL_CR_NET, /* data send to the network (connection filters) */
CURL_CR_TRANSFER_ENCODE, /* add transfer-encodings */
CURL_CR_PROTOCOL, /* before transfer, but after content decoding */
CURL_CR_CONTENT_ENCODE, /* add content-encodings */
CURL_CR_CLIENT /* data read from client */
} Curl_creader_phase;
```

If a reader for phase `PROTOCOL` is added to the chain, it is always added *after* any `NET` or `TRANSFER_ENCODE` readers and *before* and `CONTENT_ENCODE` and `CLIENT` readers. If there is already a reader for the same phase, the new reader is added before the existing one(s).

### Example: `chunked` reader

In `http_chunks.c` a client reader for chunked uploads is implemented. This one operates at phase `CURL_CR_TRANSFER_ENCODE`. Any data coming from the reader "below" has the HTTP/1.1 chunk handling applied and returned to the caller.

When this reader sees an `eos` from below, it generates the terminal chunk, adding trailers if provided by the application. When that last chunk is fully returned, it also sets `eos` to the caller.

### Example: `lineconv` reader

In `sendf.c` a client reader that does line-end conversions is implemented. It operates at `CURL_CR_CONTENT_ENCODE` and converts any "\n" to "\r\n". This is used for FTP ASCII uploads or when the general `crlf` options has been set.

### Example: `null` reader

Implemented in `sendf.c` for phase `CURL_CR_CLIENT`, this reader has the simple job of providing transfer bytes of length 0 to the caller, immediately indicating an `eos`. This reader is installed by HTTP for all GET/HEAD requests and when authentication is being negotiated.

### Example: `buf` reader

Implemented in `sendf.c` for phase `CURL_CR_CLIENT`, this reader get a buffer pointer and a length and provides exactly these bytes. This one is used in HTTP for sending `postfields` provided by the application.

## Request retries

Sometimes it is necessary to send a request with client data again. Transfer handling can inquire via `Curl_client_read_needs_rewind()` if a rewind (e.g. a reset of the client data) is necessary. This asks all installed readers if they need it and give `FALSE` of none does.

## Summary and Outlook

By adding the client reader interface, any protocol can control how/if it wants the curl transfer to send bytes for a request. The transfer loop becomes then blissfully ignorant of the specifics.

The protocols on the other hand no longer have to care to package data most efficiently. At any time, should more data be needed, it can be read from the client. This is used when sending HTTP requests headers to add as much request body data to the initial sending as there is room for.

Future enhancements based on the client readers:
* delegate the actual "rewinding" to the readers. The should know how it is done, eliminating the `readrewind.c` protocol specifics in `multi.c`.
* `expect-100` handling: place that into a HTTP specific reader at `CURL_CR_PROTOCOL` and eliminate the checks in the generic transfer parts.
* `eos` detection: `upload_done` is partly triggered now by comparing the number of bytes sent to a known size. This is no longer necessary since the core readers obey length restrictions.
* `eos forwarding`: transfer should forward an `eos` flag to the connection filters. Filters like HTTP/2 and HTTP/3 can make use of that, terminating streams early. This would also eliminate length checks in stream handling.
1 change: 1 addition & 0 deletions docs/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ EXTRA_DIST = \
CODE_OF_CONDUCT.md \
CODE_REVIEW.md \
CODE_STYLE.md \
CLIENT-READERS.md \
CLIENT-WRITERS.md \
CONNECTION-FILTERS.md \
CONTRIBUTE.md \
Expand Down
21 changes: 21 additions & 0 deletions lib/bufq.c
Original file line number Diff line number Diff line change
Expand Up @@ -417,6 +417,17 @@ ssize_t Curl_bufq_write(struct bufq *q,
return nwritten;
}

CURLcode Curl_bufq_cwrite(struct bufq *q,
const char *buf, size_t len,
size_t *pnwritten)
{
ssize_t n;
CURLcode result;
n = Curl_bufq_write(q, (const unsigned char *)buf, len, &result);
*pnwritten = (n < 0)? 0 : (size_t)n;
return result;
}

ssize_t Curl_bufq_read(struct bufq *q, unsigned char *buf, size_t len,
CURLcode *err)
{
Expand All @@ -440,6 +451,16 @@ ssize_t Curl_bufq_read(struct bufq *q, unsigned char *buf, size_t len,
return nread;
}

CURLcode Curl_bufq_cread(struct bufq *q, char *buf, size_t len,
size_t *pnread)
{
ssize_t n;
CURLcode result;
n = Curl_bufq_read(q, (unsigned char *)buf, len, &result);
*pnread = (n < 0)? 0 : (size_t)n;
return result;
}

bool Curl_bufq_peek(struct bufq *q,
const unsigned char **pbuf, size_t *plen)
{
Expand Down
7 changes: 7 additions & 0 deletions lib/bufq.h
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,10 @@ ssize_t Curl_bufq_write(struct bufq *q,
const unsigned char *buf, size_t len,
CURLcode *err);

CURLcode Curl_bufq_cwrite(struct bufq *q,
const char *buf, size_t len,
size_t *pnwritten);

/**
* Read buf from the start of the buffer queue. The buf is copied
* and the amount of copied bytes is returned.
Expand All @@ -187,6 +191,9 @@ ssize_t Curl_bufq_write(struct bufq *q,
ssize_t Curl_bufq_read(struct bufq *q, unsigned char *buf, size_t len,
CURLcode *err);

CURLcode Curl_bufq_cread(struct bufq *q, char *buf, size_t len,
size_t *pnread);

/**
* Peek at the head chunk in the buffer queue. Returns a pointer to
* the chunk buf (at the current offset) and its length. Does not
Expand Down
102 changes: 28 additions & 74 deletions lib/c-hyper.c
Original file line number Diff line number Diff line change
Expand Up @@ -117,8 +117,6 @@ size_t Curl_hyper_send(void *userp, hyper_context *ctx,
DEBUGF(infof(data, "Curl_hyper_send(%zu)", buflen));
result = Curl_conn_send(data, io_ctx->sockindex,
(void *)buf, buflen, &nwrote);
if(!result && !nwrote)
result = CURLE_AGAIN;
if(result == CURLE_AGAIN) {
DEBUGF(infof(data, "Curl_hyper_send(%zu) -> EAGAIN", buflen));
/* would block, register interest */
Expand Down Expand Up @@ -659,48 +657,13 @@ static CURLcode request_target(struct Curl_easy *data,
return result;
}

static int uploadpostfields(void *userdata, hyper_context *ctx,
hyper_buf **chunk)
{
struct Curl_easy *data = (struct Curl_easy *)userdata;
(void)ctx;
if(data->req.exp100 > EXP100_SEND_DATA) {
if(data->req.exp100 == EXP100_FAILED)
return HYPER_POLL_ERROR;

/* still waiting confirmation */
if(data->hyp.exp100_waker)
hyper_waker_free(data->hyp.exp100_waker);
data->hyp.exp100_waker = hyper_context_waker(ctx);
return HYPER_POLL_PENDING;
}
if(data->req.upload_done)
*chunk = NULL; /* nothing more to deliver */
else {
/* send everything off in a single go */
hyper_buf *copy = hyper_buf_copy(data->set.postfields,
(size_t)data->req.p.http->postsize);
if(copy)
*chunk = copy;
else {
data->state.hresult = CURLE_OUT_OF_MEMORY;
return HYPER_POLL_ERROR;
}
/* increasing the writebytecount here is a little premature but we
don't know exactly when the body is sent */
data->req.writebytecount += (size_t)data->req.p.http->postsize;
Curl_pgrsSetUploadCounter(data, data->req.writebytecount);
data->req.upload_done = TRUE;
}
return HYPER_POLL_READY;
}

static int uploadstreamed(void *userdata, hyper_context *ctx,
hyper_buf **chunk)
{
size_t fillcount;
struct Curl_easy *data = (struct Curl_easy *)userdata;
CURLcode result;
bool eos;
(void)ctx;

if(data->req.exp100 > EXP100_SEND_DATA) {
Expand All @@ -714,32 +677,15 @@ static int uploadstreamed(void *userdata, hyper_context *ctx,
return HYPER_POLL_PENDING;
}

if(data->req.upload_chunky && data->req.authneg) {
fillcount = 0;
data->req.upload_chunky = FALSE;
result = CURLE_OK;
}
else {
result = Curl_fillreadbuffer(data, data->set.upload_buffer_size,
&fillcount);
}
result = Curl_client_read(data, data->state.ulbuf,
data->set.upload_buffer_size,
&fillcount, &eos);
if(result) {
data->state.hresult = result;
return HYPER_POLL_ERROR;
}
if(!fillcount) {
if((data->req.keepon & KEEP_SEND_PAUSE) != KEEP_SEND_PAUSE)
/* done! */
*chunk = NULL;
else {
/* paused, save a waker */
if(data->hyp.send_body_waker)
hyper_waker_free(data->hyp.send_body_waker);
data->hyp.send_body_waker = hyper_context_waker(ctx);
return HYPER_POLL_PENDING;
}
}
else {

if(fillcount) {
hyper_buf *copy = hyper_buf_copy((uint8_t *)data->state.ulbuf, fillcount);
if(copy)
*chunk = copy;
Expand All @@ -751,8 +697,19 @@ static int uploadstreamed(void *userdata, hyper_context *ctx,
don't know exactly when the body is sent */
data->req.writebytecount += fillcount;
Curl_pgrsSetUploadCounter(data, data->req.writebytecount);
return HYPER_POLL_READY;
}
else if(eos) {
*chunk = NULL;
return HYPER_POLL_READY;
}
else {
/* paused, save a waker */
if(data->hyp.send_body_waker)
hyper_waker_free(data->hyp.send_body_waker);
data->hyp.send_body_waker = hyper_context_waker(ctx);
return HYPER_POLL_PENDING;
}
return HYPER_POLL_READY;
}

/*
Expand All @@ -773,7 +730,7 @@ static CURLcode bodysend(struct Curl_easy *data,
else {
hyper_body *body;
Curl_dyn_init(&req, DYN_HTTP_REQUEST);
result = Curl_http_req_send(data, &req, httpreq);
result = Curl_http_req_complete(data, &req, httpreq);

if(!result)
result = Curl_hyper_header(data, headers, Curl_dyn_ptr(&req));
Expand All @@ -782,18 +739,15 @@ static CURLcode bodysend(struct Curl_easy *data,

body = hyper_body_new();
hyper_body_set_userdata(body, data);
if(data->set.postfields)
hyper_body_set_data_func(body, uploadpostfields);
else {
result = Curl_get_upload_buffer(data);
if(result) {
hyper_body_free(body);
return result;
}
/* init the "upload from here" pointer */
data->req.upload_fromhere = data->state.ulbuf;
hyper_body_set_data_func(body, uploadstreamed);
result = Curl_get_upload_buffer(data);
if(result) {
hyper_body_free(body);
return result;
}
/* init the "upload from here" pointer */
data->req.upload_fromhere = data->state.ulbuf;
hyper_body_set_data_func(body, uploadstreamed);

if(HYPERE_OK != hyper_request_set_body(hyperreq, body)) {
/* fail */
result = CURLE_OUT_OF_MEMORY;
Expand Down Expand Up @@ -888,7 +842,7 @@ CURLcode Curl_http(struct Curl_easy *data, bool *done)
may be parts of the request that is not yet sent, since we can deal with
the rest of the request in the PERFORM phase. */
*done = TRUE;
Curl_cw_reset(data);
Curl_client_reset(data);

/* Add collecting of headers written to client. For a new connection,
* we might have done that already, but reuse
Expand Down
2 changes: 1 addition & 1 deletion lib/cf-h1-proxy.c
Original file line number Diff line number Diff line change
Expand Up @@ -1013,7 +1013,7 @@ static CURLcode cf_h1_proxy_connect(struct Curl_cfilter *cf,
data->req.header = TRUE; /* assume header */
data->req.bytecount = 0;
data->req.ignorebody = FALSE;
Curl_cw_reset(data);
Curl_client_reset(data);
Curl_pgrsSetUploadCounter(data, 0);
Curl_pgrsSetDownloadCounter(data, 0);

Expand Down
12 changes: 3 additions & 9 deletions lib/file.c
Original file line number Diff line number Diff line change
Expand Up @@ -293,10 +293,10 @@ static CURLcode file_upload(struct Curl_easy *data)
CURLcode result = CURLE_OK;
char *xfer_buf;
size_t xfer_blen;
char *uphere_save;
curl_off_t bytecount = 0;
struct_stat file_stat;
const char *sendbuf;
bool eos = FALSE;

/*
* Since FILE: doesn't do the full init, we need to provide some extra
Expand Down Expand Up @@ -340,21 +340,16 @@ static CURLcode file_upload(struct Curl_easy *data)
data->state.resume_from = (curl_off_t)file_stat.st_size;
}

/* Yikes! Curl_fillreadbuffer uses data->req.upload_fromhere to READ
* client data to! Please, someone fix... */
uphere_save = data->req.upload_fromhere;

result = Curl_multi_xfer_buf_borrow(data, &xfer_buf, &xfer_blen);
if(result)
goto out;

while(!result) {
while(!result && !eos) {
size_t nread;
ssize_t nwrite;
size_t readcount;

data->req.upload_fromhere = xfer_buf;
result = Curl_fillreadbuffer(data, xfer_blen, &readcount);
result = Curl_client_read(data, xfer_buf, xfer_blen, &readcount, &eos);
if(result)
break;

Expand Down Expand Up @@ -401,7 +396,6 @@ static CURLcode file_upload(struct Curl_easy *data)
out:
close(fd);
Curl_multi_xfer_buf_release(data, xfer_buf);
data->req.upload_fromhere = uphere_save;

return result;
}
Expand Down
Loading

0 comments on commit 9369c30

Please sign in to comment.