Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use standard HTTP mechanism to surface truncated query responses #13492

Conversation

abhishekagarwal87
Copy link
Contributor

@abhishekagarwal87 abhishekagarwal87 commented Dec 5, 2022

Description

As of now, executing a query can sometimes give truncated results. This can happen when the server starts streaming the response to the client but fails before the whole response has been streamed back. The most common situation here is query timeout. If all the results are not streamed back in the timeout period, then the response gets cut. The client sees a 200 OK response and tries to parse the JSON.

We have built some solutions for this. There is no generic solution but implemented depending on the result format (CSV, Array, Object, etc.) and query protocol (SQL/JSON). We add some newline characters and if the client doesn't see those newline characters at the end, it considers the response truncated. This, however, doesn't work with clients that speak standard HTTP language such as curl.

There is a provision in the chunked encoding mechanism to signal truncated responses. The server sends a final terminating chunk and the client uses that terminating chunk to confirm that response is complete. Jetty is supposed to do the same but for some reason, it sends a terminating chunk even if the response is incomplete. I verified this by running Wireshark locally and inspecting the frames

For a truncated response
Screenshot 2022-12-05 at 1 08 19 PM

For a complete response

Screenshot 2022-12-05 at 1 09 07 PM

But then, I also found this commit that clearly has a fix and even a test case to verify the behavior we want.

After doing a bit more fiddling, I realized that we are closing the ResultWriter in SqlResource try-block and that, in turn, closes the response output stream. That effectively signals the end of the response and the timeout exception in the query is thrown afterward. At this point, throwing an exception would likely have no effect on the response bits being sent by Jetty.

To fix this, I have disabled the auto-closing of the target output stream in all of the writers used in SqlResource. I have modified the docs of ResultWriter#close to call out explicitly that they must not close the output stream.

Here is what I get using a curl command to issue a query

Before the fix

< Vary: Accept-Encoding, User-Agent
< Content-Encoding: gzip
< Transfer-Encoding: chunked
<
{ [13 bytes data]
100 1295k    0 1295k  100   255   221k     43  0:00:05  0:00:05 --:--:--  5789
* Connection #0 to host localhost left intact

After the fix

< Vary: Accept-Encoding, User-Agent
< Content-Encoding: gzip
< Transfer-Encoding: chunked
<
{ [13 bytes data]
100 1443k    0 1443k  100   255   172k     30  0:00:08  0:00:08 --:--:--     0* transfer closed with outstanding read data remaining
100 1443k    0 1443k  100   255   172k     30  0:00:08  0:00:08 --:--:--     0
* Closing connection 0
curl: (18) transfer closed with outstanding read data remaining

Release note

Key changes

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@abhishekagarwal87
Copy link
Contributor Author

abhishekagarwal87 commented Dec 5, 2022

I haven't tested the router yet but I think, the router should just work without any changes. Will test that before I merge this PR.

@abhishekagarwal87 abhishekagarwal87 changed the title Use a more standard way to surface truncated query responses Use standard HTTP mechanism to surface truncated query responses Dec 5, 2022
@abhishekagarwal87 abhishekagarwal87 marked this pull request as draft July 12, 2023 12:45
Copy link

This pull request has been marked as stale due to 60 days of inactivity.
It will be closed in 4 weeks if no further activity occurs. If you think
that's incorrect or this pull request should instead be reviewed, please simply
write any comment. Even if closed, you can still revive the PR at any time or
discuss it on the dev@druid.apache.org list.
Thank you for your contributions.

@github-actions github-actions bot added the stale label Jan 13, 2024
Copy link

This pull request/issue has been closed due to lack of activity. If you think that
is incorrect, or the pull request requires review, you can revive the PR at any time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scan query gracefully closing connections which results in partial data read from datasource
2 participants