Add explicit heartbeat interval for GetOperationStatus #385

benc-db · 2024-04-05T20:47:41Z

In my investigation, it looks as though the current interval between GetOperationStatus requests is enforced by the server, as it comes between sending and receiving the request, and turning on debug logging for urllib3 shows no retries. This PR adds an entry point for configuring how long to wait client side between sending successive GetOperationStatus requests on 200 response. I'm initially proposing to set this value at 25 seconds, so that when adding to the 5s server-side, we are polling every 30s for statement completion.

github-actions · 2024-04-05T20:47:56Z

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

Signed-off-by: Ben Cassell <ben.cassell@databricks.com>

kravets-levko · 2024-04-05T21:20:36Z

src/databricks/sql/thrift_backend.py

@@ -55,6 +55,7 @@

 TIMESTAMP_AS_STRING_CONFIG = "spark.thriftserver.arrowBasedRowSet.timestampAsString"
 DEFAULT_SOCKET_TIMEOUT = float(900)
+DEFAULT_STATEMENT_HEARTBEAT_INTERVAL = float(25)


For me 25 seconds delay between GetOperationStatus requests looks a bit too much. I mean - if query execution finishes faster than this delay - client will still wait that 25 seconds. In Nodejs we poll for operation status with 100ms interval - which may be too low, but looks way more reasonable

100ms? Do you actually see that in practice? Today we wait 0, and the server holds up for up to 5 seconds. Seems weird to care about subsecond latency on queries that we know take more than 5 seconds.

benc-db requested review from rcypher-databricks, yunbodeng-db, andrefurlan-db and jackyhu-db as code owners April 5, 2024 20:47

jackyhu-db approved these changes Apr 5, 2024

View reviewed changes

add heartbeat interval

c4e6cef

Signed-off-by: Ben Cassell <ben.cassell@databricks.com>

benc-db force-pushed the investigate_get_op_status branch from 6812ff5 to c4e6cef Compare April 5, 2024 20:51

benc-db had a problem deploying to azure-prod April 5, 2024 20:51 — with GitHub Actions Failure

kravets-levko reviewed Apr 5, 2024

View reviewed changes

benc-db marked this pull request as draft April 5, 2024 21:31

benc-db closed this Apr 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add explicit heartbeat interval for GetOperationStatus #385

Add explicit heartbeat interval for GetOperationStatus #385

benc-db commented Apr 5, 2024

github-actions bot commented Apr 5, 2024

kravets-levko Apr 5, 2024 •

edited

Loading

benc-db Apr 5, 2024

Add explicit heartbeat interval for GetOperationStatus #385

Add explicit heartbeat interval for GetOperationStatus #385

Conversation

benc-db commented Apr 5, 2024

github-actions bot commented Apr 5, 2024

kravets-levko Apr 5, 2024 • edited Loading

Choose a reason for hiding this comment

benc-db Apr 5, 2024

Choose a reason for hiding this comment

kravets-levko Apr 5, 2024 •

edited

Loading