`connect_retries` and `connect_timeout` parameters don't have an effect #778

henlue · 2024-08-27T16:30:50Z

Describe the bug

The connect_retries and connect_timeout parameters in the profiles.yml don't have the effect that is described in the docs.

The retry functionality seems to be implemented, but the list of exceptions for which a retry happens is empty by default (here and here). It is possible to configure the connector to retry on all exceptions by setting retry_all: true and this will make the connect_retries and connect_timeout have the documented effects, but the retry_all parameter is not documented. I only found it while checking the code.

Depending on the desired behavior I see various ways to fix this:

Keep the existing behavior and update the documentation. For example by adding the retry_all parameter.
Change the existing behavior to match the documentation. For example:
a) Add transient exceptions to the retryable_exceptions list
b) Set retry_all to true by default (not sure about the side effects though)
c) Forward the connect_retries and connect_timeout parameters to the databricks sql connector, if this is possible.

I would be willing to implement a fix or to take a deeper look into the implications of the various fixes I've described.

Steps To Reproduce

I've created a profiles.yml with invalid connection parameters and a high number of connect_retries:

databricks:
  outputs:
    test:
      type: databricks
      host: invalid
      http_path: invalid
      token: invalid
      schema: schema
      connect_retries: 1000

then executed dbt run

Expected behavior

I expect 1000 retries. Instead dbt tries to establish the connection for 15 minutes, like it does when connect_retries is set to 1 and then fails.

System information

The output of dbt --version:

Core:
  - installed: 1.8.4
  - latest:    1.8.5 - Update available!

  Your version of dbt-core is out of date!
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Plugins:
  - spark:      1.8.0 - Up to date!
  - databricks: 1.8.3 - Update available!

The operating system you're using:
Ubuntu 22.04
The output of python --version:
Python 3.10.12

Additional context

We use a classic warehouse on Azure for our daily jobs. By default dbt databricks tries for 15 minutes to establish a connection to the warehouse, but sometimes this is not enough for the warehouse to start.

The text was updated successfully, but these errors were encountered:

caineblood · 2024-08-27T16:34:53Z

The above comment asking you to download a file is malware to steal your account; do not under any circumstances download or run it. The post needs to be removed. If you have attempted to run it please have your system cleaned and your account secured immediately.

benc-db · 2024-09-12T15:37:56Z

Thanks for the report. Quite a few users have been asking about retries lately, so I think I'll need to look into it.

henlue added the bug Something isn't working label Aug 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`connect_retries` and `connect_timeout` parameters don't have an effect #778

`connect_retries` and `connect_timeout` parameters don't have an effect #778

henlue commented Aug 27, 2024

caineblood commented Aug 27, 2024

benc-db commented Sep 12, 2024

connect_retries and connect_timeout parameters don't have an effect #778

connect_retries and connect_timeout parameters don't have an effect #778

Comments

henlue commented Aug 27, 2024

Describe the bug

Steps To Reproduce

Expected behavior

System information

Additional context

caineblood commented Aug 27, 2024

benc-db commented Sep 12, 2024

`connect_retries` and `connect_timeout` parameters don't have an effect #778

`connect_retries` and `connect_timeout` parameters don't have an effect #778