Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move the database configuration to a new section #22284

Merged
merged 23 commits into from
Apr 11, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
b05a841
move the database configuration to a new section
gitstart Mar 27, 2022
1b78621
Merge branch 'main' into airflow-15930
gitstart Mar 28, 2022
9d9b80c
merge maian
gitstart Mar 29, 2022
3c7bf73
Merge branch 'main' into airflow-15930
kelechi2020 Mar 29, 2022
705edee
Merge commit '56235d939a9eda89241da7cde7e45985eb5b0786' into airflow-…
gitstart Mar 30, 2022
45a352e
update section key pair
gitstart Mar 30, 2022
a645f4a
fix backward compatibility issues
gitstart Mar 31, 2022
a4ed524
Merge branch 'apache:main' into airflow-15930
gitstart Mar 31, 2022
e25055a
fix static check failure
gitstart Apr 1, 2022
04ad107
change section order in config.yml to fix static check
gitstart Apr 4, 2022
0c264c9
Merge branch 'apache:main' into airflow-15930
gitstart Apr 4, 2022
8629898
fix indentation error on .yml file
gitstart Apr 5, 2022
1a65c7d
fix indentation error on .yml file
gitstart Apr 5, 2022
6fa3499
Merge branch 'apache:main' into airflow-15930
gitstart Apr 5, 2022
c57a8b7
Merge commit '34154803ac73d62d3e969e480405df3073032622' into airflow-…
gitstart Apr 6, 2022
7257aca
fix failing tests
gitstart Apr 6, 2022
22a24a6
fix test failure
gitstart Apr 7, 2022
1d1bb46
Merge branch 'apache:main' into airflow-15930
gitstart Apr 7, 2022
f2ccec3
Merge branch 'main' into airflow-15930
gitstart Apr 8, 2022
8cb776d
fix syntax for static checks
gitstart Apr 8, 2022
dd44916
Merge branch 'main' into airflow-15930
gitstart Apr 11, 2022
cd5d673
Update breeze
gitstart Apr 11, 2022
2a06c21
Merge branch 'main' into airflow-15930
gitstart Apr 11, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CHANGELOG.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2090,7 +2090,7 @@ Improvements
- [AIRFLOW-5583] Extend the 'DAG Details' page to display the start_date / end_date (#6235)
- [AIRFLOW-6250] Ensure on_failure_callback always has a populated context (#6812)
- [AIRFLOW-6222] http hook logs response body for any failure (#6779)
- [AIRFLOW-6260] Drive _cmd config option by env var (``AIRFLOW__CORE__SQL_ALCHEMY_CONN_CMD`` for example) (#6801)
- [AIRFLOW-6260] Drive _cmd config option by env var (``AIRFLOW__DATABASE__SQL_ALCHEMY_CONN_CMD`` for example) (#6801)
- [AIRFLOW-6168] Allow proxy_fix middleware of webserver to be configurable (#6723)
- [AIRFLOW-5931] Use os.fork when appropriate to speed up task execution. (#6627)
- [AIRFLOW-4145] Allow RBAC roles permissions, ViewMenu to be over-rideable (#4960)
Expand Down
17 changes: 17 additions & 0 deletions UPDATING.md
Original file line number Diff line number Diff line change
Expand Up @@ -243,6 +243,23 @@ Smart sensors, an "early access" feature added in Airflow 2, are now deprecated

See [Migrating to Deferrable Operators](https://airflow.apache.org/docs/apache-airflow/2.2.4/concepts/smart-sensors.html#migrating-to-deferrable-operators) for details on how to migrate.

### Database configuration moved to new section

The following configurations have been moved from `[core]` to the new `[database]` section. However when reading new option, the old option will be checked to see if it exists. If it does a DeprecationWarning will be issued and the old option will be used instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Information that they can still be accessed via "core" section but will raise a deprecation warning would be useful to add here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is NIT. And it is good without it - feel free ot add it if you want though.

- sql_alchemy_conn
- sql_engine_encoding
- sql_engine_collation_for_ids
- sql_alchemy_pool_enabled
- sql_alchemy_pool_size
- sql_alchemy_max_overflow
- sql_alchemy_pool_recycle
- sql_alchemy_pool_pre_ping
- sql_alchemy_schema
- sql_alchemy_connect_args
- load_default_connections
- max_db_retries

## Airflow 2.2.3

No breaking changes.
Expand Down
2 changes: 1 addition & 1 deletion airflow/cli/commands/info_command.py
Original file line number Diff line number Diff line change
Expand Up @@ -221,7 +221,7 @@ def get_fullname(o):
def _airflow_info(self):
executor = configuration.conf.get("core", "executor")
sql_alchemy_conn = self.anonymizer.process_url(
configuration.conf.get("core", "SQL_ALCHEMY_CONN", fallback="NOT AVAILABLE")
configuration.conf.get("database", "SQL_ALCHEMY_CONN", fallback="NOT AVAILABLE")
)
dags_folder = self.anonymizer.process_path(
configuration.conf.get("core", "dags_folder", fallback="NOT AVAILABLE")
Expand Down
2 changes: 1 addition & 1 deletion airflow/cli/commands/standalone_command.py
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,7 @@ def calculate_env(self):
executor_constants.LOCAL_EXECUTOR,
executor_constants.SEQUENTIAL_EXECUTOR,
]:
if "sqlite" in conf.get("core", "sql_alchemy_conn"):
if "sqlite" in conf.get("database", "sql_alchemy_conn"):
self.print_output("standalone", "Forcing executor to SequentialExecutor")
env["AIRFLOW__CORE__EXECUTOR"] = executor_constants.SEQUENTIAL_EXECUTOR
else:
Expand Down
250 changes: 127 additions & 123 deletions airflow/config_templates/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -60,111 +60,6 @@
type: string
example: ~
default: "SequentialExecutor"
- name: sql_alchemy_conn
description: |
The SqlAlchemy connection string to the metadata database.
SqlAlchemy supports many different database engines.
More information here:
http://airflow.apache.org/docs/apache-airflow/stable/howto/set-up-database.html#database-uri
version_added: ~
type: string
sensitive: true
example: ~
default: "sqlite:///{AIRFLOW_HOME}/airflow.db"
- name: sql_alchemy_engine_args
description: |
Extra engine specific keyword args passed to SQLAlchemy's create_engine, as a JSON-encoded value
version_added: 2.3.0
type: string
sensitive: true
example: '{"arg1": True}'
default: ~
- name: sql_engine_encoding
description: |
The encoding for the databases
version_added: 1.10.1
type: string
example: ~
default: "utf-8"
- name: sql_engine_collation_for_ids
description: |
Collation for ``dag_id``, ``task_id``, ``key`` columns in case they have different encoding.
By default this collation is the same as the database collation, however for ``mysql`` and ``mariadb``
the default is ``utf8mb3_bin`` so that the index sizes of our index keys will not exceed
the maximum size of allowed index when collation is set to ``utf8mb4`` variant
(see https://github.com/apache/airflow/pull/17603#issuecomment-901121618).
version_added: 2.0.0
type: string
example: ~
default: ~
- name: sql_alchemy_pool_enabled
description: |
If SqlAlchemy should pool database connections.
version_added: ~
type: string
example: ~
default: "True"
- name: sql_alchemy_pool_size
description: |
The SqlAlchemy pool size is the maximum number of database connections
in the pool. 0 indicates no limit.
version_added: ~
type: string
example: ~
default: "5"
- name: sql_alchemy_max_overflow
description: |
The maximum overflow size of the pool.
When the number of checked-out connections reaches the size set in pool_size,
additional connections will be returned up to this limit.
When those additional connections are returned to the pool, they are disconnected and discarded.
It follows then that the total number of simultaneous connections the pool will allow
is pool_size + max_overflow,
and the total number of "sleeping" connections the pool will allow is pool_size.
max_overflow can be set to ``-1`` to indicate no overflow limit;
no limit will be placed on the total number of concurrent connections. Defaults to ``10``.
version_added: 1.10.4
type: string
example: ~
default: "10"
- name: sql_alchemy_pool_recycle
description: |
The SqlAlchemy pool recycle is the number of seconds a connection
can be idle in the pool before it is invalidated. This config does
not apply to sqlite. If the number of DB connections is ever exceeded,
a lower config value will allow the system to recover faster.
version_added: ~
type: string
example: ~
default: "1800"
- name: sql_alchemy_pool_pre_ping
description: |
Check connection at the start of each connection pool checkout.
Typically, this is a simple statement like "SELECT 1".
More information here:
https://docs.sqlalchemy.org/en/13/core/pooling.html#disconnect-handling-pessimistic
version_added: 1.10.6
type: string
example: ~
default: "True"
- name: sql_alchemy_schema
description: |
The schema to use for the metadata database.
SqlAlchemy supports databases with the concept of multiple schemas.
version_added: 1.10.3
type: string
example: ~
default: ""
- name: sql_alchemy_connect_args
description: |
Import path for connect args in SqlAlchemy. Defaults to an empty dict.
This is useful when you want to configure db engine args that SqlAlchemy won't parse
in connection string.
See https://docs.sqlalchemy.org/en/13/core/engines.html#sqlalchemy.create_engine.params.connect_args
version_added: 1.10.11
type: string
example: ~
default: ~
- name: parallelism
description: |
This defines the maximum number of task instances that can run concurrently in Airflow
Expand Down Expand Up @@ -212,15 +107,6 @@
type: string
example: ~
default: "True"
- name: load_default_connections
description: |
Whether to load the default connections that ship with Airflow. It's good to
get started, but you probably want to set this to ``False`` in a production
environment
version_added: 1.10.10
type: string
example: ~
default: "True"
- name: plugins_folder
description: |
Path to the folder containing Airflow plugins
Expand Down Expand Up @@ -435,15 +321,6 @@
type: boolean
example: ~
default: "True"
- name: max_db_retries
description: |
Number of times the code should be retried in case of DB Operational Errors.
Not all transactions will be retried as it can cause undesired state.
Currently it is only used in ``DagFileProcessor.process_file`` to retry ``dagbag.sync_to_db``.
version_added: 2.0.0
type: integer
example: ~
default: "3"
- name: hide_sensitive_var_conn_fields
description: |
Hide sensitive Variables or Connection extra json keys from UI and task logs when set to True
Expand Down Expand Up @@ -480,6 +357,133 @@
example: ~
default: "1024"

- name: database
description: ~
options:
- name: sql_alchemy_conn
description: |
The SqlAlchemy connection string to the metadata database.
SqlAlchemy supports many different database engines.
More information here:
http://airflow.apache.org/docs/apache-airflow/stable/howto/set-up-database.html#database-uri
version_added: 2.3.0
type: string
sensitive: true
example: ~
default: "sqlite:///{AIRFLOW_HOME}/airflow.db"
- name: sql_alchemy_engine_args
description: |
Extra engine specific keyword args passed to SQLAlchemy's create_engine, as a JSON-encoded value
version_added: 2.3.0
type: string
sensitive: true
example: '{"arg1": True}'
default: ~
- name: sql_engine_encoding
description: |
The encoding for the databases
version_added: 2.3.0
type: string
example: ~
default: "utf-8"
- name: sql_engine_collation_for_ids
description: |
Collation for ``dag_id``, ``task_id``, ``key`` columns in case they have different encoding.
By default this collation is the same as the database collation, however for ``mysql`` and ``mariadb``
the default is ``utf8mb3_bin`` so that the index sizes of our index keys will not exceed
the maximum size of allowed index when collation is set to ``utf8mb4`` variant
(see https://github.com/apache/airflow/pull/17603#issuecomment-901121618).
version_added: 2.3.0
type: string
example: ~
default: ~
- name: sql_alchemy_pool_enabled
description: |
If SqlAlchemy should pool database connections.
version_added: 2.3.0
type: string
example: ~
default: "True"
- name: sql_alchemy_pool_size
description: |
The SqlAlchemy pool size is the maximum number of database connections
in the pool. 0 indicates no limit.
version_added: 2.3.0
type: string
example: ~
default: "5"
- name: sql_alchemy_max_overflow
description: |
The maximum overflow size of the pool.
When the number of checked-out connections reaches the size set in pool_size,
additional connections will be returned up to this limit.
When those additional connections are returned to the pool, they are disconnected and discarded.
It follows then that the total number of simultaneous connections the pool will allow
is pool_size + max_overflow,
and the total number of "sleeping" connections the pool will allow is pool_size.
max_overflow can be set to ``-1`` to indicate no overflow limit;
no limit will be placed on the total number of concurrent connections. Defaults to ``10``.
version_added: 2.3.0
type: string
example: ~
default: "10"
- name: sql_alchemy_pool_recycle
description: |
The SqlAlchemy pool recycle is the number of seconds a connection
can be idle in the pool before it is invalidated. This config does
not apply to sqlite. If the number of DB connections is ever exceeded,
a lower config value will allow the system to recover faster.
version_added: 2.3.0
type: string
example: ~
default: "1800"
- name: sql_alchemy_pool_pre_ping
description: |
Check connection at the start of each connection pool checkout.
Typically, this is a simple statement like "SELECT 1".
More information here:
https://docs.sqlalchemy.org/en/13/core/pooling.html#disconnect-handling-pessimistic
version_added: 2.3.0
type: string
example: ~
default: "True"
- name: sql_alchemy_schema
description: |
The schema to use for the metadata database.
SqlAlchemy supports databases with the concept of multiple schemas.
version_added: 2.3.0
type: string
example: ~
default: ""
- name: sql_alchemy_connect_args
description: |
Import path for connect args in SqlAlchemy. Defaults to an empty dict.
This is useful when you want to configure db engine args that SqlAlchemy won't parse
in connection string.
See https://docs.sqlalchemy.org/en/13/core/engines.html#sqlalchemy.create_engine.params.connect_args
version_added: 2.3.0
type: string
example: ~
default: ~
- name: load_default_connections
description: |
Whether to load the default connections that ship with Airflow. It's good to
get started, but you probably want to set this to ``False`` in a production
environment
version_added: 2.3.0
type: string
example: ~
default: "True"
- name: max_db_retries
description: |
Number of times the code should be retried in case of DB Operational Errors.
Not all transactions will be retried as it can cause undesired state.
Currently it is only used in ``DagFileProcessor.process_file`` to retry ``dagbag.sync_to_db``.
version_added: 2.3.0
type: integer
example: ~
default: "3"

- name: logging
description: ~
options:
Expand Down
Loading