-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: add files for dbt materialized views
- Loading branch information
1 parent
d9fe3a2
commit 50e809c
Showing
23 changed files
with
741 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
target/ | ||
dbt_modules/ | ||
logs/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
## dbt_labs_materialized_views | ||
|
||
`dbt_labs_materialized_views` is a dbt project containing materializations, helper macros, and some builtin macro overrides that enable use of materialized views in your dbt project. It takes a conceptual approach similar to that of the existing `incremental` materialization: | ||
- In a "full refresh" run, drop and recreate the MV from scratch. | ||
- Otherwise, "refresh" the MV as appropriate. Depending on the database, that could require DML (`refresh`) or no action. | ||
|
||
At any point, if the database object corresponding to a MV model exists instead as a table or standard view, dbt will attempt to drop it and recreate the model from scratch as a materialized view. | ||
|
||
Materialized views vary significantly across databases, as do their current limitations. Be sure to read the documentation for your adapter. | ||
|
||
If you're here, you may also like the [dbt-materialize](https://github.com/MaterializeInc/materialize/tree/main/misc/dbt-materialize) plugin, which enables dbt to materialize models as materialized views in [Materialize](https://materialize.io/). | ||
|
||
## Setup | ||
|
||
### General installation: | ||
|
||
You can install the materialized-view funcionality using one of the following methods. | ||
|
||
- Install this project as a package ([package-management docs](https://docs.getdbt.com/docs/building-a-dbt-project/package-management)) | ||
- [Local package](https://docs.getdbt.com/docs/building-a-dbt-project/package-management#local-packages): by referencing the [`materialized-views`](https://github.com/dbt-labs/dbt-labs-experimental-features/tree/master/materialized-views) folder. | ||
- [Git package](https://docs.getdbt.com/docs/building-a-dbt-project/package-management#git-packages) using [project subdirectories](https://docs.getdbt.com/docs/building-a-dbt-project/package-management#git-packages): again by referencing the [`materialized-views`](https://github.com/dbt-labs/dbt-labs-experimental-features/tree/master/materialized-views) folder. | ||
- Copy-paste the files from `macros/` (specifically `default` and your adapter) into your own project. | ||
|
||
### Extra installation steps for Postgres and Redshift | ||
|
||
The Postgres and Redshift implementations both require overriding the builtin versions of some adapter macros. If you've installed `dbt_labs_materialized_views` as a local package, you can achieve this override by creating a file `macros/*.sql` in your project with the following contents: | ||
|
||
```sql | ||
{# postgres and redshift #} | ||
|
||
{% macro drop_relation(relation) -%} | ||
{{ return(dbt_labs_materialized_views.drop_relation(relation)) }} | ||
{% endmacro %} | ||
|
||
{% macro postgres__list_relations_without_caching(schema_relation) %} | ||
{{ return(dbt_labs_materialized_views.postgres__list_relations_without_caching(schema_relation)) }} | ||
{% endmacro %} | ||
|
||
{% macro postgres_get_relations() %} | ||
{{ return(dbt_labs_materialized_views.postgres_get_relations()) }} | ||
{% endmacro %} | ||
|
||
{# redshift only #} | ||
|
||
{% macro redshift__list_relations_without_caching(schema_relation) %} | ||
{{ return(dbt_labs_materialized_views.redshift__list_relations_without_caching(schema_relation)) }} | ||
{% endmacro %} | ||
|
||
{% macro load_relation(relation) %} | ||
{{ return(dbt_labs_materialized_views.redshift_load_relation_or_mv(relation)) }} | ||
{% endmacro %} | ||
``` | ||
|
||
## Postgres | ||
|
||
- Supported model configs: none | ||
- [docs](https://www.postgresql.org/docs/9.3/rules-materializedviews.html) | ||
|
||
## Redshift | ||
|
||
- Supported model configs: `sort`, `dist`, `auto_refresh` | ||
- [docs](https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-overview.html) | ||
- Anecdotally, `refresh materialized view ...` is very slow to run. By contrast, `auto_refresh` runs in the background, with minimal disruption to other workloads, at the risk of some small potential latency. | ||
- ❗ MVs do not support late binding, so if an underlying table is cascade-dropped, the MV will be dropped as well. This would be fine, except that MVs don't include their "true" dependencies in `pg_depend`. Instead, a materialized view claims to depend on a table relation called `mv_tbl__[MV_name]__0`, in place of the name of the true underlying table (https://github.com/awslabs/amazon-redshift-utils/issues/499). As such, dbt's runtime cache is unable to reliably know if a MV has been dropped when it cascade-drops the underlying table. This package requires an override of `load_relation()` to perform a "hard" check (database query of `stv_mv_info`) every time dbt's cache thinks a `materializedview` relation may already exist. | ||
- ❗ MVs do appear in `pg_views`, but the only way we can know that they're materialized views is that the `create materialized view` DDL appear in their `definition`, instead of just the SQL without DDL (standard views). There's another Redshift system table, `stv_mv_info`, but it can't effectively be joined with `pg_views` because they're different types of system tables. | ||
- ❗ If a column in the underlying table renamed, or removed and readded (e.g. varchar widening), the materialized view cannot be refreshed: | ||
``` | ||
Database Error in model test_mv (models/test_mv.sql) | ||
Materialized view test_mv is unrefreshable as a column was renamed for a base table. | ||
compiled SQL at target/run/dbt_labs_experimental_features_integration_tests/test_mv.sql | ||
``` | ||
|
||
## BigQuery | ||
|
||
- Supported model configs: `auto_refresh`, `refresh_interval_minutes` | ||
- [docs](https://cloud.google.com/bigquery/docs/materialized-views-intro) | ||
- ❗ Although BQ does not have `drop ... cascade`, if the base table of a MV is dropped and recreated, the MV also needs to be dropped and recreated: | ||
``` | ||
Materialized view dbt-dev-168022:dbt_jcohen.test_mv references table dbt-dev-168022:dbt_jcohen.base_tbl which was deleted and recreated. The view must be deleted and recreated as well. | ||
``` | ||
|
||
## Snowflake | ||
|
||
- Supported model configs: `secure`, `cluster_by`, `automatic_clustering`, `persist_docs` (relation only) | ||
- [docs](https://docs.snowflake.com/en/user-guide/views-materialized.html) | ||
- ❗ Note: Snowflake MVs are only enabled on enterprise accounts | ||
- ❗ Although Snowflake does not have `drop ... cascade`, if the base table table of a MV is dropped and recreated, the MV also needs to be dropped and recreated, otherwise the following error will appear: | ||
``` | ||
Failure during expansion of view 'TEST_MV': SQL compilation error: Materialized View TEST_MV is invalid. | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
name: 'dbt_labs_materialized_views' | ||
version: '0.2.0' | ||
config-version: 2 | ||
require-dbt-version: ">=1.0.0" | ||
|
||
model-paths: ["models"] | ||
analysis-paths: ["analysis"] | ||
test-paths: ["tests"] | ||
seed-paths: ["seed"] | ||
macro-paths: ["macros"] | ||
snapshot-paths: ["snapshots"] | ||
|
||
target-path: "target" | ||
clean-targets: | ||
- "target" | ||
- "dbt_modules" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
|
||
target/ | ||
dbt_modules/ | ||
logs/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
test-postgres: | ||
dbt deps | ||
dbt seed --target postgres --full-refresh | ||
dbt run --target postgres --full-refresh --vars 'update: false' | ||
dbt run --target postgres --vars 'update: true' | ||
dbt test --target postgres | ||
|
||
test-redshift: | ||
dbt deps | ||
dbt seed --target redshift --full-refresh | ||
dbt run --target redshift --full-refresh --vars 'update: false' | ||
dbt run --target redshift --vars 'update: true' | ||
sleep 10 # wait for auto refresh | ||
dbt test --target redshift | ||
|
||
test-snowflake: | ||
dbt deps | ||
dbt seed --profile garage-snowflake --full-refresh | ||
dbt run --profile garage-snowflake --full-refresh --vars 'update: false' | ||
dbt run --profile garage-snowflake --vars 'update: true' | ||
dbt test --profile garage-snowflake | ||
|
||
test-bigquery: | ||
dbt deps | ||
dbt seed --target bigquery --full-refresh | ||
dbt run --target bigquery --full-refresh --vars 'update: false' | ||
dbt run --target bigquery --vars 'update: true' | ||
dbt test --target bigquery | ||
|
||
test-all: test-postgres test-redshift test-snowflake test-bigquery | ||
echo "Completed successfully" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
|
||
name: 'dbt_labs_materialized_views_integration_tests' | ||
version: '0.2.0' | ||
config-version: 2 | ||
|
||
profile: 'integration_tests' | ||
|
||
model-paths: ["models"] | ||
analysis-paths: ["analysis"] | ||
test-paths: ["tests"] | ||
seed-paths: ["seed"] | ||
macro-paths: ["macros"] | ||
|
||
target-path: "target" | ||
clean-targets: | ||
- "target" | ||
- "dbt_modules" | ||
|
||
quoting: | ||
identifier: false | ||
schema: false | ||
|
||
seeds: | ||
quote_columns: false |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
{# postgres + redshift #} | ||
|
||
{% macro drop_relation(relation) -%} | ||
{{ return(dbt_labs_materialized_views.drop_relation(relation)) }} | ||
{% endmacro %} | ||
|
||
{% macro postgres__list_relations_without_caching(schema_relation) %} | ||
{{ return(dbt_labs_materialized_views.postgres__list_relations_without_caching(schema_relation)) }} | ||
{% endmacro %} | ||
|
||
{% macro postgres_get_relations() %} | ||
{{ return(dbt_labs_materialized_views.postgres_get_relations()) }} | ||
{% endmacro %} | ||
|
||
{# redshift only #} | ||
|
||
{% macro redshift__list_relations_without_caching(schema_relation) %} | ||
{{ return(dbt_labs_materialized_views.redshift__list_relations_without_caching(schema_relation)) }} | ||
{% endmacro %} | ||
|
||
{% macro load_relation(relation) %} | ||
{% if adapter.type() == 'redshift' %} | ||
{{ return(dbt_labs_materialized_views.redshift_load_relation_or_mv(relation)) }} | ||
{% else %} | ||
{{ return(dbt.load_relation(relation)) }} | ||
{% endif %} | ||
{% endmacro %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
{{config( | ||
materialized = 'incremental', | ||
unique_key = 'id' | ||
)}} | ||
|
||
-- depends on: {{ref('seed_update')}} | ||
-- depends on: {{ref('seed')}} | ||
|
||
{% if is_incremental() %} | ||
|
||
select * from {{ref('seed_update')}} | ||
|
||
{% else %} | ||
|
||
select * from {{ref('seed')}} | ||
|
||
{% endif %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
version: 2 | ||
|
||
models: | ||
- name: test_mv_manual | ||
tests: | ||
- dbt_utils.equality: | ||
compare_model: ref('expected') | ||
- name: test_mv_auto | ||
tests: | ||
- dbt_utils.equality: | ||
compare_model: ref('expected') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
{{config( | ||
materialized = 'materialized_view', | ||
auto_refresh = true | ||
)}} | ||
|
||
select | ||
|
||
gender, | ||
count(*) as num | ||
|
||
from {{ref('base_tbl')}} | ||
group by 1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
{{config( | ||
materialized = 'materialized_view', | ||
auto_refresh = false | ||
)}} | ||
|
||
select | ||
|
||
gender, | ||
count(*) as num | ||
|
||
from {{ref('base_tbl')}} | ||
group by 1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
packages: | ||
- local: ../ | ||
- package: fishtown-analytics/dbt_utils | ||
version: 0.6.4 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
gender,num | ||
Female,6 | ||
Male,4 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
id,first_name,last_name,email,gender,ip_address | ||
1,Jacqueline,Hunter,jhunter0@pbs.org,Male,59.80.20.168 | ||
2,Kathryn,Walker,kwalker1@ezinearticles.com,Female,194.121.179.35 | ||
3,Gerald,Ryan,gryan2@com.com,Male,11.3.212.243 | ||
4,Bonnie,Spencer,bspencer3@ameblo.jp,Female,216.32.196.175 | ||
5,Harold,Taylor,htaylor4@people.com.cn,Male,253.10.246.136 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
id,first_name,last_name,email,gender,ip_address | ||
1,Jacqueline,Hunter,jhunter0@pbs.org,Male,59.80.20.168 | ||
2,Kathryn,Walker,kwalker1@ezinearticles.com,Female,194.121.179.35 | ||
3,Gerald,Ryan,gryan2@com.com,Female,11.3.212.243 | ||
4,Bonnie,Spencer,bspencer3@ameblo.jp,Female,216.32.196.175 | ||
5,Harold,Taylor,htaylor4@people.com.cn,Male,253.10.246.136 | ||
6,Jack,Griffin,jgriffin5@t.co,Female,16.13.192.220 | ||
7,Wanda,Arnold,warnold6@google.nl,Female,232.116.150.64 | ||
8,Craig,Ortiz,cortiz7@sciencedaily.com,Male,199.126.106.13 | ||
9,Gary,Day,gday8@nih.gov,Male,35.81.68.186 | ||
10,Rose,Wright,rwright9@yahoo.co.jp,Female,236.82.178.100 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
{% macro bigquery_options() %} | ||
{%- set opts = kwargs -%} | ||
{%- set options -%} | ||
OPTIONS({% for opt_key, opt_val in kwargs.items() if opt_val is not none %} | ||
{{ opt_key }}={{ opt_val }}{{ "," if not loop.last }} | ||
{%- endfor -%}) | ||
{%- endset %} | ||
{%- do return(options) -%} | ||
{%- endmacro -%} | ||
|
||
{% macro bigquery__create_materialized_view_as(relation, sql, config) -%} | ||
|
||
{%- set enable_refresh = config.get('auto_refresh', none) -%} | ||
{%- set refresh_interval_minutes = config.get('refresh_interval_minutes', none) -%} | ||
{%- set sql_header = config.get('sql_header', none) -%} | ||
|
||
{{ sql_header if sql_header is not none }} | ||
|
||
create materialized view {{relation}} | ||
{{ dbt_labs_materialized_views.bigquery_options( | ||
enable_refresh=enable_refresh, | ||
refresh_interval_minutes=refresh_interval_minutes | ||
) }} | ||
as ( | ||
{{sql}} | ||
) | ||
|
||
{% endmacro %} | ||
|
||
|
||
{% macro bigquery__refresh_materialized_view(relation, config) -%} | ||
|
||
{%- set is_auto_refresh = config.get('auto_refresh', true) %} | ||
|
||
{%- if is_auto_refresh == false -%} {# manual refresh #} | ||
|
||
{% set refresh_command %} | ||
call bq.refresh_materialized_view('{{relation|replace("`","")}}') | ||
{% endset %} | ||
|
||
{%- do return(refresh_command) -%} | ||
|
||
{%- else -%} {# automatic refresh #} | ||
|
||
{%- do log("Skipping materialized view " ~ relation ~ " because it is set | ||
to refresh automatically") -%} | ||
|
||
{%- do return(none) -%} | ||
|
||
{%- endif -%} | ||
|
||
{% endmacro %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
{% materialization materialized_view, adapter='bigquery' -%} | ||
|
||
{% set full_refresh_mode = (should_full_refresh()) %} | ||
|
||
{% set target_relation = this %} | ||
{% set existing_relation = load_relation(this) %} | ||
{% set tmp_relation = make_temp_relation(this) %} | ||
|
||
{{ run_hooks(pre_hooks) }} | ||
|
||
{% if existing_relation is none %} | ||
{% set build_sql = dbt_labs_materialized_views.create_materialized_view_as(target_relation, sql, config) %} | ||
{% elif existing_relation.is_view or existing_relation.is_table %} | ||
{#-- Can't overwrite a view with a table - we must drop --#} | ||
{{ log("Dropping relation " ~ target_relation ~ " because it is a " ~ existing_relation.type ~ " and this model is a materialized view.") }} | ||
{% do adapter.drop_relation(existing_relation) %} | ||
{% set build_sql = dbt_labs_materialized_views.create_materialized_view_as(target_relation, sql, config) %} | ||
{% elif full_refresh_mode %} | ||
{#-- create or replace not yet supported for materialized views --#} | ||
{{ log("Dropping relation " ~ target_relation ~ " because replacing an existing materialized view is not supported.") }} | ||
{% do adapter.drop_relation(existing_relation) %} | ||
{% set build_sql = dbt_labs_materialized_views.create_materialized_view_as(target_relation, sql, config) %} | ||
{% else %} | ||
{% set build_sql = dbt_labs_materialized_views.refresh_materialized_view(target_relation, config) %} | ||
{% endif %} | ||
|
||
{% if build_sql %} | ||
{% call statement("main") %} | ||
{{ build_sql }} | ||
{% endcall %} | ||
{% else %} | ||
{{ store_result('main', 'SKIP') }} | ||
{% endif %} | ||
|
||
{{ run_hooks(post_hooks) }} | ||
|
||
{% do persist_docs(target_relation, model) %} | ||
|
||
{{ return({'relations': [target_relation]}) }} | ||
|
||
{%- endmaterialization %} |
Oops, something went wrong.