Connect DuckDB to dbt Core
Some functionality may be limited. If you're interested in contributing, check out the source code for each repository listed below.
- Maintained by: Community
- Authors: Josh Wills (https://github.com/jwills)
- GitHub repo: duckdb/dbt-duckdb
- PyPI package:
dbt-duckdb - Slack channel: #db-duckdb
- Supported dbt Core version: v1.0.1 and newer
- dbt support: Not Supported
- Minimum data platform version: DuckDB 0.3.2
Installing dbt-duckdb
Use pip to install the adapter. Before 1.8, installing the adapter would automatically install dbt-core and any additional dependencies. Beginning in 1.8, installing an adapter does not automatically install dbt-core. This is because adapters and dbt Core versions have been decoupled from each other so we no longer want to overwrite existing dbt-core installations.
Use the following command for installation:
python -m pip install dbt-core dbt-duckdb
Configuring dbt-duckdb
For Duck DB-specific configuration, please refer to Duck DB configs.
Connecting to DuckDB with dbt-duckdb
DuckDB is an embedded database, similar to SQLite, but designed for OLAP-style analytics instead of OLTP. The only configuration parameter that is required in your profile (in addition to type: duckdb) is the path field, which should refer to a path on your local filesystem where you would like the DuckDB database file (and it's associated write-ahead log) to be written. You can also specify the schema parameter if you would like to use a schema besides the default (which is called main).
There is also a database field defined in the DuckDBCredentials class for consistency with the parent Credentials class, but it defaults to main and setting it to be something else will likely cause strange things to happen that cannot be fully predicted, so please avoid changing it.
As of version 1.2.3, you can load any supported DuckDB extensions by listing them in the extensions field in your profile. You can also set any additional DuckDB configuration options via the settings field, including options that are supported in any loaded extensions.
For example, to be able to connect to s3 and read/write parquet files using an AWS access key and secret, your profile would look something like this:
your_profile_name:
target: dev
outputs:
dev:
type: duckdb
path: 'file_path/database_name.duckdb'
extensions:
- httpfs
- parquet
settings:
s3_region: my-aws-region
s3_access_key_id: "{{ env_var('S3_ACCESS_KEY_ID') }}"
s3_secret_access_key: "{{ env_var('S3_SECRET_ACCESS_KEY') }}"
Local storage
When using dbt with DuckDB, the path field in your profiles.yml determines where the DuckDB database file is stored on your local filesystem. You can provide an absolute path or a relative path (which will be resolved relative to your dbt project root).
If you provide a filesystem path ending in .duckdb, DuckDB will:
- Create the file automatically if it does not exist
- Write tables and views into that file when you execute
dbt run - Persist data between runs
The following is an example of a profile configured to create and connect to a local DuckDB database using a relative path:
duckdb_local_storage_test:
target: dev
outputs:
dev:
type: duckdb
path: "./local.duckdb"
threads: 4
This configuration tells DuckDB to create (or reuse) a file named local.duckdb in the directory where you run dbt.
From your project directory, run these commands:
dbt debug
dbt run
ls -lah local.duckdb
After executing dbt run, you should see a file named local.duckdb in your project directory. This file contains the tables and views built by dbt.
If you delete the file:
rm local.duckdb
Running dbt run again will recreate it.
If you use a relative path (for example, ./local.duckdb), the database file is created relative to the directory where you execute dbt. You can also use an absolute path (for example, /Users/yourname/project/local.duckdb) to ensure the database file is always written to a specific location.
Was this page helpful?
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.