Transformations for dbt CoreBeta
Orchestrate custom data transformations in your destination with Transformations for dbt Core*.
TIP: You can also leverage our Transformations for dbt Core API to fulfill your analytical needs.
Overview
Fivetran integrates with dbt Core to power our transformations. dbt Core, by dbt Labs, is an open-source transformation tool that enables you to perform sophisticated data transformations in your destination using simple SQL statements. With dbt Core, you can:
- Write and test SQL transformations
- Use version control with your transformations
- Create and share documentation about your transformations
Once you have set up dbt Core, you write SQL SELECT statements (a.k.a "dbt models") in a Git repository to transform your data. dbt Core runs these SQL statements in your destination to build tables and views. dbt Core honors dependencies between your dbt models so everything is built in the correct order.
To work with dbt Core, you can either use the dbt CLI, a free and open-source command line interface, or dbt Cloud, dbt Labs' hosted service. Fivetran can run dbt projects created with either dbt Cloud or dbt CLI.
There are two types of Transformations for dbt Core:
- Scheduled in Fivetran (recommended): We run your dbt models in your destination according to the schedule that you set in the Fivetran dashboard.
- Scheduled in Code: We run your dbt models in your destination according to the schedule that you set in your dbt project.
Learn how to manage transformations in your Fivetran dashboard.
Scheduled in Fivetran
Fivetran connects to your Git provider and runs your dbt models in your destination according to the schedule that you choose in the Fivetran dashboard. We sync your dbt models from your Git provider every few minutes to ensure that we are up to date.
You create a transformation in the Fivetran dashboard for each dbt model that you want Fivetran to run. Each transformation consists of the following elements:
- Output model: A dbt model that transforms your data so it’s ready for analytics.
- Output model lineage: All upstream models that are needed to produce the output model, starting from your source table references in dbt Core.
- Schedule: A customizable schedule that determines how often Fivetran runs your transformation.
IMPORTANT: Each transformation references a single output model but executes all upstream models during each run.
By default, new transformations have the same schedule as their associated connectors, also known as an integrated schedule. Learn more in the integrated scheduling section.
TIP: If you want to customize your transformation schedule, we recommend that you schedule transformation runs in your Fivetran dashboard. However, you can use a configuration file in your Git repository instead if you prefer.
Integrated scheduling
Fivetran automatically runs your transformation as soon as we update any relevant destination data. Integrated schedules reduce data latency and ensure that your analytics tools reflect new data as quickly as possible. They can also reduce compute costs, since downstream transformations do not run if their associated connector fails to sync.
To run a transformation with integrated scheduling, Fivetran performs the following steps:
- Matches source table references in the dbt models to the source table names written by Fivetran connectors.
- Unifies your pipelines into end-to-end directed graphs.
- Executes the pipelines in order, which minimizes latency on the analytics-ready tables in your destination.
Fivetran transformation pipelines use the following elements:
- The start is the connector sync interval that initiates the pipeline.
- A connector updates source tables in the destination.
- A junction waits for multiple connectors to finish syncing before it triggers a transformation.
- A transformation is a model or a collection of models that updates downstream tables in the destination.
- An output model generates an analytics-ready table. It is usually the final element in a pipeline.
- A test is an assertion that you make about the models in your dbt project. A test may succeed or fail independently of model execution.
Each start node defines its own data pipeline. In the following example, the start node is the connector sync frequency. The oracle
connector runs every 15 minutes. When it successfully finishes syncing, that initiates downstream transformations to produce the customers
output model.
Fivetran comes with a fixed set of start nodes corresponding to different sync frequencies. When you select a frequency in the dashboard, the pipelines that activate those syncs are aware of overlaps and automatically adjust to them. In the example below, the oracle
connector is on a 15-minute schedule, the netsuite
connector is on an hourly schedule, and the salesforce
connector is on a 24-hour schedule.
- The 15-minute node activates every 15 minutes, except when the 1-hour or 24-hour node activates.
- The 1-hour node activates every hour, except when the 24-hour node activates.
- The 24-hour node activates all three connector syncs.
Custom scheduling
Fivetran runs your transformation according to the schedule you set. If you have enabled the Smart syncing option, when your transformation's schedule overlaps with the associated connector's schedule, we wait until the connector finishes syncing and only then run the transformation.
With custom scheduling, you can:
- Save on warehouse costs by running your transformation less frequently than your connector.
- Ensure that all upstream connectors have synced when you run your transformation by matching its schedule to your slowest upstream connector.
To run a transformation with custom scheduling, Fivetran performs the following steps:
- Matches source table references in the dbt models to the source table names written by Fivetran connectors.
- Unifies your pipelines into end-to-end directed graphs if your upstream connector’s and transformation’s schedules overlap. Otherwise, we create two separate pipelines.
- Runs your transformation according to the schedule you chose.
- If you enabled the Smart syncing option, when the transformation’s schedule overlaps with the upstream connector's schedule, Fivetran waits until the connector finishes syncing and only then runs the transformation.
For example, you have an output model, churn
, that is expensive to run. You choose to run it every 24 hours, but you don't want to run it until your destination data has been updated. You create a custom schedule so that the transformation runs every 24 hours once all of its associated connectors successfully sync:
NOTE: Fivetran executes pipelines one by one. In the simplest case, a pipeline involving transformation comprises a connector and an output model. For both Integrated and Custom scheduling, if the connector sync and transformation run are combined within one pipeline, we won't start the next run for the pipeline until the current pipeline run has ended. It may cause perceived delays in connector syncs. For example, if a connector has a 1-hour schedule, but the downstream integrated transformation run takes 2 hours, Fivetran run the next sync only after the transformation has finished.
Known limitations
You cannot connect a dbt project that features environment variables.
Scheduled in Code
Fivetran connects to your Git provider and runs your dbt models in your destination according to the schedule that you set in your dbt project's deployment.yml
file. We sync your dbt models from your Git provider every few minutes to ensure that we are up to date.
To run a transformation with Scheduled in Code, Fivetran performs the following steps:
Verifies the
deployment.yml
file in your dbt project.Parses the
deployment.yml
file to create new jobs or update existing jobs.Runs the jobs according to the schedule you specified in your
deployment.yml
file. For each job run, we do the following:i. Prepare an environment with the corresponding dbt CLI version installed, the clean project working directory, and the
profile.yml
file.
ii. Execute thedbt deps
service command to install the required project packages.
iii. Execute the steps in each job one-by-one until all scheduled jobs have run.
Known limitations
- You cannot connect a dbt project that features environment variables.
- You cannot integrate a transformation's schedule with a connector schedule.
- You cannot add, edit, or delete a transformation or its schedule on the Fivetran dashboard. You must do so in your dbt project.
- You cannot see links between the
dbt test
jobs and connectors in the transformation list.
Supported destinations
Fivetran supports Transformations for dbt Core for the following destinations:
- Azure Synapse
- BigQuery
- Databricks
- MySQL - see the limitations
- PostgreSQL
- Redshift
- Snowflake
- SQL Server - see the limitations
Hybrid Deployment limitations
- dbt transformations are not currently supported for Hybrid Deployment destinations.
Databricks on Azure limitations
Databricks on Azure destinations with OAuth authentication don't support Transformations for dbt Core.
MySQL limitations
For MySQL destinations, Fivetran only supports the following dbt Core versions: v1.0.x.
For Transformations for dbt Core, Fivetran does not support MySQL 5.7, even though Fivetran supports MySQL 5.7 as a destination.
SQL Server limitations
The dbt Core SQL Server adapter only supports SQL Server 2016+, even though Fivetran supports SQL Server 2012+ as a destination.
Fivetran data models
IMPORTANT: To use Fivetran's data models, you must have a BigQuery, Redshift, or Snowflake destination. Some data models also support Databricks. See our documentation for the relevant data model to check if it supports Databricks as a destination.
Fivetran offers pre-built, dbt Core-compatible data models for our top connectors (formerly known as "Fivetran dbt packages"). Learn more about Fivetran Data Models.
Setup guide
To learn how to use Transformations for dbt Core, follow the setup guide that applies to you:
- To schedule transformations in the Fivetran dashboard, follow the Scheduled in Fivetran setup guide.
- To schedule transformations in your dbt code, follow the Scheduled in Code setup guide.
Use cases
To see common use cases for Transformations for dbt Core - Scheduled in Fivetran, see our Transformations Scheduling documentation.
Notifications
You can enable email notifications for each of your transformations on your Fivetran dashboard's Notifications tab. To learn more about email notifications, see our Notifications documentation.
* dbt Core is a trademark of dbt Labs, Inc. All rights therein are reserved to dbt Labs, Inc. Fivetran Transformations is not a product or service of or endorsed by dbt Labs, Inc.