Boost Data Visibility and Trust with dbt’s New Auto-Exposures Feature
At Coalesce 2024, dbt Labs announced auto-exposures, a new dbt Cloud feature. Auto-exposures builds on the existing exposures functionality in dbt Core by providing better visibility into how your models are used downstream outside your dbt project. As this enterprise feature matures and covers a wider array of BI tools, it will become a game-changer for teams on dbt Cloud. So in this post, we’ll explore the benefits of well-documented dbt exposures, why the feature isn’t frequently utilized, and how dbt Labs will enhance the feature’s capabilities with auto-exposures.
The Benefits of Well-Documented dbt Exposures
Exposures, a dbt Core feature, lets you declare a downstream data asset — a dashboard, a report, a reverse ETL workflow, a ML model, etc. In that declaration, you list the dbt models being queried by that asset, a description of that asset, and who owns it. Well-documented exposures can improve a typical workflow from a technical and documentation perspective! Here’s how:
- Analytics engineers can re-build dbt models upstream of the asset with a single selector defined by the exposure. This streamlined approach saves time and cost and reduces the complexity of debugging those assets by letting you build only the relevant models and no more.
- Analytics engineers get better visibility into pipeline health when communicating with colleagues using these assets. If an asset has data hiccups, exposures reduce the time an engineer must dig through the dbt project or the asset to find the root cause. And when certain dbt models fail, well-documented exposures let engineers know which downstream assets will be impacted and who to notify with this information. This speed and transparency build trust with stakeholders.
- Analytics engineers gain more practical ways to build an inventory of dbt models for various needs such as data asset mapping and roadmap planning. Here’s a real-world example: A client had a large dbt project with several dozen declared exposures listed in a single 2,000-line YAML file. These exposures referenced downstream Looker reports amongst others. The challenge was that the client needed to migrate many models to a newer dbt project. However, they were unsure which models to prioritize and worried about breaking those downstream Looker reports. Using a few lines of Python code, we parsed the exposure YAML and identified the models used across the client’s Looker instance. That gave us a starting point to plan the migration — having a library of well-defined exposures was critical!
How to Get Your Team to Use Exposures More Frequently in dbt Projects
Despite their numerous benefits, we’ve found that exposures still aren’t widely adopted in dbt projects. There are some reasons, but we have practical tips to help your team members get around these initial blockers so they can start declaring exposures more often.
- The data team doesn’t always know which dbt models are used downstream from the dbt project or how they’re used. This is specifically true if other business teams are querying dbt models — you’ve democratized the data but at the cost of not knowing where and how the data is used. Having a line of communication with those other business teams on a regular cadence to collect model usage information will increase visibility. Remember, the benefit is mutual. With added visibility, you can better manage model updates, and they can better flag and handle errors with us.
- Data teams must manually input and update the exposure YAML. Once the exposure is declared, it can become outdated quickly because there are two moving parts: the referenced dbt model and referenced downstream asset. For example, upstream models can change, the downstream asset could join in more models, or the URL of the downstream asset changes. If keeping up with detailed exposure documentation is prohibitive to your team’s productivity, establishing guidelines on the important assets worth tracking is a good middle ground.
- Data teams feel there’s little use for exposures after they’re set up. This is the feedback we’ve heard from many data teams. Ideally, the ongoing benefits already listed in this article will change some people’s minds, and maybe yours!
How dbt Labs is Expanding the Exposures Feature with Auto-Exposures
As your data teams scale, data governance and documentation become a larger part of streamlining your developer workflow, increasing cross-team collaboration, and building trust with your business stakeholders. At Coalesce 2024, dbt Labs announced auto-exposures and data health tiles for dbt Cloud which aim to solve the challenges that come with declaring and maintaining exposures:
- With Auto-exposures dbt Cloud will pull information from the downstream tool to surface model and asset relationships and metadata within dbt Explorer. Essentially, the exposure is auto-declared and auto-updated as models and assets change. Auto-exposures are currently available only for Tableau, with PowerBI support coming soon.
- Data health tiles mean that with some basic configuration, teams can embed a visual data quality indicator inside the asset via an iFrame or URL embed. The indicator will alert if data quality or freshness checks aren’t passing, giving asset users greater confidence that the data they’re using is, in fact, fresh and of high quality. This feature will work with both manual exposures and auto-exposures.
dbt’s Exposures Feature: A Win-Win for Your Teams
Well-documented exposures hold significant potential for improving data management practices within your organization. By enhancing visibility, fostering collaboration, and streamlining workflows, exposures empower your data teams to work more efficiently and effectively with other business units. As dbt Labs continues to innovate and expand exposure capabilities, Brooklyn Data anticipates greater adoption and utilization of this feature across projects — a win-win for all involved in the data ecosystem.
Want to learn more about how to leverage dbt Cloud’s features? Contact us. Our data experts would be happy to help you use it to increase data visibility and collaboration across your organization.