diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index 8aaf0375007..1983b0201d9 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -4,14 +4,14 @@ * @dbt-labs/product-docs # Adapter & Package Development Docs -/website/docs/docs/supported-data-platforms.md @dbt-labs/product-docs @dataders -/website/docs/reference/warehouse-setups @dbt-labs/product-docs @dataders +/website/docs/docs/supported-data-platforms.md @dbt-labs/product-docs @amychen1776 +/website/docs/reference/warehouse-setups @dbt-labs/product-docs @amychen1776 # `resource-configs` contains more than just warehouse setups -/website/docs/reference/resource-configs/*-configs.md @dbt-labs/product-docs @dataders -/website/docs/guides/advanced/adapter-development @dbt-labs/product-docs @dataders @dbeatty10 +/website/docs/reference/resource-configs/*-configs.md @dbt-labs/product-docs @amychen1776 +/website/docs/guides/advanced/adapter-development @dbt-labs/product-docs @amychen1776 -/website/docs/guides/building-packages @dbt-labs/product-docs @amychen1776 @dataders @dbeatty10 -/website/docs/guides/creating-new-materializations @dbt-labs/product-docs @dataders @dbeatty10 +/website/docs/guides/building-packages @dbt-labs/product-docs @amychen1776 +/website/docs/guides/creating-new-materializations @dbt-labs/product-docs # Require approval from the Multicell team when making # changes to the public facing migration documentation. diff --git a/website/blog/2021-11-23-how-to-upgrade-dbt-versions.md b/website/blog/2021-11-23-how-to-upgrade-dbt-versions.md index f7e5786bc70..0b1f1fe26bd 100644 --- a/website/blog/2021-11-23-how-to-upgrade-dbt-versions.md +++ b/website/blog/2021-11-23-how-to-upgrade-dbt-versions.md @@ -17,8 +17,9 @@ is_featured: true It's been a few years since dbt-core turned 1.0! Since then, we've committed to releasing zero breaking changes whenever possible and it's become much easier to upgrade dbt Core versions. In 2024, we're taking this promise further by: + - Stabilizing interfaces for everyone — adapter maintainers, metadata consumers, and (of course) people writing dbt code everywhere — as discussed in [our November 2023 roadmap update](https://github.com/dbt-labs/dbt-core/blob/main/docs/roadmap/2023-11-dbt-tng.md). -- Introducing **Versionless** in dbt Cloud. No more manual upgrades and no more need for _a second sandbox project_ just to try out new features in development. For more details, refer to [Upgrade Core version in Cloud](/docs/dbt-versions/upgrade-dbt-version-in-cloud). +- Introducing [Release tracks](/docs/dbt-versions/cloud-release-tracks) (formerly known as Versionless) to dbt Cloud. No more manual upgrades and no need for _a second sandbox project_ just to try out new features in development. For more details, refer to [Upgrade Core version in Cloud](/docs/dbt-versions/upgrade-dbt-version-in-cloud). We're leaving the rest of this post as is, so we can all remember how it used to be. Enjoy a stroll down memory lane. diff --git a/website/blog/2024-04-22-extended-attributes.md b/website/blog/2024-04-22-extended-attributes.md index 18d4ff0b64c..57636cc8f6b 100644 --- a/website/blog/2024-04-22-extended-attributes.md +++ b/website/blog/2024-04-22-extended-attributes.md @@ -80,7 +80,7 @@ All you need to do is configure an environment as staging and enable the **Defer ## Upgrading on a curve -Lastly, let’s consider a more specialized use case. Imagine we have a "tiger team" (consisting of a lone analytics engineer named Dave) tasked with upgrading from dbt version 1.6 to the new **Versionless** setting, to take advantage of added stability and feature access. We want to keep the rest of the data team being productive in dbt 1.6 for the time being, while enabling Dave to upgrade and do his work in the new versionless mode. +Lastly, let’s consider a more specialized use case. Imagine we have a "tiger team" (consisting of a lone analytics engineer named Dave) tasked with upgrading from dbt version 1.6 to the new **[Latest release track](/docs/dbt-versions/cloud-release-tracks)**, to take advantage of new features and performance improvements. We want to keep the rest of the data team being productive in dbt 1.6 for the time being, while enabling Dave to upgrade and do his work with Latest (and greatest) dbt. ### Development environment diff --git a/website/blog/2024-05-22-latest-dbt-stability-improvement-innovation.md b/website/blog/2024-05-22-latest-dbt-stability-improvement-innovation.md index 078dab198fa..f2c25f3da8c 100644 --- a/website/blog/2024-05-22-latest-dbt-stability-improvement-innovation.md +++ b/website/blog/2024-05-22-latest-dbt-stability-improvement-innovation.md @@ -1,5 +1,5 @@ --- -title: "How we're making sure you can confidently go \"Versionless\" in dbt Cloud" +title: "How we're making sure you can confidently switch to the \"Latest\" release track in dbt Cloud" description: "Over the past 6 months, we've laid a stable foundation for continuously improving dbt." slug: latest-dbt-stability @@ -12,23 +12,27 @@ date: 2024-05-02 is_featured: true --- +import Latest from '/snippets/_release-stages-from-versionless.md' + +<Latest/> + As long as dbt Cloud has existed, it has required users to select a version of dbt Core to use under the hood in their jobs and environments. This made sense in the earliest days, when dbt Core minor versions often included breaking changes. It provided a clear way for everyone to know which version of the underlying runtime they were getting. However, this came at a cost. While bumping a project's dbt version *appeared* as simple as selecting from a dropdown, there was real effort required to test the compatibility of the new version against existing projects, package dependencies, and adapters. On the other hand, putting this off meant foregoing access to new features and bug fixes in dbt. -But no more. Today, we're ready to announce the general availability of a new option in dbt Cloud: [**"Versionless."**](https://docs.getdbt.com/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) +But no more. Today, we're ready to announce the general availability of a new option in dbt Cloud: [**the "Latest" release track.**](/docs/dbt-versions/cloud-release-tracks) <!--truncate--> For customers, this means less maintenance overhead, faster access to bug fixes and features, and more time to focus on what matters most: building trusted data products. This will be our stable foundation for improvement and innovation in dbt Cloud. -But we wanted to go a step beyond just making this option available to you. In this blog post, we aim to shed a little light on the extensive work we've done to ensure that using "Versionless" is a stable, reliable experience for the thousands of customers who rely daily on dbt Cloud. +But we wanted to go a step beyond just making this option available to you. In this blog post, we aim to shed a little light on the extensive work we've done to ensure that using the "Latest" release track is a stable and reliable experience for the thousands of customers who rely daily on dbt Cloud. ## How we safely deploy dbt upgrades to Cloud We've put in place a rigorous, best-in-class suite of tests and control mechanisms to ensure that all changes to dbt under the hood are fully vetted before they're deployed to customers of dbt Cloud. -This pipeline has in fact been in place since January! It's how we've already been shipping continuous changes to the hundreds of customers who've selected "Versionless" while it's been in Beta and Preview. In that time, this process has enabled us to prevent multiple regressions before they were rolled out to any customers. +This pipeline has in fact been in place since January! It's how we've already been shipping continuous changes to the hundreds of customers who've selected the "Latest" release track while it's been in Beta and Preview. In that time, this process has enabled us to prevent multiple regressions before they were rolled out to any customers. We're very confident in the robustness of this process**. We also know that we'll need to continue building trust with time.** We're sharing details about this work in the spirit of transparency and to build that trust. @@ -82,9 +86,9 @@ All incidents are retrospected to make sure we not only identify and fix the roo ::: -The outcome of this process is that, when you select "Versionless" in dbt Cloud, the time between an improvement being made to dbt Core and you *safely* getting access to it in your projects is a matter of days — rather than months of waiting for the next dbt Core release, on top of any additional time it may have taken to actually carry out the upgrade. +The outcome of this process is that, when you select the "Latest" release track in dbt Cloud, the time between an improvement being made to dbt Core and you *safely* getting access to it in your projects is a matter of days — rather than months of waiting for the next dbt Core release, on top of any additional time it may have taken to actually carry out the upgrade. -We’re pleased to say that since the beta launch of “Versionless” in dbt Cloud in March, **we have not had any functional regressions reach customers**, while we’ve also been shipping multiple improvements to dbt functionality every day. This is a foundation that we aim to build on for the foreseeable future. +We’re pleased to say that, at the time of writing (May 2, 2024), since the beta launch of the "Latest" release track in dbt Cloud in March, **we have not had any functional regressions reach customers**, while we’ve also been shipping multiple improvements to dbt functionality every day. This is a foundation that we aim to build on for the foreseeable future. ## Stability as a feature @@ -98,7 +102,7 @@ The adapter interface — i.e. how dbt Core actually connects to a third-party d To solve that, we've released a new set of interfaces that are entirely independent of the `dbt-core` library: [`dbt-adapters==1.0.0`](https://github.com/dbt-labs/dbt-adapters). From now on, any changes to `dbt-adapters` will be backward and forward-compatible. This also decouples adapter maintenance from the regular release cadence of dbt Core — meaning maintainers get full control over when they ship implementations of new adapter-powered features. -Note that adapters running in dbt Cloud **must** be [migrated to the new decoupled architecture](https://github.com/dbt-labs/dbt-adapters/discussions/87) as a baseline in order to support the new "Versionless" option. +Note that adapters running in dbt Cloud **must** be [migrated to the new decoupled architecture](https://github.com/dbt-labs/dbt-adapters/discussions/87) as a baseline in order to support the new "Latest" release track. ### Managing behavior changes: stability as a feature @@ -118,7 +122,7 @@ We’ve now [formalized our development best practices](https://github.com/dbt-l In conclusion, we’re putting a lot of new muscle behind our commitments to dbt Cloud customers, the dbt Community, and the broader ecosystem: -- **Continuous updates**: "Versionless" dbt Cloud simplifies the update process, ensuring you always have the latest features and bug fixes without the maintenance overhead. +- **Continuous updates**: The "Latest" release track in dbt Cloud simplifies the update process, ensuring you always have the latest features and bug fixes without the maintenance overhead. - **A rigorous new testing and deployment process**: Our new testing pipeline ensures that every update is carefully vetted against documented interfaces, Cloud-supported adapters, and popular packages before it reaches you. This process minimizes the risk of regressions — and has now been successful at entirely preventing them for hundreds of customers over multiple months. - **A commitment to stability**: We’ve reworked our approaches to adapter interfaces, behaviour change management, and metadata artifacts to give you more stability and control. diff --git a/website/blog/2024-06-12-putting-your-dag-on-the-internet.md b/website/blog/2024-06-12-putting-your-dag-on-the-internet.md index 535cfc34d6e..54864916d0e 100644 --- a/website/blog/2024-06-12-putting-your-dag-on-the-internet.md +++ b/website/blog/2024-06-12-putting-your-dag-on-the-internet.md @@ -12,7 +12,7 @@ date: 2024-06-14 is_featured: true --- -**New in dbt: allow Snowflake Python models to access the internet** +## New in dbt: allow Snowflake Python models to access the internet With dbt 1.8, dbt released support for Snowflake’s [external access integrations](https://docs.snowflake.com/en/developer-guide/external-network-access/external-network-access-overview) further enabling the use of dbt + AI to enrich your data. This allows querying of external APIs within dbt Python models, a functionality that was required for dbt Cloud customer, [EQT AB](https://eqtgroup.com/). Learn about why they needed it and how they helped build the feature and get it shipped! @@ -45,7 +45,7 @@ This API is open and if it requires an API key, handle it similarly to managing For simplicity’s sake, we will show how to create them using [pre-hooks](/reference/resource-configs/pre-hook-post-hook) in a model configuration yml file: -``` +```yml models: - name: external_access_sample config: @@ -57,7 +57,7 @@ models: Then we can simply use the new external_access_integrations configuration parameter to use our network rule within a Python model (called external_access_sample.py): -``` +```python import snowflake.snowpark as snowpark def model(dbt, session: snowpark.Session): dbt.config( @@ -75,7 +75,7 @@ def model(dbt, session: snowpark.Session): The result is a model with some json I can parse, for example, in a SQL model to extract some information: -``` +```sql {{ config( materialized='incremental', @@ -108,12 +108,12 @@ The result is a model that will keep track of dbt invocations, and the current U This is a very new area to Snowflake and dbt -- something special about SQL and dbt is that it’s very resistant to external entropy. The second we rely on API calls, Python packages and other external dependencies, we open up to a lot more external entropy. APIs will change, break, and your models could fail. -Traditionally dbt is the T in ELT (dbt overview [here](https://docs.getdbt.com/terms/elt)), and this functionality unlocks brand new EL capabilities for which best practices do not yet exist. What’s clear is that EL workloads should be separated from T workloads, perhaps in a different modeling layer. Note also that unless using incremental models, your historical data can easily be deleted. dbt has seen a lot of use cases for this, including this AI example as outlined in this external [engineering blog post](https://klimmy.hashnode.dev/enhancing-your-dbt-project-with-large-language-models). +Traditionally dbt is the T in ELT (dbt overview [here](https://docs.getdbt.com/terms/elt)), and this functionality unlocks brand new EL capabilities for which best practices do not yet exist. What’s clear is that EL workloads should be separated from T workloads, perhaps in a different modeling layer. Note also that unless using incremental models, your historical data can easily be deleted. dbt has seen a lot of use cases for this, including this AI example as outlined in this external [engineering blog post](https://klimmy.hashnode.dev/enhancing-your-dbt-project-with-large-language-models). -**A few words about the power of Commercial Open Source Software** +## A few words about the power of Commercial Open Source Software In order to get this functionality shipped quickly, EQT opened a pull request, Snowflake helped with some problems we had with CI and a member of dbt Labs helped write the tests and merge the code in! -dbt now features this functionality in dbt 1.8+ or the “Versionless” option of dbt Cloud (dbt overview [here](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless)). +dbt now features this functionality in dbt 1.8+ and all [Release tracks](/docs/dbt-versions/cloud-release-tracks) in dbt Cloud. dbt Labs staff and community members would love to chat more about it in the [#db-snowflake](https://getdbt.slack.com/archives/CJN7XRF1B) slack channel. diff --git a/website/blog/2024-11-27-test-smarter-part-2.md b/website/blog/2024-11-27-test-smarter-part-2.md new file mode 100644 index 00000000000..4fabe066011 --- /dev/null +++ b/website/blog/2024-11-27-test-smarter-part-2.md @@ -0,0 +1,125 @@ +--- +title: "Test smarter not harder: Where should tests go in your pipeline?" +description: "Testing your data should drive action, not accumulate alerts. We take our testing framework developed in our last post and make recommendations for where tests ought to go at each transformation stage." +slug: test-smarter-where-tests-should-go + +authors: [faith_mckenna, jerrie_kumalah_kenney] + +tags: [analytics craft] +hide_table_of_contents: false + +date: 2024-12-09 +is_featured: true +--- + +👋 Greetings, dbt’ers! It’s Faith & Jerrie, back again to offer tactical advice on *where* to put tests in your pipeline. + +In [our first post](/blog/test-smarter-not-harder) on refining testing best practices, we developed a prioritized list of data quality concerns. We also documented first steps for debugging each concern. This post will guide you on where specific tests should go in your data pipeline. + +*Note that we are constructing this guidance based on how we [structure data at dbt Labs.](/best-practices/how-we-structure/1-guide-overview#guide-structure-overview)* You may use a different modeling approach—that’s okay! Translate our guidance to your data’s shape, and let us know in the comments section what modifications you made. + +First, here’s our opinions on where specific tests should go: + +- Source tests should be fixable data quality concerns. See the [callout box below](#sources) for what we mean by “fixable”. +- Staging tests should be business-focused anomalies specific to individual tables, such as accepted ranges or ensuring sequential values. In addition to these tests, your staging layer should clean up any nulls, duplicates, or outliers that you can’t fix in your source system. You generally don’t need to test your cleanup efforts. +- Intermediate and marts layer tests should be business-focused anomalies resulting specifically from joins or calculations. You also may consider adding additional primary key and not null tests on columns where it’s especially important to protect the grain. + +<!--truncate--> + +## Where should tests go in your pipeline? + +![A horizontal, multicolored diagram that shows examples of where tests ought to be placed in a data pipeline.](/img/blog/2024-11-27-test-smarter-part-2/testing_pipeline.png) + +This diagram above outlines where you might put specific data tests in your pipeline. Let’s expand on it and discuss where each type of data quality issue should be tested. + +### Sources + +Tests applied to your sources should indicate *fixable-at-the-source-system* issues. If your source tests flag source system issues that aren’t fixable, remove the test and mitigate the problem in your staging layer instead. + +:::tip[What does fixable mean?] +We consider a "fixable-at-the-source-system" issue to be something that: + +- You yourself can fix in the source system. +- You know the right person to fix it and have a good enough relationship with them that you know you can *get it fixed.* + +You may have issues that can *technically* get fixed at the source, but it won't happen till the next planning cycle, or you need to develop better relationships to get the issue fixed, or something similar. This demands a more nuanced approach than we'll cover in this post. If you have thoughts on this type of situation, let us know! + +::: + +Here’s our recommendation for what tests belong on your sources. + +- Source freshness: testing data freshness for sources that are critical to your pipelines. + - If any sources feed into any of the “top 3” [priority categories](https://docs.getdbt.com/blog/test-smarter-not-harder#how-to-prioritize-data-quality-concerns-in-your-pipeline) in our last post, use [`dbt source freshness`](https://docs.getdbt.com/docs/deploy/source-freshness) in your job execution commands and set the severity to `error`. That way, if source freshness fails, so does your job. + - If none of your sources feed into high priority categories, set your source freshness severity to `warn` and add source freshness to your job execution commands. That way, you still get source freshness information but stale data won't fail your pipeline. +- Data hygiene: tests that are *fixable* in the source system (see our note above on “fixability”). + - Examples: + - Duplicate customer records that can be deleted in the source system + - Null records, such as a customer name or email address, that can be entered into the source system + - Primary key testing where duplicates are removable in the source system + +### Staging + +In the staging layer, your models should be cleaning up or mitigating data issues that can't be fixed at the source. Your tests should be focused on business anomaly detection. + +- Data cleanup and issue mitigation: Use our [best practices around staging layers](https://docs.getdbt.com/best-practices/how-we-structure/2-staging) to clean things up. Don’t add tests to your cleanup efforts. If you’re filtering out nulls in a column, adding a not_null test is repetitive! 🌶️ +- Business-focused anomaly examples: these are data quality issues you *should* test for in your staging layer, because they fall outside of your business’s defined norms. These might be: + - Values inside a single column that fall outside of an acceptable range. For example, a store selling a greater quantity of limited-edition items than they received in their stock delivery. + - Values that should always be positive, are positive. This might look like a negative transaction amount that isn’t classified as a return. This failing test would then spur further investigation into the offending transaction. + - An unexpected uptick in volume of a quantity column beyond a pre-defined percentage. This might look like a store’s customer volume spiking unexpectedly and outside of expected seasonal norms. This is an anomaly that could indicate a bug or modeling issue. + +### Intermediate (if applicable) + +In your intermediate layer, focus on data hygiene and anomaly tests for new columns. Don’t re-test passthrough columns from sources or staging. Here are some examples of tests you might put in your intermediate layer based on the use cases of intermediate models we [outline in this guide](/best-practices/how-we-structure/3-intermediate#intermediate-models). + +- Intermediate models often re-grain models to prepare them for marts. + - Add a primary key test to any re-grained models. + - Additionally, consider adding a primary key test to models where the grain *has remained the same* but has been *enriched.* This helps future-proof your enriched models against future developers who may not be able to glean your intention from SQL alone. +- Intermediate models may perform a first set of joins or aggregations to reduce complexity in a final mart. + - Add simple anomaly tests to verify the behavior of your sets of joins and aggregations. This may look like: + - An [accepted_values](/reference/resource-properties/data-tests#accepted_values) test on a newly calculated categorical column. + - A [mutually_exclusive_ranges](https://github.com/dbt-labs/dbt-utils#mutually_exclusive_ranges-source) test on two columns whose values behave in relation to one another (ex: asserting age ranges do not overlap). + - A [not_constant](https://github.com/dbt-labs/dbt-utils#not_constant-source) test on a column whose value should be continually changing (ex: page view counts on website analytics). +- Intermediate models may isolate complex operations. + - The anomaly tests we list above may suffice here. + - You might also consider [unit testing](/docs/build/unit-tests) any particularly complex pieces of SQL logic. + +### Marts + +Marts layer testing will follow the same hygiene-or-anomaly pattern as staging and intermediate. Similar to your intermediate layer, you should focus your testing on net-new columns in your marts layer. This might look like: + +- Unit tests: validate especially complex transformation logic. For example: + - Calculating dates in a way that feeds into forecasting. + - Customer segmentation logic, especially logic that has a lot of CASE-WHEN statements. +- Primary key tests: focus on where where your mart's granularity has changed from its staging/intermediate inputs. + - Similar to the intermediate models above, you may also want to add primary key tests to models whose grain hasn’t changed, but have been enriched with other data. Primary key tests here communicate your intent. +- Business focused anomaly tests: focus on *new* calculated fields, such as: + - Singular tests on high-priority, high-impact tables where you have a specific problem you want forewarning about. + - This might be something like fuzzy matching logic to detect when the same person is making multiple emails to extend a free trial beyond its acceptable end date. + - A test for calculated numerical fields that shouldn’t vary by more than certain percentage in a week. + - A calculated ledger table that follows certain business rules, i.e. today’s running total of spend must always be greater than yesterday’s. + +### CI/CD + +All of the testing you’ve applied in your different layers is the manual work of constructing your framework. CI/CD is where it gets automated. + +You should run a [slim CI](/best-practices/best-practice-workflows#run-only-modified-models-to-test-changes-slim-ci) to optimize your resource consumption. + +With CI/CD and your regular production runs, your testing framework can be on autopilot. 😎 + +If and when you encounter failures, consult your trusty testing framework doc you built in our [earlier post](/blog/test-smarter-not-harder). + +### Advanced CI + +In the early stages of your smarter testing journey, start with dbt Cloud’s built-in flags for [advanced CI](/docs/deploy/advanced-ci). In PRs with advanced CI enabled, dbt Cloud will flag what has been modified, added, or removed in the “compare changes” section. These three flags offer confidence and evidence that your changes are what you expect. Then, hand them off for peer review. Advanced CI helps jump start your colleague’s review of your work by bringing all of the implications of the change into one place. + +We consider usage of Advanced CI beyond the modified, added, or changed gut checks to be an advanced (heh) testing strategy, and look forward to hearing how you use it. + +## Wrapping it all up + +Judicious data testing is like training for a marathon. It’s not productive to go run 20 miles a day and hope that you’ll be marathon-ready and uninjured. Similarly, throwing data tests randomly at your data pipeline without careful thought is not going to tell you much about your data quality. + +Runners go into marathons with training plans. Analytics engineers who care about data quality approach the issue with a plan, too. + +As you try out some of the guidance above here, remember that your testing needs are going to evolve over time. Don’t be afraid to revise your original testing strategy. + +Let us know your thoughts on these strategies in the comments section. Try them out, and share your thoughts to help us refine them. diff --git a/website/dbt-versions.js b/website/dbt-versions.js index 825af8ac6ee..3e59b926b80 100644 --- a/website/dbt-versions.js +++ b/website/dbt-versions.js @@ -16,11 +16,11 @@ exports.versions = [ { version: "1.10", - customDisplay: "Cloud (Versionless)", + customDisplay: "Cloud (Latest)", }, { version: "1.9", - isPrerelease: true, + EOLDate: "2025-12-08", }, { version: "1.8", diff --git a/website/docs/best-practices/how-we-style/2-how-we-style-our-sql.md b/website/docs/best-practices/how-we-style/2-how-we-style-our-sql.md index 8c61e63b888..35e025faf3f 100644 --- a/website/docs/best-practices/how-we-style/2-how-we-style-our-sql.md +++ b/website/docs/best-practices/how-we-style/2-how-we-style-our-sql.md @@ -8,8 +8,8 @@ id: 2-how-we-style-our-sql - ☁️ Use [SQLFluff](https://sqlfluff.com/) to maintain these style rules automatically. - Customize `.sqlfluff` configuration files to your needs. - Refer to our [SQLFluff config file](https://github.com/dbt-labs/jaffle-shop-template/blob/main/.sqlfluff) for the rules we use in our own projects. - - - Exclude files and directories by using a standard `.sqlfluffignore` file. Learn more about the syntax in the [.sqlfluffignore syntax docs](https://docs.sqlfluff.com/en/stable/configuration.html#id2). + - Exclude files and directories by using a standard `.sqlfluffignore` file. Learn more about the syntax in the [.sqlfluffignore syntax docs](https://docs.sqlfluff.com/en/stable/configuration/index.html). + - Excluding unnecessary folders and files (such as `target/`, `dbt_packages/`, and `macros/`) can speed up linting, improve run times, and help you avoid irrelevant logs. - 👻 Use Jinja comments (`{# #}`) for comments that should not be included in the compiled SQL. - ⏭️ Use trailing commas. - 4️⃣ Indents should be four spaces. diff --git a/website/docs/best-practices/how-we-style/5-how-we-style-our-yaml.md b/website/docs/best-practices/how-we-style/5-how-we-style-our-yaml.md index 8f817356334..e3b539e8b12 100644 --- a/website/docs/best-practices/how-we-style/5-how-we-style-our-yaml.md +++ b/website/docs/best-practices/how-we-style/5-how-we-style-our-yaml.md @@ -7,6 +7,7 @@ id: 5-how-we-style-our-yaml - 2️⃣ Indents should be two spaces - ➡️ List items should be indented +- 🔠 List items with a single entry can be a string. For example, `'select': 'other_user'`, but it's best practice to provide the argument as an explicit list. For example, `'select': ['other_user']` - 🆕 Use a new line to separate list items that are dictionaries where appropriate - 📏 Lines of YAML should be no longer than 80 characters. - 🛠️ Use the [dbt JSON schema](https://github.com/dbt-labs/dbt-jsonschema) with any compatible IDE and a YAML formatter (we recommend [Prettier](https://prettier.io/)) to validate your YAML files and format them automatically. diff --git a/website/docs/community/resources/oss-expectations.md b/website/docs/community/resources/oss-expectations.md index e6e5d959c96..7b518424e92 100644 --- a/website/docs/community/resources/oss-expectations.md +++ b/website/docs/community/resources/oss-expectations.md @@ -2,112 +2,122 @@ title: "Expectations for OSS contributors" --- -Whether it's a dbt package, a plugin, `dbt-core`, or this very documentation site, contributing to the open source code that supports the dbt ecosystem is a great way to level yourself up as a developer, and to give back to the community. The goal of this page is to help you understand what to expect when contributing to dbt open source software (OSS). While we can only speak for our own experience as open source maintainers, many of these guidelines apply when contributing to other open source projects, too. +Whether it's `dbt-core`, adapters, packages, or this very documentation site, contributing to the open source code that supports the dbt ecosystem is a great way to share your knowledge, level yourself up as a developer, and to give back to the community. The goal of this page is to help you understand what to expect when contributing to dbt open source software (OSS). -Have you seen things in other OSS projects that you quite like, and think we could learn from? [Open a discussion on the dbt Community Forum](https://discourse.getdbt.com), or start a conversation in the dbt Community Slack (for example: `#community-strategy`, `#dbt-core-development`, `#package-ecosystem`, `#adapter-ecosystem`). We always appreciate hearing from you! +Have you seen things in other OSS projects that you quite like, and think we could learn from? [Open a discussion on the dbt Community Forum](https://discourse.getdbt.com), or start a conversation in the [dbt Community Slack](https://www.getdbt.com/community/join-the-community) (for example: `#community-strategy`, `#dbt-core-development`, `#package-ecosystem`, `#adapter-ecosystem`). We always appreciate hearing from you! ## Principles ### Open source is participatory -Why take time out of your day to write code you don’t _have_ to? We all build dbt together. By using dbt, you’re invested in the future of the tool, and an agent in pushing forward the practice of analytics engineering. You’ve already benefited from using code contributed by community members, and documentation written by community members. Contributing to dbt OSS is your way to pay it forward, as an active participant in the thing we’re all creating together. +We all build dbt together -- whether you write code or contribute your ideas. By using dbt, you're invested in the future of the tool, and have an active role in pushing forward the standard of analytics engineering. You already benefit from using code and documentation contributed by community members. Contributing to the dbt community is your way to be an active participant in the thing we're all creating together. -There’s a very practical reason, too: OSS prioritizes our collective knowledge and experience over any one person’s. We don’t have experience using every database, operating system, security environment, ... We rely on the community of OSS users to hone our product capabilities and documentation to the wide variety of contexts in which it operates. In this way, dbt gets to be the handiwork of thousands, rather than a few dozen. +There's a very practical reason, too: OSS prioritizes our collective knowledge and experience over any one person's. We don't have experience using every database, operating system, security environment, ... We rely on the community of OSS users to hone our product capabilities and documentation to the wide variety of contexts in which it operates. In this way, dbt gets to be the handiwork of thousands, rather than a few dozen. -### We take seriously our role as maintainers +### We take seriously our role as maintainers of a standard -In that capacity, we cannot and will not fix every bug ourselves, or code up every feature worth doing. Instead, we’ll do our best to respond to new issues with context (including links to related issues), feedback, alternatives/workarounds, and (whenever possible) pointers to code that would aid a community contributor. If a change is so tricky or involved that the initiative rests solely with us, we’ll do our best to explain the complexity, and when / why we could foresee prioritizing it. Our role also includes maintenance of the backlog of issues, such as closing duplicates, proposals we don’t intend to support, or stale issues (no activity for 180 days). +As a standard, dbt must be reliable and consistent. Our first priority is ensuring the continued high quality of existing dbt capabilities before we introduce net-new capabilities. -### Initiative is everything +We also believe dbt as a framework should be extensible enough to ["make the easy things easy, and the hard things possible"](https://en.wikipedia.org/wiki/Perl#Philosophy). To that end, we _don't_ believe it's appropriate for dbt to have an out-of-the-box solution for every niche problem. Users have the flexibility to achieve many custom behaviors by defining their own macros, materializations, hooks, and more. We view it as our responsibility as maintainers to decide when something should be "possible" — via macros, packages, etc. — and when something should be "easy" — built into the dbt Core standard. -Given that we, as maintainers, will not be able to resolve every bug or flesh out every feature request, we empower you, as a community member, to initiate a change. +So when will we say "yes" to new capabilities for dbt Core? The signals we look for include: +- Upvotes on issues in our GitHub repos +- Open source dbt packages trying to close a gap +- Technical advancements in the ecosystem -- If you open the bug report, it’s more likely to be identified. -- If you open the feature request, it’s more likely to be discussed. -- If you comment on the issue, engaging with ideas and relating it to your own experience, it’s more likely to be prioritized. -- If you open a PR to fix an identified bug, it’s more likely to be fixed. -- If you contribute the code for a well-understood feature, that feature is more likely to be in the next version. -- If you review an existing PR, to confirm it solves a concrete problem for you, it’s more likely to be merged. +In the meantime — we'll do our best to respond to new issues with: +- Clarity about whether the proposed feature falls into the intended scope of dbt Core +- Context (including links to related issues) +- Alternatives and workarounds +- When possible, pointers to code that would aid a community contributor -Sometimes, this can feel like shouting into the void, especially if you aren’t met with an immediate response. We promise that there are dozens (if not hundreds) of folks who will read your comment, maintainers included. It all adds up to a real difference. +### Initiative is everything -# Practicalities +Given that we, as maintainers, will not be able to resolve every bug or flesh out every feature request, we empower you, as a community member, to initiate a change. -As dbt OSS is growing in popularity, and dbt Labs has been growing in size, we’re working to involve new people in the responsibilities of OSS maintenance. We really appreciate your patience as our newest maintainers are learning and developing habits. +- If you open the bug report, it's more likely to be identified. +- If you open the feature request, it's more likely to be discussed. +- If you comment on the issue, engaging with ideas and relating it to your own experience, it's more likely to be prioritized. +- If you open a PR to fix an identified bug, it's more likely to be fixed. +- If you comment on an existing PR, to confirm it solves the concrete problem for your team in practice, it's more likely to be merged. -## Discussions +Sometimes, this can feel like shouting into the void, especially if you aren't met with an immediate response. We promise that there are dozens (if not hundreds) of folks who will read your comment, including us as maintainers. It all adds up to a real difference. -Discussions are a relatively new GitHub feature, and we really like them! +## Practicalities -A discussion is best suited to propose a Big Idea, such as brand-new capability in dbt Core, or a new section of the product docs. Anyone can open a discussion, add a comment to an existing one, or reply in a thread. +### Discussions -What can you expect from a new Discussion? Hopefully, comments from other members of the community, who like your idea or have their own ideas for how it could be improved. The most helpful comments are ones that describe the kinds of experiences users and readers should have. Unlike an **issue**, there is no specific code change that would “resolve” a Discussion. +A discussion is best suited to propose a Big Idea, such as brand-new capability in dbt Core or an adapter. Anyone can open a discussion, comment on an existing one, or reply in a thread. -If, over the course of a discussion, we do manage to reach consensus on a way forward, we’ll open a new issue that references the discussion for context. That issue will connect desired outcomes to specific implementation details, as well as perceived limitations and open questions. It will serve as a formal proposal and request for comment. +When you open a new discussion, you might be looking for validation from other members of the community — folks who identify with your problem statement, who like your proposed idea, and who may have their own ideas for how it could be improved. The most helpful comments propose nuances or desirable user experiences to be considered in design and refinement. Unlike an **issue**, there is no specific code change that would “resolve” a discussion. -## Issues +If, over the course of a discussion, we reach a consensus on specific elements of a proposed design, we can open new implementation issues that reference the discussion for context. Those issues will connect desired user outcomes to specific implementation details, acceptance testing, and remaining questions that need answering. -An issue could be a bug you’ve identified while using the product or reading the documentation. It could also be a specific idea you’ve had for how it could be better. +### Issues -### Best practices for issues +An issue could be a bug you've identified while using the product or reading the documentation. It could also be a specific idea you've had for a narrow extension of existing functionality. + +#### Best practices for issues - Issues are **not** for support / troubleshooting / debugging help. Please see [dbt support](/docs/dbt-support) for more details and suggestions on how to get help. - Always search existing issues first, to see if someone else had the same idea / found the same bug you did. -- Many repositories offer templates for creating issues, such as when reporting a bug or requesting a new feature. If available, please select the relevant template and fill it out to the best of your ability. This will help other people understand your issue and respond. +- Many dbt repositories offer templates for creating issues, such as reporting a bug or requesting a new feature. If available, please select the relevant template and fill it out to the best of your ability. This information helps us (and others) understand your issue. -### You’ve found an existing issue that interests you. What should you do? +##### You've found an existing issue that interests you. What should you do? -Comment on it! Explain that you’ve run into the same bug, or had a similar idea for a new feature. If the issue includes a detailed proposal for a change, say which parts of the proposal you find most compelling, and which parts give you pause. +Comment on it! Explain that you've run into the same bug, or had a similar idea for a new feature. If the issue includes a detailed proposal for a change, say which parts of the proposal you find most compelling, and which parts give you pause. -### You’ve opened a new issue. What can you expect to happen? +##### You've opened a new issue. What can you expect to happen? -In our most critical repositories (such as `dbt-core`), **our goal is to respond to new issues within 2 standard work days.** While this initial response might be quite lengthy (context, feedback, and pointers that we can offer as maintainers), more often it will be a short acknowledgement that the maintainers are aware of it and don't believe it's in urgent need of resolution. Depending on the nature of your issue, it might be well suited to an external contribution, from you or another community member. +In our most critical repositories (such as `dbt-core`), our goal is to respond to new issues as soon as possible. This initial response will often be a short acknowledgement that the maintainers are aware of the issue, signalling our perception of its urgency. Depending on the nature of your issue, it might be well suited to an external contribution, from you or another community member. -**What does “triage” mean?** In some repositories, we use a `triage` label to keep track of issues that need an initial response from a maintainer. +**What if you're opening an issue in a different repository?** We have engineering teams dedicated to active maintenance of [`dbt-core`](https://github.com/dbt-labs/dbt-core) and its component libraries ([`dbt-common`](https://github.com/dbt-labs/dbt-common) + [`dbt-adapters`](https://github.com/dbt-labs/dbt-adapters)), as well as several platform-specific adapters ([`dbt-snowflake`](https://github.com/dbt-labs/dbt-snowflake), [`dbt-bigquery`](https://github.com/dbt-labs/dbt-bigquery), [`dbt-redshift`](https://github.com/dbt-labs/dbt-redshift), [`dbt-postgres`](https://github.com/dbt-labs/dbt-postgres)). We've open-sourced a number of other software projects over the years, and the majority of them do not have the same activity or maintenance guarantees. Check to see if other recent issues have responses, or when the last commit was added to the `main` branch. -**What if I’m opening an issue in a different repository?** **What if I’m opening an issue in a different repository?** We have engineering teams dedicated to active maintainence of [`dbt-core`](https://github.com/dbt-labs/dbt-core) and its component libraries ([`dbt-common`](https://github.com/dbt-labs/dbt-common) + [`dbt-adapters`](https://github.com/dbt-labs/dbt-adapters)), as well as several platform-specific adapters ([`dbt-snowflake`](https://github.com/dbt-labs/dbt-snowflake), [`dbt-bigquery`](https://github.com/dbt-labs/dbt-bigquery), [`dbt-redshift`](https://github.com/dbt-labs/dbt-redshift), [`dbt-postgres`](https://github.com/dbt-labs/dbt-postgres)). We’ve open sourced a number of other software projects over the years, and the majority of them do not have the same activity or maintenance guarantees. Check to see if other recent issues have responses, or when the last commit was added to the `main` branch. +**You're not sure about the status of your issue.** If your issue is in an actively maintained repo and has a `triage` label attached, we're aware it's something that needs a response. If the issue has been triaged, but not prioritized, this could mean: +- The intended scope or user experience of a proposed feature requires further refinement from a maintainer +- We believe the required code change is too tricky for an external contributor -**If my issue is lingering...** Sorry for the delay! If your issue is in an actively maintained repo and has a `triage` label attached, we’re aware it's something that needs a response. +We'll do our best to explain the open questions or complexity, and when / why we could foresee prioritizing it. -**Automation that can help us:** In many repositories, we use a bot that marks issues as stale if they haven’t had any activity for 180 days. This helps us keep our backlog organized and up-to-date. We encourage you to comment on older open issues that you’re interested in, to keep them from being marked stale. You’re also always welcome to comment on closed issues to say that you’re still interested in the proposal. +**Automation that can help us:** In many repositories, we use a bot that marks issues as stale if they haven't had any activity for 180 days. This helps us keep our backlog organized and up-to-date. We encourage you to comment on older open issues that you're interested in, to keep them from being marked stale. You're also always welcome to comment on closed issues to say that you're still interested in the proposal. -### Issue labels +#### Issue labels In all likelihood, the maintainer who responds will also add a number of labels. Not all of these labels are used in every repository. -In some cases, the right resolution to an open issue might be tangential to the codebase. The right path forward might be in another codebase (we'll transfer it), a documentation update, or a change that can be made in user-space code. In other cases, the issue might describe functionality that the maintainers are unwilling or unable to incorporate into the main codebase. In these cases, a maintainer will close the issue (perhaps using a `wontfix` label) and explain why. +In some cases, the right resolution to an open issue might be tangential to the codebase. The right path forward might be in another codebase (we'll transfer it), a documentation update, or a change that you can make yourself in user-space code. In other cases, the issue might describe functionality that the maintainers are unwilling or unable to incorporate into the main codebase. In these cases, a maintainer will close the issue (perhaps using a `wontfix` label) and explain why. + +Some of the most common labels are explained below: | tag | description | | ------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `triage` | This is a new issue which has not yet been reviewed by a maintainer. This label is removed when a maintainer reviews and responds to the issue. | -| `bug` | This issue represents a defect or regression from the behavior that's documented, or that you reasonably expect | -| `enhancement` | This issue represents net-new functionality, including an extension of an existing capability | -| `good_first_issue` | This issue does not require deep knowledge of the codebase to implement. This issue is appropriate for a first-time contributor. | +| `bug` | This issue represents a defect or regression from the behavior that's documented | +| `enhancement` | This issue represents a narrow extension of an existing capability | +| `good_first_issue` | This issue does not require deep knowledge of the codebase to implement, and it is appropriate for a first-time contributor. | | `help_wanted` | This issue is trickier than a "good first issue." The required changes are scattered across the codebase, or more difficult to test. The maintainers are happy to help an experienced community contributor; they aren't planning to prioritize this issue themselves. | | `duplicate` | This issue is functionally identical to another open issue. The maintainers will close this issue and encourage community members to focus conversation on the other one. | | `stale` | This is an old issue which has not recently been updated. In repositories with a lot of activity, stale issues will periodically be closed. | | `wontfix` | This issue does not require a code change in the repository, or the maintainers are unwilling to merge a change which implements the proposed behavior. | -## Pull requests - -PRs are your surest way to make the change you want to see in dbt / packages / docs, especially when the change is straightforward. +### Pull requests -**Every PR should be associated with an issue.** Why? Before you spend a lot of time working on a contribution, we want to make sure that your proposal will be accepted. You should open an issue first, describing your desired outcome and outlining your planned change. If you've found an older issue that's already open, comment on it with an outline for your planned implementation. Exception to this rule: If you're just opening a PR for a cosmetic fix, such as a typo in documentation, an issue isn't needed. +**Every PR should be associated with an issue.** Why? Before you spend a lot of time working on a contribution, we want to make sure that your proposal will be accepted. You should open an issue first, describing your desired outcome and outlining your planned change. If you've found an older issue that's already open, comment on it with an outline for your planned implementation _before_ putting in the work to open a pull request. -**PRs must include robust testing.** Comprehensive testing within pull requests is crucial for the stability of our project. By prioritizing robust testing, we ensure the reliability of our codebase, minimize unforeseen issues, and safeguard against potential regressions. We cannot merge changes that risk the backward incompatibility of existing documented behaviors. We understand that creating thorough tests often requires significant effort, and your dedication to this process greatly contributes to the project's overall reliability. Thank you for your commitment to maintaining the integrity of our codebase and the experience of everyone using dbt! +**PRs must include robust testing.** Comprehensive testing within pull requests is crucial for the stability of dbt. By prioritizing robust testing, we ensure the reliability of our codebase, minimize unforeseen issues, and safeguard against potential regressions. **We cannot merge changes that risk the backward incompatibility of existing documented behaviors.** We understand that creating thorough tests often requires significant effort, and your dedication to this process greatly contributes to the project's overall reliability. Thank you for your commitment to maintaining the integrity of our codebase and the experience of everyone using dbt! -**PRs go through two review steps.** First, we aim to respond with feedback on whether we think the implementation is appropriate from a product & usability standpoint. At this point, we will close PRs that we believe fall outside the scope of dbt Core, or which might lead to an inconsistent user experience. This is an important part of our role as maintainers; we're always open to hearing disagreement. If a PR passes this first review, we will queue it up for code review, at which point we aim to test it ourselves and provide thorough feedback within the next month. +**PRs go through two review steps.** First, we aim to respond with feedback on whether we think the implementation is appropriate from a product & usability standpoint. At this point, we will close PRs that we believe fall outside the scope of dbt Core, or which might lead to an inconsistent user experience. This is an important part of our role as maintainers; we're always open to hearing disagreement. If a PR passes this first review, we will queue it up for code review, at which point we aim to test it ourselves and provide thorough feedback. -**We receive more PRs than we can thoroughly review, test, and merge.** Our teams have finite capacity, and our top priority is maintaining a well-scoped, high-quality framework for the tens of thousands of people who use it every week. To that end, we must prioritize overall stability and planned improvements over a long tail of niche potential features. For best results, say what in particular you’d like feedback on, and explain what would it mean to you, your team, and other community members to have the proposed change merged. Smaller PRs tackling well-scoped issues tend to be easier and faster for review. Two recent examples of community-contributed PRs: +**We receive more PRs than we can thoroughly review, test, and merge.** Our teams have finite capacity, and our top priority is maintaining a well-scoped, high-quality framework for the tens of thousands of people who use it every week. To that end, we must prioritize overall stability and planned improvements over a long tail of niche potential features. For best results, say what in particular you'd like feedback on, and explain what would it mean to you, your team, and other community members to have the proposed change merged. Smaller PRs tackling well-scoped issues tend to be easier and faster for review. Two examples of community-contributed PRs: - [(dbt-core#9347) Fix configuration of turning test warnings into failures](https://github.com/dbt-labs/dbt-core/pull/9347) - [(dbt-core#9863) Better error message when trying to select a disabled model](https://github.com/dbt-labs/dbt-core/pull/9863) -**Automation that can help us:** Many repositories have a template for pull request descriptions, which will include a checklist that must be completed before the PR can be merged. You don’t have to do all of these things to get an initial PR, but they definitely help. Those many include things like: +**Automation that can help us:** Many repositories have a template for pull request descriptions, which will include a checklist that must be completed before the PR can be merged. You don't have to do all of these things to get an initial PR, but they will delay our review process. Those include: -- **Tests!** When you open a PR, some tests and code checks will run. (For security reasons, some may need to be approved by a maintainer.) We will not merge any PRs with failing tests. If you’re not sure why a test is failing, please say so, and we’ll do our best to get to the bottom of it together. +- **Tests, tests, tests.** When you open a PR, some tests and code checks will run. (For security reasons, some may need to be approved by a maintainer.) We will not merge any PRs with failing tests. If you're not sure why a test is failing, please say so, and we'll do our best to get to the bottom of it together. - **Contributor License Agreement** (CLA): This ensures that we can merge your code, without worrying about unexpected implications for the copyright or license of open source dbt software. For more details, read: ["Contributor License Agreements"](../resources/contributor-license-agreements.md) - **Changelog:** In projects that include a number of changes in each release, we need a reliable way to signal what's been included. The mechanism for this will vary by repository, so keep an eye out for notes about how to update the changelog. -### Inclusion in release versions +#### Inclusion in release versions -Both bug fixes and backwards-compatible new features will be included in the [next minor release](/docs/dbt-versions/core#how-dbt-core-uses-semantic-versioning). Fixes for regressions and net-new bugs that were present in the minor version's original release will be backported to versions with [active support](/docs/dbt-versions/core). Other bug fixes may be backported when we have high confidence that they're narrowly scoped and won't cause unintended side effects. +Both bug fixes and backwards-compatible new features will be included in the [next minor release of dbt Core](/docs/dbt-versions/core#how-dbt-core-uses-semantic-versioning). Fixes for regressions and net-new bugs that were present in the minor version's original release will be backported to versions with [active support](/docs/dbt-versions/core). Other bug fixes may be backported when we have high confidence that they're narrowly scoped and won't cause unintended side effects. diff --git a/website/docs/docs/build/conversion-metrics.md b/website/docs/docs/build/conversion-metrics.md index 2ef2c3910b9..2d227f4a703 100644 --- a/website/docs/docs/build/conversion-metrics.md +++ b/website/docs/docs/build/conversion-metrics.md @@ -20,28 +20,29 @@ The specification for conversion metrics is as follows: Note that we use the double colon (::) to indicate whether a parameter is nested within another parameter. So for example, `query_params::metrics` means the `metrics` parameter is nested under `query_params`. ::: -| Parameter | Description | Type | -| --- | --- | --- | -| `name` | The name of the metric. | Required | -| `description` | The description of the metric. | Optional | -| `type` | The type of metric (such as derived, ratio, and so on.). In this case, set as 'conversion' | Required | -| `label` | Required string that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Required | -| `type_params` | Specific configurations for each metric type. | Required | -| `conversion_type_params` | Additional configuration specific to conversion metrics. | Required | -| `entity` | The entity for each conversion event. | Required | -| `calculation` | Method of calculation. Either `conversion_rate` or `conversions`. Defaults to `conversion_rate`. | Optional | -| `base_measure` | A list of base measure inputs | Required | -| `base_measure:name` | The base conversion event measure. | Required | -| `base_measure:fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | Optional | -| `base_measure:join_to_timespine` | Boolean that indicates if the aggregated measure should be joined to the time spine table to fill in missing dates. Default `false`. | Optional | -| `conversion_measure` | A list of conversion measure inputs. | Required | -| `conversion_measure:name` | The base conversion event measure.| Required | -| `conversion_measure:fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | Optional | -| `conversion_measure:join_to_timespine` | Boolean that indicates if the aggregated measure should be joined to the time spine table to fill in missing dates. Default `false`. | Optional | -| `window` | The time window for the conversion event, such as 7 days, 1 week, 3 months. Defaults to infinity. | Optional | -| `constant_properties` | List of constant properties. | Optional | -| `base_property` | The property from the base semantic model that you want to hold constant. | Optional | -| `conversion_property` | The property from the conversion semantic model that you want to hold constant. | Optional | +| Parameter | Description | Required | Type | +| --- | --- | --- | --- | +| `name` | The name of the metric. | Required | String | +| `description` | The description of the metric. | Optional | String | +| `type` | The type of metric (such as derived, ratio, and so on.). In this case, set as 'conversion'. | Required | String | +| `label` | Required string that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Required | String | +| `type_params` | Specific configurations for each metric type. | Required | Dict | +| `conversion_type_params` | Additional configuration specific to conversion metrics. | Required | Dict | +| `entity` | The entity for each conversion event. | Required | String | +| `calculation` | Method of calculation. Either `conversion_rate` or `conversions`. Defaults to `conversion_rate`. | Optional | String | +| `base_measure` | A list of base measure inputs. | Required | Dict | +| `base_measure:name` | The base conversion event measure. | Required | String | +| `base_measure:fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | Optional | String | +| `base_measure:join_to_timespine` | Boolean that indicates if the aggregated measure should be joined to the time spine table to fill in missing dates. Default `false`. | Optional | Boolean | +| `base_measure:filter` | Optional `filter` used to apply to the base measure. | Optional | String | +| `conversion_measure` | A list of conversion measure inputs. | Required | Dict | +| `conversion_measure:name` | The base conversion event measure.| Required | String | +| `conversion_measure:fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | Optional | String | +| `conversion_measure:join_to_timespine` | Boolean that indicates if the aggregated measure should be joined to the time spine table to fill in missing dates. Default `false`. | Optional | Boolean | +| `window` | The time window for the conversion event, such as 7 days, 1 week, 3 months. Defaults to infinity. | Optional | String | +| `constant_properties` | List of constant properties. | Optional | List | +| `base_property` | The property from the base semantic model that you want to hold constant. | Optional | String | +| `conversion_property` | The property from the conversion semantic model that you want to hold constant. | Optional | String | Refer to [additional settings](#additional-settings) to learn how to customize conversion metrics with settings for null values, calculation type, and constant properties. @@ -61,6 +62,7 @@ metrics: name: The name of the measure # Required fill_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional join_to_timespine: true/false # Boolean that indicates if the aggregated measure should be joined to the time spine table to fill in missing dates. Default `false`. # Optional + filter: The filter used to apply to the base measure. # Optional conversion_measure: name: The name of the measure # Required fill_nulls_with: Set the value in your metric definition instead of null (such as zero) # Optional @@ -105,13 +107,14 @@ Next, define a conversion metric as follows: - name: visit_to_buy_conversion_rate_7d description: "Conversion rate from visiting to transaction in 7 days" type: conversion - label: Visit to Buy Conversion Rate (7-day window) + label: Visit to buy conversion rate (7-day window) type_params: conversion_type_params: base_measure: name: visits fill_nulls_with: 0 - conversion_measure: sellers + filter: {{ Dimension('visits__referrer_id') }} = 'facebook' + conversion_measure: name: sellers entity: user window: 7 days diff --git a/website/docs/docs/build/cumulative-metrics.md b/website/docs/docs/build/cumulative-metrics.md index b44918d2fbd..24596be8b3d 100644 --- a/website/docs/docs/build/cumulative-metrics.md +++ b/website/docs/docs/build/cumulative-metrics.md @@ -18,21 +18,21 @@ Note that we use the double colon (::) to indicate whether a parameter is nested <VersionBlock firstVersion="1.9"> -| Parameter | <div style={{width:'350px'}}>Description</div> | Type | -|-------------|---------------------------------------------------|-----------| -| `name` | The name of the metric. | Required | -| `description` | The description of the metric. | Optional | -| `type` | The type of the metric (cumulative, derived, ratio, or simple). | Required | -| `label` | Required string that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Required | -| `type_params` | The type parameters of the metric. Supports nested parameters indicated by the double colon, such as `type_params::measure`. | Required | -| `type_params::measure` | The measure associated with the metric. Supports both shorthand (string) and object syntax. The shorthand is used if only the name is needed, while the object syntax allows specifying additional attributes. | Required | -| `measure::name` | The name of the measure being referenced. Required if using object syntax for `type_params::measure`. | Optional | -| `measure::fill_nulls_with` | Sets a value (for example, 0) to replace nulls in the metric definition. | Optional | -| `measure::join_to_timespine` | Boolean indicating if the aggregated measure should be joined to the time spine table to fill in missing dates. Default is `false`. | Optional | -| `type_params::cumulative_type_params` | Configures the attributes like `window`, `period_agg`, and `grain_to_date` for cumulative metrics. | Optional | -| `cumulative_type_params::window` | Specifies the accumulation window, such as `1 month`, `7 days`, or `1 year`. Cannot be used with `grain_to_date`. | Optional | -| `cumulative_type_params::grain_to_date` | Sets the accumulation grain, such as `month`, restarting accumulation at the beginning of each specified grain period. Cannot be used with `window`. | Optional | -| `cumulative_type_params::period_agg` | Defines how to aggregate the cumulative metric when summarizing data to a different granularity: `first`, `last`, or `average`. Defaults to `first` if `window` is not specified. | Optional | +| Parameter | <div style={{width:'350px'}}>Description</div> | Required | Type | +|-------------|---------------------------------------------------|----------|-----------| +| `name` | The name of the metric. | Required | String | +| `description` | The description of the metric. | Optional | String | +| `type` | The type of the metric (cumulative, derived, ratio, or simple). | Required | String | +| `label` | Required string that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Required | String | +| `type_params` | The type parameters of the metric. Supports nested parameters indicated by the double colon, such as `type_params::measure`. | Required | Dict | +| `type_params::measure` | The measure associated with the metric. Supports both shorthand (string) and object syntax. The shorthand is used if only the name is needed, while the object syntax allows specifying additional attributes. | Required | Dict | +| `measure::name` | The name of the measure being referenced. Required if using object syntax for `type_params::measure`. | Optional | String | +| `measure::fill_nulls_with` | Sets a value (for example, 0) to replace nulls in the metric definition. | Optional | Integer or string | +| `measure::join_to_timespine` | Boolean indicating if the aggregated measure should be joined to the time spine table to fill in missing dates. Default is `false`. | Optional | Boolean | +| `type_params::cumulative_type_params` | Configures the attributes like `window`, `period_agg`, and `grain_to_date` for cumulative metrics. | Optional | Dict | +| `cumulative_type_params::window` | Specifies the accumulation window, such as `1 month`, `7 days`, or `1 year`. Cannot be used with `grain_to_date`. | Optional | String | +| `cumulative_type_params::grain_to_date` | Sets the accumulation grain, such as `month`, restarting accumulation at the beginning of each specified grain period. Cannot be used with `window`. | Optional | String | +| `cumulative_type_params::period_agg` | Defines how to aggregate the cumulative metric when summarizing data to a different granularity: `first`, `last`, or `average`. Defaults to `first` if `window` is not specified. | Optional | String | </VersionBlock> diff --git a/website/docs/docs/build/derived-metrics.md b/website/docs/docs/build/derived-metrics.md index d5f2221907e..b6184aaeebf 100644 --- a/website/docs/docs/build/derived-metrics.md +++ b/website/docs/docs/build/derived-metrics.md @@ -10,18 +10,18 @@ In MetricFlow, derived metrics are metrics created by defining an expression usi The parameters, description, and type for derived metrics are: -| Parameter | Description | Type | -| --------- | ----------- | ---- | -| `name` | The name of the metric. | Required | -| `description` | The description of the metric. | Optional | -| `type` | The type of the metric (cumulative, derived, ratio, or simple). | Required | -| `label` | Required string that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Required | -| `type_params` | The type parameters of the metric. | Required | -| `expr` | The derived expression. You see validation warnings when the derived metric is missing an `expr` or the `expr` does not use all the input metrics. | Required | -| `metrics` | The list of metrics used in the derived metrics. | Required | -| `alias` | Optional alias for the metric that you can use in the expr. | Optional | -| `filter` | Optional filter to apply to the metric. | Optional | -| `offset_window` | Set the period for the offset window, such as 1 month. This will return the value of the metric one month from the metric time. | Optional | +| Parameter | Description | Required | Type | +| --------- | ----------- | ---- | ---- | +| `name` | The name of the metric. | Required | String | +| `description` | The description of the metric. | Optional | String | +| `type` | The type of the metric (cumulative, derived, ratio, or simple). | Required | String | +| `label` | Defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Required | String | +| `type_params` | The type parameters of the metric. | Required | Dict | +| `expr` | The derived expression. You'll see validation warnings when the derived metric is missing an `expr` or the `expr` does not use all the input metrics. | Required | String | +| `metrics` | The list of metrics used in the derived metrics. Each entry can include optional fields like `alias`, `filter`, or `offset_window`. | Required | List | +| `alias` | Optional alias for the metric that you can use in the `expr`. | Optional | String | +| `filter` | Optional filter to apply to the metric. | Optional | String | +| `offset_window` | Set the period for the offset window, such as 1 month. This will return the value of the metric one month from the metric time. | Optional | String | The following displays the complete specification for derived metrics, along with an example. diff --git a/website/docs/docs/build/dimensions.md b/website/docs/docs/build/dimensions.md index 5026f4c45cd..975ae4d3160 100644 --- a/website/docs/docs/build/dimensions.md +++ b/website/docs/docs/build/dimensions.md @@ -14,14 +14,14 @@ Groups are defined within semantic models, alongside entities and measures, and All dimensions require a `name`, `type`, and can optionally include an `expr` parameter. The `name` for your Dimension must be unique within the same semantic model. -| Parameter | Description | Type | -| --------- | ----------- | ---- | -| `name` | Refers to the name of the group that will be visible to the user in downstream tools. It can also serve as an alias if the column name or SQL query reference is different and provided in the `expr` parameter. <br /><br /> Dimension names should be unique within a semantic model, but they can be non-unique across different models as MetricFlow uses [joins](/docs/build/join-logic) to identify the right dimension. | Required | -| `type` | Specifies the type of group created in the semantic model. There are two types:<br /><br />- **Categorical**: Describe attributes or features like geography or sales region. <br />- **Time**: Time-based dimensions like timestamps or dates. | Required | -| `type_params` | Specific type params such as if the time is primary or used as a partition | Required | -| `description` | A clear description of the dimension | Optional | -| `expr` | Defines the underlying column or SQL query for a dimension. If no `expr` is specified, MetricFlow will use the column with the same name as the group. You can use the column name itself to input a SQL expression. | Optional | -| `label` | A recommended string that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Optional | +| Parameter | Description | Required | Type | +| --------- | ----------- | ---- | ---- | +| `name` | Refers to the name of the group that will be visible to the user in downstream tools. It can also serve as an alias if the column name or SQL query reference is different and provided in the `expr` parameter. <br /><br /> Dimension names should be unique within a semantic model, but they can be non-unique across different models as MetricFlow uses [joins](/docs/build/join-logic) to identify the right dimension. | Required | String | +| `type` | Specifies the type of group created in the semantic model. There are two types:<br /><br />- **Categorical**: Describe attributes or features like geography or sales region. <br />- **Time**: Time-based dimensions like timestamps or dates. | Required | String | +| `type_params` | Specific type params such as if the time is primary or used as a partition. | Required | Dict | +| `description` | A clear description of the dimension. | Optional | String | +| `expr` | Defines the underlying column or SQL query for a dimension. If no `expr` is specified, MetricFlow will use the column with the same name as the group. You can use the column name itself to input a SQL expression. | Optional | String | +| `label` | Defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Optional | String | Refer to the following for the complete specification for dimensions: diff --git a/website/docs/docs/build/environment-variables.md b/website/docs/docs/build/environment-variables.md index 99129cea8c9..95242069ed9 100644 --- a/website/docs/docs/build/environment-variables.md +++ b/website/docs/docs/build/environment-variables.md @@ -83,7 +83,7 @@ If you change the value of an environment variable mid-session while using the I To refresh the IDE mid-development, click on either the green 'ready' signal or the red 'compilation error' message at the bottom right corner of the IDE. A new modal will pop up, and you should select the Refresh IDE button. This will load your environment variables values into your development environment. -<Lightbox src="/img/docs/dbt-cloud/using-dbt-cloud/Environment Variables/refresh-ide.gif" title="Refreshing IDE mid-session"/> +<Lightbox src="/img/docs/dbt-cloud/using-dbt-cloud/Environment Variables/refresh-ide.png" title="Refreshing IDE mid-session"/> There are some known issues with partial parsing of a project and changing environment variables mid-session in the IDE. If you find that your dbt project is not compiling to the values you've set, try deleting the `target/partial_parse.msgpack` file in your dbt project which will force dbt to re-compile your whole project. diff --git a/website/docs/docs/build/exposures.md b/website/docs/docs/build/exposures.md index 1a85d5fb415..16dfd0e5f73 100644 --- a/website/docs/docs/build/exposures.md +++ b/website/docs/docs/build/exposures.md @@ -69,7 +69,7 @@ dbt test -s +exposure:weekly_jaffle_report ``` -When we generate the dbt Explorer site, you'll see the exposure appear: +When we generate the [dbt Explorer site](/docs/collaborate/explore-projects), you'll see the exposure appear: <Lightbox src="/img/docs/building-a-dbt-project/dbt-explorer-exposures.jpg" title="Exposures has a dedicated section, under the 'Resources' tab in dbt Explorer, which lists each exposure in your project."/> <Lightbox src="/img/docs/building-a-dbt-project/dag-exposures.png" title="Exposures appear as nodes in the dbt Explorer DAG. It displays an orange 'EXP' indicator within the node. "/> diff --git a/website/docs/docs/build/hooks-operations.md b/website/docs/docs/build/hooks-operations.md index 6cec2a673c0..842d3fb99a3 100644 --- a/website/docs/docs/build/hooks-operations.md +++ b/website/docs/docs/build/hooks-operations.md @@ -40,8 +40,6 @@ Hooks are snippets of SQL that are executed at different times: Hooks are a more-advanced capability that enable you to run custom SQL, and leverage database-specific actions, beyond what dbt makes available out-of-the-box with standard materializations and configurations. -<Snippet path="hooks-to-grants" /> - If (and only if) you can't leverage the [`grants` resource-config](/reference/resource-configs/grants), you can use `post-hook` to perform more advanced workflows: * Need to apply `grants` in a more complex way, which the dbt Core `grants` config doesn't (yet) support. diff --git a/website/docs/docs/build/incremental-microbatch.md b/website/docs/docs/build/incremental-microbatch.md index 46dfa795f1d..4aff8b5839c 100644 --- a/website/docs/docs/build/incremental-microbatch.md +++ b/website/docs/docs/build/incremental-microbatch.md @@ -8,7 +8,9 @@ id: "incremental-microbatch" :::info Microbatch -The `microbatch` strategy is available in beta for [dbt Cloud Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) and dbt Core v1.9. We have been developing it behind a flag to prevent unintended interactions with existing custom incremental strategies. To enable this feature, [set the environment variable](/docs/build/environment-variables#setting-and-overriding-environment-variables) `DBT_EXPERIMENTAL_MICROBATCH` to `True` in your dbt Cloud environments or wherever you're running dbt Core. +The new `microbatch` strategy is available in beta for [dbt Cloud "Latest"](/docs/dbt-versions/cloud-release-tracks) and dbt Core v1.9. + +If you use a custom microbatch macro, set a [distinct behavior flag](/reference/global-configs/behavior-changes#custom-microbatch-strategy) in your `dbt_project.yml` to enable batched execution. If you don't have a custom microbatch macro, you don't need to set this flag as dbt will handle microbatching automatically for any model using the [microbatch strategy](#how-microbatch-compares-to-other-incremental-strategies). Read and participate in the discussion: [dbt-core#10672](https://github.com/dbt-labs/dbt-core/discussions/10672) @@ -20,17 +22,35 @@ Refer to [Supported incremental strategies by adapter](/docs/build/incremental-s Incremental models in dbt are a [materialization](/docs/build/materializations) designed to efficiently update your data warehouse tables by only transforming and loading _new or changed data_ since the last run. Instead of reprocessing an entire dataset every time, incremental models process a smaller number of rows, and then append, update, or replace those rows in the existing table. This can significantly reduce the time and resources required for your data transformations. -Microbatch incremental models make it possible to process transformations on very large time-series datasets with efficiency and resiliency. When dbt runs a microbatch model — whether for the first time, during incremental runs, or in specified backfills — it will split the processing into multiple queries (or "batches"), based on the [`event_time`](/reference/resource-configs/event-time) and `batch_size` you configure. +Microbatch is an incremental strategy designed for large time-series datasets: +- It relies solely on a time column ([`event_time`](/reference/resource-configs/event-time)) to define time-based ranges for filtering. Set the `event_time` column for your microbatch model and its direct parents (upstream models). Note, this is different to `partition_by`, which groups rows into partitions. +- It complements, rather than replaces, existing incremental strategies by focusing on efficiency and simplicity in batch processing. +- Unlike traditional incremental strategies, microbatch enables you to [reprocess failed batches](/docs/build/incremental-microbatch#retry), auto-detect [parallel batch execution](#parallel-batch-execution), and eliminate the need to implement complex conditional logic for [backfilling](#backfills). + +- Note, microbatch might not be the best strategy for all use cases. Consider other strategies for use cases such as not having a reliable `event_time` column or if you want more control over the incremental logic. Read more in [How `microbatch` compares to other incremental strategies](#how-microbatch-compares-to-other-incremental-strategies). + +### How microbatch works + +When dbt runs a microbatch model — whether for the first time, during incremental runs, or in specified backfills — it will split the processing into multiple queries (or "batches"), based on the `event_time` and `batch_size` you configure. + +Each "batch" corresponds to a single bounded time period (by default, a single day of data). Where other incremental strategies operate only on "old" and "new" data, microbatch models treat every batch as an atomic unit that can be built or replaced on its own. Each batch is independent and <Term id="idempotent" />. + +This is a powerful abstraction that makes it possible for dbt to run batches [separately](#backfills), concurrently, and [retry](#retry) them independently. -Each "batch" corresponds to a single bounded time period (by default, a single day of data). Where other incremental strategies operate only on "old" and "new" data, microbatch models treat every batch as an atomic unit that can be built or replaced on its own. Each batch is independent and <Term id="idempotent" />. This is a powerful abstraction that makes it possible for dbt to run batches separately — in the future, concurrently — and to retry them independently. +## Example -### Example +A `sessions` model aggregates and enriches data that comes from two other models: +- `page_views` is a large, time-series table. It contains many rows, new records almost always arrive after existing ones, and existing records rarely update. It uses the `page_view_start` column as its `event_time`. +- `customers` is a relatively small dimensional table. Customer attributes update often, and not in a time-based manner — that is, older customers are just as likely to change column values as newer customers. The customers model doesn't configure an `event_time` column. -A `sessions` model aggregates and enriches data that comes from two other models. -- `page_views` is a large, time-series table. It contains many rows, new records almost always arrive after existing ones, and existing records rarely update. -- `customers` is a relatively small dimensional table. Customer attributes update often, and not in a time-based manner — that is, older customers are just as likely to change column values as newer customers. +As a result: -The `page_view_start` column in `page_views` is configured as that model's `event_time`. The `customers` model does not configure an `event_time`. Therefore, each batch of `sessions` will filter `page_views` to the equivalent time-bounded batch, and it will not filter `customers` (a full scan for every batch). +- Each batch of `sessions` will filter `page_views` to the equivalent time-bounded batch. +- The `customers` table isn't filtered, resulting in a full scan for every batch. + +:::tip +In addition to configuring `event_time` for the target table, you should also specify it for any upstream models that you want to filter, even if they have different time columns. +::: <File name="models/staging/page_views.yml"> @@ -42,7 +62,7 @@ models: ``` </File> -We run the `sessions` model on October 1, 2024, and then again on October 2. It produces the following queries: +We run the `sessions` model for October 1, 2024, and then again for October 2. It produces the following queries: <Tabs> @@ -156,22 +176,65 @@ It does not matter whether the table already contains data for that day. Given t <Lightbox src="/img/docs/building-a-dbt-project/microbatch/microbatch_filters.png" title="Each batch of sessions filters page_views to the matching time-bound batch, but doesn't filter sessions, performing a full scan for each batch."/> -### Relevant configs +## Relevant configs Several configurations are relevant to microbatch models, and some are required: -| Config | Type | Description | Default | -|----------|------|---------------|---------| -| [`event_time`](/reference/resource-configs/event-time) | Column (required) | The column indicating "at what time did the row occur." Required for your microbatch model and any direct parents that should be filtered. | N/A | -| `begin` | Date (required) | The "beginning of time" for the microbatch model. This is the starting point for any initial or full-refresh builds. For example, a daily-grain microbatch model run on `2024-10-01` with `begin = '2023-10-01` will process 366 batches (it's a leap year!) plus the batch for "today." | N/A | -| `batch_size` | String (required) | The granularity of your batches. Supported values are `hour`, `day`, `month`, and `year` | N/A | -| `lookback` | Integer (optional) | Process X batches prior to the latest bookmark to capture late-arriving records. | `1` | + +| Config | Description | Default | Type | Required | +|----------|---------------|---------|------|---------| +| [`event_time`](/reference/resource-configs/event-time) | The column indicating "at what time did the row occur." Required for your microbatch model and any direct parents that should be filtered. | N/A | Column | Required | +| [`begin`](/reference/resource-configs/begin) | The "beginning of time" for the microbatch model. This is the starting point for any initial or full-refresh builds. For example, a daily-grain microbatch model run on `2024-10-01` with `begin = '2023-10-01` will process 366 batches (it's a leap year!) plus the batch for "today." | N/A | Date | Required | +| [`batch_size`](/reference/resource-configs/batch-size) | The granularity of your batches. Supported values are `hour`, `day`, `month`, and `year` | N/A | String | Required | +| [`lookback`](/reference/resource-configs/lookback) | Process X batches prior to the latest bookmark to capture late-arriving records. | `1` | Integer | Optional | +| [`concurrent_batches`](/reference/resource-properties/concurrent_batches) | Overrides dbt's auto detect for running batches concurrently (at the same time). Read more about [configuring concurrent batches](/docs/build/incremental-microbatch#configure-concurrent_batches). Setting to <br />* `true` runs batches concurrently (in parallel). <br />* `false` runs batches sequentially (one after the other). | `None` | Boolean | Optional | <Lightbox src="/img/docs/building-a-dbt-project/microbatch/event_time.png" title="The event_time column configures the real-world time of this record"/> +### Required configs for specific adapters +Some adapters require additional configurations for the microbatch strategy. This is because each adapter implements the microbatch strategy differently. + +The following table lists the required configurations for the specific adapters, in addition to the standard microbatch configs: + +| Adapter | `unique_key` config | `partition_by` config | +|----------|------------------|--------------------| +| [`dbt-postgres`](/reference/resource-configs/postgres-configs#incremental-materialization-strategies) | ✅ Required | N/A | +| [`dbt-spark`](/reference/resource-configs/spark-configs#incremental-models) | N/A | ✅ Required | +| [`dbt-bigquery`](/reference/resource-configs/bigquery-configs#merge-behavior-incremental-models) | N/A | ✅ Required | + +For example, if you're using `dbt-postgres`, configure `unique_key` as follows: + +<File name="models/sessions.sql"> + +```sql +{{ config( + materialized='incremental', + incremental_strategy='microbatch', + unique_key='sales_id', ## required for dbt-postgres + event_time='transaction_date', + begin='2023-01-01', + batch_size='day' +) }} + +select + sales_id, + transaction_date, + customer_id, + product_id, + total_amount +from {{ source('sales', 'transactions') }} + +``` + + In this example, `unique_key` is required because `dbt-postgres` microbatch uses the `merge` strategy, which needs a `unique_key` to identify which rows in the data warehouse need to get merged. Without a `unique_key`, dbt won't be able to match rows between the incoming batch and the existing table. + +</File> + +### Full refresh + As a best practice, we recommend configuring `full_refresh: False` on microbatch models so that they ignore invocations with the `--full-refresh` flag. If you need to reprocess historical data, do so with a targeted backfill that specifies explicit start and end dates. -### Usage +## Usage **You must write your model query to process (read and return) exactly one "batch" of data**. This is a simplifying assumption and a powerful one: - You don’t need to think about `is_incremental` filtering @@ -188,7 +251,7 @@ During standard incremental runs, dbt will process batches according to the curr **Note:** If there’s an upstream model that configures `event_time`, but you *don’t* want the reference to it to be filtered, you can specify `ref('upstream_model').render()` to opt-out of auto-filtering. This isn't generally recommended — most models that configure `event_time` are fairly large, and if the reference is not filtered, each batch will perform a full scan of this input table. -### Backfills +## Backfills Whether to fix erroneous source data or retroactively apply a change in business logic, you may need to reprocess a large amount of historical data. @@ -203,13 +266,13 @@ dbt run --event-time-start "2024-09-01" --event-time-end "2024-09-04" <Lightbox src="/img/docs/building-a-dbt-project/microbatch/microbatch_backfill.png" title="Configure a lookback to reprocess additional batches during standard incremental runs"/> -### Retry +## Retry If one or more of your batches fail, you can use `dbt retry` to reprocess _only_ the failed batches. ![Partial retry](https://github.com/user-attachments/assets/f94c4797-dcc7-4875-9623-639f70c97b8f) -### Timezones +## Timezones For now, dbt assumes that all values supplied are in UTC: @@ -220,7 +283,127 @@ For now, dbt assumes that all values supplied are in UTC: While we may consider adding support for custom time zones in the future, we also believe that defining these values in UTC makes everyone's lives easier. -## How `microbatch` compares to other incremental strategies? +## Parallel batch execution + +The microbatch strategy offers the benefit of updating a model in smaller, more manageable batches. Depending on your use case, configuring your microbatch models to run in parallel offers faster processing, in comparison to running batches sequentially. + +Parallel batch execution means that multiple batches are processed at the same time, instead of one after the other (sequentially) for faster processing of your microbatch models. + +dbt automatically detects whether a batch can be run in parallel in most cases, which means you don’t need to configure this setting. However, the [`concurrent_batches` config](/reference/resource-properties/concurrent_batches) is available as an override (not a gate), allowing you to specify whether batches should or shouldn’t be run in parallel in specific cases. + +For example, if you have a microbatch model with 12 batches, you can execute those batches to run in parallel. Specifically they'll run in parallel limited by the number of [available threads](/docs/running-a-dbt-project/using-threads). + +### Prerequisites + +To enable parallel execution, you must: + +- Use a supported adapter: + - Snowflake + - Databricks + - More adapters coming soon! + - We'll be continuing to test and add concurrency support for adapters. This means that some adapters might get concurrency support _after_ the 1.9 initial release. + +- Meet [additional conditions](#how-parallel-batch-execution-works) described in the following section. + +### How parallel batch execution works + +A batch can only run in parallel if all of these conditions are met: + +| Condition | Parallel execution | Sequential execution| +| ---------------| :------------------: | :----------: | +| **Not** the first batch | ✅ | - | +| **Not** the last batch | ✅ | - | +| [Adapter supports](#prerequisites) parallel batches | ✅ | - | + + +After checking for the conditions in the previous table — and if `concurrent_batches` value isn't set, dbt will intelligently auto-detect if the model invokes the [`{{ this }}`](/reference/dbt-jinja-functions/this) Jinja function. If it references `{{ this }}`, the batches will run sequentially since `{{ this }}` represents the database of the current model and referencing the same relation causes conflict. + +Otherwise, if `{{ this }}` isn't detected (and other conditions are met), the batches will run in parallel, which can be overriden when you [set a value for `concurrent_batches`](/reference/resource-properties/concurrent_batches). + +### Parallel or sequential execution + +Choosing between parallel batch execution and sequential processing depends on the specific requirements of your use case. + +- Parallel batch execution is faster but requires logic independent of batch execution order. For example, if you're developing a data pipeline for a system that processes user transactions in batches, each batch is executed in parallel for better performance. However, the logic used to process each transaction shouldn't depend on the order of how batches are executed or completed. +- Sequential processing is slower but essential for calculations like [cumulative metrics](/docs/build/cumulative) in microbatch models. It processes data in the correct order, allowing each step to build on the previous one. + +<!-- You can override the check for `this` by setting `concurrent_batches` to either `True` or `False`. If set to `False`, the batch will be run sequentially. If set to `True` the batch will be run in parallel (assuming [1], [2], and [3]) +To override the `this` check, use the `concurrent_batches` configuration: + + +<File name='dbt_project.yml'> + +```yaml +models: + +concurrent_batches: True +``` + +</File> + +or: + +<File name='models/my_model.sql'> + +```sql +{{ + config( + materialized='incremental', + concurrent_batches=True, + incremental_strategy='microbatch' + + ... + ) +}} + +select ... +``` + +</File> +--> + +### Configure `concurrent_batches` + +By default, dbt auto-detects whether batches can run in parallel for microbatch models, and this works correctly in most cases. However, you can override dbt's detection by setting the [`concurrent_batches` config](/reference/resource-properties/concurrent_batches) in your `dbt_project.yml` or model `.sql` file to specify parallel or sequential execution, given you meet all the [conditions](#prerequisites): + +<Tabs> +<TabItem value="yaml" label="dbt_project.yml"> + +<File name='dbt_project.yml'> + +```yaml +models: + +concurrent_batches: true # value set to true to run batches in parallel +``` + +</File> +</TabItem> + +<TabItem value="sql" label="my_model.sql"> + +<File name='models/my_model.sql'> + +```sql +{{ + config( + materialized='incremental', + incremental_strategy='microbatch', + event_time='session_start', + begin='2020-01-01', + batch_size='day + concurrent_batches=true, # value set to true to run batches in parallel + ... + ) +}} + +select ... +``` +</File> +</TabItem> +</Tabs> + +## How microbatch compares to other incremental strategies + +As data warehouses roll out new operations for concurrently replacing/upserting data partitions, we may find that the new operation for the data warehouse is more efficient than what the adapter uses for microbatch. In such instances, we reserve the right the update the default operation for microbatch, so long as it works as intended/documented for models that fit the microbatch paradigm. Most incremental models rely on the end user (you) to explicitly tell dbt what "new" means, in the context of each model, by writing a filter in an `{% if is_incremental() %}` conditional block. You are responsible for crafting this SQL in a way that queries [`{{ this }}`](/reference/dbt-jinja-functions/this) to check when the most recent record was last loaded, with an optional look-back window for late-arriving records. diff --git a/website/docs/docs/build/incremental-models.md b/website/docs/docs/build/incremental-models.md index a56246addf3..0560797c9bc 100644 --- a/website/docs/docs/build/incremental-models.md +++ b/website/docs/docs/build/incremental-models.md @@ -114,7 +114,7 @@ When you define a `unique_key`, you'll see this behavior for each row of "new" d Please note that if there's a unique_key with more than one row in either the existing target table or the new incremental rows, the incremental model may fail depending on your database and [incremental strategy](/docs/build/incremental-strategy). If you're having issues running an incremental model, it's a good idea to double check that the unique key is truly unique in both your existing database table and your new incremental rows. You can [learn more about surrogate keys here](https://www.getdbt.com/blog/guide-to-surrogate-key). :::info -While common incremental strategies, such as`delete+insert` + `merge`, might use `unique_key`, others don't. For example, the `insert_overwrite` strategy does not use `unique_key`, because it operates on partitions of data rather than individual rows. For more information, see [About incremental_strategy](/docs/build/incremental-strategy). +While common incremental strategies, such as `delete+insert` + `merge`, might use `unique_key`, others don't. For example, the `insert_overwrite` strategy does not use `unique_key`, because it operates on partitions of data rather than individual rows. For more information, see [About incremental_strategy](/docs/build/incremental-strategy). ::: #### `unique_key` example @@ -156,15 +156,17 @@ Building this model incrementally without the `unique_key` parameter would resul ## How do I rebuild an incremental model? If your incremental model logic has changed, the transformations on your new rows of data may diverge from the historical transformations, which are stored in your target table. In this case, you should rebuild your incremental model. -To force dbt to rebuild the entire incremental model from scratch, use the `--full-refresh` flag on the command line. This flag will cause dbt to drop the existing target table in the database before rebuilding it for all-time. +To force dbt to rebuild the entire incremental model from scratch, use the `--full-refresh` flag on the command line. This flag will cause dbt to drop the existing target table in the database before rebuilding it for all-time. ```bash $ dbt run --full-refresh --select my_incremental_model+ ``` + It's also advisable to rebuild any downstream models, as indicated by the trailing `+`. -For detailed usage instructions, check out the [dbt run](/reference/commands/run) documentation. +You can optionally use the [`full_refresh config`](/reference/resource-configs/full_refresh) to set a resource to always or never full-refresh at the project or resource level. If specified as true or false, the `full_refresh` config will take precedence over the presence or absence of the `--full-refresh` flag. +For detailed usage instructions, check out the [dbt run](/reference/commands/run) documentation. ## What if the columns of my incremental model change? diff --git a/website/docs/docs/build/incremental-strategy.md b/website/docs/docs/build/incremental-strategy.md index 86b6a89edc6..9176e962a3a 100644 --- a/website/docs/docs/build/incremental-strategy.md +++ b/website/docs/docs/build/incremental-strategy.md @@ -30,10 +30,10 @@ Click the name of the adapter in the below table for more information about supp | [dbt-redshift](/reference/resource-configs/redshift-configs#incremental-materialization-strategies) | ✅ | ✅ | ✅ | | ✅ | | [dbt-bigquery](/reference/resource-configs/bigquery-configs#merge-behavior-incremental-models) | | ✅ | | ✅ | ✅ | | [dbt-spark](/reference/resource-configs/spark-configs#incremental-models) | ✅ | ✅ | | ✅ | ✅ | -| [dbt-databricks](/reference/resource-configs/databricks-configs#incremental-models) | ✅ | ✅ | | ✅ | | +| [dbt-databricks](/reference/resource-configs/databricks-configs#incremental-models) | ✅ | ✅ | | ✅ | ✅ | | [dbt-snowflake](/reference/resource-configs/snowflake-configs#merge-behavior-incremental-models) | ✅ | ✅ | ✅ | | ✅ | | [dbt-trino](/reference/resource-configs/trino-configs#incremental) | ✅ | ✅ | ✅ | | | -| [dbt-fabric](/reference/resource-configs/fabric-configs#incremental) | ✅ | ✅ | ✅ | | | +| [dbt-fabric](/reference/resource-configs/fabric-configs#incremental) | ✅ | | ✅ | | | | [dbt-athena](/reference/resource-configs/athena-configs#incremental-models) | ✅ | ✅ | | ✅ | | ### Configuring incremental strategy @@ -295,6 +295,8 @@ For example, a user-defined strategy named `insert_only` can be defined and used </File> +If you use a custom microbatch macro, set a [`require_batched_execution_for_custom_microbatch_strategy` behavior flag](/reference/global-configs/behavior-changes#custom-microbatch-strategy) in your `dbt_project.yml` to enable batched execution of your custom strategy. + ### Custom strategies from a package To use the `merge_null_safe` custom incremental strategy from the `example` package: diff --git a/website/docs/docs/build/metricflow-time-spine.md b/website/docs/docs/build/metricflow-time-spine.md index 5f16af38023..5499c61a8e4 100644 --- a/website/docs/docs/build/metricflow-time-spine.md +++ b/website/docs/docs/build/metricflow-time-spine.md @@ -7,7 +7,7 @@ tags: [Metrics, Semantic Layer] --- <VersionBlock firstVersion="1.9"> -<!-- this whole section is for 1.9 and higher + Versionless --> +<!-- this whole section is for 1.9 and higher + Release Tracks --> It's common in analytics engineering to have a date dimension or "time spine" table as a base table for different types of time-based joins and aggregations. The structure of this table is typically a base column of daily or hourly dates, with additional columns for other time grains, like fiscal quarters, defined based on the base column. You can join other tables to the time spine on the base column to calculate metrics like revenue at a point in time, or to aggregate to a specific time grain. @@ -108,7 +108,7 @@ models: - It needs to reference a column defined under the `columns` key, in this case, `date_hour` and `date_day`, respectively. - It sets the granularity at the column-level using the `granularity` key, in this case, `hour` and `day`, respectively. - MetricFlow will use the `standard_granularity_column` as the join key when joining the time spine table to another source table. -- [The `custom_granularities` field](#custom-calendar), (available in Versionless and dbt v1.9 and higher) lets you specify non-standard time periods like `fiscal_year` or `retail_month` that your organization may use. +- [The `custom_granularities` field](#custom-calendar), (available in dbt Cloud Latest and dbt Core v1.9 and higher) lets you specify non-standard time periods like `fiscal_year` or `retail_month` that your organization may use. For an example project, refer to our [Jaffle shop](https://github.com/dbt-labs/jaffle-sl-template/blob/main/models/marts/_models.yml) example. @@ -179,8 +179,8 @@ final as ( select * from final -- filter the time spine to a specific range -where date_day > dateadd(year, -4, current_timestamp()) -and date_day < dateadd(day, 30, current_timestamp()) +where date_day > date_add(DATE(current_timestamp()), INTERVAL -4 YEAR) +and date_day < date_add(DATE(current_timestamp()), INTERVAL 30 DAY) ``` </File> @@ -310,9 +310,7 @@ You only need to include the `date_day` column in the table. MetricFlow can hand <VersionBlock lastVersion="1.8"> -The ability to configure custom calendars, such as a fiscal calendar, is available in [dbt Cloud Versionless](/docs/dbt-versions/versionless-cloud) or dbt Core [v1.9 and higher](/docs/dbt-versions/core). - -To access this feature, [upgrade to Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) or your dbt Core version to v1.9 or higher. +The ability to configure custom calendars, such as a fiscal calendar, is available now in [the "Latest" release track in dbt Cloud](/docs/dbt-versions/cloud-release-tracks), and it will be available in [dbt Core v1.9+](/docs/dbt-versions/core-upgrade/upgrading-to-v1.9). </VersionBlock> diff --git a/website/docs/docs/build/metrics-overview.md b/website/docs/docs/build/metrics-overview.md index 7021a6d7330..57cdd929acb 100644 --- a/website/docs/docs/build/metrics-overview.md +++ b/website/docs/docs/build/metrics-overview.md @@ -15,15 +15,15 @@ This article explains the different supported metric types you can add to your d <VersionBlock firstVersion="1.8"> -| Parameter | Description | Type | -| --------- | ----------- | ---- | -| `name` | Provide the reference name for the metric. This name must be a unique metric name and can consist of lowercase letters, numbers, and underscores. | Required | -| `description` | Describe your metric. | Optional | -| `type` | Define the type of metric, which can be `conversion`, `cumulative`, `derived`, `ratio`, or `simple`. | Required | -| `type_params` | Additional parameters used to configure metrics. `type_params` are different for each metric type. | Required | -| `label` | Required string that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Required | -| `config` | Use the [`config`](/reference/resource-properties/config) property to specify configurations for your metric. Supports [`meta`](/reference/resource-configs/meta), [`group`](/reference/resource-configs/group), and [`enabled`](/reference/resource-configs/enabled) configurations. | Optional | -| `filter` | You can optionally add a [filter](#filters) string to any metric type, applying filters to dimensions, entities, time dimensions, or other metrics during metric computation. Consider it as your WHERE clause. | Optional | +| Parameter | Description | Required | Type | +| --------- | ----------- | ---- | ---- | +| `name` | Provide the reference name for the metric. This name must be a unique metric name and can consist of lowercase letters, numbers, and underscores. | Required | String | +| `description` | Describe your metric. | Optional | String | +| `type` | Define the type of metric, which can be `conversion`, `cumulative`, `derived`, `ratio`, or `simple`. | Required | String | +| `type_params` | Additional parameters used to configure metrics. `type_params` are different for each metric type. | Required | Dict | +| `label` | Required string that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Required | String | +| `config` | Use the [`config`](/reference/resource-properties/config) property to specify configurations for your metric. Supports [`meta`](/reference/resource-configs/meta), [`group`](/reference/resource-configs/group), and [`enabled`](/reference/resource-configs/enabled) configurations. | Optional | Dict | +| `filter` | You can optionally add a [filter](#filters) string to any metric type, applying filters to dimensions, entities, time dimensions, or other metrics during metric computation. Consider it as your WHERE clause. | Optional | String | Here's a complete example of the metrics spec configuration: @@ -52,16 +52,16 @@ metrics: <VersionBlock lastVersion="1.7"> -| Parameter | Description | Type | -| --------- | ----------- | ---- | -| `name` | Provide the reference name for the metric. This name must be unique amongst all metrics. | Required | -| `description` | Describe your metric. | Optional | -| `type` | Define the type of metric, which can be `simple`, `ratio`, `cumulative`, or `derived`. | Required | -| `type_params` | Additional parameters used to configure metrics. `type_params` are different for each metric type. | Required | -| `config` | Provide the specific configurations for your metric. | Optional | -| `meta` | Use the [`meta` config](/reference/resource-configs/meta) to set metadata for a resource. | Optional | -| `label` | Required string that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Required | -| `filter` | You can optionally add a filter string to any metric type, applying filters to dimensions, entities, or time dimensions during metric computation. Consider it as your WHERE clause. | Optional | +| Parameter | Description | Required | Type | +| --------- | ----------- | ---- | ---- | +| `name` | Provide the reference name for the metric. This name must be unique amongst all metrics. | Required | String | +| `description` | Describe your metric. | Optional | String | +| `type` | Define the type of metric, which can be `simple`, `ratio`, `cumulative`, or `derived`. | Required | String | +| `type_params` | Additional parameters used to configure metrics. `type_params` are different for each metric type. | Required | Dict | +| `config` | Provide the specific configurations for your metric. | Optional | Dict | +| `meta` | Use the [`meta` config](/reference/resource-configs/meta) to set metadata for a resource. | Optional | String | +| `label` | Required string that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Required | String | +| `filter` | You can optionally add a filter string to any metric type, applying filters to dimensions, entities, or time dimensions during metric computation. Consider it as your WHERE clause. | Optional | String | Here's a complete example of the metrics spec configuration: @@ -95,7 +95,8 @@ import SLCourses from '/snippets/_sl-course.md'; <VersionBlock lastVersion="1.8"> Default time granularity for metrics is useful if your time dimension has a very fine grain, like second or hour, but you typically query metrics rolled up at a coarser grain. -To set the default time granularity for metrics, you need to be on dbt Cloud Versionless or dbt v1.9 and higher. +Default time granularity for metrics is available now in [the "Latest" release track in dbt Cloud](/docs/dbt-versions/cloud-release-tracks), and it will be available in [dbt Core v1.9+](/docs/dbt-versions/core-upgrade/upgrading-to-v1.9). + </VersionBlock> diff --git a/website/docs/docs/build/packages.md b/website/docs/docs/build/packages.md index 49cd7e00b1c..9ba4ceeaff5 100644 --- a/website/docs/docs/build/packages.md +++ b/website/docs/docs/build/packages.md @@ -162,7 +162,7 @@ Where `name: 'dbt_utils'` specifies the subfolder of `dbt_packages` that's creat #### SSH Key Method (Command Line only) If you're using the Command Line, private packages can be cloned via SSH and an SSH key. -When you use SSH keys to authenticate to your git remote server, you don’t need to supply your username and password each time. Read more about SSH keys, how to generate them, and how to add them to your git provider here: [Github](https://docs.github.com/en/github/authenticating-to-github/connecting-to-github-with-ssh) and [GitLab](https://docs.gitlab.com/ee/ssh/). +When you use SSH keys to authenticate to your git remote server, you don’t need to supply your username and password each time. Read more about SSH keys, how to generate them, and how to add them to your git provider here: [Github](https://docs.github.com/en/github/authenticating-to-github/connecting-to-github-with-ssh) and [GitLab](https://docs.gitlab.com/ee/user/ssh.html). <File name='packages.yml'> diff --git a/website/docs/docs/build/python-models.md b/website/docs/docs/build/python-models.md index 28136f91e9c..eac477b03fd 100644 --- a/website/docs/docs/build/python-models.md +++ b/website/docs/docs/build/python-models.md @@ -598,6 +598,34 @@ Python models have capabilities that SQL models do not. They also have some draw - **These capabilities are very new.** As data warehouses develop new features, we expect them to offer cheaper, faster, and more intuitive mechanisms for deploying Python transformations. **We reserve the right to change the underlying implementation for executing Python models in future releases.** Our commitment to you is around the code in your model `.py` files, following the documented capabilities and guidance we're providing here. - **Lack of `print()` support.** The data platform runs and compiles your Python model without dbt's oversight. This means it doesn't display the output of commands such as Python's built-in [`print()`](https://docs.python.org/3/library/functions.html#print) function in dbt's logs. +- <Expandable alt_header="Alternatives to using print() in Python models"> + + The following explains other methods you can use for debugging, such as writing messages to a dataframe column: + + - Using platform logs: Use your data platform's logs to debug your Python models. + - Return logs as a dataframe: Create a dataframe containing your logs and build it into the warehouse. + - Develop locally with DuckDB: Test and debug your models locally using DuckDB before deploying them. + + Here's an example of debugging in a Python model: + + ```python + def model(dbt, session): + dbt.config( + materialized = "table" + ) + + df = dbt.ref("my_source_table").df() + + # One option for debugging: write messages to temporary table column + # Pros: visibility + # Cons: won't work if table isn't building for some reason + msg = "something" + df["debugging"] = f"My debug message here: {msg}" + + return df + ``` + </Expandable> + As a general rule, if there's a transformation you could write equally well in SQL or Python, we believe that well-written SQL is preferable: it's more accessible to a greater number of colleagues, and it's easier to write code that's performant at scale. If there's a transformation you _can't_ write in SQL, or where ten lines of elegant and well-annotated Python could save you 1000 lines of hard-to-read Jinja-SQL, Python is the way to go. ## Specific data platforms {#specific-data-platforms} @@ -613,7 +641,8 @@ In their initial launch, Python models are supported on three of the most popula **Installing packages:** Snowpark supports several popular packages via Anaconda. Refer to the [complete list](https://repo.anaconda.com/pkgs/snowflake/) for more details. Packages are installed when your model is run. Different models can have different package dependencies. If you use third-party packages, Snowflake recommends using a dedicated virtual warehouse for best performance rather than one with many concurrent users. **Python version:** To specify a different python version, use the following configuration: -``` + +```python def model(dbt, session): dbt.config( materialized = "table", @@ -625,7 +654,7 @@ def model(dbt, session): **External access integrations and secrets**: To query external APIs within dbt Python models, use Snowflake’s [external access](https://docs.snowflake.com/en/developer-guide/external-network-access/external-network-access-overview) together with [secrets](https://docs.snowflake.com/en/developer-guide/external-network-access/secret-api-reference). Here are some additional configurations you can use: -``` +```python import pandas import snowflake.snowpark as snowpark @@ -645,18 +674,7 @@ def model(dbt, session: snowpark.Session): </VersionBlock> -**About "sprocs":** dbt submits Python models to run as _stored procedures_, which some people call _sprocs_ for short. By default, dbt will create a named sproc containing your model's compiled Python code, and then _call_ it to execute. Snowpark has an Open Preview feature for _temporary_ or _anonymous_ stored procedures ([docs](https://docs.snowflake.com/en/sql-reference/sql/call-with.html)), which are faster and leave a cleaner query history. You can switch this feature on for your models by configuring `use_anonymous_sproc: True`. We plan to switch this on for all dbt + Snowpark Python models starting with the release of dbt Core version 1.4. - -<File name='dbt_project.yml'> - -```yml -# I asked Snowflake Support to enable this Private Preview feature, -# and now my dbt-py models run even faster! -models: - use_anonymous_sproc: True -``` - -</File> +**About "sprocs":** dbt submits Python models to run as _stored procedures_, which some people call _sprocs_ for short. By default, dbt will use Snowpark's _temporary_ or _anonymous_ stored procedures ([docs](https://docs.snowflake.com/en/sql-reference/sql/call-with.html)), which are faster and keep query history cleaner than named sprocs containing your model's compiled Python code. To disable this feature, set `use_anonymous_sproc: False` in your model configuration. **Docs:** ["Developer Guide: Snowpark Python"](https://docs.snowflake.com/en/developer-guide/snowpark/python/index.html) diff --git a/website/docs/docs/build/ratio-metrics.md b/website/docs/docs/build/ratio-metrics.md index fdaeb878450..a34dec29d71 100644 --- a/website/docs/docs/build/ratio-metrics.md +++ b/website/docs/docs/build/ratio-metrics.md @@ -10,17 +10,17 @@ Ratio allows you to create a ratio between two metrics. You simply specify a num The parameters, description, and type for ratio metrics are: -| Parameter | Description | Type | -| --------- | ----------- | ---- | -| `name` | The name of the metric. | Required | -| `description` | The description of the metric. | Optional | -| `type` | The type of the metric (cumulative, derived, ratio, or simple). | Required | -| `label` | Required string that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Required | -| `type_params` | The type parameters of the metric. | Required | -| `numerator` | The name of the metric used for the numerator, or structure of properties. | Required | -| `denominator` | The name of the metric used for the denominator, or structure of properties. | Required | -| `filter` | Optional filter for the numerator or denominator. | Optional | -| `alias` | Optional alias for the numerator or denominator. | Optional | +| Parameter | Description | Required | Type | +| --------- | ----------- | ---- | ---- | +| `name` | The name of the metric. | Required | String | +| `description` | The description of the metric. | Optional | String | +| `type` | The type of the metric (cumulative, derived, ratio, or simple). | Required | String | +| `label` | Defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Required | String | +| `type_params` | The type parameters of the metric. | Required | Dict | +| `numerator` | The name of the metric used for the numerator, or structure of properties. | Required | String or dict | +| `denominator` | The name of the metric used for the denominator, or structure of properties. | Required | String or dict | +| `filter` | Optional filter for the numerator or denominator. | Optional | String | +| `alias` | Optional alias for the numerator or denominator. | Optional | String | The following displays the complete specification for ratio metrics, along with an example. diff --git a/website/docs/docs/build/semantic-models.md b/website/docs/docs/build/semantic-models.md index 609d7f1ff8d..5ff363dd44c 100644 --- a/website/docs/docs/build/semantic-models.md +++ b/website/docs/docs/build/semantic-models.md @@ -26,18 +26,18 @@ import SLCourses from '/snippets/\_sl-course.md'; Here we describe the Semantic model components with examples: -| Component | Description | Type | -| --------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | -| [Name](#name) | Choose a unique name for the semantic model. Avoid using double underscores (\_\_) in the name as they're not supported. | Required | -| [Description](#description) | Includes important details in the description | Optional | -| [Model](#model) | Specifies the dbt model for the semantic model using the `ref` function | Required | -| [Defaults](#defaults) | The defaults for the model, currently only `agg_time_dimension` is supported. | Required | -| [Entities](#entities) | Uses the columns from entities as join keys and indicate their type as primary, foreign, or unique keys with the `type` parameter | Required | -| [Primary Entity](#primary-entity) | If a primary entity exists, this component is Optional. If the semantic model has no primary entity, then this property is required. | Optional | -| [Dimensions](#dimensions) | Different ways to group or slice data for a metric, they can be `time` or `categorical` | Required | -| [Measures](#measures) | Aggregations applied to columns in your data model. They can be the final metric or used as building blocks for more complex metrics | Optional | -| Label | The display name for your semantic model `node`, `dimension`, `entity`, and/or `measures` | Optional | -| `config` | Use the [`config`](/reference/resource-properties/config) property to specify configurations for your metric. Supports [`meta`](/reference/resource-configs/meta), [`group`](/reference/resource-configs/group), and [`enabled`](/reference/resource-configs/enabled) configs. | Optional | +| Component | Description | Required | Type | +| ------------ | ---------------- | -------- | -------- | +| [Name](#name) | Choose a unique name for the semantic model. Avoid using double underscores (\_\_) in the name as they're not supported. | Required | String | +| [Description](#description) | Includes important details in the description. | Optional | String | +| [Model](#model) | Specifies the dbt model for the semantic model using the `ref` function. | Required | String | +| [Defaults](#defaults) | The defaults for the model, currently only `agg_time_dimension` is supported. | Required | Dict | +| [Entities](#entities) | Uses the columns from entities as join keys and indicate their type as primary, foreign, or unique keys with the `type` parameter. | Required | List | +| [Primary Entity](#primary-entity) | If a primary entity exists, this component is Optional. If the semantic model has no primary entity, then this property is required. | Optional | String | +| [Dimensions](#dimensions) | Different ways to group or slice data for a metric, they can be `time` or `categorical`. | Required | List | +| [Measures](#measures) | Aggregations applied to columns in your data model. They can be the final metric or used as building blocks for more complex metrics. | Optional | List | +| [Label](#label) | The display name for your semantic model `node`, `dimension`, `entity`, and/or `measures`. | Optional | String | +| `config` | Use the [`config`](/reference/resource-properties/config) property to specify configurations for your metric. Supports [`meta`](/reference/resource-configs/meta), [`group`](/reference/resource-configs/group), and [`enabled`](/reference/resource-configs/enabled) configs. | Optional | Dict | ## Semantic models components diff --git a/website/docs/docs/build/simple.md b/website/docs/docs/build/simple.md index f57d498d290..2deb718d780 100644 --- a/website/docs/docs/build/simple.md +++ b/website/docs/docs/build/simple.md @@ -15,17 +15,19 @@ Simple metrics are metrics that directly reference a single measure, without any Note that we use the double colon (::) to indicate whether a parameter is nested within another parameter. So for example, `query_params::metrics` means the `metrics` parameter is nested under `query_params`. ::: -| Parameter | Description | Type | -| --------- | ----------- | ---- | -| `name` | The name of the metric. | Required | -| `description` | The description of the metric. | Optional | -| `type` | The type of the metric (cumulative, derived, ratio, or simple). | Required | -| `label` | Required string that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Required | -| `type_params` | The type parameters of the metric. | Required | -| `measure` | A list of measure inputs | Required | -| `measure:name` | The measure you're referencing. | Required | -| `measure:fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | Optional | -| `measure:join_to_timespine` | Boolean that indicates if the aggregated measure should be joined to the time spine table to fill in missing dates. Default `false`. | Optional | +| Parameter | Description | Required | Type | +| --------- | ----------- | ---- | ---- | +| `name` | The name of the metric. | Required | String | +| `description` | The description of the metric. | Optional | String | +| `type` | The type of the metric (cumulative, derived, ratio, or simple). | Required | String | +| `label` | Defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). | Required | String | +| `type_params` | The type parameters of the metric. | Required | Dict | +| `measure` | A list of measure inputs. | Required | List | +| `measure:name` | The measure you're referencing. | Required | String | +| `measure:alias` | Optional [`alias`](/reference/resource-configs/alias) to rename the measure. | Optional | String | +| `measure:filter` | Optional `filter` applied to the measure. | Optional | String | +| `measure:fill_nulls_with` | Set the value in your metric definition instead of null (such as zero). | Optional | String | +| `measure:join_to_timespine` | Indicates if the aggregated measure should be joined to the time spine table to fill in missing dates. Default `false`. | Optional | Boolean | The following displays the complete specification for simple metrics, along with an example. @@ -38,6 +40,8 @@ metrics: type_params: # Required measure: name: The name of your measure # Required + alias: The alias applied to the measure. # Optional + filter: The filter applied to the measure. # Optional fill_nulls_with: Set value instead of null (such as zero) # Optional join_to_timespine: true/false # Boolean that indicates if the aggregated measure should be joined to the time spine table to fill in missing dates. # Optional @@ -65,9 +69,11 @@ If you've already defined the measure using the `create_metric: true` parameter, name: customers # The measure you are creating a proxy of. fill_nulls_with: 0 join_to_timespine: true + alias: customer_count + filter: {{ Dimension('customer__customer_total') }} >= 20 - name: large_orders description: "Order with order values over 20." - type: SIMPLE + type: simple label: Large orders type_params: measure: diff --git a/website/docs/docs/build/snapshots.md b/website/docs/docs/build/snapshots.md index 3b21549a3c7..f72f1eb75de 100644 --- a/website/docs/docs/build/snapshots.md +++ b/website/docs/docs/build/snapshots.md @@ -10,8 +10,7 @@ id: "snapshots" * [Snapshot properties](/reference/snapshot-properties) * [`snapshot` command](/reference/commands/snapshot) - -### What are snapshots? +## What are snapshots? Analysts often need to "look back in time" at previous data states in their mutable tables. While some source data systems are built in a way that makes accessing historical data possible, this is not always the case. dbt provides a mechanism, **snapshots**, which records changes to a mutable <Term id="table" /> over time. Snapshots implement [type-2 Slowly Changing Dimensions](https://en.wikipedia.org/wiki/Slowly_changing_dimension#Type_2:_add_new_row) over mutable source tables. These Slowly Changing Dimensions (or SCDs) identify how a row in a table changes over time. Imagine you have an `orders` table where the `status` field can be overwritten as the order is processed. @@ -39,7 +38,8 @@ This order is now in the "shipped" state, but we've lost the information about w <VersionBlock lastVersion="1.8" > - To configure snapshots in versions 1.8 and earlier, refer to [Configure snapshots in versions 1.8 and earlier](#configure-snapshots-in-versions-18-and-earlier). These versions use an older syntax where snapshots are defined within a snapshot block in a `.sql` file, typically located in your `snapshots` directory. -- Note that defining multiple resources in a single file can significantly slow down parsing and compilation. For faster and more efficient management, consider the updated snapshot YAML syntax, [available in Versionless](/docs/dbt-versions/versionless-cloud) or [dbt Core v1.9 and later](/docs/dbt-versions/core). +- Note that defining multiple resources in a single file can significantly slow down parsing and compilation. For faster and more efficient management, consider the updated snapshot YAML syntax, [available now in the "Latest" release track in dbt Cloud](/docs/dbt-versions/cloud-release-tracks) or [dbt Core v1.9 and later](/docs/dbt-versions/core). + - For more information on how to migrate from the legacy snapshot configurations to the updated snapshot YAML syntax, refer to [Snapshot configuration migration](/reference/snapshot-configs#snapshot-configuration-migration). </VersionBlock> @@ -63,9 +63,9 @@ snapshots: [unique_key](/reference/resource-configs/unique_key): column_name_or_expression [check_cols](/reference/resource-configs/check_cols): [column_name] | all [updated_at](/reference/resource-configs/updated_at): column_name - [invalidate_hard_deletes](/reference/resource-configs/invalidate_hard_deletes): true | false [snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names): dictionary [dbt_valid_to_current](/reference/resource-configs/dbt_valid_to_current): string + [hard_deletes](/reference/resource-configs/hard-deletes): ignore | invalidate | new_record ``` </File> @@ -81,9 +81,9 @@ The following table outlines the configurations available for snapshots: | [unique_key](/reference/resource-configs/unique_key) | A <Term id="primary-key" /> column(s) (string or array) or expression for the record | Yes | `id` or `[order_id, product_id]` | | [check_cols](/reference/resource-configs/check_cols) | If using the `check` strategy, then the columns to check | Only if using the `check` strategy | ["status"] | | [updated_at](/reference/resource-configs/updated_at) | If using the `timestamp` strategy, the timestamp column to compare | Only if using the `timestamp` strategy | updated_at | -| [invalidate_hard_deletes](/reference/resource-configs/invalidate_hard_deletes) | Find hard deleted records in source and set `dbt_valid_to` to current time if the record no longer exists | No | True | | [dbt_valid_to_current](/reference/resource-configs/dbt_valid_to_current) | Set a custom indicator for the value of `dbt_valid_to` in current snapshot records (like a future date). By default, this value is `NULL`. When configured, dbt will use the specified value instead of `NULL` for `dbt_valid_to` for current records in the snapshot table.| No | string | | [snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names) | Customize the names of the snapshot meta fields | No | dictionary | +| [hard_deletes](/reference/resource-configs/hard-deletes) | Specify how to handle deleted rows from the source. Supported options are `ignore` (default), `invalidate` (replaces the legacy `invalidate_hard_deletes=true`), and `new_record`.| No | string | - In versions prior to v1.9, the `target_schema` (required) and `target_database` (optional) configurations defined a single schema or database to build a snapshot across users and environment. This created problems when testing or developing a snapshot, as there was no clear separation between development and production environments. In v1.9, `target_schema` became optional, allowing snapshots to be environment-aware. By default, without `target_schema` or `target_database` defined, snapshots now use the `generate_schema_name` or `generate_database_name` macros to determine where to build. Developers can still set a custom location with [`schema`](/reference/resource-configs/schema) and [`database`](/reference/resource-configs/database) configs, consistent with other resource types. @@ -172,7 +172,7 @@ This strategy handles column additions and deletions better than the `check` str <Expandable alt_header="Use dbt_valid_to_current for easier date range queries"> -By default, `dbt_valid_to` is `NULL` for current records. However, if you set the [`dbt_valid_to_current` configuration](/reference/resource-configs/dbt_valid_to_current) (available in Versionless and 1.9 and higher), `dbt_valid_to` will be set to your specified value (such as `9999-12-31`) for current records. +By default, `dbt_valid_to` is `NULL` for current records. However, if you set the [`dbt_valid_to_current` configuration](/reference/resource-configs/dbt_valid_to_current) (available in dbt Core v1.9+), `dbt_valid_to` will be set to your specified value (such as `9999-12-31`) for current records. This allows for straightforward date range filtering. @@ -210,15 +210,19 @@ Snapshots can't be rebuilt. Because of this, it's a good idea to put snapshots i ### How snapshots work When you run the [`dbt snapshot` command](/reference/commands/snapshot): -* **On the first run:** dbt will create the initial snapshot table — this will be the result set of your `select` statement, with additional columns including `dbt_valid_from` and `dbt_valid_to`. All records will have a `dbt_valid_to = null` or the value specified in [`dbt_valid_to_current`](/reference/resource-configs/dbt_valid_to_current) (available in Versionless and 1.9 and higher) if configured. +* **On the first run:** dbt will create the initial snapshot table — this will be the result set of your `select` statement, with additional columns including `dbt_valid_from` and `dbt_valid_to`. All records will have a `dbt_valid_to = null` or the value specified in [`dbt_valid_to_current`](/reference/resource-configs/dbt_valid_to_current) (available in dbt Core 1.9+) if configured. * **On subsequent runs:** dbt will check which records have changed or if any new records have been created: - The `dbt_valid_to` column will be updated for any existing records that have changed. - - The updated record and any new records will be inserted into the snapshot table. These records will now have `dbt_valid_to = null` or the value configured in `dbt_valid_to_current` (available in Versionless and 1.9 and higher). + - The updated record and any new records will be inserted into the snapshot table. These records will now have `dbt_valid_to = null` or the value configured in `dbt_valid_to_current` (available in dbt Core v1.9+). + +<VersionBlock firstVersion="1.9"> #### Note - These column names can be customized to your team or organizational conventions using the [snapshot_meta_column_names](#snapshot-meta-fields) config. - Use the `dbt_valid_to_current` config to set a custom indicator for the value of `dbt_valid_to` in current snapshot records (like a future date such as `9999-12-31`). By default, this value is `NULL`. When set, dbt will use this specified value instead of `NULL` for `dbt_valid_to` for current records in the snapshot table. - +- Use the [`hard_deletes`](/reference/resource-configs/hard-deletes) config to track hard deletes by adding a new record when row become "deleted" in source. Supported options are `ignore`, `invalidate`, and `new_record`. +</VersionBlock> + Snapshots can be referenced in downstream models the same way as referencing models — by using the [ref](/reference/dbt-jinja-functions/ref) function. ## Detecting row changes @@ -294,7 +298,7 @@ The `check` snapshot strategy can be configured to track changes to _all_ column ::: -**Example Usage** +**Example usage** <VersionBlock lastVersion="1.8"> @@ -344,15 +348,64 @@ snapshots: ### Hard deletes (opt-in) +<VersionBlock firstVersion="1.9"> + +In dbt v1.9 and higher, the [`hard_deletes`](/reference/resource-configs/hard-deletes) config replaces the `invalidate_hard_deletes` config to give you more control on how to handle deleted rows from the source. The `hard_deletes` config is not a separate strategy but an additional opt-in feature that can be used with any snapshot strategy. + +The `hard_deletes` config has three options/fields: +| Field | Description | +| --------- | ----------- | +| `ignore` (default) | No action for deleted records. | +| `invalidate` | Behaves the same as the existing `invalidate_hard_deletes=true`, where deleted records are invalidated by setting `dbt_valid_to`. | +| `new_record` | Tracks deleted records as new rows using the `dbt_is_deleted` [meta field](#snapshot-meta-fields) when records are deleted.| + +import HardDeletes from '/snippets/_hard-deletes.md'; + +<HardDeletes /> + +#### Example usage + +<File name='snapshots/orders_snapshot.yml'> + +```yaml +snapshots: + - name: orders_snapshot_hard_delete + relation: source('jaffle_shop', 'orders') + config: + schema: snapshots + unique_key: id + strategy: timestamp + updated_at: updated_at + hard_deletes: new_record # options are: 'ignore', 'invalidate', or 'new_record' +``` + +</File> + +In this example, the `hard_deletes: new_record` config will add a new row for deleted records with the `dbt_is_deleted` column set to `True`. +Any restored records are added as new rows with the `dbt_is_deleted` field set to `False`. + +The resulting table will look like this: + +| id | status | updated_at | dbt_valid_from | dbt_valid_to | dbt_is_deleted | +| -- | ------ | ---------- | -------------- | ------------ | -------------- | +| 1 | pending | 2024-01-01 10:47 | 2024-01-01 10:47 | 2024-01-01 11:05 | False | +| 1 | shipped | 2024-01-01 11:05 | 2024-01-01 11:05 | 2024-01-01 11:20 | False | +| 1 | deleted | 2024-01-01 11:20 | 2024-01-01 11:20 | 2024-01-01 12:00 | True | +| 1 | restored | 2024-01-01 12:00 | 2024-01-01 12:00 | | False | + +</VersionBlock> + +<VersionBlock lastVersion="1.8"> + Rows that are deleted from the source query are not invalidated by default. With the config option `invalidate_hard_deletes`, dbt can track rows that no longer exist. This is done by left joining the snapshot table with the source table, and filtering the rows that are still valid at that point, but no longer can be found in the source table. `dbt_valid_to` will be set to the current snapshot time. This configuration is not a different strategy as described above, but is an additional opt-in feature. It is not enabled by default since it alters the previous behavior. For this configuration to work with the `timestamp` strategy, the configured `updated_at` column must be of timestamp type. Otherwise, queries will fail due to mixing data types. -**Example Usage** +Note, in v1.9 and higher, the [`hard_deletes`](/reference/resource-configs/hard-deletes) config replaces the `invalidate_hard_deletes` config for better control over how to handle deleted rows from the source. -<VersionBlock lastVersion="1.8"> +#### Example usage <File name='snapshots/orders_snapshot_hard_delete.sql'> @@ -378,33 +431,16 @@ For this configuration to work with the `timestamp` strategy, the configured `up </VersionBlock> -<VersionBlock firstVersion="1.9"> - -<File name='snapshots/orders_snapshot.yml'> - -```yaml -snapshots: - - name: orders_snapshot_hard_delete - relation: source('jaffle_shop', 'orders') - config: - schema: snapshots - unique_key: id - strategy: timestamp - updated_at: updated_at - invalidate_hard_deletes: true -``` - -</File> - -</VersionBlock> - ## Snapshot meta-fields Snapshot <Term id="table">tables</Term> will be created as a clone of your source dataset, plus some additional meta-fields*. -Starting in 1.9 or with [dbt Cloud Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless): -- These column names can be customized to your team or organizational conventions using the [`snapshot_meta_column_names`](/reference/resource-configs/snapshot_meta_column_names) config. +In dbt Core v1.9+ (or available sooner in [the "Latest" release track in dbt Cloud](/docs/dbt-versions/cloud-release-tracks)): +- These column names can be customized to your team or organizational conventions using the [`snapshot_meta_column_names`](/reference/resource-configs/snapshot_meta_column_names) config. +ess) - Use the [`dbt_valid_to_current` config](/reference/resource-configs/dbt_valid_to_current) to set a custom indicator for the value of `dbt_valid_to` in current snapshot records (like a future date such as `9999-12-31`). By default, this value is `NULL`. When set, dbt will use this specified value instead of `NULL` for `dbt_valid_to` for current records in the snapshot table. +- Use the [`hard_deletes`](/reference/resource-configs/hard-deletes) config to track deleted records as new rows with the `dbt_is_deleted` meta field when using the `hard_deletes='new_record'` field. + | Field | Meaning | Usage | | -------------- | ------- | ----- | @@ -412,6 +448,7 @@ Starting in 1.9 or with [dbt Cloud Versionless](/docs/dbt-versions/upgrade-dbt-v | dbt_valid_to | The timestamp when this row became invalidated. <br /> For current records, this is `NULL` by default <VersionBlock firstVersion="1.9"> or the value specified in `dbt_valid_to_current`.</VersionBlock> | The most recent snapshot record will have `dbt_valid_to` set to `NULL` <VersionBlock firstVersion="1.9"> or the specified value. </VersionBlock> | | dbt_scd_id | A unique key generated for each snapshotted record. | This is used internally by dbt | | dbt_updated_at | The updated_at timestamp of the source record when this snapshot row was inserted. | This is used internally by dbt | +| dbt_is_deleted | A boolean value indicating if the record has been deleted. `True` if deleted, `False` otherwise. | Added when `hard_deletes='new_record'` is configured. This is used internally by dbt | *The timestamps used for each column are subtly different depending on the strategy you use: @@ -445,6 +482,15 @@ Snapshot results (note that `11:30` is not used anywhere): | 1 | pending | 2024-01-01 10:47 | 2024-01-01 10:47 | 2024-01-01 11:05 | 2024-01-01 10:47 | | 1 | shipped | 2024-01-01 11:05 | 2024-01-01 11:05 | | 2024-01-01 11:05 | +Snapshot results with `hard_deletes='new_record'`: + +| id | status | updated_at | dbt_valid_from | dbt_valid_to | dbt_updated_at | dbt_is_deleted | +|----|---------|------------------|------------------|------------------|------------------|----------------| +| 1 | pending | 2024-01-01 10:47 | 2024-01-01 10:47 | 2024-01-01 11:05 | 2024-01-01 10:47 | False | +| 1 | shipped | 2024-01-01 11:05 | 2024-01-01 11:05 | 2024-01-01 11:20 | 2024-01-01 11:05 | False | +| 1 | deleted | 2024-01-01 11:20 | 2024-01-01 11:20 | | 2024-01-01 11:20 | True | + + </details> <br/> @@ -479,6 +525,14 @@ Snapshot results: | 1 | pending | 2024-01-01 11:00 | 2024-01-01 11:30 | 2024-01-01 11:00 | | 1 | shipped | 2024-01-01 11:30 | | 2024-01-01 11:30 | +Snapshot results with `hard_deletes='new_record'`: + +| id | status | dbt_valid_from | dbt_valid_to | dbt_updated_at | dbt_is_deleted | +|----|---------|------------------|------------------|------------------|----------------| +| 1 | pending | 2024-01-01 11:00 | 2024-01-01 11:30 | 2024-01-01 11:00 | False | +| 1 | shipped | 2024-01-01 11:30 | 2024-01-01 11:40 | 2024-01-01 11:30 | False | +| 1 | deleted | 2024-01-01 11:40 | | 2024-01-01 11:40 | True | + </details> ## Configure snapshots in versions 1.8 and earlier @@ -495,7 +549,8 @@ To configure snapshots in versions 1.9 and later, refer to [Configuring snapshot - In dbt versions 1.8 and earlier, snapshots are `select` statements, defined within a snapshot block in a `.sql` file (typically in your `snapshots` directory). You'll also need to configure your snapshot to tell dbt how to detect record changes. - The earlier dbt versions use an older syntax that allows for defining multiple resources in a single file. This syntax can significantly slow down parsing and compilation. -- For faster and more efficient management, consider[ upgrading to Versionless](/docs/dbt-versions/versionless-cloud) or the [latest version of dbt Core](/docs/dbt-versions/core), which introduces an updated snapshot configuration syntax that optimizes performance. +- For faster and more efficient management, consider [choosing the "Latest" release track in dbt Cloud](/docs/dbt-versions/cloud-release-tracks) or the [latest version of dbt Core](/docs/dbt-versions/core), which introduces an updated snapshot configuration syntax that optimizes performance. + - For more information on how to migrate from the legacy snapshot configurations to the updated snapshot YAML syntax, refer to [Snapshot configuration migration](/reference/snapshot-configs#snapshot-configuration-migration). The following example shows how to configure a snapshot: diff --git a/website/docs/docs/build/unit-tests.md b/website/docs/docs/build/unit-tests.md index 1d7143d7476..fc4cf02b34f 100644 --- a/website/docs/docs/build/unit-tests.md +++ b/website/docs/docs/build/unit-tests.md @@ -10,13 +10,13 @@ keywords: :::note -This functionality is only supported in dbt Core v1.8+ or accounts that have opted for a ["Versionless"](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) dbt Cloud experience. +Unit testing functionality is available in [dbt Cloud Release Tracks](/docs/dbt-versions/cloud-release-tracks) or dbt Core v1.8+ ::: Historically, dbt's test coverage was confined to [“data” tests](/docs/build/data-tests), assessing the quality of input data or resulting datasets' structure. However, these tests could only be executed _after_ building a model. -With dbt Core v1.8 and dbt Cloud environments that have gone versionless by selecting the **Versionless** option, we have introduced an additional type of test to dbt - unit tests. In software programming, unit tests validate small portions of your functional code, and they work much the same way here. Unit tests allow you to validate your SQL modeling logic on a small set of static inputs _before_ you materialize your full model in production. Unit tests enable test-driven development, benefiting developer efficiency and code reliability. +Starting in dbt Core v1.8, we have introduced an additional type of test to dbt - unit tests. In software programming, unit tests validate small portions of your functional code, and they work much the same way here. Unit tests allow you to validate your SQL modeling logic on a small set of static inputs _before_ you materialize your full model in production. Unit tests enable test-driven development, benefiting developer efficiency and code reliability. ## Before you begin @@ -24,11 +24,15 @@ With dbt Core v1.8 and dbt Cloud environments that have gone versionless by sele - We currently only support adding unit tests to models in your _current_ project. - We currently _don't_ support unit testing models that use the [`materialized view`](/docs/build/materializations#materialized-view) materialization. - We currently _don't_ support unit testing models that use recursive SQL. -- You must specify all fields in a BigQuery STRUCT in a unit test. You cannot use only a subset of fields in a STRUCT. +- We currently _don't_ support unit testing models that use introspective queries. - If your model has multiple versions, by default the unit test will run on *all* versions of your model. Read [unit testing versioned models](/reference/resource-properties/unit-testing-versions) for more information. -- Unit tests must be defined in a YML file in your `models/` directory. -- Table names must be [aliased](/docs/build/custom-aliases) in order to unit test `join` logic. -- Redshift customers need to be aware of a [limitation when building unit tests](/reference/resource-configs/redshift-configs#unit-test-limitations) that requires a workaround. +- Unit tests must be defined in a YML file in your [`models/` directory](/reference/project-configs/model-paths). +- Table names must be aliased in order to unit test `join` logic. +- Include all [`ref`](/reference/dbt-jinja-functions/ref) or [`source`](/reference/dbt-jinja-functions/source) model references in the unit test configuration as `input`s to avoid "node not found" errors during compilation. + +#### Adapter-specific caveats +- You must specify all fields in a BigQuery `STRUCT` in a unit test. You cannot use only a subset of fields in a `STRUCT`. +- Redshift customers need to be aware of a [limitation when building unit tests](/reference/resource-configs/redshift-configs#unit-test-limitations) that requires a workaround. Read the [reference doc](/reference/resource-properties/unit-tests) for more details about formatting your unit tests. diff --git a/website/docs/docs/cloud-integrations/configure-auto-exposures.md b/website/docs/docs/cloud-integrations/configure-auto-exposures.md index 42e36e572b3..2bb09573221 100644 --- a/website/docs/docs/cloud-integrations/configure-auto-exposures.md +++ b/website/docs/docs/cloud-integrations/configure-auto-exposures.md @@ -20,11 +20,13 @@ Auto-exposures help data teams optimize their efficiency and ensure data quality To access the features, you should meet the following: -1. Your environment and jobs are on [Versionless](/docs/dbt-versions/versionless-cloud) dbt. +1. Your environment and jobs are on a supported [release track](/docs/dbt-versions/cloud-release-tracks) dbt. 2. You have a dbt Cloud account on the [Enterprise plan](https://www.getdbt.com/pricing/). 3. You have set up a [production](/docs/deploy/deploy-environments#set-as-production-environment) deployment environment for each project you want to explore, with at least one successful job run. 4. You have [admin permissions](/docs/cloud/manage-access/enterprise-permissions) in dbt Cloud to edit project settings or production environment settings. 5. Use Tableau as your BI tool and enable metadata permissions or work with an admin to do so. Compatible with Tableau Cloud or Tableau Server with the Metadata API enabled. + - If you're using Tableau Server, you need to [allowlist dbt Cloud's IP addresses](/docs/cloud/about-cloud/access-regions-ip-addresses) for your dbt Cloud region. + - Currently, you can only connect to a single Tableau site on the same server. ## Set up in Tableau @@ -59,8 +61,14 @@ To set up [personal access tokens (PATs)](https://help.tableau.com/current/serve <Lightbox src="/img/docs/cloud-integrations/auto-exposures/cloud-integration-details.jpg" title="Enter the details for the exposure connection."/> 4. Select the collections you want to include for auto exposures. - dbt Cloud automatically imports and syncs any workbook within the selected collections. New additions to the collections will be added to the lineage in dbt Cloud during the next automatic sync (usually once per day). <Lightbox src="/img/docs/cloud-integrations/auto-exposures/cloud-select-collections.jpg" title="Select the collections you want to include for auto exposures."/> + + :::info + dbt Cloud automatically imports and syncs any workbook within the selected collections. New additions to the collections will be added to the lineage in dbt Cloud during the next sync (automatically once per day). + + dbt Cloud immediately starts a sync when you update the selected collections list, capturing new workbooks and removing irrelevant ones. + ::: + 5. Click **Save**. dbt Cloud imports everything in the collection(s) and you can continue to view them in Explorer. For more information on how to view and use auto-exposures, refer to [View auto-exposures from dbt Explorer](/docs/collaborate/auto-exposures) page. diff --git a/website/docs/docs/cloud/about-cloud/about-dbt-cloud.md b/website/docs/docs/cloud/about-cloud/about-dbt-cloud.md index 08bbcb94c3b..1a7e59dd5c2 100644 --- a/website/docs/docs/cloud/about-cloud/about-dbt-cloud.md +++ b/website/docs/docs/cloud/about-cloud/about-dbt-cloud.md @@ -24,7 +24,7 @@ dbt Cloud's [flexible plans](https://www.getdbt.com/pricing/) and features make <Card title="dbt Cloud IDE" - body="The IDE is the easiest and most efficient way to develop dbt models, allowing you to build, test, run, and version control your dbt projects directly from your browser. Use dbt Copilot, a powerful AI engine that automatically generates documentation, tests, and semantic models." + body="The IDE is the easiest and most efficient way to develop dbt models, allowing you to build, test, run, and version control your dbt projects directly from your browser. Use dbt Copilot, a powerful AI engine that automatically generates code, documentation, tests, and semantic models." link="/docs/cloud/dbt-cloud-ide/develop-in-the-cloud" icon="dbt-bit"/> diff --git a/website/docs/docs/cloud/account-integrations.md b/website/docs/docs/cloud/account-integrations.md new file mode 100644 index 00000000000..e5ff42cb900 --- /dev/null +++ b/website/docs/docs/cloud/account-integrations.md @@ -0,0 +1,103 @@ +--- +title: "Account integrations in dbt Cloud" +sidebar_label: "Account integrations" +description: "Learn how to configure account integrations for your dbt Cloud account." +--- + +The following sections describe the different **Account integrations** available from your dbt Cloud account under the account **Settings** section. + +<Lightbox src="/img/docs/dbt-cloud/account-integrations.jpg" title="Example of Account integrations from the sidebar" /> + +## Git integrations + +Connect your dbt Cloud account to your Git provider to enable dbt Cloud users to authenticate your personal accounts. dbt Cloud will perform Git actions on behalf of your authenticated self, against repositories to which you have access according to your Git provider permissions. + +To configure a Git account integration: +1. Navigate to **Account settings** in the side menu. +2. Under the **Settings** section, click on **Integrations**. +3. Click on the Git provider from the list and select the **Pencil** icon to the right of the provider. +4. dbt Cloud [natively connects](/docs/cloud/git/git-configuration-in-dbt-cloud) to the following Git providers: + + - [GitHub](/docs/cloud/git/connect-github) + - [GitLab](/docs/cloud/git/connect-gitlab) + - [Azure DevOps](/docs/cloud/git/connect-azure-devops) <Lifecycle status="enterprise" /> + +You can connect your dbt Cloud account to additional Git providers by importing a git repository from any valid git URL. Refer to [Import a git repository](/docs/cloud/git/import-a-project-by-git-url) for more information. + +<Lightbox src="/img/docs/dbt-cloud/account-integration-git.jpg" width="85%" title="Example of the Git integration page" /> + +## OAuth integrations + +Connect your dbt Cloud account to an OAuth provider that are integrated with dbt Cloud. + +To configure an OAuth account integration: +1. Navigate to **Account settings** in the side menu. +2. Under the **Settings** section, click on **Integrations**. +3. Under **OAuth**, and click on **Link** to connect your Slack account. +4. For custom OAuth providers, under **Custom OAuth integrations**, click on **Add integration** and select the OAuth provider from the list. Fill in the required fields and click **Save**. + +<Lightbox src="/img/docs/dbt-cloud/account-integration-oauth.jpg" width="85%" title="Example of the OAuth integration page" /> + +## AI integrations + +Once AI features have been [enabled](/docs/cloud/enable-dbt-copilot#enable-dbt-copilot), you can use dbt Labs' AI integration or bring-your-own provider to support AI-powered dbt Cloud features like [dbt Copilot](/docs/cloud/dbt-copilot) and [Ask dbt](/docs/cloud-integrations/snowflake-native-app) (both available on [dbt Cloud Enterprise plans](https://www.getdbt.com/pricing)). + +dbt Cloud supports AI integrations for dbt Labs-managed OpenAI keys, Self-managed OpenAI keys, or Self-managed Azure OpenAI keys <Lifecycle status="beta" />. + +Note, if you bring-your-own provider, you will incur API calls and associated charges for features used in dbt Cloud. + +:::info +dbt Cloud's AI is optimized for OpenAIs gpt-4o. Using other models can affect performance and accuracy, and functionality with other models isn't guaranteed. +::: + +To configure the AI integration in your dbt Cloud account, a dbt Cloud admin can perform the following steps: +1. Navigate to **Account settings** in the side menu. +2. Select **Integrations** and scroll to the **AI** section. +3. Click on the **Pencil** icon to the right of **OpenAI** to configure the AI integration. + <Lightbox src="/img/docs/dbt-cloud/account-integration-ai.jpg" width="85%" title="Example of the AI integration page" /> +4. Configure the AI integration for either **dbt Labs OpenAI**, **OpenAI**, or **Azure OpenAI**. + + <Tabs queryString="ai-integration"> + <TabItem value="dbtlabs" label="dbt Labs OpenAI"> + + 1. Select the toggle for **dbt Labs** to use dbt Labs' managed OpenAI key. + 2. Click **Save**. + + <Lightbox src="/img/docs/dbt-cloud/account-integration-dbtlabs.jpg" width="85%" title="Example of the dbt Labs integration page" /> + </TabItem> + + <TabItem value="openai" label="OpenAI"> + + 1. Select the toggle for **OpenAI** to use your own OpenAI key. + 2. Enter the API key. + 3. Click **Save**. + <Lightbox src="/img/docs/dbt-cloud/account-integration-openai.jpg" width="85%" title="Example of the OpenAI integration page" /> + + </TabItem> + + <TabItem value="azure" label="Azure OpenAI (beta)"> + To learn about deploying your own OpenAI model on Azure, refer to [Deploy models on Azure OpenAI](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-openai). Configure credentials for your Azure OpenAI deployment in dbt Cloud in the following two ways: + - [From a Target URI](#from-a-target-uri) + - [Manually providing the credentials](#manually-providing-the-credentials) + + #### From a Target URI + + 1. Locate your Azure OpenAI deployment URI in your Azure Deployment details page. + 2. In the dbt Cloud **Azure OpenAI** section, select the tab **From Target URI**. + 3. Paste the URI into the **Target URI** field. + 4. Enter your Azure OpenAI API key. + 5. Verify the **Endpoint**, **API Version**, and **Deployment Name** are correct. + 6. Click **Save**. + <Lightbox src="/img/docs/dbt-cloud/account-integration-azure-target.jpg" width="85%" title="Example of Azure OpenAI integration section" /> + + #### Manually providing the credentials + + 1. Locate your Azure OpenAI configuration in your Azure Deployment details page. + 2. In the dbt Cloud **Azure OpenAI** section, select the tab **Manual Input**. + 2. Enter your Azure OpenAI API key. + 3. Enter the **Endpoint**, **API Version**, and **Deployment Name**. + 4. Click **Save**. + <Lightbox src="/img/docs/dbt-cloud/account-integration-azure-manual.jpg" width="85%" title="Example of Azure OpenAI integration section" /> + + </TabItem> + </Tabs> diff --git a/website/docs/docs/cloud/cloud-cli-installation.md b/website/docs/docs/cloud/cloud-cli-installation.md index 8a058cbb90f..a80f1a587e0 100644 --- a/website/docs/docs/cloud/cloud-cli-installation.md +++ b/website/docs/docs/cloud/cloud-cli-installation.md @@ -21,8 +21,6 @@ dbt commands are run against dbt Cloud's infrastructure and benefit from: ## Prerequisites The dbt Cloud CLI is available in all [deployment regions](/docs/cloud/about-cloud/access-regions-ip-addresses) and for both multi-tenant and single-tenant accounts. -- You are on dbt version 1.5 or higher. Alternatively, set it to [**Versionless**](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) to automatically stay up to date. - ## Install dbt Cloud CLI You can install the dbt Cloud CLI on the command line by using one of these methods. @@ -321,3 +319,10 @@ This alias will allow you to use the <code>dbt-cloud</code> command to invoke th If you've ran a dbt command and receive a <code>Session occupied</code> error, you can reattach to your existing session with <code>dbt reattach</code> and then press <code>Control-C</code> and choose to cancel the invocation. </DetailsToggle> + +<DetailsToggle alt_header="Why am I receiving a `Stuck session` error when trying to run a new command?"> + + +The Cloud CLI allows only one command that writes to the data warehouse at a time. If you attempt to run multiple write commands simultaneously (for example, `dbt run` and `dbt build`), you will encounter a `stuck session` error. To resolve this, cancel the specific invocation by passing its ID to the cancel command. For more information, refer to [parallel execution](/reference/dbt-commands#parallel-execution). + +</DetailsToggle> \ No newline at end of file diff --git a/website/docs/docs/cloud/connect-data-platform/connect-amazon-athena.md b/website/docs/docs/cloud/connect-data-platform/connect-amazon-athena.md index f1009f61274..e3645500b9e 100644 --- a/website/docs/docs/cloud/connect-data-platform/connect-amazon-athena.md +++ b/website/docs/docs/cloud/connect-data-platform/connect-amazon-athena.md @@ -7,7 +7,7 @@ sidebar_label: "Connect Amazon Athena" # Connect Amazon Athena -Your environment(s) must be on ["Versionless"](/docs/dbt-versions/versionless-cloud) to use the Amazon Athena connection. +Your environment(s) must be on a supported [release track](/docs/dbt-versions/cloud-release-tracks) to use the Amazon Athena connection. Connect dbt Cloud to Amazon's Athena interactive query service to build your dbt project. The following are the required and optional fields for configuring the Athena connection: diff --git a/website/docs/docs/cloud/connect-data-platform/connect-snowflake.md b/website/docs/docs/cloud/connect-data-platform/connect-snowflake.md index 7e4bc7a9288..6b749ced186 100644 --- a/website/docs/docs/cloud/connect-data-platform/connect-snowflake.md +++ b/website/docs/docs/cloud/connect-data-platform/connect-snowflake.md @@ -5,6 +5,14 @@ description: "Configure Snowflake connection." sidebar_label: "Connect Snowflake" --- +:::note + +dbt Cloud connections and credentials inherit the permissions of the accounts configured. You can customize roles and associated permissions in Snowflake to fit your company's requirements and fine-tune access to database objects in your account. See [Snowflake permissions](/reference/database-permissions/snowflake-permissions) for more information about customizing roles in Snowflake. + +Refer to [Snowflake permissions](/reference/database-permissions/snowflake-permissions) for more information about customizing roles in Snowflake. + +::: + The following fields are required when creating a Snowflake connection | Field | Description | Examples | @@ -14,9 +22,6 @@ The following fields are required when creating a Snowflake connection | Database | The logical database to connect to and run queries against. | `analytics` | | Warehouse | The virtual warehouse to use for running queries. | `transforming` | - -**Note:** A crucial part of working with dbt atop Snowflake is ensuring that users (in development environments) and/or service accounts (in deployment to production environments) have the correct permissions to take actions on Snowflake! Here is documentation of some [example permissions to configure Snowflake access](/reference/database-permissions/snowflake-permissions). - ## Authentication methods This section describes the different authentication methods for connecting dbt Cloud to Snowflake. Configure Deployment environment (Production, Staging, General) credentials globally in the [**Connections**](/docs/deploy/deploy-environments#deployment-connection) area of **Account settings**. Individual users configure their development credentials in the [**Credentials**](/docs/cloud/dbt-cloud-ide/develop-in-the-cloud#get-started-with-the-cloud-ide) area of their user profile. diff --git a/website/docs/docs/cloud/connect-data-platform/connect-starburst-trino.md b/website/docs/docs/cloud/connect-data-platform/connect-starburst-trino.md index db0d3f61728..4c460f0d705 100644 --- a/website/docs/docs/cloud/connect-data-platform/connect-starburst-trino.md +++ b/website/docs/docs/cloud/connect-data-platform/connect-starburst-trino.md @@ -11,7 +11,7 @@ The following are the required fields for setting up a connection with a [Starbu | **Host** | The hostname of your cluster. Don't include the HTTP protocol prefix. | `mycluster.mydomain.com` | | **Port** | The port to connect to your cluster. By default, it's 443 for TLS enabled clusters. | `443` | | **User** | The username (of the account) to log in to your cluster. When connecting to Starburst Galaxy clusters, you must include the role of the user as a suffix to the username.<br/><br/> | Format for Starburst Enterprise or Trino depends on your configured authentication method. <br/>Format for Starburst Galaxy:<br/> <ul><li>`user.name@mydomain.com/role`</li></ul> | -| **Password** | The user's password. | | +| **Password** | The user's password. | - | | **Database** | The name of a catalog in your cluster. | `example_catalog` | | **Schema** | The name of a schema that exists within the specified catalog. | `example_schema` | diff --git a/website/docs/docs/cloud/connect-data-platform/connect-teradata.md b/website/docs/docs/cloud/connect-data-platform/connect-teradata.md index cf41814078b..8663a181645 100644 --- a/website/docs/docs/cloud/connect-data-platform/connect-teradata.md +++ b/website/docs/docs/cloud/connect-data-platform/connect-teradata.md @@ -7,7 +7,7 @@ sidebar_label: "Connect Teradata" # Connect Teradata <Lifecycle status="preview" /> -Your environment(s) must be on ["Versionless"](/docs/dbt-versions/versionless-cloud) to use the Teradata connection. +Your environment(s) must be on a supported [release track](/docs/dbt-versions/cloud-release-tracks) to use the Teradata connection. | Field | Description | Type | Required? | Example | | ----------------------------- | --------------------------------------------------------------------------------------------- | -------------- | --------- | ------- | diff --git a/website/docs/docs/cloud/connect-data-platform/connnect-bigquery.md b/website/docs/docs/cloud/connect-data-platform/connnect-bigquery.md index 1ce9712ab91..ffe7e468bd2 100644 --- a/website/docs/docs/cloud/connect-data-platform/connnect-bigquery.md +++ b/website/docs/docs/cloud/connect-data-platform/connnect-bigquery.md @@ -11,7 +11,12 @@ sidebar_label: "Connect BigQuery" :::info Uploading a service account JSON keyfile -While the fields in a BigQuery connection can be specified manually, we recommend uploading a service account <Term id="json" /> keyfile to quickly and accurately configure a connection to BigQuery. +While the fields in a BigQuery connection can be specified manually, we recommend uploading a service account <Term id="json" /> keyfile to quickly and accurately configure a connection to BigQuery. + +You can provide the JSON keyfile in one of two formats: + +- JSON keyfile upload — Upload the keyfile directly in its normal JSON format. +- Base64-encoded string — Provide the keyfile as a base64-encoded string. When you provide a base64-encoded string, dbt decodes it automatically and populates the necessary fields. ::: diff --git a/website/docs/docs/cloud/dbt-cloud-ide/develop-in-the-cloud.md b/website/docs/docs/cloud/dbt-cloud-ide/develop-in-the-cloud.md index c9d2cbbad30..de44de67b33 100644 --- a/website/docs/docs/cloud/dbt-cloud-ide/develop-in-the-cloud.md +++ b/website/docs/docs/cloud/dbt-cloud-ide/develop-in-the-cloud.md @@ -13,7 +13,7 @@ The dbt Cloud integrated development environment (IDE) is a single web-based int The dbt Cloud IDE offers several [keyboard shortcuts](/docs/cloud/dbt-cloud-ide/keyboard-shortcuts) and [editing features](/docs/cloud/dbt-cloud-ide/ide-user-interface#editing-features) for faster and efficient development and governance: - Syntax highlighting for SQL — Makes it easy to distinguish different parts of your code, reducing syntax errors and enhancing readability. -- AI copilot — Use [dbt Copilot](/docs/cloud/dbt-copilot), a powerful AI engine that can generate documentation, tests, and semantic models for your dbt SQL models. +- AI copilot — Use [dbt Copilot](/docs/cloud/dbt-copilot), a powerful AI engine that can [generate code](/docs/cloud/use-dbt-copilot#generate-and-edit-code) using natural language, and [generate documentation](/docs/build/documentation), [tests](/docs/build/data-tests), and [semantic models](/docs/build/semantic-models) for you with the click of a button. - Auto-completion — Suggests table names, arguments, and column names as you type, saving time and reducing typos. - Code [formatting and linting](/docs/cloud/dbt-cloud-ide/lint-format) — Helps standardize and fix your SQL code effortlessly. - Navigation tools — Easily move around your code, jump to specific lines, find and replace text, and navigate between project files. diff --git a/website/docs/docs/cloud/dbt-copilot.md b/website/docs/docs/cloud/dbt-copilot.md index 403df86a089..bd2573e0ff8 100644 --- a/website/docs/docs/cloud/dbt-copilot.md +++ b/website/docs/docs/cloud/dbt-copilot.md @@ -8,10 +8,12 @@ pagination_prev: null # About dbt Copilot <Lifecycle status='beta'/> -dbt Copilot is a powerful artificial intelligence (AI) engine that's fully integrated into your dbt Cloud experience and designed to accelerate your analytics workflows. dbt Copilot embeds AI-driven assistance across every stage of the analytics development life cycle (ADLC), empowering data practitioners to deliver data products faster, improve data quality, and enhance data accessibility. With automatic code generation, you can let the AI engine generate the [documentation](/docs/build/documentation), [tests](/docs/build/data-tests), and [semantic models](/docs/build/semantic-models) for you. +dbt Copilot is a powerful artificial intelligence (AI) engine that's fully integrated into your dbt Cloud experience and designed to accelerate your analytics workflows. dbt Copilot embeds AI-driven assistance across every stage of the analytics development life cycle (ADLC), empowering data practitioners to deliver data products faster, improve data quality, and enhance data accessibility. + +With automatic code generation, let dbt Copilot [generate code](/docs/cloud/use-dbt-copilot#generate-and-edit-code) using natural language, and [generate documentation](/docs/build/documentation), [tests](/docs/build/data-tests), and [semantic models](/docs/build/semantic-models) for you with the click of a button. :::tip Beta feature -dbt Copilot is designed to _help_ developers generate documentation, tests, and semantic models in dbt Cloud. It's available in beta, in the dbt Cloud IDE only. +dbt Copilot is designed to _help_ developers generate documentation, tests, and semantic models, as well as [code](/docs/cloud/use-dbt-copilot#generate-and-edit-code) using natural language, in dbt Cloud. It's available in beta, in the dbt Cloud IDE only. To use dbt Copilot, you must have an active [dbt Cloud Enterprise account](https://www.getdbt.com/pricing) and either agree to use dbt Labs' OpenAI key or provide your own Open AI API key. [Register here](https://docs.google.com/forms/d/e/1FAIpQLScPjRGyrtgfmdY919Pf3kgqI5E95xxPXz-8JoVruw-L9jVtxg/viewform) or reach out to the Account Team if you're interested in joining the private beta. ::: diff --git a/website/docs/docs/cloud/enable-dbt-copilot.md b/website/docs/docs/cloud/enable-dbt-copilot.md index 67a11fed3fc..2b954d1db5d 100644 --- a/website/docs/docs/cloud/enable-dbt-copilot.md +++ b/website/docs/docs/cloud/enable-dbt-copilot.md @@ -12,7 +12,7 @@ This page explains how to enable the dbt Copilot engine in dbt Cloud, leveraging - Available in the dbt Cloud IDE only. - Must have an active [dbt Cloud Enterprise account](https://www.getdbt.com/pricing). -- Development environment has been upgraded to ["Versionless"](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless). +- Development environment is on a supported [release track](/docs/dbt-versions/cloud-release-tracks) to receive ongoing updates. - By default, dbt Copilot deployments use a central OpenAI API key managed by dbt Labs. Alternatively, you can [provide your own OpenAI API key](#bringing-your-own-openai-api-key-byok). - Accept and sign legal agreements. Reach out to your Account team to begin this process. @@ -34,18 +34,13 @@ Note: To disable (only after enabled), repeat steps 1 to 3, toggle off in step 4 <Lightbox src="/img/docs/deploy/example-account-settings.png" width="90%" title="Example of the 'Enable account access to AI-powered feature' option in Account settings" /> -### Bringing your own OpenAI API key (BYOK) +## Bringing your own OpenAI API key (BYOK) Once AI features have been enabled, you can provide your organization's OpenAI API key. dbt Cloud will then leverage your OpenAI account and terms to power dbt Copilot. This will incur billing charges to your organization from OpenAI for requests made by dbt Copilot. -Note that Azure OpenAI is not currently supported, but will be in the future. +Configure AI keys using: +- [dbt Labs-managed OpenAI API key](/docs/cloud/account-integrations?ai-integration=dbtlabs#ai-integrations) +- Your own [OpenAI API key](/docs/cloud/account-integrations?ai-integration=openai#ai-integrations) +- [Azure OpenAI](/docs/cloud/account-integrations?ai-integration=azure#ai-integrations) <Lifecycle status="beta" /> -A dbt Cloud admin can provide their API key by following these steps: - -1. Navigate to **Account settings** in the side menu. - -2. Find the **Settings** section and click on **Integrations**. - -3. Scroll to **AI** and select the toggle for **OpenAI** - -4. Enter your API key and click **Save**. +For configuration details, see [Account integrations](/docs/cloud/account-integrations#ai-integrations). diff --git a/website/docs/docs/cloud/git/connect-azure-devops.md b/website/docs/docs/cloud/git/connect-azure-devops.md index f6c0ee634fc..f3bb07a12d0 100644 --- a/website/docs/docs/cloud/git/connect-azure-devops.md +++ b/website/docs/docs/cloud/git/connect-azure-devops.md @@ -4,6 +4,8 @@ id: "connect-azure-devops" pagination_next: "docs/cloud/git/setup-azure" --- +# Connect to Azure DevOps <Lifecycle status="enterprise" /> + <Snippet path="available-enterprise-tier-only" /> diff --git a/website/docs/docs/cloud/git/connect-gitlab.md b/website/docs/docs/cloud/git/connect-gitlab.md index 648a4543932..d16cdb15b8e 100644 --- a/website/docs/docs/cloud/git/connect-gitlab.md +++ b/website/docs/docs/cloud/git/connect-gitlab.md @@ -10,6 +10,7 @@ Connecting your GitLab account to dbt Cloud provides convenience and another lay - Clone repos using HTTPS rather than SSH. - Carry GitLab user permissions through to dbt Cloud or dbt Cloud CLI's git actions. - Trigger [Continuous integration](/docs/deploy/continuous-integration) builds when merge requests are opened in GitLab. + - GitLab automatically registers a webhook in your GitLab repository to enable seamless integration with dbt Cloud. The steps to integrate GitLab in dbt Cloud depend on your plan. If you are on: - the Developer or Team plan, read these [instructions](#for-dbt-cloud-developer-and-team-tiers). @@ -61,8 +62,8 @@ In GitLab, when creating your Group Application, input the following: | ------ | ----- | | **Name** | dbt Cloud | | **Redirect URI** | `https://YOUR_ACCESS_URL/complete/gitlab` | -| **Confidential** | ✔️ | -| **Scopes** | ✔️ api | +| **Confidential** | ✅ | +| **Scopes** | ✅ api | Replace `YOUR_ACCESS_URL` with the [appropriate Access URL](/docs/cloud/about-cloud/access-regions-ip-addresses) for your region and plan. @@ -114,20 +115,10 @@ If your GitLab account is not connected, you’ll see "No connected account". Se Once you approve authorization, you will be redirected to dbt Cloud, and you should see your connected account. You're now ready to start developing in the dbt Cloud IDE or dbt Cloud CLI. - ## Troubleshooting -### Errors when importing a repository on dbt Cloud project set up -If you do not see your repository listed, double-check that: -- Your repository is in a Gitlab group you have access to. dbt Cloud will not read repos associated with a user. - -If you do see your repository listed, but are unable to import the repository successfully, double-check that: -- You are a maintainer of that repository. Only users with maintainer permissions can set up repository connections. - -If you imported a repository using the dbt Cloud native integration with GitLab, you should be able to see the clone strategy is using a `deploy_token`. If it's relying on an SSH key, this means the repository was not set up using the native GitLab integration, but rather using the generic git clone option. The repository must be reconnected in order to get the benefits described above. - -## FAQs - +<FAQ path="Troubleshooting/gitlab-webhook"/> +<FAQ path="Troubleshooting/error-importing-repo"/> <FAQ path="Git/gitignore"/> <FAQ path="Git/gitlab-authentication"/> <FAQ path="Git/gitlab-selfhosted"/> diff --git a/website/docs/docs/cloud/git/import-a-project-by-git-url.md b/website/docs/docs/cloud/git/import-a-project-by-git-url.md index 5cd3553b07f..2b499b39cb7 100644 --- a/website/docs/docs/cloud/git/import-a-project-by-git-url.md +++ b/website/docs/docs/cloud/git/import-a-project-by-git-url.md @@ -49,7 +49,7 @@ If you use GitLab, you can import your repo directly using [dbt Cloud's GitLab A - To add a deploy key to a GitLab account, navigate to the [SSH keys](https://gitlab.com/profile/keys) tab in the User Settings page of your GitLab account. - Next, paste in the deploy key generated by dbt Cloud for your repository. - After saving this SSH key, dbt Cloud will be able to read and write files in your GitLab repository. -- Refer to [Adding a read only deploy key in GitLab](https://docs.gitlab.com/ee/ssh/#per-repository-deploy-keys) +- Refer to [Adding a read only deploy key in GitLab](https://docs.gitlab.com/ee/user/project/deploy_keys/) <Lightbox src="/img/docs/dbt-cloud/cloud-configuring-dbt-cloud/f3ea88d-Screen_Shot_2019-10-16_at_4.45.50_PM.png" title="Configuring a GitLab SSH Key"/> diff --git a/website/docs/docs/cloud/manage-access/auth0-migration.md b/website/docs/docs/cloud/manage-access/auth0-migration.md index b7bab836810..2f45ad7dcc8 100644 --- a/website/docs/docs/cloud/manage-access/auth0-migration.md +++ b/website/docs/docs/cloud/manage-access/auth0-migration.md @@ -5,22 +5,10 @@ sidebar: "SSO Auth0 Migration" description: "Required actions for migrating to Auth0 for SSO services on dbt Cloud." --- -:::note - -This migration is a feature of the dbt Cloud Enterprise plan. To learn more about an Enterprise plan, contact us at [sales@getdbt.com](mailto::sales@getdbt.com). - -For single-tenant Virtual Private Cloud, you should [email dbt Cloud Support](mailto::support@getdbt.com) to set up or update your SSO configuration. - -::: - dbt Labs is partnering with Auth0 to bring enhanced features to dbt Cloud's single sign-on (SSO) capabilities. Auth0 is an identity and access management (IAM) platform with advanced security features, and it will be leveraged by dbt Cloud. These changes will require some action from customers with SSO configured in dbt Cloud today, and this guide will outline the necessary changes for each environment. If you have not yet configured SSO in dbt Cloud, refer instead to our setup guides for [SAML](/docs/cloud/manage-access/set-up-sso-saml-2.0), [Okta](/docs/cloud/manage-access/set-up-sso-okta), [Google Workspace](/docs/cloud/manage-access/set-up-sso-google-workspace), or [Microsoft Entra ID (formerly Azure AD)](/docs/cloud/manage-access/set-up-sso-microsoft-entra-id) single sign-on services. -## Auth0 Multi-tenant URIs - -<Snippet path="auth0-uri" /> - ## Start the migration The Auth0 migration feature is being rolled out incrementally to customers who have SSO features already enabled. When the migration option has been enabled on your account, you will see **SSO Updates Available** on the right side of the menu bar, near the settings icon. diff --git a/website/docs/docs/cloud/manage-access/environment-permissions.md b/website/docs/docs/cloud/manage-access/environment-permissions.md index b99da64609c..20acfae51f7 100644 --- a/website/docs/docs/cloud/manage-access/environment-permissions.md +++ b/website/docs/docs/cloud/manage-access/environment-permissions.md @@ -17,8 +17,8 @@ Environment-level permissions give dbt Cloud admins more flexibility to protect - Environment-level permissions do not allow you to create custom roles and permissions for each resource type in dbt Cloud. - You can only select environment types, and can’t specify a particular environment within a project. -- You can't select specific resources within environments. dbt Cloud jobs, runs, and environment variables are all environment resources. - - For example, you can't specify that a user only has access to jobs but not environment variables. Access to a given environment gives the user access to everything within that environment. +- You can't select specific resources within environments. dbt Cloud jobs and runs are environment resources. + - For example, you can't specify that a user only has access to jobs but not runs. Access to a given environment gives the user access to everything within that environment. ## Environments and roles diff --git a/website/docs/docs/cloud/manage-access/self-service-permissions.md b/website/docs/docs/cloud/manage-access/self-service-permissions.md index a5bdba825c2..6b326645d44 100644 --- a/website/docs/docs/cloud/manage-access/self-service-permissions.md +++ b/website/docs/docs/cloud/manage-access/self-service-permissions.md @@ -52,33 +52,33 @@ The following tables outline the access that users have if they are assigned a D | Account-level permission| Owner | Member | Read-only license| IT license | |:------------------------|:-----:|:------:|:----------------:|:------------:| -| Account settings | W | W | | W | -| Billing | W | | | W | -| Invitations | W | W | | W | -| Licenses | W | R | | W | -| Users | W | R | | W | -| Project (create) | W | W | | W | -| Connections | W | W | | W | -| Service tokens | W | | | W | -| Webhooks | W | W | | | +| Account settings | W | W | - | W | +| Billing | W | - | - | W | +| Invitations | W | W | - | W | +| Licenses | W | R | - | W | +| Users | W | R | - | W | +| Project (create) | W | W | - | W | +| Connections | W | W | - | W | +| Service tokens | W | - | - | W | +| Webhooks | W | W | - | - | #### Project permissions for account roles |Project-level permission | Owner | Member | Read-only | IT license | |:------------------------|:-----:|:-------:|:---------:|:----------:| -| Adapters | W | W | R | | -| Connections | W | W | R | | -| Credentials | W | W | R | | -| Custom env. variables | W | W | R | | -| Develop (IDE or dbt Cloud CLI)| W | W | | | -| Environments | W | W | R | | -| Jobs | W | W | R | | -| dbt Explorer | W | W | R | | -| Permissions | W | R | | | -| Profile | W | W | R | | -| Projects | W | W | R | | -| Repositories | W | W | R | | -| Runs | W | W | R | | -| Semantic Layer Config | W | W | R | | +| Adapters | W | W | R | - | +| Connections | W | W | R | - | +| Credentials | W | W | R | - | +| Custom env. variables | W | W | R | - | +| Develop (IDE or dbt Cloud CLI)| W | W | - | - | +| Environments | W | W | R | - | +| Jobs | W | W | R | - | +| dbt Explorer | W | W | R | - | +| Permissions | W | R | - | - | +| Profile | W | W | R | - | +| Projects | W | W | R | - | +| Repositories | W | W | R | - | +| Runs | W | W | R | - | +| Semantic Layer Config | W | W | R | - | diff --git a/website/docs/docs/cloud/manage-access/set-up-sso-microsoft-entra-id.md b/website/docs/docs/cloud/manage-access/set-up-sso-microsoft-entra-id.md index de935627765..81463cf9ee5 100644 --- a/website/docs/docs/cloud/manage-access/set-up-sso-microsoft-entra-id.md +++ b/website/docs/docs/cloud/manage-access/set-up-sso-microsoft-entra-id.md @@ -61,6 +61,13 @@ Depending on your Microsoft Entra ID settings, your App Registration page might ### Azure <-> dbt Cloud User and Group mapping +:::important + +There is a [limitation](https://learn.microsoft.com/en-us/entra/identity/hybrid/connect/how-to-connect-fed-group-claims#important-caveats-for-this-functionality) on the number of groups Azure will emit (capped at 150) via the SSO token, meaning if a user belongs to more than 150 groups, it will appear as though they belong to none. To prevent this, configure [group assignments](https://learn.microsoft.com/en-us/entra/identity/enterprise-apps/assign-user-or-group-access-portal?pivots=portal) with the dbt Cloud app in Azure and set a [group claim](https://learn.microsoft.com/en-us/entra/identity/hybrid/connect/how-to-connect-fed-group-claims#add-group-claims-to-tokens-for-saml-applications-using-sso-configuration) so Azure emits only the relevant groups. + +::: + + The Azure users and groups you will create in the following steps are mapped to groups created in dbt Cloud based on the group name. Reference the docs on [enterprise permissions](enterprise-permissions) for additional information on how users, groups, and permission sets are configured in dbt Cloud. ### Adding users to an Enterprise application diff --git a/website/docs/docs/cloud/manage-access/set-up-sso-saml-2.0.md b/website/docs/docs/cloud/manage-access/set-up-sso-saml-2.0.md index 7083e7ac5f8..34c1a91fbee 100644 --- a/website/docs/docs/cloud/manage-access/set-up-sso-saml-2.0.md +++ b/website/docs/docs/cloud/manage-access/set-up-sso-saml-2.0.md @@ -16,7 +16,7 @@ Currently supported features include: This document details the steps to integrate dbt Cloud with an identity provider in order to configure Single Sign On and [role-based access control](/docs/cloud/manage-access/about-user-access#role-based-access-control). -## Auth0 Multi-tenant URIs +## Auth0 URIs <Snippet path="auth0-uri" /> diff --git a/website/docs/docs/cloud/manage-access/sso-overview.md b/website/docs/docs/cloud/manage-access/sso-overview.md index 6b6527df753..e922a073fc8 100644 --- a/website/docs/docs/cloud/manage-access/sso-overview.md +++ b/website/docs/docs/cloud/manage-access/sso-overview.md @@ -12,7 +12,7 @@ dbt Cloud supports JIT (Just-in-Time) provisioning and IdP-initiated login. You - You have a dbt Cloud account enrolled in the Enterprise plan. [Contact us](mailto:sales@getdbt.com) to learn more and enroll. -## Auth0 Multi-tenant URIs +## Auth0 URIs <Snippet path="auth0-uri" /> diff --git a/website/docs/docs/cloud/secure/databricks-privatelink.md b/website/docs/docs/cloud/secure/databricks-privatelink.md index d754f2b76c4..aaa6e0c6eb7 100644 --- a/website/docs/docs/cloud/secure/databricks-privatelink.md +++ b/website/docs/docs/cloud/secure/databricks-privatelink.md @@ -34,7 +34,7 @@ The following steps will walk you through the setup of a Databricks AWS PrivateL 1. Once dbt Cloud support has notified you that setup is complete, [register the VPC endpoint in Databricks](https://docs.databricks.com/administration-guide/cloud-configurations/aws/privatelink.html#step-3-register-privatelink-objects-and-attach-them-to-a-workspace) and attach it to the workspace: - [Register your VPC endpoint](https://docs.databricks.com/en/security/network/classic/vpc-endpoints.html) — Register the VPC endpoint using the VPC endpoint ID provided by dbt Support. - [Create a Private Access Settings object](https://docs.databricks.com/en/security/network/classic/private-access-settings.html) — Create a Private Access Settings (PAS) object with your desired public access settings, and setting Private Access Level to **Endpoint**. Choose the registered endpoint created in the previous step. - - [Create or update your workspace](https://docs.databricks.com/en/security/network/classic/privatelink.html#step-3d-create-or-update-the-workspace-front-end-back-end-or-both) — Create a workspace, or update your an existing workspace. Under **Advanced configurations → Private Link** choose the private access settings object created in the previous step. + - [Create or update your workspace](https://docs.databricks.com/en/security/network/classic/privatelink.html#step-3d-create-or-update-the-workspace-front-end-back-end-or-both) — Create a workspace, or update an existing workspace. Under **Advanced configurations → Private Link** choose the private access settings object created in the previous step. :::warning If using an existing Databricks workspace, all workloads running in the workspace need to be stopped to enable Private Link. Workloads also can't be started for another 20 minutes after making changes. From the [Databricks documentation](https://docs.databricks.com/en/security/network/classic/privatelink.html#step-3d-create-or-update-the-workspace-front-end-back-end-or-both): diff --git a/website/docs/docs/cloud/use-dbt-copilot.md b/website/docs/docs/cloud/use-dbt-copilot.md index 30def967f96..48e5ffa6fa7 100644 --- a/website/docs/docs/cloud/use-dbt-copilot.md +++ b/website/docs/docs/cloud/use-dbt-copilot.md @@ -1,22 +1,73 @@ --- title: "Use dbt Copilot" sidebar_label: "Use dbt Copilot" -description: "Use the dbt Copilot AI engine to generate documentation, tests, and semantic models from scratch, giving you the flexibility to modify or fix generated code." +description: "Use dbt Copilot to generate documentation, tests, semantic models, and sql code from scratch, giving you the flexibility to modify or fix generated code." --- # Use dbt Copilot <Lifecycle status='beta'/> -Use dbt Copilot to generate documentation, tests, and semantic models from scratch, giving you the flexibility to modify or fix generated code. To access and use this AI engine: +Use dbt Copilot to generate documentation, tests, semantic models, and code from scratch, giving you the flexibility to modify or fix generated code. -1. Navigate to the dbt Cloud IDE and select a SQL model file under the **File Explorer**. +This page explains how to use dbt Copilot to: -2. In the **Console** section (under the **File Editor**), click **dbt Copilot** to view the available AI options. +- [Generate resources](#generate-resources) — Save time by using dbt Copilot’s generation button to generate documentation, tests, and semantic model files during your development. +- [Generate and edit code](#generate-and-edit-code) — Use natural language prompts to generate SQL code from scratch or to edit existing SQL file by using keyboard shortcuts or highlighting code. + +## Generate resources +Generate documentation, tests, and semantic models resources with the click-of-a-button using dbt Copilot, saving you time. To access and use this AI feature: + +1. Navigate to the dbt Cloud IDE and select a SQL model file under the **File Explorer**. +2. In the **Console** section (under the **File Editor**), click **dbt Copilot** to view the available AI options. 3. Select the available options to generate the YAML config: **Generate Documentation**, **Generate Tests**, or **Generate Semantic Model**. - To generate multiple YAML configs for the same model, click each option separately. dbt Copilot intelligently saves the YAML config in the same file. - 4. Verify the AI-generated code. You can update or fix the code as needed. - 5. Click **Save As**. You should see the file changes under the **Version control** section. <Lightbox src="/img/docs/dbt-cloud/cloud-ide/dbt-copilot-doc.gif" width="100%" title="Example of using dbt Copilot to generate documentation in the IDE" /> + +## Generate and edit code <Lifecycle status='beta'/> + +dbt Copilot also allows you to generate SQL code directly within the SQL file in the dbt Cloud IDE, using natural language prompts. This means you can rewrite or add specific portions of the SQL file without needing to edit the entire file. + +This intelligent AI tool streamlines SQL development by reducing errors, scaling effortlessly with complexity, and saving valuable time. dbt Copilot's [prompt window](#use-the-prompt-window), accessible by keyboard shortcut, handles repetitive or complex SQL generation effortlessly so you can focus on high-level tasks. + +Use Copilot's prompt window for use cases like: + +- Writing advanced transformations +- Performing bulk edits efficiently +- Crafting complex patterns like regex + +### Use the prompt window + +Access dbt Copilot's AI prompt window using the keyboard shortcut Cmd+B (Mac) or Ctrl+B (Windows) to: + +#### 1. Generate SQL from scratch +- Use the keyboard shortcuts Cmd+B (Mac) or Ctrl+B (Windows) to generate SQL from scratch. +- Enter your instructions to generate SQL code tailored to your needs using natural language. +- Ask dbt Copilot to fix the code or add a specific portion of the SQL file. + +<Lightbox src="/img/docs/dbt-cloud/cloud-ide/copilot-sql-generation-prompt.jpg" width="90%" title="dbt Copilot's prompt window accessible by keyboard shortcut Cmd+B (Mac) or Ctrl+B (Windows)" /> + +#### 2. Edit existing SQL code +- Highlight a section of SQL code and press Cmd+B (Mac) or Ctrl+B (Windows) to open the prompt window for editing. +- Use this to refine or modify specific code snippets based on your needs. +- Ask dbt Copilot to fix the code or add a specific portion of the SQL file. + +#### 3. Review changes with the diff view to quickly assess the impact of the changes before making changes +- When a suggestion is generated, Copilot displays a visual "diff" view to help you compare the proposed changes with your existing code: + - **Green**: Means new code that will be added if you accept the suggestion. + - **Red**: Highlights existing code that will be removed or replaced by the suggested changes. + +#### 4. Accept or reject suggestions +- **Accept**: If the generated SQL meets your requirements, click the **Accept** button to apply the changes directly to your `.sql` file directly in the IDE. +- **Reject**: If the suggestion don’t align with your request/prompt, click **Reject** to discard the generated SQL without making changes and start again. + +#### 5. Regenerate code +- To regenerate, press the **Escape** button on your keyboard (or click the Reject button in the popup). This will remove the generated code and puts your cursor back into the prompt text area. +- Update your prompt and press **Enter** to try another generation. Press **Escape** again to close the popover entirely. + +Once you've accepted a suggestion, you can continue to use the prompt window to generate additional SQL code and commit your changes to the branch. + +<Lightbox src="/img/docs/dbt-cloud/cloud-ide/copilot-sql-generation.gif" width="100%" title="Edit existing SQL code using dbt Copilot's prompt window accessible by keyboard shortcut Cmd+B (Mac) or Ctrl+B (Windows)" /> + diff --git a/website/docs/docs/cloud/use-visual-editor.md b/website/docs/docs/cloud/use-visual-editor.md index b390432b227..2ab6a5b82d1 100644 --- a/website/docs/docs/cloud/use-visual-editor.md +++ b/website/docs/docs/cloud/use-visual-editor.md @@ -22,8 +22,7 @@ To join the private beta, [register your interest](https://docs.google.com/forms - You have a [dbt Cloud Enterprise](https://www.getdbt.com/pricing) account - You have a [developer license](/docs/cloud/manage-access/seats-and-users) with developer credentials set up - You have an existing dbt Cloud project already created -- You are [Keep on latest](/docs/dbt-versions/upgrade-dbt-version-in-cloud#keep-on-latest-version) for a versionless experience -- Successful job run on Production or Staging [environment](/docs/dbt-cloud-environments) +- Your Development environment is on a supported [release track](/docs/dbt-versions/cloud-release-tracks) to receive ongoing updates. - Have AI-powered features toggle enabled ## Access visual editor diff --git a/website/docs/docs/collaborate/auto-exposures.md b/website/docs/docs/collaborate/auto-exposures.md index 9b25a2fb305..495906cee75 100644 --- a/website/docs/docs/collaborate/auto-exposures.md +++ b/website/docs/docs/collaborate/auto-exposures.md @@ -9,11 +9,16 @@ image: /img/docs/cloud-integrations/auto-exposures/explorer-lineage.jpg # Auto-exposures <Lifecycle status="preview,enterprise" /> -As a data team, it’s critical that you have context into the downstream use cases and users of your data products. Auto-exposures integrates natively with Tableau (Power BI coming soon) and auto-generates downstream lineage in dbt Explorer for a richer experience. +As a data team, it’s critical that you have context into the downstream use cases and users of your data products. Auto-exposures integrate natively with Tableau (Power BI coming soon) and auto-generate downstream lineage in dbt Explorer for a richer experience. -Auto-exposures helps users understand how their models are used in downstream analytics tools to inform investments and reduce incidents — ultimately building trust and confidence in data products. It imports and auto-generates exposures based on Tableau dashboards, with user-defined curation. +Auto-exposures help users understand how their models are used in downstream analytics tools to inform investments and reduce incidents — ultimately building trust and confidence in data products. It imports and auto-generates exposures based on Tableau dashboards, with user-defined curation. -Auto-exposures is available on [Versionless](/docs/dbt-versions/versionless-cloud) and on [dbt Cloud Enterprise](https://www.getdbt.com/pricing/) plans. +## Supported plans +Auto-exposures is available on the [dbt Cloud Enterprise](https://www.getdbt.com/pricing/) plan. Currently, you can only connect to a single Tableau site on the same server. + +:::info Tableau Server +If you're using Tableau Server, you need to [allowlist dbt Cloud's IP addresses](/docs/cloud/about-cloud/access-regions-ip-addresses) for your dbt Cloud region. +::: For more information on how to set up auto-exposures, prerequisites, and more — refer to [configure auto-exposures in Tableau and dbt Cloud](/docs/cloud-integrations/configure-auto-exposures). diff --git a/website/docs/docs/collaborate/explore-projects.md b/website/docs/docs/collaborate/explore-projects.md index a4388a8696e..3780d100932 100644 --- a/website/docs/docs/collaborate/explore-projects.md +++ b/website/docs/docs/collaborate/explore-projects.md @@ -164,12 +164,12 @@ Under the the **Models** option, you can filter on model properties (access or m <Expandable alt_header="Trust signals for resources" lifecycle="preview"> -Trust signal icons offer a quick, at-a-glance view of data health when browsing your models in dbt Explorer. These icons keep you informed on the status of your model's health using the indicators **Healthy**, **Caution**, **Degraded**, and **Unknown**. For accurate health data, ensure the resource is up-to-date and has had a recent job run. +Trust signal icons offer a quick, at-a-glance view of data health when browsing your resources in dbt Explorer. These icons keep you informed on the status of your resource's health using the indicators **Healthy**, **Caution**, **Degraded**, and **Unknown**. For accurate health data, ensure the resource is up-to-date and has had a recent job run. Supported resources are models, sources, and exposures. Each trust signal icon reflects key data health components, such as test success status, missing resource descriptions, absence of builds in 30-day windows, and more. To access trust signals: -- Use the search function or click on **Models** or **Sources** under the **Resource** tab. +- Use the search function or click on **Models**, **Sources** or **Exposures** under the **Resource** tab. - View the icons under the **Health** column. - Hover over or click the trust signal to see detailed information. - For sources, the trust signal also indicates the source freshness status. diff --git a/website/docs/docs/collaborate/govern/model-contracts.md b/website/docs/docs/collaborate/govern/model-contracts.md index d30024157c8..9b75e518719 100644 --- a/website/docs/docs/collaborate/govern/model-contracts.md +++ b/website/docs/docs/collaborate/govern/model-contracts.md @@ -205,13 +205,11 @@ At the same time, for models with many columns, we understand that this can mean When comparing to a previous project state, dbt will look for breaking changes that could impact downstream consumers. If breaking changes are detected, dbt will present a contract error. -Breaking changes include: -- Removing an existing column. -- Changing the `data_type` of an existing column. -- Removing or modifying one of the `constraints` on an existing column (dbt v1.6 or higher). -- Removing a contracted model by deleting, renaming, or disabling it (dbt v1.9 or higher). - - versioned models will raise an error. - - unversioned models will raise a warning. +import BreakingChanges from '/snippets/_versions-contracts.md'; -More details are available in the [contract reference](/reference/resource-configs/contract#detecting-breaking-changes). +<BreakingChanges +value="Removing a contracted model by deleting, renaming, or disabling it (dbt v1.9 or higher)." +value2="versioned models will raise an error. unversioned models will raise a warning." +/> +More details are available in the [contract reference](/reference/resource-configs/contract#detecting-breaking-changes). diff --git a/website/docs/docs/collaborate/govern/project-dependencies.md b/website/docs/docs/collaborate/govern/project-dependencies.md index 7813e25efcb..bbda99960cd 100644 --- a/website/docs/docs/collaborate/govern/project-dependencies.md +++ b/website/docs/docs/collaborate/govern/project-dependencies.md @@ -18,7 +18,6 @@ This year, dbt Labs is introducing an expanded notion of `dependencies` across m ## Prerequisites - Available in [dbt Cloud Enterprise](https://www.getdbt.com/pricing). If you have an Enterprise account, you can unlock these features by designating a [public model](/docs/collaborate/govern/model-access) and adding a [cross-project ref](#how-to-write-cross-project-ref). <Lifecycle status="enterprise"/> -- Use a supported version of dbt (v1.6 or newer or go versionless with "[Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless)") for both the upstream ("producer") project and the downstream ("consumer") project. - Define models in an upstream ("producer") project that are configured with [`access: public`](/reference/resource-configs/access). You need at least one successful job run after defining their `access`. - Define a deployment environment in the upstream ("producer") project [that is set to be your Production environment](/docs/deploy/deploy-environments#set-as-production-environment), and ensure it has at least one successful job run in that environment. - If the upstream project has a Staging environment, run a job in that Staging environment to ensure the downstream cross-project ref resolves. diff --git a/website/docs/docs/community-adapters.md b/website/docs/docs/community-adapters.md index 3af4e15b32b..895e47a8fa3 100644 --- a/website/docs/docs/community-adapters.md +++ b/website/docs/docs/community-adapters.md @@ -7,7 +7,8 @@ Community adapters are adapter plugins contributed and maintained by members of | Data platforms (click to view setup guide) ||| | ------------------------------------------ | -------------------------------- | ------------------------------------- | -| [Clickhouse](/docs/core/connect-data-platform/clickhouse-setup) | [Databend Cloud](/docs/core/connect-data-platform/databend-setup) | [Doris & SelectDB](/docs/core/connect-data-platform/doris-setup) | +| [Clickhouse](/docs/core/connect-data-platform/clickhouse-setup) | [CrateDB](/docs/core/connect-data-platform/cratedb-setup) +| [Databend Cloud](/docs/core/connect-data-platform/databend-setup) | [Doris & SelectDB](/docs/core/connect-data-platform/doris-setup) | | [DuckDB](/docs/core/connect-data-platform/duckdb-setup) | [Exasol Analytics](/docs/core/connect-data-platform/exasol-setup) | [Extrica](/docs/core/connect-data-platform/extrica-setup) | | [Hive](/docs/core/connect-data-platform/hive-setup) | [IBM DB2](/docs/core/connect-data-platform/ibmdb2-setup) | [Impala](/docs/core/connect-data-platform/impala-setup) | | [Infer](/docs/core/connect-data-platform/infer-setup) | [iomete](/docs/core/connect-data-platform/iomete-setup) | [MindsDB](/docs/core/connect-data-platform/mindsdb-setup) | diff --git a/website/docs/docs/core/connect-data-platform/cratedb-setup.md b/website/docs/docs/core/connect-data-platform/cratedb-setup.md new file mode 100644 index 00000000000..fa1b9833e59 --- /dev/null +++ b/website/docs/docs/core/connect-data-platform/cratedb-setup.md @@ -0,0 +1,62 @@ +--- +title: "CrateDB setup" +description: "Read this guide to learn about the CrateDB data platform setup in dbt." +id: "cratedb-setup" +meta: + maintained_by: Crate.io, Inc. + authors: 'CrateDB maintainers' + github_repo: 'crate/dbt-cratedb2' + pypi_package: 'dbt-cratedb2' + min_core_version: 'v1.0.0' + cloud_support: Not Supported + min_supported_version: 'n/a' + slack_channel_name: 'Community Forum' + slack_channel_link: 'https://community.cratedb.com/' + platform_name: 'CrateDB' + config_page: '/reference/resource-configs/no-configs' +--- + +import SetUpPages from '/snippets/_setup-pages-intro.md'; + +<SetUpPages meta={frontMatter.meta}/> + + +[CrateDB] is compatible with PostgreSQL, so its dbt adapter strongly depends on +dbt-postgres, documented at [PostgreSQL profile setup]. + +CrateDB targets are configured exactly the same way, see also [PostgreSQL +configuration], with just a few things to consider which are special to +CrateDB. Relevant details are outlined at [using dbt with CrateDB], +which also includes up-to-date information. + + +## Profile configuration + +CrateDB targets should be set up using a configuration like this minimal sample +of settings in your [`profiles.yml`] file. + +<File name='~/.dbt/profiles.yml'> + +```yaml +cratedb_analytics: + target: dev + outputs: + dev: + type: cratedb + host: [clustername].aks1.westeurope.azure.cratedb.net + port: 5432 + user: [username] + pass: [password] + dbname: crate # Do not change this value. CrateDB's only catalog is `crate`. + schema: doc # Define the schema name. CrateDB's default schema is `doc`. +``` + +</File> + + + +[CrateDB]: https://cratedb.com/database +[PostgreSQL configuration]: https://docs.getdbt.com/reference/resource-configs/postgres-configs +[PostgreSQL profile setup]: https://docs.getdbt.com/docs/core/connect-data-platform/postgres-setup +[`profiles.yml`]: https://docs.getdbt.com/docs/core/connect-data-platform/profiles.yml +[using dbt with CrateDB]: https://cratedb.com/docs/guide/integrate/dbt/ diff --git a/website/docs/docs/core/connect-data-platform/dremio-setup.md b/website/docs/docs/core/connect-data-platform/dremio-setup.md index 21d0ee2956b..69f2b14fc4f 100644 --- a/website/docs/docs/core/connect-data-platform/dremio-setup.md +++ b/website/docs/docs/core/connect-data-platform/dremio-setup.md @@ -60,10 +60,6 @@ Next, configure the profile for your project. When you initialize a project, you create one of these three profiles. You must configure it before trying to connect to Dremio Cloud or Dremio Software. -## Profiles - -When you initialize a project, you create one of these three profiles. You must configure it before trying to connect to Dremio Cloud or Dremio Software. - * Profile for Dremio Cloud * Profile for Dremio Software with Username/Password Authentication * Profile for Dremio Software with Authentication Through a Personal Access Token @@ -149,9 +145,7 @@ For descriptions of the configurations in these profiles, see [Configurations](# </TabItem> </Tabs> -## Configurations - -### Configurations Common to Profiles for Dremio Cloud and Dremio Software +## Configurations Common to Profiles for Dremio Cloud and Dremio Software | Configuration | Required? | Default Value | Description | diff --git a/website/docs/docs/core/connect-data-platform/mssql-setup.md b/website/docs/docs/core/connect-data-platform/mssql-setup.md index f2b17278df3..31fa93874cf 100644 --- a/website/docs/docs/core/connect-data-platform/mssql-setup.md +++ b/website/docs/docs/core/connect-data-platform/mssql-setup.md @@ -4,7 +4,7 @@ description: "Read this guide to learn about the Microsoft SQL Server warehouse id: "mssql-setup" meta: maintained_by: Community - authors: 'dbt-msft community (https://github.com/dbt-msft)' + authors: 'Mikael Ene & dbt-msft community (https://github.com/dbt-msft)' github_repo: 'dbt-msft/dbt-sqlserver' pypi_package: 'dbt-sqlserver' min_core_version: 'v0.14.0' diff --git a/website/docs/docs/core/connect-data-platform/redshift-setup.md b/website/docs/docs/core/connect-data-platform/redshift-setup.md index ce3e8658045..4c00558d782 100644 --- a/website/docs/docs/core/connect-data-platform/redshift-setup.md +++ b/website/docs/docs/core/connect-data-platform/redshift-setup.md @@ -31,7 +31,7 @@ import SetUpPages from '/snippets/_setup-pages-intro.md'; | `port` | 5439 | | | `dbname` | my_db | Database name| | `schema` | my_schema | Schema name| -| `connect_timeout` | `None` or 30 | Number of seconds before connection times out| +| `connect_timeout` | 30 | Number of seconds before connection times out. Default is `None`| | `sslmode` | prefer | optional, set the sslmode to connect to the database. Default prefer, which will use 'verify-ca' to connect. For more information on `sslmode`, see Redshift note below| | `role` | None | Optional, user identifier of the current session| | `autocreate` | false | Optional, default false. Creates user if they do not exist | diff --git a/website/docs/docs/core/connect-data-platform/trino-setup.md b/website/docs/docs/core/connect-data-platform/trino-setup.md index 4caa56dcb00..06c94d7e7ff 100644 --- a/website/docs/docs/core/connect-data-platform/trino-setup.md +++ b/website/docs/docs/core/connect-data-platform/trino-setup.md @@ -34,7 +34,7 @@ The following profile fields are always required except for `user`, which is als | Field | Example | Description | | --------- | ------- | ----------- | -| `host` | `mycluster.mydomain.com` | The hostname of your cluster.<br/><br/>Don't include the `http://` or `https://` prefix. | +| `host` | `mycluster.mydomain.com`<br/><br/>Format for Starburst Galaxy:<br/><ul><li>`mygalaxyaccountname-myclustername.trino.galaxy.starburst.io`</li></ul> | The hostname of your cluster.<br/><br/>Don't include the `http://` or `https://` prefix. | | `database` | `my_postgres_catalog` | The name of a catalog in your cluster. | | `schema` | `my_schema` | The name of a schema within your cluster's catalog. <br/><br/>It's _not recommended_ to use schema names that have upper case or mixed case letters. | | `port` | `443` | The port to connect to your cluster. By default, it's 443 for TLS enabled clusters. | diff --git a/website/docs/docs/dbt-cloud-apis/user-tokens.md b/website/docs/docs/dbt-cloud-apis/user-tokens.md index 02a81d80139..b7bf4fdce28 100644 --- a/website/docs/docs/dbt-cloud-apis/user-tokens.md +++ b/website/docs/docs/dbt-cloud-apis/user-tokens.md @@ -8,7 +8,7 @@ pagination_next: "docs/dbt-cloud-apis/service-tokens" :::Warning -User API tokens have been deprecated and will no longer work. [Migrate](#migrate-from-user-api-keys-to-personal-access-tokens) to personal access tokens to resume services. +User API tokens have been deprecated and will no longer work. [Migrate](#migrate-deprecated-user-api-keys-to-personal-access-tokens) to personal access tokens to resume services. ::: diff --git a/website/docs/docs/dbt-versions/versionless-cloud.md b/website/docs/docs/dbt-versions/cloud-release-tracks.md similarity index 55% rename from website/docs/docs/dbt-versions/versionless-cloud.md rename to website/docs/docs/dbt-versions/cloud-release-tracks.md index 34ffc34f68a..290078da572 100644 --- a/website/docs/docs/dbt-versions/versionless-cloud.md +++ b/website/docs/docs/dbt-versions/cloud-release-tracks.md @@ -1,18 +1,61 @@ --- -title: "Upgrade to \"Versionless\" in dbt Cloud" -sidebar_label: "Upgrade to \"Versionless\" " -description: "Learn how to go versionless in dbt Cloud. You never have to perform an upgrade again. Plus, you'll be able to access new features and enhancements as soon as they become available. " +title: "Release tracks in dbt Cloud" +sidebar_label: "dbt Cloud Release Tracks" +description: "Learn how to get automatic upgrades to dbt in dbt Cloud. Access new features and enhancements as soon as they become available." --- -Since May 2024, new capabilities in dbt are delivered continuously to dbt Cloud. We call this "versionless dbt," because your projects and environments are upgraded automatically. +Since May 2024, new capabilities in the dbt framework are delivered continuously to dbt Cloud. Your projects and environments are upgraded automatically on a cadence that you choose, depending on your dbt Cloud plan. + +Previously, customers would pin to a minor version of dbt Core, and receive only patch updates during that specific version's active support period. Release tracks ensure that your project stays up-to-date with the modern capabilities of dbt Cloud and recent versions of dbt Core. This will require you to make one final update to your current jobs and environments. When that's done, you'll never have to think about managing, coordinating, or upgrading dbt versions again. -By moving your environments and jobs to "Versionless," you can get all the functionality in the latest features before they're in dbt Core — and more! — along with access to the new features and fixes as soon as they’re released. +By moving your environments and jobs to release tracks you can get all the functionality in dbt Cloud as soon as it's ready. On the "Latest" release track, this includes access to features _before_ they're available in final releases of dbt Core OSS. + +## Which release tracks are available? + +- **"Latest"** (available to all plans, formerly called "Versionless"): Provides a continuous release of the latest functionality in dbt Cloud. Includes early access to new features of the dbt framework before they're available in open source releases of dbt Core. +- <Lifecycle status="coming soon"/> **"Compatible"** (available to Team + Enterprise): Provides a monthly release aligned with the most recent open source versions of dbt Core and adapters, plus functionality exclusively available in dbt Cloud. +- <Lifecycle status="coming soon"/> **"Extended"** (available to Enterprise): Provides a delayed release of the previous month's "Compatible" release. + +The first "Compatible" release will be in December 2024, after the final release of dbt Core v1.9.0. For December 2024 only, the "Extended" release is the same as "Compatible." Starting in January 2025, "Extended" will be one month behind "Compatible." + +To configure an environment in the [dbt Cloud Admin API](/docs/dbt-cloud-apis/admin-cloud-api) or [Terraform](https://registry.terraform.io/providers/dbt-labs/dbtcloud/latest) to use a release track, set `dbt_version` to the release track name: +- `latest` (formerly called `versionless`; the old name is still supported) +- `compatible` (available to Team + Enterprise) +- `extended` (available to Enterprise) + +## Which release track should I choose? + +Choose the "Latest" release track to continuously receive new features, fixes, performance improvements — latest & greatest dbt. This is the default for all customers on dbt Cloud. + +Choose the "Compatible" and "Extended" release tracks if you need a less-frequent release cadence, the ability to test new dbt releases before they go live in production, and/or ongoing compatibility with the latest open source releases of dbt Core. -## Tips for upgrading {#upgrade-tips} +### Common architectures -If you regularly develop your dbt project in dbt Cloud and this is your first time trying “Versionless,” dbt Labs recommends that you try upgrading your project in a development environment. [Override your dbt version in development](/docs/dbt-versions/upgrade-dbt-version-in-cloud#override-dbt-version). Then, launch the IDE or Cloud CLI and do your development work as usual. Everything should work as you expect. +**Default** - majority of customers on all plans +- Prioritize immediate access to fixes and features +- Leave all environments on the "Latest" release track (default configuration) + +**Hybrid** - Team, Enterprise +- Prioritize ongoing compatibility between dbt Cloud and dbt Core for development & deployment using both products in the same dbt projects +- Configure all environments to use the "Compatible" release track +- Understand that new features will not be available until they are first released in dbt Core OSS (several months after the "Latest" release track) + +**Cautious** - Enterprise, Business Critical +- Prioritize "bake in" time for new features & fixes +- Configure development & test environments to use the "Compatible" release track +- Configure pre-production & production environments to use the "Extended" release track +- Understand that new features will not be available until they are first released in dbt Core OSS + Compatible track + +**Virtual Private dbt or Single Tenant** +- Changes to all release tracks roll out as part of dbt Cloud instance upgrades once per week + +## Upgrading from older versions + +### How to upgrade {#upgrade-tips} + +If you regularly develop your dbt project in dbt Cloud, and you're still running on a legacy version of dbt Core, dbt Labs recommends that you try upgrading your project in a development environment. [Override your dbt version in development](/docs/dbt-versions/upgrade-dbt-version-in-cloud#override-dbt-version). Then, launch the IDE or Cloud CLI and do your development work as usual. Everything should work as you expect. If you do see something unexpected or surprising, revert back to the previous version and record the differences you observed. [Contact dbt Cloud support](/docs/dbt-support#dbt-cloud-support) with your findings for a more detailed investigation. @@ -20,25 +63,23 @@ Next, we recommend that you try upgrading your project’s [deployment environme If your organization has multiple dbt projects, we recommend starting your upgrade with projects that are smaller, newer, or more familiar for your team. That way, if you do encounter any issues, it'll be easier and faster to troubleshoot those before proceeding to upgrade larger or more complex projects. -## Considerations - -The following is our guidance on some important considerations regarding dbt projects as part of the upgrade. +### Considerations -To learn more about how dbt Labs deploys stable dbt upgrades in a safe manner to dbt Cloud, we recommend that you read our blog post [How we're making sure you can confidently go "Versionless" in dbt Cloud](https://docs.getdbt.com/blog/latest-dbt-stability) for details. +To learn more about how dbt Labs deploys stable dbt upgrades in a safe manner to dbt Cloud, we recommend that you read our blog post: [How we're making sure you can confidently switch to the \"Latest\" release track in dbt Cloud](https://docs.getdbt.com/blog/latest-dbt-stability). If you're running dbt version 1.6 or older, please know that your version of dbt Core has reached [end-of-life (EOL)](/docs/dbt-versions/core#eol-version-support) and is no longer supported. We strongly recommend that you update to a newer version as soon as reasonably possible. -dbt Labs has extended the critical support period of dbt Core v1.7 for dbt Cloud Enterprise customers. +dbt Labs has extended the critical support period of dbt Core v1.7 for dbt Cloud Enterprise customers to January 31, 2024. At that point, we will be asking all customers to select a Release Track for receiving ongoing updates to dbt in dbt Cloud. <Expandable alt_header="I'm using an older version of dbt in dbt Cloud. What should I do? What happens if I do nothing?" > If you're running dbt version v1.6 or older, please know that your version of dbt Core has reached [end-of-life (EOL)](/docs/dbt-versions/core#eol-version-support) and is no longer supported. We strongly recommend that you update to a newer version as soon as reasonably possible. -dbt Labs has extended the "Critical Support" period of dbt Core v1.7 for dbt Cloud Enterprise customers while we work through the migration with those customers to automatic upgrades. In the meantime, this means that v1.7 will continue to be accessible in dbt Cloud for Enteprise customers, jobs and environments on v1.7 for those customers will not be automatically migrated to "Versionless," and dbt Labs will continue to fix critical bugs and security issues. +dbt Labs has extended the "Critical Support" period of dbt Core v1.7 for dbt Cloud Enterprise customers while we work through the migration with those customers to Release Tracks. In the meantime, this means that v1.7 will continue to be accessible in dbt Cloud for Enteprise customers, jobs and environments on v1.7 for those customers will not be automatically migrated to "Latest," and dbt Labs will continue to fix critical bugs and security issues. -dbt Cloud accounts on the Developer and Team plans will be migrated to "Versionless" dbt after November 1, 2024. If you know that your project will not be compatible with the upgrade, for one of the reasons described here, or a different reason in your own testing, you should [contact dbt Cloud support](https://docs.getdbt.com/docs/dbt-support#dbt-cloud-support) to request an extension. +dbt Cloud accounts on the Developer and Team plans will be migrated to the "Latest" release track after November 1, 2024. If you know that your project will not be compatible with the upgrade, for one of the reasons described here, or a different reason in your own testing, you should [contact dbt Cloud support](https://docs.getdbt.com/docs/dbt-support#dbt-cloud-support) to request an extension. -If your account has been migrated to "Versionless," and you are seeing net-new failures in your scheduled dbt jobs, you should also [contact dbt Cloud support](https://docs.getdbt.com/docs/dbt-support#dbt-cloud-support) to request an extension. +If your account has been migrated to the "Latest" release track, and you are seeing net-new failures in your scheduled dbt jobs, you should also [contact dbt Cloud support](https://docs.getdbt.com/docs/dbt-support#dbt-cloud-support) to request an extension. </Expandable> @@ -61,7 +102,7 @@ You should [contact dbt Cloud support](https://docs.getdbt.com/docs/dbt-support# </Expandable> -<Expandable alt_header="I see that my account was migrated to Versionless. What should I do?" > +<Expandable alt_header="I see that my account was migrated to Latest. What should I do?" > For the vast majority of customers, there is no further action needed. @@ -75,9 +116,9 @@ When we talk about _latest version_, we’re referring to the underlying runtime If a new version of a dbt package includes a breaking change (for example, a change to one of the macros in `dbt_utils`), you don’t have to immediately use the new version. In your `packages` configuration (in `dependencies.yml` or `packages.yml`), you can still specify which versions or version ranges of packages you want dbt to install. If you're not already doing so, we strongly recommend [checking `package-lock.yml` into version control](/reference/commands/deps#predictable-package-installs) for predictable package installs in deployment environments and a clear change history whenever you install upgrades. -If you upgrade to “Versionless” and immediately see something that breaks, please [contact support](/docs/dbt-support#dbt-cloud-support) and, in the meantime, downgrade back to v1.7. +If you upgrade to the "Latest" release track, and immediately see something that breaks, please [contact support](/docs/dbt-support#dbt-cloud-support) and, in the meantime, downgrade back to v1.7. -If you’re already on “Versionless” and you observe a breaking change (like something worked yesterday, but today it isn't working, or works in a surprising/different way), please [contact support](/docs/dbt-support#dbt-cloud-support) immediately. Depending on your contracted support agreement, the dbt Labs team will respond within our SLA time and we would seek to roll back the change and/or roll out a fix (just as we would for any other part of dbt Cloud). This is the same whether or not the root cause of the breaking change is in the project code or in the code of a package. +If you’re already on the "Latest" release track, and you observe a breaking change (like something worked yesterday, but today it isn't working, or works in a surprising/different way), please [contact support](/docs/dbt-support#dbt-cloud-support) immediately. Depending on your contracted support agreement, the dbt Labs team will respond within our SLA time and we would seek to roll back the change and/or roll out a fix (just as we would for any other part of dbt Cloud). This is the same whether or not the root cause of the breaking change is in the project code or in the code of a package. If the package you’ve installed relies on _undocumented_ functionality of dbt, it doesn't have the same guarantees as functionality that we’ve documented and tested. However, we will still do our best to avoid breaking them. diff --git a/website/docs/docs/dbt-versions/compatible-track-changelog.md b/website/docs/docs/dbt-versions/compatible-track-changelog.md new file mode 100644 index 00000000000..8f31775e3f1 --- /dev/null +++ b/website/docs/docs/dbt-versions/compatible-track-changelog.md @@ -0,0 +1,27 @@ +--- +title: "dbt Cloud Compatible Track - Changelog" +sidebar_label: "Compatible Track Changelog" +description: "The Compatible release track updates once per month, and it includes up-to-date open source versions as of the monthly release." +--- + +:::info Coming soon + +The "Compatible" and "Extended" release tracks will be available in Preview to eligible dbt Cloud accounts in December 2024. + +::: + +Select the "Compatible" and "Extended" release tracks if you need a less-frequent release cadence, the ability to test new dbt releases before they go live in production, and/or ongoing compatibility with the latest open source releases of dbt Core. + +Each monthly "Compatible" release includes functionality matching up-to-date open source versions of dbt Core and adapters at the time of release. + +Starting in January 2025, each monthly "Extended" release will match the previous month's "Compatible" release. + +For more information, see [release tracks](/docs/dbt-versions/cloud-release-tracks). + +## December 2024 + +Planned release: December 11-13 + +This release will include functionality from `dbt-core==1.9.0` and the most recent versions of all adapters supported in dbt Cloud. After the Compatible release is cut, we will update with: +- exact versions of open source dbt packages +- changelog notes concerning functionality specific to dbt Cloud diff --git a/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.9.md b/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.9.md index 7ac5a743995..9a4712af528 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.9.md +++ b/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.9.md @@ -1,5 +1,5 @@ --- -title: "Upgrading to v1.9 (beta)" +title: "Upgrading to v1.9" id: upgrading-to-v1.9 description: New features and changes in dbt Core v1.9 displayed_sidebar: "docs" @@ -9,14 +9,15 @@ displayed_sidebar: "docs" - [dbt Core 1.9 changelog](https://github.com/dbt-labs/dbt-core/blob/1.9.latest/CHANGELOG.md) - [dbt Core CLI Installation guide](/docs/core/installation-overview) -- [Cloud upgrade guide](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) +- [Cloud upgrade guide](/docs/dbt-versions/upgrade-dbt-version-in-cloud#release-tracks) ## What to know before upgrading dbt Labs is committed to providing backward compatibility for all versions 1.x. Any behavior changes will be accompanied by a [behavior change flag](/reference/global-configs/behavior-changes#behavior-change-flags) to provide a migration window for existing projects. If you encounter an error upon upgrading, please let us know by [opening an issue](https://github.com/dbt-labs/dbt-core/issues/new). -dbt Cloud is now [versionless](/docs/dbt-versions/versionless-cloud). If you have selected "Versionless" in dbt Cloud, you already have access to all the features, fixes, and other functionality that is included in dbt Core v1.9. -For users of dbt Core, since v1.8 we recommend explicitly installing both `dbt-core` and `dbt-<youradapter>`. This may become required for a future version of dbt. For example: +Starting in 2024, dbt Cloud provides the functionality from new versions of dbt Core via [release tracks](/docs/dbt-versions/cloud-release-tracks) with automatic upgrades. If you have selected the "Latest" release track in dbt Cloud, you already have access to all the features, fixes, and other functionality that is included in dbt Core v1.9! If you have selected the "Compatible" release track, you will have access in the next monthly "Compatible" release after the dbt Core v1.9 final release. + +For users of dbt Core, since v1.8, we recommend explicitly installing both `dbt-core` and `dbt-<youradapter>`. This may become required for a future version of dbt. For example: ```sql python3 -m pip install dbt-core dbt-snowflake @@ -29,7 +30,8 @@ Features and functionality new in dbt v1.9. ### Microbatch `incremental_strategy` :::info -While microbatch is in "beta", this functionality is still gated behind an env var, which will change to a behavior flag when 1.9 is GA. To use microbatch, set `DBT_EXPERIMENTAL_MICROBATCH` to `true` wherever you're running dbt Core. + +If you use a custom microbatch macro, set the [`require_batched_execution_for_custom_microbatch_strategy`](/reference/global-configs/behavior-changes#custom-microbatch-strategy) behavior flag in your `dbt_project.yml` to enable batched execution. If you don't have a custom microbatch macro, you don't need to set this flag as dbt will handle microbatching automatically for any model using the microbatch strategy. ::: Incremental models are, and have always been, a *performance optimization* — for datasets that are too large to be dropped and recreated from scratch every time you do a `dbt run`. Learn more about [incremental models](/docs/build/incremental-models-overview). @@ -47,12 +49,16 @@ Starting in Core 1.9, you can use the new [microbatch strategy](/docs/build/incr - Simplified query design: Write your model query for a single batch of data. dbt will use your `event_time`, `lookback`, and `batch_size` configurations to automatically generate the necessary filters for you, making the process more streamlined and reducing the need for you to manage these details. - Independent batch processing: dbt automatically breaks down the data to load into smaller batches based on the specified `batch_size` and processes each batch independently, improving efficiency and reducing the risk of query timeouts. If some of your batches fail, you can use `dbt retry` to load only the failed batches. - Targeted reprocessing: To load a *specific* batch or batches, you can use the CLI arguments `--event-time-start` and `--event-time-end`. +- [Automatic parallel batch execution](/docs/build/incremental-microbatch#parallel-batch-execution): Process multiple batches at the same time, instead of one after the other (sequentially) for faster processing of your microbatch models. dbt intelligently auto-detects if your batches can run in parallel, while also allowing you to manually override parallel execution with the [`concurrent_batches` config](/reference/resource-properties/concurrent_batches). + Currently microbatch is supported on these adapters with more to come: * postgres + * redshift * snowflake * bigquery * spark + * databricks ### Snapshots improvements @@ -64,9 +70,12 @@ Beginning in dbt Core 1.9, we've streamlined snapshot configuration and added a - Standard `schema` and `database` configs supported: Snapshots will now be consistent with other dbt resource types. You can specify where environment-aware snapshots should be stored. - Warning for incorrect `updated_at` data type: To ensure data integrity, you'll see a warning if the `updated_at` field specified in the snapshot configuration is not the proper data type or timestamp. - Set a custom current indicator for the value of `dbt_valid_to`: Use the [`dbt_valid_to_current` config](/reference/resource-configs/dbt_valid_to_current) to set a custom indicator for the value of `dbt_valid_to` in current snapshot records (like a future date). By default, this value is `NULL`. When configured, dbt will use the specified value instead of `NULL` for `dbt_valid_to` for current records in the snapshot table. +- Use the [`hard_deletes`](/reference/resource-configs/hard-deletes) configuration to get more control on how to handle deleted rows from the source. Supported methods are `ignore` (default), `invalidate` (replaces legacy `invalidate_hard_deletes=true`), and `new_record`. Setting `hard_deletes='new_record'` allows you to track hard deletes by adding a new record when row becomes "deleted" in source. Read more about [Snapshots meta fields](/docs/build/snapshots#snapshot-meta-fields). +To learn how to safely migrate existing snapshots, refer to [Snapshot configuration migration](/reference/snapshot-configs#snapshot-configuration-migration) for more information. + ### `state:modified` improvements We’ve made improvements to `state:modified` behaviors to help reduce the risk of false positives and negatives. Read more about [the `state:modified` behavior flag](#managing-changes-to-legacy-behaviors) that unlocks this improvement: @@ -83,6 +92,8 @@ You can read more about each of these behavior changes in the following links: - (Introduced, disabled by default) [`skip_nodes_if_on_run_start_fails` project config flag](/reference/global-configs/behavior-changes#behavior-change-flags). If the flag is set and **any** `on-run-start` hook fails, mark all selected nodes as skipped. - `on-run-start/end` hooks are **always** run, regardless of whether they passed or failed last time. - (Introduced, disabled by default) [[Redshift] `restrict_direct_pg_catalog_access`](/reference/global-configs/behavior-changes#redshift-restrict_direct_pg_catalog_access). If the flag is set the adapter will use the Redshift API (through the Python client) if available, or query Redshift's `information_schema` tables instead of using `pg_` tables. +- (Introduced, disabled by default) [`require_nested_cumulative_type_params`](/reference/global-configs/behavior-changes#cumulative-metrics). If the flag is set to `True`, users will receive an error instead of a warning if they're not proprly formatting cumulative metrics using the new [`cumulative_type_params`](/docs/build/cumulative#parameters) nesting. +- (Introduced, disabled by default) [`require_batched_execution_for_custom_microbatch_strategy`](/reference/global-configs/behavior-changes#custom-microbatch-strategy). Set to `True` if you use a custom microbatch macro to enable batched execution. If you don't have a custom microbatch macro, you don't need to set this flag as dbt will handle microbatching automatically for any model using the microbatch strategy. ## Adapter specific features and functionalities @@ -92,7 +103,7 @@ You can read more about each of these behavior changes in the following links: ### Snowflake -- Iceberg Table Format support will be available on three out of the box materializations: table, incremental, dynamic tables. +- Iceberg Table Format support will be available on three out-of-the-box materializations: table, incremental, dynamic tables. ### Bigquery @@ -107,7 +118,7 @@ You can read more about each of these behavior changes in the following links: We also made some quality-of-life improvements in Core 1.9, enabling you to: -- Maintain data quality now that dbt returns an an error (versioned models) or warning (unversioned models) when someone [removes a contracted model by deleting, renaming, or disabling](/docs/collaborate/govern/model-contracts#how-are-breaking-changes-handled) it. +- Maintain data quality now that dbt returns an error (versioned models) or warning (unversioned models) when someone [removes a contracted model by deleting, renaming, or disabling](/docs/collaborate/govern/model-contracts#how-are-breaking-changes-handled) it. - Document [data tests](/reference/resource-properties/description). - Use `ref` and `source` in [foreign key constraints](/reference/resource-properties/constraints). - Use `dbt test` with the `--resource-type` / `--exclude-resource-type` flag, making it possible to include or exclude data tests (`test`) or unit tests (`unit_test`). diff --git a/website/docs/docs/dbt-versions/core-upgrade/07-upgrading-to-v1.8.md b/website/docs/docs/dbt-versions/core-upgrade/07-upgrading-to-v1.8.md index 9163047e7e0..026fb1a2a11 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/07-upgrading-to-v1.8.md +++ b/website/docs/docs/dbt-versions/core-upgrade/07-upgrading-to-v1.8.md @@ -1,5 +1,5 @@ --- -title: "Upgrading to v1.8 (latest)" +title: "Upgrading to v1.8" id: upgrading-to-v1.8 description: New features and changes in dbt Core v1.8 displayed_sidebar: "docs" @@ -15,13 +15,9 @@ displayed_sidebar: "docs" dbt Labs is committed to providing backward compatibility for all versions 1.x, except for any changes explicitly mentioned on this page. If you encounter an error upon upgrading, please let us know by [opening an issue](https://github.com/dbt-labs/dbt-core/issues/new). -## Versionless +## Release tracks -dbt Cloud is going "versionless." This means you'll automatically get early access to new features and functionality before they're available in final releases of dbt Core. - -Select [**Versionless**](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) in your development, staging, and production [environments](/docs/deploy/deploy-environments) to access to everything in dbt Core v1.8+ and more. - -To upgrade an environment in the [dbt Cloud Admin API](/docs/dbt-cloud-apis/admin-cloud-api) or [Terraform](https://registry.terraform.io/providers/dbt-labs/dbtcloud/latest), set `dbt_version` to the string `versionless`. +Starting in 2024, dbt Cloud provides the functionality from new versions of dbt Core via [release tracks](/docs/dbt-versions/cloud-release-tracks) with automatic upgrades. Select a release track in your development, staging, and production [environments](/docs/deploy/deploy-environments) to access everything in dbt Core v1.8+ and more. To upgrade an environment in the [dbt Cloud Admin API](/docs/dbt-cloud-apis/admin-cloud-api) or [Terraform](https://registry.terraform.io/providers/dbt-labs/dbtcloud/latest), set `dbt_version` to the string `latest`. ## New and changed features and functionality diff --git a/website/docs/docs/dbt-versions/core-versions.md b/website/docs/docs/dbt-versions/core-versions.md index 4a490f96bd5..2f3cec44191 100644 --- a/website/docs/docs/dbt-versions/core-versions.md +++ b/website/docs/docs/dbt-versions/core-versions.md @@ -8,11 +8,11 @@ pagination_prev: null dbt Core releases follow [semantic versioning](https://semver.org/) guidelines. For more on how we use semantic versions, see [How dbt Core uses semantic versioning](#how-dbt-core-uses-semantic-versioning). -:::tip Go versionless and stay up to date, always +:::tip Release Tracks keep you up to date, always _Did you know that you can always be working with the latest features and functionality?_ -With dbt Cloud, you can get early access to new functionality before it becomes available in dbt Core and without the need of managing your own version upgrades. Refer to the [Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) setting for details. +With dbt Cloud, you can get early access to new functionality before it becomes available in dbt Core and without the need of managing your own version upgrades. Refer to the ["Latest" Release Track](/docs/dbt-versions/cloud-release-tracks) setting for details. ::: diff --git a/website/docs/docs/dbt-versions/product-lifecycles.md b/website/docs/docs/dbt-versions/product-lifecycles.md index e8711c825c4..01a8628d3ca 100644 --- a/website/docs/docs/dbt-versions/product-lifecycles.md +++ b/website/docs/docs/dbt-versions/product-lifecycles.md @@ -17,7 +17,7 @@ dbt Cloud features all fall into one of the following categories: - **Beta:** Beta features are still in development and are only available to select customers. To join a beta, there might be a signup form or dbt Labs may contact specific customers about testing. Some features can be activated by enabling [experimental features](/docs/dbt-versions/experimental-features) in your account. Beta features are incomplete and might not be entirely stable; they should be used at the customer’s risk, as breaking changes could occur. Beta features might not be fully documented, technical support is limited, and service level objectives (SLOs) might not be provided. Download the [Beta Features Terms and Conditions](/assets/beta-tc.pdf) for more details. - **Preview:** Preview features are stable and considered functionally ready for production deployments. Some planned additions and modifications to feature behaviors could occur before they become generally available. New functionality that is not backward compatible could also be introduced. Preview features include documentation, technical support, and service level objectives (SLOs). Features in preview are provided at no extra cost, although they might become paid features when they become generally available. -- **Generally available (GA):** Generally available features provide stable features introduced to all qualified dbt Cloud accounts. Service level agreements (SLAs) apply to GA features, including documentation and technical support. Certain GA feature availability is determined by the dbt version of the environment. To always receive the latest GA features, ensure your dbt Cloud [environments](/docs/dbt-cloud-environments) are set to ["Versionless"](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless). +- **Generally available (GA):** Generally available features provide stable features introduced to all qualified dbt Cloud accounts. Service level agreements (SLAs) apply to GA features, including documentation and technical support. Certain GA feature availability is determined by the dbt version of the environment. To always receive the latest GA features, ensure your dbt Cloud [environments](/docs/dbt-cloud-environments) are on a supported [Release Track](/docs/dbt-versions/cloud-release-tracks). - **Deprecated:** Features in this state are no longer being developed or enhanced by dbt Labs. They will continue functioning as-is, and their documentation will persist until their removal date. However, they are no longer subject to technical support. - **Removed:** Removed features are no longer available on the platform in any capacity. diff --git a/website/docs/docs/dbt-versions/release-notes.md b/website/docs/docs/dbt-versions/release-notes.md index 3cce35d9556..e9245399e65 100644 --- a/website/docs/docs/dbt-versions/release-notes.md +++ b/website/docs/docs/dbt-versions/release-notes.md @@ -18,12 +18,23 @@ Release notes are grouped by month for both multi-tenant and virtual private clo \* The official release date for this new format of release notes is May 15th, 2024. Historical release notes for prior dates may not reflect all available features released earlier this year or their tenancy availability. -## November 2024 +## December 2024 + +- **Fix**: [The dbt Semantic Layer](/docs/use-dbt-semantic-layer/dbt-sl) now respects the BigQuery [`execution_project` attribute](/docs/core/connect-data-platform/bigquery-setup#execution-project), including for exports. +- **New**: [Model notifications](/docs/deploy/model-notifications) are now generally available in dbt Cloud. These notifications alert model owners through email about any issues encountered by models and tests as soon as they occur while running a job. +- **New**: You can now use your [Azure OpenAI key](/docs/cloud/account-integrations?ai-integration=azure#ai-integrations) (available in beta) to use dbt Cloud features like [dbt Copilot](/docs/cloud/dbt-copilot) and [Ask dbt](/docs/cloud-integrations/snowflake-native-app) . Additionally, you can use your own [OpenAI API key](/docs/cloud/account-integrations?ai-integration=openai#ai-integrations) or use [dbt Labs-managed OpenAI](/docs/cloud/account-integrations?ai-integration=dbtlabs#ai-integrations) key. Refer to [AI integrations](/docs/cloud/account-integrations#ai-integrations) for more information. +- **New**: The [`hard_deletes`](/reference/resource-configs/hard-deletes) config gives you more control on how to handle deleted rows from the source. Supported options are `ignore` (default), `invalidate` (replaces the legacy `invalidate_hard_deletes=true`), and `new_record`. Note that `new_record` will create a new metadata column in the snapshot table. + +## November 2024 +- **Enhancement**: Trust signal icons in dbt Explorer are now available for Exposures, providing a quick view of data health while browsing resources. To view trust signal icons, go to dbt Explorer and click **Exposures** under the **Resource** tab. Refer to [Trust signal for resources](/docs/collaborate/explore-projects#trust-signals-for-resources) for more info. +- **Bug**: Identified and fixed an error with Semantic Layer queries that take longer than 10 minutes to complete. +- **Fix**: Job environment variable overrides in credentials are now respected for Exports. Previously, they were ignored. +- **Behavior change**: If you use a custom microbatch macro, set a [`require_batched_execution_for_custom_microbatch_strategy` behavior flag](/reference/global-configs/behavior-changes#custom-microbatch-strategy) in your `dbt_project.yml` to enable batched execution. If you don't have a custom microbatch macro, you don't need to set this flag as dbt will handle microbatching automatically for any model using the [microbatch strategy](/docs/build/incremental-microbatch#how-microbatch-compares-to-other-incremental-strategies). - **Enhancement**: For users that have Advanced CI's [compare changes](/docs/deploy/advanced-ci#compare-changes) feature enabled, you can optimize performance when running comparisons by using custom dbt syntax to customize deferral usage, exclude specific large models (or groups of models with tags), and more. Refer to [Compare changes custom commands](/docs/deploy/job-commands#compare-changes-custom-commands) for examples of how to customize the comparison command. -- **New**: SQL linting in CI jobs is now generally available in dbt Cloud. You can enable SQL linting in your CI jobs, using [SQLFluff](https://sqlfluff.com/), to automatically lint all SQL files in your project as a run step before your CI job builds. SQLFluff linting is available on [dbt Cloud Versionless](/docs/dbt-versions/versionless-cloud) and to dbt Cloud [Team or Enterprise](https://www.getdbt.com/pricing/) accounts. Refer to [SQL linting](/docs/deploy/continuous-integration#sql-linting) for more information. -- **New**: Use the [`dbt_valid_to_current`](/reference/resource-configs/dbt_valid_to_current) config to set a custom indicator for the value of `dbt_valid_to` in current snapshot records (like a future date). By default, this value is `NULL`. When configured, dbt will use the specified value instead of `NULL` for `dbt_valid_to` for current records in the snapshot table. This feature is available in dbt Cloud Versionless and dbt Core v1.9 and later. -- **New**: Use the [`event_time`](/reference/resource-configs/event-time) configuration to specify "at what time did the row occur." This configuration is required for [Incremental microbatch](/docs/build/incremental-microbatch) and can be added to ensure you're comparing overlapping times in [Advanced CI's compare changes](/docs/deploy/advanced-ci). Available in dbt Cloud Versionless and dbt Core v1.9 and higher. +- **New**: SQL linting in CI jobs is now generally available in dbt Cloud. You can enable SQL linting in your CI jobs, using [SQLFluff](https://sqlfluff.com/), to automatically lint all SQL files in your project as a run step before your CI job builds. SQLFluff linting is available on [dbt Cloud release tracks](/docs/dbt-versions/cloud-release-tracks) and to dbt Cloud [Team or Enterprise](https://www.getdbt.com/pricing/) accounts. Refer to [SQL linting](/docs/deploy/continuous-integration#sql-linting) for more information. +- **New**: Use the [`dbt_valid_to_current`](/reference/resource-configs/dbt_valid_to_current) config to set a custom indicator for the value of `dbt_valid_to` in current snapshot records (like a future date). By default, this value is `NULL`. When configured, dbt will use the specified value instead of `NULL` for `dbt_valid_to` for current records in the snapshot table. This feature is available in [the dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks) (formerly called `Versionless`) and dbt Core v1.9 and later. +- **New**: Use the [`event_time`](/reference/resource-configs/event-time) configuration to specify "at what time did the row occur." This configuration is required for [Incremental microbatch](/docs/build/incremental-microbatch) and can be added to ensure you're comparing overlapping times in [Advanced CI's compare changes](/docs/deploy/advanced-ci). Available in [the dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks) (formerly called `Versionless`) and dbt Core v1.9 and higher. - **Fix**: This update improves [dbt Semantic Layer Tableau integration](/docs/cloud-integrations/semantic-layer/tableau) making query parsing more reliable. Some key fixes include: - Error messages for unsupported joins between saved queries and ALL tables. - Improved handling of queries when multiple tables are selected in a data source. @@ -40,7 +51,7 @@ Release notes are grouped by month for both multi-tenant and virtual private clo - Iceberg table support for [Snowflake](https://docs.getdbt.com/reference/resource-configs/snowflake-configs#iceberg-table-format) - [Athena](https://docs.getdbt.com/reference/resource-configs/athena-configs) and [Teradata](https://docs.getdbt.com/reference/resource-configs/teradata-configs) adapter support in dbt Cloud - dbt Cloud now hosted on [Azure](https://docs.getdbt.com/docs/cloud/about-cloud/access-regions-ip-addresses) - - Get comfortable with [Versionless dbt Cloud](https://docs.getdbt.com/docs/dbt-versions/versionless-cloud) + - Get comfortable with [dbt Cloud Release Tracks](https://docs.getdbt.com/docs/dbt-versions/cloud-release-tracks) that keep your project up-to-date, automatically — on a cadence appropriate for your team - Scalable [microbatch incremental models](https://docs.getdbt.com/docs/build/incremental-microbatch) - Advanced CI [features](https://docs.getdbt.com/docs/deploy/advanced-ci) - [Linting with CI jobs](https://docs.getdbt.com/docs/deploy/continuous-integration#sql-linting) @@ -68,17 +79,17 @@ Release notes are grouped by month for both multi-tenant and virtual private clo - **New**: The dbt Cloud IDE supports signed commits for Git, available for Enterprise plans. You can sign your Git commits when pushing them to the repository to prevent impersonation and enhance security. Supported Git providers are GitHub and GitLab. Refer to [Git commit signing](/docs/cloud/dbt-cloud-ide/git-commit-signing.md) for more information. - **New:** With dbt Mesh, you can now enable bidirectional dependencies across your projects. Previously, dbt enforced dependencies to only go in one direction. dbt checks for cycles across projects and raises errors if any are detected. For details, refer to [Cycle detection](/docs/collaborate/govern/project-dependencies#cycle-detection). There's also the [Intro to dbt Mesh](/best-practices/how-we-mesh/mesh-1-intro) guide to help you learn more best practices. - **New**: The [dbt Semantic Layer Python software development kit](/docs/dbt-cloud-apis/sl-python) is now [generally available](/docs/dbt-versions/product-lifecycles). It provides users with easy access to the dbt Semantic Layer with Python and enables developers to interact with the dbt Semantic Layer APIs to query metrics/dimensions in downstream tools. -- **Enhancement**: You can now add a description to a singular data test in dbt Cloud Versionless. Use the [`description` property](/reference/resource-properties/description) to document [singular data tests](/docs/build/data-tests#singular-data-tests). You can also use [docs block](/docs/build/documentation#using-docs-blocks) to capture your test description. The enhancement will be included in upcoming dbt Core 1.9 release. -- **New**: Introducing the [microbatch incremental model strategy](/docs/build/incremental-microbatch) (beta), available in dbt Cloud Versionless and will soon be supported in dbt Core 1.9. The microbatch strategy allows for efficient, batch-based processing of large time-series datasets for improved performance and resiliency, especially when you're working with data that changes over time (like new records being added daily). To enable this feature in dbt Cloud, set the `DBT_EXPERIMENTAL_MICROBATCH` environment variable to `true` in your project. +- **Enhancement**: You can now add a description to a singular data test. Use the [`description` property](/reference/resource-properties/description) to document [singular data tests](/docs/build/data-tests#singular-data-tests). You can also use [docs block](/docs/build/documentation#using-docs-blocks) to capture your test description. The enhancement is available now in [the "Latest" release track in dbt Cloud](/docs/dbt-versions/cloud-release-tracks), and it will be included in dbt Core v1.9. +- **New**: Introducing the [microbatch incremental model strategy](/docs/build/incremental-microbatch) (beta), available now in [dbt Cloud Latest](/docs/dbt-versions/cloud-release-tracks) and will soon be supported in dbt Core v1.9. The microbatch strategy allows for efficient, batch-based processing of large time-series datasets for improved performance and resiliency, especially when you're working with data that changes over time (like new records being added daily). To enable this feature in dbt Cloud, set the `DBT_EXPERIMENTAL_MICROBATCH` environment variable to `true` in your project. - **New**: The dbt Semantic Layer supports custom calendar configurations in MetricFlow, available in [Preview](/docs/dbt-versions/product-lifecycles#dbt-cloud). Custom calendar configurations allow you to query data using non-standard time periods like `fiscal_year` or `retail_month`. Refer to [custom calendar](/docs/build/metricflow-time-spine#custom-calendar) to learn how to define these custom granularities in your MetricFlow timespine YAML configuration. -- **New**: In dbt Cloud Versionless, [Snapshots](/docs/build/snapshots) have been updated to use YAML configuration files instead of SQL snapshot blocks. This new feature simplifies snapshot management and improves performance, and will soon be released in dbt Core 1.9. - - Who does this affect? New user on Versionless can define snapshots using the new YAML specification. Users upgrading to Versionless who use snapshots can keep their existing configuration or can choose to migrate their snapshot definitions to YAML. - - Users on dbt 1.8 and earlier: No action is needed; existing snapshots will continue to work as before. However, we recommend upgrading to Versionless to take advantage of the new snapshot features. +- **New**: In the "Latest" release track in dbt Cloud, [Snapshots](/docs/build/snapshots) have been updated to use YAML configuration files instead of SQL snapshot blocks. This new feature simplifies snapshot management and improves performance, and will soon be released in dbt Core 1.9. + - Who does this affect? Users of the "Latest" release track in dbt Cloud can define snapshots using the new YAML specification. Users upgrading to "Latest" who have existing snapshot definitions can keep their existing configurations, or they can choose to migrate their snapshot definitions to YAML. + - Users on older versions: No action is needed; existing snapshots will continue to work as before. However, we recommend upgrading to the "Latest" release track to take advantage of the new snapshot features. - **Behavior change:** Set [`state_modified_compare_more_unrendered_values`](/reference/global-configs/behavior-changes#source-definitions-for-state) to true to reduce false positives for `state:modified` when configs differ between `dev` and `prod` environments. - **Behavior change:** Set the [`skip_nodes_if_on_run_start_fails`](/reference/global-configs/behavior-changes#failures-in-on-run-start-hooks) flag to `True` to skip all selected resources from running if there is a failure on an `on-run-start` hook. -- **Enhancement**: In dbt Cloud Versionless, snapshots defined in SQL files can now use `config` defined in `schema.yml` YAML files. This update resolves the previous limitation that required snapshot properties to be defined exclusively in `dbt_project.yml` and/or a `config()` block within the SQL file. This will also be released in dbt Core 1.9. -- **New**: In dbt Cloud Versionless, the `snapshot_meta_column_names` config allows for customizing the snapshot metadata columns. This feature allows an organization to align these automatically-generated column names with their conventions, and will be included in the upcoming dbt Core 1.9 release. -- **Enhancement**: dbt Cloud versionless began inferring a model's `primary_key` based on configured data tests and/or constraints within `manifest.json`. The inferred `primary_key` is visible in dbt Explorer and utilized by the dbt Cloud [compare changes](/docs/deploy/run-visibility#compare-tab) feature. This will also be released in dbt Core 1.9. Read about the [order dbt infers columns can be used as primary key of a model](https://github.com/dbt-labs/dbt-core/blob/7940ad5c7858ff11ef100260a372f2f06a86e71f/core/dbt/contracts/graph/nodes.py#L534-L541). +- **Enhancement**: In the "Latest" release track in dbt Cloud, snapshots defined in SQL files can now use `config` defined in `schema.yml` YAML files. This update resolves the previous limitation that required snapshot properties to be defined exclusively in `dbt_project.yml` and/or a `config()` block within the SQL file. This will also be released in dbt Core 1.9. +- **New**: In the "Latest" release track in dbt Cloud, the `snapshot_meta_column_names` config allows for customizing the snapshot metadata columns. This feature allows an organization to align these automatically-generated column names with their conventions, and will be included in the upcoming dbt Core 1.9 release. +- **Enhancement**: the "Latest" release track in dbt Cloud infers a model's `primary_key` based on configured data tests and/or constraints within `manifest.json`. The inferred `primary_key` is visible in dbt Explorer and utilized by the dbt Cloud [compare changes](/docs/deploy/run-visibility#compare-tab) feature. This will also be released in dbt Core 1.9. Read about the [order dbt infers columns can be used as primary key of a model](https://github.com/dbt-labs/dbt-core/blob/7940ad5c7858ff11ef100260a372f2f06a86e71f/core/dbt/contracts/graph/nodes.py#L534-L541). - **New:** dbt Explorer now includes trust signal icons, which is currently available as a [Preview](/docs/dbt-versions/product-lifecycles#dbt-cloud). Trust signals offer a quick, at-a-glance view of data health when browsing your dbt models in Explorer. These icons indicate whether a model is **Healthy**, **Caution**, **Degraded**, or **Unknown**. For accurate health data, ensure the resource is up-to-date and has had a recent job run. Refer to [Trust signals](/docs/collaborate/explore-projects#trust-signals-for-resources) for more information. - **New:** Auto exposures are now available in Preview in dbt Cloud. Auto-exposures helps users understand how their models are used in downstream analytics tools to inform investments and reduce incidents. It imports and auto-generates exposures based on Tableau dashboards, with user-defined curation. To learn more, refer to [Auto exposures](/docs/collaborate/auto-exposures). @@ -88,14 +99,14 @@ Release notes are grouped by month for both multi-tenant and virtual private clo - **Fix**: MetricFlow updated `get_and_expire` to replace the unsupported `GETEX` command with a `GET` and conditional expiration, ensuring compatibility with Azure Redis 6.0. - **Enhancement**: The [dbt Semantic Layer Python SDK](/docs/dbt-cloud-apis/sl-python) now supports `TimeGranularity` custom grain for metrics. This feature allows you to define custom time granularities for metrics, such as `fiscal_year` or `retail_month`, to query data using non-standard time periods. - **New**: Use the dbt Copilot AI engine to generate semantic model for your models, now available in beta. dbt Copilot automatically generates documentation, tests, and now semantic models based on the data in your model, . To learn more, refer to [dbt Copilot](/docs/cloud/dbt-copilot). -- **New**: Use the new recommended syntax for [defining `foreign_key` constraints](/reference/resource-properties/constraints) using `refs`, available in dbt Cloud Versionless. This will soon be released in dbt Core v1.9. This new syntax will capture dependencies and works across different environments. +- **New**: Use the new recommended syntax for [defining `foreign_key` constraints](/reference/resource-properties/constraints) using `refs`, available in the "Latest" release track in dbt Cloud. This will soon be released in dbt Core v1.9. This new syntax will capture dependencies and works across different environments. - **Enhancement**: You can now run [Semantic Layer commands](/docs/build/metricflow-commands) commands in the [dbt Cloud IDE](/docs/cloud/dbt-cloud-ide/develop-in-the-cloud). The supported commands are `dbt sl list`, `dbt sl list metrics`, `dbt sl list dimension-values`, `dbt sl list saved-queries`, `dbt sl query`, `dbt sl list dimensions`, `dbt sl list entities`, and `dbt sl validate`. - **New**: Microsoft Excel, a dbt Semantic Layer integration, is now generally available. The integration allows you to connect to Microsoft Excel to query metrics and collaborate with your team. Available for [Excel Desktop](https://pages.store.office.com/addinsinstallpage.aspx?assetid=WA200007100&rs=en-US&correlationId=4132ecd1-425d-982d-efb4-de94ebc83f26) or [Excel Online](https://pages.store.office.com/addinsinstallpage.aspx?assetid=WA200007100&rs=en-US&correlationid=4132ecd1-425d-982d-efb4-de94ebc83f26&isWac=True). For more information, refer to [Microsoft Excel](/docs/cloud-integrations/semantic-layer/excel). - **New**: [Data health tile](/docs/collaborate/data-tile) is now generally available in dbt Explorer. Data health tiles provide a quick at-a-glance view of your data quality, highlighting potential issues in your data. You can embed these tiles in your dashboards to quickly identify and address data quality issues in your dbt project. - **New**: dbt Explorer's Model query history feature is now in Preview for dbt Cloud Enterprise customers. Model query history allows you to view the count of consumption queries for a model based on the data warehouse's query logs. This feature provides data teams insight, so they can focus their time and infrastructure spend on the worthwhile used data products. To learn more, refer to [Model query history](/docs/collaborate/model-query-history). - **Enhancement**: You can now use [Extended Attributes](/docs/dbt-cloud-environments#extended-attributes) and [Environment Variables](/docs/build/environment-variables) when connecting to the Semantic Layer. If you set a value directly in the Semantic Layer Credentials, it will have a higher priority than Extended Attributes. When using environment variables, the default value for the environment will be used. If you're using exports, job environment variable overrides aren't supported yet, but they will be soon. - **New:** There are two new [environment variable defaults](/docs/build/environment-variables#dbt-cloud-context) — `DBT_CLOUD_ENVIRONMENT_NAME` and `DBT_CLOUD_ENVIRONMENT_TYPE`. -- **New:** The [Amazon Athena warehouse connection](/docs/cloud/connect-data-platform/connect-amazon-athena) is available as a public preview for dbt Cloud accounts that have upgraded to [`versionless`](/docs/dbt-versions/versionless-cloud). +- **New:** The [Amazon Athena warehouse connection](/docs/cloud/connect-data-platform/connect-amazon-athena) is available as a public preview for dbt Cloud accounts that have upgraded to [the "Latest" release track](/docs/dbt-versions/cloud-release-tracks). ## August 2024 @@ -221,15 +232,15 @@ The following features are new or enhanced as part of our [dbt Cloud Launch Show - **New**: The [dbt Semantic Layer](/docs/use-dbt-semantic-layer/dbt-sl) introduces [declarative caching](/docs/use-dbt-semantic-layer/sl-cache), allowing you to cache common queries to speed up performance and reduce query compute costs. Available for dbt Cloud Team or Enterprise accounts. -- <Expandable alt_header="New: Versionless" > +- <Expandable alt_header="New: Latest Release Track" > - The **Versionless** setting is now Generally Available (previously Public Preview). + The **Latest** Release Track is now Generally Available (previously Public Preview). - When the new **Versionless** setting is enabled, you get a versionless experience and always get the latest features and early access to new functionality for your dbt project. dbt Labs will handle upgrades behind-the-scenes, as part of testing and redeploying the dbt Cloud application — just like other dbt Cloud capabilities and other SaaS tools that you're using. No more manual upgrades and no more need for _a second sandbox project_ just to try out new features in development. + On this release track, you get automatic upgrades of dbt, including early access to the latest features, fixes, and performance improvements for your dbt project. dbt Labs will handle upgrades behind-the-scenes, as part of testing and redeploying the dbt Cloud application — just like other dbt Cloud capabilities and other SaaS tools that you're using. No more manual upgrades and no more need for _a second sandbox project_ just to try out new features in development. - To learn more about the new setting, refer to [Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) for details. + To learn more about the new setting, refer to [Release Tracks](/docs/dbt-versions/cloud-release-tracks) for details. - <Lightbox src="/img/docs/dbt-cloud/cloud-configuring-dbt-cloud/choosing-dbt-version/example-environment-settings.png" width="90%" title="Example of the Versionless setting"/> + <Lightbox src="/img/docs/dbt-cloud/cloud-configuring-dbt-cloud/choosing-dbt-version/example-environment-settings.png" width="90%" title="Example of the Latest setting"/> </Expandable> @@ -245,7 +256,7 @@ The following features are new or enhanced as part of our [dbt Cloud Launch Show </Expandable> -- **Behavior change:** Introduced the `require_explicit_package_overrides_for_builtin_materializations` flag, opt-in and disabled by default. If set to `True`, dbt will only use built-in materializations defined in the root project or within dbt, rather than implementations in packages. This will become the default in May 2024 (dbt Core v1.8 and "Versionless" dbt Cloud). Read [Package override for built-in materialization](/reference/global-configs/behavior-changes#package-override-for-built-in-materialization) for more information. +- **Behavior change:** Introduced the `require_explicit_package_overrides_for_builtin_materializations` flag, opt-in and disabled by default. If set to `True`, dbt will only use built-in materializations defined in the root project or within dbt, rather than implementations in packages. This will become the default in May 2024 (dbt Core v1.8 and dbt Cloud release tracks). Read [Package override for built-in materialization](/reference/global-configs/behavior-changes#package-override-for-built-in-materialization) for more information. **dbt Semantic Layer** - **New**: Use Saved selections to [save your query selections](/docs/cloud-integrations/semantic-layer/gsheets#using-saved-selections) within the [Google Sheets application](/docs/cloud-integrations/semantic-layer/gsheets). They can be made private or public and refresh upon loading. @@ -300,15 +311,15 @@ The following features are new or enhanced as part of our [dbt Cloud Launch Show </Expandable> -- <Expandable alt_header="New: Versionless " lifecycle="beta"> +- <Expandable alt_header="New: Latest Release Track" lifecycle="beta"> _Now available in the dbt version dropdown in dbt Cloud — starting with select customers, rolling out to wider availability through February and March._ - When the new **Versionless** setting is enabled, you always get the latest fixes and early access to new functionality for your dbt project. dbt Labs will handle upgrades behind-the-scenes, as part of testing and redeploying the dbt Cloud application — just like other dbt Cloud capabilities and other SaaS tools that you're using. No more manual upgrades and no more need for _a second sandbox project_ just to try out new features in development. + On this release track, you get automatic upgrades of dbt, including early access to the latest features, fixes, and performance improvements for your dbt project. dbt Labs will handle upgrades behind-the-scenes, as part of testing and redeploying the dbt Cloud application — just like other dbt Cloud capabilities and other SaaS tools that you're using. No more manual upgrades and no more need for _a second sandbox project_ just to try out new features in development. - To learn more about the new setting, refer to [Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) for details. + To learn more about the new setting, refer to [Release Tracks](/docs/dbt-versions/cloud-release-tracks) for details. - <Lightbox src="/img/docs/dbt-cloud/cloud-configuring-dbt-cloud/choosing-dbt-version/example-environment-settings.png" width="90%" title="Example of the Versionless setting"/> + <Lightbox src="/img/docs/dbt-cloud/cloud-configuring-dbt-cloud/choosing-dbt-version/example-environment-settings.png" width="90%" title="Example of the Latest setting"/> </Expandable> diff --git a/website/docs/docs/dbt-versions/upgrade-dbt-version-in-cloud.md b/website/docs/docs/dbt-versions/upgrade-dbt-version-in-cloud.md index cfe27d5e9d7..52faa9385fa 100644 --- a/website/docs/docs/dbt-versions/upgrade-dbt-version-in-cloud.md +++ b/website/docs/docs/dbt-versions/upgrade-dbt-version-in-cloud.md @@ -7,17 +7,22 @@ In dbt Cloud, both [jobs](/docs/deploy/jobs) and [environments](/docs/dbt-cloud- ## Environments -Navigate to the settings page of an environment, then click **Edit**. Click the **dbt version** dropdown bar and make your selection. You can select a previous release of dbt Core or go [**Versionless**](#versionless) (recommended). Be sure to save your changes before navigating away. +Navigate to the settings page of an environment, then click **Edit**. Click the **dbt version** dropdown bar and make your selection. You can select a [release track](#release-tracks) to receive ongoing updates (recommended), or a legacy version of dbt Core. Be sure to save your changes before navigating away. <Lightbox src="/img/docs/dbt-cloud/cloud-configuring-dbt-cloud/choosing-dbt-version/example-environment-settings.png" width="90%" title="Example environment settings in dbt Cloud"/> -### Versionless +### Release Tracks -By choosing to go **Versionless**, you opt for an experience that provides the latest features and early access to new functionality for your dbt project. dbt Labs will handle upgrades for you, as part of testing and redeploying the dbt Cloud SaaS application. Versionless always includes the most recent features before they're in dbt Core, and more. +Starting in 2024, your project will be upgraded automatically on a cadence that you choose -You can upgrade to the **Versionless** experience no matter which version of dbt you currently have selected. As a best practice, dbt Labs recommends that you test the upgrade in development first; use the [Override dbt version](#override-dbt-version) setting to test _your_ project on the latest dbt version before upgrading your deployment environments and the default development environment for all your colleagues. +The **Latest** track ensures you have up-to-date dbt Cloud functionality, and early access to new features of the dbt framework. The **Compatible** and **Extended** tracks are designed for customers who need a less-frequent release cadence, the ability to test new dbt releases before they go live in production, and/or ongoing compatibility with the latest open source releases of dbt Core. -To upgrade an environment in the [dbt Cloud Admin API](/docs/dbt-cloud-apis/admin-cloud-api) or [Terraform](https://registry.terraform.io/providers/dbt-labs/dbtcloud/latest), set `dbt_version` to the string `versionless`. +As a best practice, dbt Labs recommends that you test the upgrade in development first; use the [Override dbt version](#override-dbt-version) setting to test _your_ project on the latest dbt version before upgrading your deployment environments and the default development environment for all your colleagues. + +To upgrade an environment in the [dbt Cloud Admin API](/docs/dbt-cloud-apis/admin-cloud-api) or [Terraform](https://registry.terraform.io/providers/dbt-labs/dbtcloud/latest), set `dbt_version` to the name of your release track: +- `latest` (formerly called `versionless`; the old name is still supported) +- `compatible` (available to Team + Enterprise) +- `extended` (available to Enterprise) ### Override dbt version diff --git a/website/docs/docs/deploy/ci-jobs.md b/website/docs/docs/deploy/ci-jobs.md index 3da04ff6948..0f9b6ba377a 100644 --- a/website/docs/docs/deploy/ci-jobs.md +++ b/website/docs/docs/deploy/ci-jobs.md @@ -10,7 +10,7 @@ You can set up [continuous integration](/docs/deploy/continuous-integration) (CI - You have a dbt Cloud account. - CI features: - For both the [concurrent CI checks](/docs/deploy/continuous-integration#concurrent-ci-checks) and [smart cancellation of stale builds](/docs/deploy/continuous-integration#smart-cancellation) features, your dbt Cloud account must be on the [Team or Enterprise plan](https://www.getdbt.com/pricing/). - - [SQL linting](/docs/deploy/continuous-integration#sql-linting) is available on [dbt Cloud Versionless](/docs/dbt-versions/versionless-cloud) and to dbt Cloud [Team or Enterprise](https://www.getdbt.com/pricing/) accounts. You should have [SQLFluff configured](/docs/deploy/continuous-integration#to-configure-sqlfluff-linting) in your project. + - [SQL linting](/docs/deploy/continuous-integration#sql-linting) is available on [dbt Cloud release tracks](/docs/dbt-versions/cloud-release-tracks) and to dbt Cloud [Team or Enterprise](https://www.getdbt.com/pricing/) accounts. You should have [SQLFluff configured](/docs/deploy/continuous-integration#to-configure-sqlfluff-linting) in your project. - [Advanced CI](/docs/deploy/advanced-ci) features: - For the [compare changes](/docs/deploy/advanced-ci#compare-changes) feature, your dbt Cloud account must be on the [Enterprise plan](https://www.getdbt.com/pricing/) and have enabled Advanced CI features. Please ask your [dbt Cloud administrator to enable](/docs/cloud/account-settings#account-access-to-advanced-ci-features) this feature for you. After enablement, the **dbt compare** option becomes available in the CI job settings. - Set up a [connection with your Git provider](/docs/cloud/git/git-configuration-in-dbt-cloud). This integration lets dbt Cloud run jobs on your behalf for job triggering. @@ -188,6 +188,8 @@ To validate _all_ semantic nodes in your project, add the following command to d ## Troubleshooting +<FAQ path="Troubleshooting/gitlab-webhook"/> + <DetailsToggle alt_header="Temporary schemas aren't dropping"> If your temporary schemas aren't dropping after a PR merges or closes, this typically indicates one of these issues: - You have overridden the <code>generate_schema_name</code> macro and it isn't using <code>dbt_cloud_pr_</code> as the prefix. @@ -201,6 +203,7 @@ A macro is creating a schema but there are no dbt models writing to that schema. </DetailsToggle> + <DetailsToggle alt_header="Error messages that refer to schemas from previous PRs"> If you receive a schema-related error message referencing a <i>previous</i> PR, this is usually an indicator that you are not using a production job for your deferral and are instead using <i>self</i>. If the prior PR has already been merged, the prior PR's schema may have been dropped by the time the CI job for the current PR is kicked off. diff --git a/website/docs/docs/deploy/continuous-integration.md b/website/docs/docs/deploy/continuous-integration.md index 38ce34678ce..c738e641a5b 100644 --- a/website/docs/docs/deploy/continuous-integration.md +++ b/website/docs/docs/deploy/continuous-integration.md @@ -58,7 +58,7 @@ CI runs don't consume run slots. This guarantees a CI check will never block a p ### SQL linting <Lifecycle status="team,enterprise" /> -Available for [dbt Cloud Versionless](/docs/dbt-versions/versionless-cloud) and dbt Cloud Team or Enterprise accounts. +Available on [dbt Cloud release tracks](/docs/dbt-versions/cloud-release-tracks) and dbt Cloud Team or Enterprise accounts. When [enabled for your CI job](/docs/deploy/ci-jobs#set-up-ci-jobs), dbt invokes [SQLFluff](https://sqlfluff.com/) which is a modular and configurable SQL linter that warns you of complex functions, syntax, formatting, and compilation errors. By default, it lints all the changed SQL files in your project (compared to the last deferred production state). diff --git a/website/docs/docs/deploy/deploy-environments.md b/website/docs/docs/deploy/deploy-environments.md index dd9d066d545..e8c7816979a 100644 --- a/website/docs/docs/deploy/deploy-environments.md +++ b/website/docs/docs/deploy/deploy-environments.md @@ -35,7 +35,7 @@ To create a new dbt Cloud deployment environment, navigate to **Deploy** -> **En In dbt Cloud, each project can have one designated deployment environment, which serves as its production environment. This production environment is _essential_ for using features like dbt Explorer and cross-project references. It acts as the source of truth for the project's production state in dbt Cloud. -<Lightbox src="/img/docs/dbt-cloud/using-dbt-cloud/prod-settings.jpg" width="70%" title="Set your production environment as the default environment in your Environment Settings"/> +<Lightbox src="/img/docs/dbt-cloud/using-dbt-cloud/prod-settings-1.png" width="100%" title="Set your production environment as the default environment in your Environment Settings"/> ### Semantic Layer diff --git a/website/docs/docs/deploy/merge-jobs.md b/website/docs/docs/deploy/merge-jobs.md index 8b2900661fa..a187e3992f8 100644 --- a/website/docs/docs/deploy/merge-jobs.md +++ b/website/docs/docs/deploy/merge-jobs.md @@ -5,7 +5,7 @@ description: "Learn how to trigger a dbt job run when a Git pull request merges. --- -You can set up a merge job to implement a continuous development (CD) workflow in dbt Cloud. The merge job triggers a dbt job to run when someone merges Git pull requests into production. This workflow creates a seamless development experience where changes made in code will automatically update production data. Also, you can use this workflow for running `dbt compile` to update your environment's manifest so subsequent CI job runs are more performant. +You can set up a merge job to implement a continuous deployment (CD) workflow in dbt Cloud. The merge job triggers a dbt job to run when someone merges Git pull requests into production. This workflow creates a seamless development experience where changes made in code will automatically update production data. Also, you can use this workflow for running `dbt compile` to update your environment's manifest so subsequent CI job runs are more performant. By using CD in dbt Cloud, you can take advantage of deferral to build only the edited model and any downstream changes. With merge jobs, state will be updated almost instantly, always giving the most up-to-date state information in [dbt Explorer](/docs/collaborate/explore-projects). @@ -62,4 +62,4 @@ The following is an example of creating a new **Code pushed** trigger in Azure D <Lightbox src="/img/docs/dbt-cloud/using-dbt-cloud/example-azuredevops-new-event.png" title="Example of creating a new trigger to push events in Azure Devops"/> -</Expandable> \ No newline at end of file +</Expandable> diff --git a/website/docs/docs/deploy/model-notifications.md b/website/docs/docs/deploy/model-notifications.md index a6d4c467f0b..24bbc2295c6 100644 --- a/website/docs/docs/deploy/model-notifications.md +++ b/website/docs/docs/deploy/model-notifications.md @@ -3,8 +3,6 @@ title: "Model notifications" description: "While a job is running, receive email notifications in real time about any issues with your models and tests. " --- -# Model notifications <Lifecycle status="beta" /> - Set up dbt to notify the appropriate model owners through email about issues as soon as they occur, while the job is still running. Model owners can specify which statuses to receive notifications about: - `Success` and `Fails` for models @@ -12,28 +10,26 @@ Set up dbt to notify the appropriate model owners through email about issues as With model-level notifications, model owners can be the first ones to know about issues before anyone else (like the stakeholders). -:::info Beta feature - -This feature is currently available in [beta](/docs/dbt-versions/product-lifecycles#dbt-cloud) to a limited group of users and is gradually being rolled out. If you're in the beta, please contact the Support team at support@getdbt.com for assistance or questions. - -::: - To be timely and keep the number of notifications to a reasonable amount when multiple models or tests trigger them, dbt observes the following guidelines when notifying the owners: - Send a notification to each unique owner/email during a job run about any models (with status of failure/success) or tests (with status of warning/failure/success). Each owner receives only one notification, the initial one. -- Don't send any notifications about subsequent models or tests while a dbt job is still running. -- At the end of a job run, each owner receives a notification, for each of the statuses they specified to be notified about, with a list of models and tests that have that status. +- No notifications sent about subsequent models or tests while a dbt job is still running. +- Each owner/user who subscribes to notifications for one or more statuses (like failure, success, warning) will receive only _one_ email notification at the end of the job run. +- The email includes a consolidated list of all models or tests that match the statuses the user subscribed to, instead of sending separate emails for each status. Create configuration YAML files in your project for dbt to send notifications about the status of your models and tests. ## Prerequisites - Your dbt Cloud administrator has [enabled the appropriate account setting](#enable-access-to-model-notifications) for you. -- Your environment(s) must be on ["Versionless"](/docs/dbt-versions/versionless-cloud). - +- Your environment(s) must be on a [release track](/docs/dbt-versions/cloud-release-tracks) instead of a legacy dbt Core version. ## Configure groups -Add your group configuration in either the `dbt_project.yml` or `groups.yml` file. For example: +Define your groups in any `.yml` file in your [models directory](/reference/project-configs/model-paths). Each group must have a single email address specified — multiple email fields or lists aren't supported. + +The following example shows how to define groups in a `groups.yml` file. + +<File name='models/groups.yml'> ```yml version: 2 @@ -42,22 +38,26 @@ groups: - name: finance description: "Models related to the finance department" owner: - # 'name' or 'email' is required + # Email is required to receive model-level notifications, additional properties are also allowed. name: "Finance Team" email: finance@dbtlabs.com - slack: finance-data + favorite_food: donuts - name: marketing description: "Models related to the marketing department" owner: name: "Marketing Team" email: marketing@dbtlabs.com - slack: marketing-data + favorite_food: jaffles ``` -## Set up models +</File> + +## Attach groups to models -Set up your model configuration in either the `dbt_project.yml` or `groups.yml` file; doing this automatically sets up notifications for tests, too. For example: +Attach groups to models as you would any other config, in either the `dbt_project.yml` or `whatever.yml` files. For example: + +<File name='models/marts.yml'> ```yml version: 2 @@ -74,6 +74,34 @@ models: group: marketing ``` +</File> + +By assigning groups in the `dbt_project.yml` file, you can capture all models in a subdirectory at once. + +In this example, model notifications related to staging models go to the data engineering group, `marts/sales` models to the finance team, and `marts/campaigns` models to the marketing team. + +<File name='dbt_project.yml'> + +```yml +config-version: 2 +name: "jaffle_shop" + +[...] + +models: + jaffle_shop: + staging: + +group: data_engineering + marts: + sales: + +group: finance + campaigns: + +group: marketing + +``` + +</File> +Attaching a group to a model also encompasses its tests, so you will also receive notifications for a model's test failures. ## Enable access to model notifications @@ -82,6 +110,6 @@ Provide dbt Cloud account members the ability to configure and receive alerts ab To use model-level notifications, your dbt Cloud account must have access to the feature. Ask your dbt Cloud administrator to enable this feature for account members by following these steps: 1. Navigate to **Notification settings** from your profile name in the sidebar (lower left-hand side). -1. From **Email notications**, enable the setting **Enable group/owner notifications on models** under the **Model notifications** section. Then, specify which statuses to receive notifications about (Success, Warning, and/or Fails). +1. From **Email notifications**, enable the setting **Enable group/owner notifications on models** under the **Model notifications** section. Then, specify which statuses to receive notifications about (Success, Warning, and/or Fails). <Lightbox src="/img/docs/dbt-cloud/example-enable-model-notifications.png" title="Example of the setting Enable group/owner notifications on models" /> diff --git a/website/docs/docs/deploy/monitor-jobs.md b/website/docs/docs/deploy/monitor-jobs.md index 1cbba23161e..40298f0cdbe 100644 --- a/website/docs/docs/deploy/monitor-jobs.md +++ b/website/docs/docs/deploy/monitor-jobs.md @@ -13,7 +13,7 @@ This portion of our documentation will go over dbt Cloud's various capabilities - [Run visibility](/docs/deploy/run-visibility) — View your run history to help identify where improvements can be made to scheduled jobs. - [Retry jobs](/docs/deploy/retry-jobs) — Rerun your errored jobs from start or the failure point. - [Job notifications](/docs/deploy/job-notifications) — Receive email or Slack notifications when a job run succeeds, encounters warnings, fails, or is canceled. -- [Model notifications](/docs/deploy/model-notifications)<Lifecycle status="beta"/> — Receive email notifications about any issues encountered by your models and tests as soon as they occur while running a job. +- [Model notifications](/docs/deploy/model-notifications) — Receive email notifications about any issues encountered by your models and tests as soon as they occur while running a job. - [Webhooks](/docs/deploy/webhooks) — Use webhooks to send events about your dbt jobs' statuses to other systems. - [Leverage artifacts](/docs/deploy/artifacts) — dbt Cloud generates and saves artifacts for your project, which it uses to power features like creating docs for your project and reporting freshness of your sources. - [Source freshness](/docs/deploy/source-freshness) — Monitor data governance by enabling snapshots to capture the freshness of your data sources. diff --git a/website/docs/docs/deploy/retry-jobs.md b/website/docs/docs/deploy/retry-jobs.md index f439351aec5..4e3ad0d429f 100644 --- a/website/docs/docs/deploy/retry-jobs.md +++ b/website/docs/docs/deploy/retry-jobs.md @@ -10,6 +10,7 @@ If your dbt job run completed with a status of **Error**, you can rerun it from - You have a [dbt Cloud account](https://www.getdbt.com/signup). - You must be using [dbt version](/docs/dbt-versions/upgrade-dbt-version-in-cloud) 1.6 or newer. +- dbt can successfully parse the project and generate a [manifest](/reference/artifacts/manifest-json) - The most recent run of the job hasn't completed successfully. The latest status of the run is **Error**. - The job command that failed in the run must be one that supports the [retry command](/reference/commands/retry). diff --git a/website/docs/docs/get-started-dbt.md b/website/docs/docs/get-started-dbt.md index 428253ec139..1920a9b3da2 100644 --- a/website/docs/docs/get-started-dbt.md +++ b/website/docs/docs/get-started-dbt.md @@ -6,7 +6,7 @@ pagination_next: null pagination_prev: null --- -Begin your dbt journey by trying one of our quickstarts, which provides a step-by-step guide to help you set up dbt Cloud or dbt Core with a [variety of data platforms](/docs/cloud/connect-data-platform/about-connections). +Begin your dbt journey by trying one of our quickstarts, which provides a step-by-step guide to help you set up [dbt Cloud](#dbt-cloud) or [dbt Core](#dbt-core) with a [variety of data platforms](/docs/cloud/connect-data-platform/about-connections). ## dbt Cloud @@ -76,13 +76,23 @@ Learn more about [dbt Cloud features](/docs/cloud/about-cloud/dbt-cloud-feature [dbt Core](/docs/core/about-core-setup) is a command-line [open-source tool](https://github.com/dbt-labs/dbt-core) that enables data practitioners to transform data using analytics engineering best practices. It suits individuals and small technical teams who prefer manual setup and customization, supports community adapters, and open-source standards. -Refer to the following quickstarts to get started with dbt Core: +<div className="grid--3-col"> + +<Card + title="dbt Core from a manual install" + body="Learn how to install dbt Core and set up a project." + link="https://docs.getdbt.com/guides/manual-install" + icon="dbt-bit"/> -- [dbt Core from a manual install](/guides/manual-install) to learn how to install dbt Core and set up a project. -- [dbt Core using GitHub Codespace](/guides/codespace?step=1) to learn how to create a codespace and execute the `dbt build` command. +<Card + title="dbt Core using GitHub Codespace" + body="Learn how to create a codespace and execute the dbt build command." + link="https://docs.getdbt.com/guides/codespace?step=1" + icon="dbt-bit"/> +</div> ## Related docs -<!-- use as an op to link to other useful guides when the query params pr is merged --> + Expand your dbt knowledge and expertise with these additional resources: - [Join the bi-weekly demos](https://www.getdbt.com/resources/webinars/dbt-cloud-demos-with-experts) to see dbt Cloud in action and ask questions. diff --git a/website/docs/docs/use-dbt-semantic-layer/exports.md b/website/docs/docs/use-dbt-semantic-layer/exports.md index 5d6e4c0d996..1883212fb66 100644 --- a/website/docs/docs/use-dbt-semantic-layer/exports.md +++ b/website/docs/docs/use-dbt-semantic-layer/exports.md @@ -176,7 +176,7 @@ If exports aren't needed, you can set the value(s) to `FALSE` (`DBT_INCLUDE_SAVE </VersionBlock> -<!-- for Versionless --> +<!-- for Release Tracks --> <VersionBlock firstVersion="1.8"> 1. Click **Deploy** in the top navigation bar and choose **Environments**. diff --git a/website/docs/docs/use-dbt-semantic-layer/sl-cache.md b/website/docs/docs/use-dbt-semantic-layer/sl-cache.md index 0c6387959a3..27ffe97a951 100644 --- a/website/docs/docs/use-dbt-semantic-layer/sl-cache.md +++ b/website/docs/docs/use-dbt-semantic-layer/sl-cache.md @@ -22,7 +22,7 @@ While you can use caching to speed up your queries and reduce compute time, know ## Prerequisites - dbt Cloud [Team or Enterprise](https://www.getdbt.com/) plan. -- dbt Cloud environments that are ["Versionless"](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless). +- dbt Cloud environments must be on [release tracks](/docs/dbt-versions/cloud-release-tracks) and not legacy dbt Core versions. - A successful job run and [production environment](/docs/deploy/deploy-environments#set-as-production-environment). - For declarative caching, you need to have [exports](/docs/use-dbt-semantic-layer/exports) defined in your [saved queries](/docs/build/saved-queries) YAML configuration file. diff --git a/website/docs/faqs/Troubleshooting/error-importing-repo.md b/website/docs/faqs/Troubleshooting/error-importing-repo.md new file mode 100644 index 00000000000..85c9ffb0745 --- /dev/null +++ b/website/docs/faqs/Troubleshooting/error-importing-repo.md @@ -0,0 +1,14 @@ +--- +title: Errors importing a repository on dbt Cloud project set up +description: "Errors importing a repository on dbt Cloud project set up" +sidebar_label: 'Errors importing a repository on dbt Cloud project set up' +id: error-importing-repo +--- + +If you don't see your repository listed, double-check that: +- Your repository is in a Gitlab group you have access to. dbt Cloud will not read repos associated with a user. + +If you do see your repository listed, but are unable to import the repository successfully, double-check that: +- You are a maintainer of that repository. Only users with maintainer permissions can set up repository connections. + +If you imported a repository using the dbt Cloud native integration with GitLab, you should be able to see if the clone strategy is using a `deploy_token`. If it's relying on an SSH key, this means the repository was not set up using the native GitLab integration, but rather using the generic git clone option. The repository must be reconnected in order to get the benefits described above. diff --git a/website/docs/faqs/Troubleshooting/gitlab-webhook.md b/website/docs/faqs/Troubleshooting/gitlab-webhook.md new file mode 100644 index 00000000000..450796db83e --- /dev/null +++ b/website/docs/faqs/Troubleshooting/gitlab-webhook.md @@ -0,0 +1,19 @@ +--- +title: Unable to trigger a CI job with GitLab +description: "Unable to trigger a CI job" +sidebar_label: 'Unable to trigger a CI job' +id: gitlab-webhook +--- + +When you connect dbt Cloud to a GitLab repository, GitLab automatically registers a webhook in the background, viewable under the repository settings. This webhook is also used to trigger [CI jobs](/docs/deploy/ci-jobs) when you push to the repository. + +If you're unable to trigger a CI job, this usually indicates that the webhook registration is missing or incorrect. + +To resolve this issue, navigate to the repository settings in GitLab and view the webhook registrations by navigating to GitLab --> **Settings** --> **Webhooks**. + +Some things to check: + +- The webhook registration is enabled in GitLab. +- The webhook registration is configured with the correct URL and secret. + +If you're still experiencing this issue, reach out to the Support team at support@getdbt.com and we'll be happy to help! diff --git a/website/docs/guides/adapter-creation.md b/website/docs/guides/adapter-creation.md index 1a69be98b29..37ef5ec0412 100644 --- a/website/docs/guides/adapter-creation.md +++ b/website/docs/guides/adapter-creation.md @@ -666,7 +666,7 @@ In order to enable the [`dbt init` command](/reference/commands/init) to prompt See examples: -- [dbt-postgres](https://github.com/dbt-labs/dbt-core/blob/main/plugins/postgres/dbt/include/postgres/profile_template.yml) +- [dbt-postgres](https://github.com/dbt-labs/dbt-postgres/blob/main/dbt/include/postgres/profile_template.yml) - [dbt-redshift](https://github.com/dbt-labs/dbt-redshift/blob/main/dbt/include/redshift/profile_template.yml) - [dbt-snowflake](https://github.com/dbt-labs/dbt-snowflake/blob/main/dbt/include/snowflake/profile_template.yml) - [dbt-bigquery](https://github.com/dbt-labs/dbt-bigquery/blob/main/dbt/include/bigquery/profile_template.yml) diff --git a/website/docs/guides/bigquery-qs.md b/website/docs/guides/bigquery-qs.md index 0820c23934d..194b73f25bf 100644 --- a/website/docs/guides/bigquery-qs.md +++ b/website/docs/guides/bigquery-qs.md @@ -85,13 +85,14 @@ In order to let dbt connect to your warehouse, you'll need to generate a keyfile 3. Create a service account key for your new project from the [Service accounts page](https://console.cloud.google.com/iam-admin/serviceaccounts?walkthrough_id=iam--create-service-account-keys&start_index=1#step_index=1). For more information, refer to [Create a service account key](https://cloud.google.com/iam/docs/creating-managing-service-account-keys#creating) in the Google Cloud docs. When downloading the JSON file, make sure to use a filename you can easily remember. For example, `dbt-user-creds.json`. For security reasons, dbt Labs recommends that you protect this JSON file like you would your identity credentials; for example, don't check the JSON file into your version control software. ## Connect dbt Cloud to BigQuery -1. Create a new project in [dbt Cloud](/docs/cloud/about-cloud/access-regions-ip-addresses). Navigate to **Account settings** (by clicking on your account name in the left side menu), and click **+ New Project**. +1. Create a new project in [dbt Cloud](/docs/cloud/about-cloud/access-regions-ip-addresses). Navigate to **Account settings** (by clicking on your account name in the left side menu), and click **+ New project**. 2. Enter a project name and click **Continue**. 3. For the warehouse, click **BigQuery** then **Next** to set up your connection. 4. Click **Upload a Service Account JSON File** in settings. 5. Select the JSON file you downloaded in [Generate BigQuery credentials](#generate-bigquery-credentials) and dbt Cloud will fill in all the necessary fields. -6. Click **Test Connection**. This verifies that dbt Cloud can access your BigQuery account. -7. Click **Next** if the test succeeded. If it failed, you might need to go back and regenerate your BigQuery credentials. +6. Optional — dbt Cloud Enterprise plans can configure developer OAuth with BigQuery, providing an additional layer of security. For more information, refer to [Set up BigQuery OAuth](/docs/cloud/manage-access/set-up-bigquery-oauth). +7. Click **Test Connection**. This verifies that dbt Cloud can access your BigQuery account. +8. Click **Next** if the test succeeded. If it failed, you might need to go back and regenerate your BigQuery credentials. ## Set up a dbt Cloud managed repository diff --git a/website/docs/guides/core-cloud-2.md b/website/docs/guides/core-cloud-2.md index cee1e8029c2..ddc0e883d84 100644 --- a/website/docs/guides/core-cloud-2.md +++ b/website/docs/guides/core-cloud-2.md @@ -155,7 +155,7 @@ After [setting the foundations of dbt Cloud](https://docs.getdbt.com/guides/core Once you’ve confirmed that dbt Cloud orchestration and CI/CD are working as expected, you should pause your current orchestration tool and stop or update your current CI/CD process. This is not relevant if you’re still using an external orchestrator (such as Airflow), and you’ve swapped out `dbt-core` execution for dbt Cloud execution (through the [API](/docs/dbt-cloud-apis/overview)). Familiarize your team with dbt Cloud's [features](/docs/cloud/about-cloud/dbt-cloud-features) and optimize development and deployment processes. Some key features to consider include: -- **Version management:** Manage [dbt versions](/docs/dbt-versions/upgrade-dbt-version-in-cloud) and ensure team collaboration with dbt Cloud's one-click feature, removing the hassle of manual updates and version discrepancies. You can go [**Versionless**](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) to always get the latest features and early access to new functionality for your dbt project. +- **Release tracks:** Choose a [release track](/docs/dbt-versions/cloud-release-tracks) for automatic dbt version upgrades, at the cadence appropriate for your team — removing the hassle of manual updates and the risk of version discrepancies. You can also get early access to new functionality, ahead of dbt Core. - **Development tools**: Use the [dbt Cloud CLI](/docs/cloud/cloud-cli-installation) or [dbt Cloud IDE](/docs/cloud/dbt-cloud-ide/develop-in-the-cloud) to build, test, run, and version control your dbt projects. - **Documentation and Source freshness:** Automate storage of [documentation](/docs/build/documentation) and track [source freshness](/docs/deploy/source-freshness) in dbt Cloud, which streamlines project maintenance. - **Notifications and logs:** Receive immediate [notifications](/docs/deploy/monitor-jobs) for job failures, with direct links to the job details. Access comprehensive logs for all job runs to help with troubleshooting. diff --git a/website/docs/guides/core-to-cloud-1.md b/website/docs/guides/core-to-cloud-1.md index efed66c862a..3d6b119c178 100644 --- a/website/docs/guides/core-to-cloud-1.md +++ b/website/docs/guides/core-to-cloud-1.md @@ -58,8 +58,7 @@ This guide outlines the steps you need to take to move from dbt Core to dbt Clou ## Prerequisites -- You have an existing dbt Core project connected to a Git repository and data platform supported in [dbt Cloud](/docs/cloud/connect-data-platform/about-connections). -- A [supported version](/docs/dbt-versions/core) of dbt or select [**Versionless**](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) of dbt. +- You have an existing dbt Core project connected to a Git repository and data platform supported in [dbt Cloud](/docs/cloud/connect-data-platform/about-connections). - You have a dbt Cloud account. **[Don't have one? Start your free trial today](https://www.getdbt.com/signup)**! ## Account setup @@ -147,8 +146,8 @@ The most common data environments are production, staging, and development. The ### Initial setup steps 1. **Set up development environment** — Set up your [development](/docs/dbt-cloud-environments#create-a-development-environment) environment and [development credentials](/docs/cloud/dbt-cloud-ide/develop-in-the-cloud#access-the-cloud-ide). You’ll need this to access your dbt project and start developing. -2. **dbt Core version** — In your dbt Cloud environment and credentials, use the same dbt Core version you use locally. You can run `dbt --version` in the command line to find out which version of dbt Core you’re using. - - When using dbt Core, you need to think about which version you’re using and manage your own upgrades. When using dbt Cloud, leverage ["Versionless"](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) so you don’t have to. +2. **dbt Core version** — In your dbt Cloud environment, select a [release track](/docs/dbt-versions/cloud-release-tracks) for ongoing dbt version upgrades. If your team plans to use both dbt Core and dbt Cloud for developing or deploying your dbt project, You can run `dbt --version` in the command line to find out which version of dbt Core you’re using. + - When using dbt Core, you need to think about which version you’re using and manage your own upgrades. When using dbt Cloud, leverage [release tracks](/docs/dbt-versions/cloud-release-tracks) so you don’t have to. 3. **Connect to your data platform** — When using dbt Cloud, you can [connect to your data platform](/docs/cloud/connect-data-platform/about-connections) directly in the UI. - Each environment is roughly equivalent to an entry in your `profiles.yml` file. This means you don't need a `profiles.yml` file in your project. @@ -210,7 +209,7 @@ To use the [dbt Cloud's job scheduler](/docs/deploy/job-scheduler), set up one e ### Initial setup steps 1. **dbt Core version** — In your environment settings, configure dbt Cloud with the same dbt Core version. - - Once your full migration is complete, we recommend upgrading your environments to ["Versionless"](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) to always get the latest features and more. You only need to do this once. + - Once your full migration is complete, we recommend upgrading your environments to [release tracks](/docs/dbt-versions/cloud-release-tracks) to always get the latest features and more. You only need to do this once. 2. **Configure your jobs** — [Create jobs](/docs/deploy/deploy-jobs#create-and-schedule-jobs) for scheduled or event-driven dbt jobs. You can use cron execution, manual, pull requests, or trigger on the completion of another job. - Note that alongside [jobs in dbt Cloud](/docs/deploy/jobs), discover other ways to schedule and run your dbt jobs with the help of other tools. Refer to [Integrate with other tools](/docs/deploy/deployment-tools) for more information. diff --git a/website/docs/guides/core-to-cloud-3.md b/website/docs/guides/core-to-cloud-3.md index 7d482d54471..81222471345 100644 --- a/website/docs/guides/core-to-cloud-3.md +++ b/website/docs/guides/core-to-cloud-3.md @@ -36,7 +36,7 @@ You may have already started your move to dbt Cloud and are looking for tips to In dbt Cloud, you can natively connect to your data platform and test its [connection](/docs/connect-adapters) with a click of a button. This is especially useful for users who are new to dbt Cloud or are looking to streamline their connection setup. Here are some tips and caveats to consider: ### Tips -- Manage [dbt versions](/docs/dbt-versions/upgrade-dbt-version-in-cloud) and ensure team collaboration with dbt Cloud's one-click feature, eliminating the need for manual updates and version discrepancies. You can go [**Versionless**](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) to always get the latest features and early access to new functionality for your dbt project. +- Manage [dbt versions](/docs/dbt-versions/upgrade-dbt-version-in-cloud) and ensure team collaboration with dbt Cloud's one-click feature, eliminating the need for manual updates and version discrepancies. Select a [release track](/docs/dbt-versions/cloud-release-tracks) for ongoing updates, to always stay up to date with fixes and (optionally) get early access to new functionality for your dbt project. - dbt Cloud supports a whole host of [cloud providers](/docs/cloud/connect-data-platform/about-connections), including Snowflake, Databricks, BigQuery, Fabric, and Redshift (to name a few). - Use [Extended Attributes](/docs/deploy/deploy-environments#extended-attributes) to set a flexible [profiles.yml](/docs/core/connect-data-platform/profiles.yml) snippet in your dbt Cloud environment settings. It gives you more control over environments (both deployment and development) and extends how dbt Cloud connects to the data platform within a given environment. - For example, if you have a field in your `profiles.yml` that you’d like to add to the dbt Cloud adapter user interface, you can use Extended Attributes to set it. diff --git a/website/docs/guides/custom-cicd-pipelines.md b/website/docs/guides/custom-cicd-pipelines.md index be23524d096..668d3f6f1dd 100644 --- a/website/docs/guides/custom-cicd-pipelines.md +++ b/website/docs/guides/custom-cicd-pipelines.md @@ -506,7 +506,7 @@ Additionally, you’ll see the job in the run history of dbt Cloud. It should be </TabItem> <TabItem value="bitbucket"> -<Lightbox src="/img/guides/orchestration/custom-cicd-pipelines/dbt-run-on-merge-bitbucket.png)" title="dbt run on merge job in Bitbucket" width="80%" /> +<Lightbox src="/img/guides/orchestration/custom-cicd-pipelines/dbt-run-on-merge-bitbucket.png" title="dbt run on merge job in Bitbucket" width="80%" /> <Lightbox src="/img/guides/orchestration/custom-cicd-pipelines/dbt-cloud-job-bitbucket-triggered.png" title="dbt Cloud job showing it was triggered by Bitbucket" width="80%" /> diff --git a/website/docs/guides/mesh-qs.md b/website/docs/guides/mesh-qs.md index 47ece7b29ec..d81951c9669 100644 --- a/website/docs/guides/mesh-qs.md +++ b/website/docs/guides/mesh-qs.md @@ -40,7 +40,6 @@ To leverage dbt Mesh, you need the following: - You must have a [dbt Cloud Enterprise account](https://www.getdbt.com/get-started/enterprise-contact-pricing) <Lifecycle status="enterprise"/> - You have access to a cloud data platform, permissions to load the sample data tables, and dbt Cloud permissions to create new projects. -- Set your development and deployment [environments](/docs/dbt-cloud-environments) to use dbt [version](/docs/dbt-versions/core) 1.6 or later. You can also opt to go ["Versionless"](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) to always get the most recent features and functionality. - This guide uses the Jaffle Shop sample data, including `customers`, `orders`, and `payments` tables. Follow the provided instructions to load this data into your respective data platform: - [Snowflake](https://docs.getdbt.com/guides/snowflake?step=3) - [Databricks](https://docs.getdbt.com/guides/databricks?step=3) @@ -95,7 +94,7 @@ To set a production environment: 6. Click **Test Connection** to confirm the deployment connection. 6. Click **Save** to create a production environment. -<Lightbox src="/img/docs/dbt-cloud/using-dbt-cloud/prod-settings.jpg" width="70%" title="Set your production environment as the default environment in your Environment Settings"/> +<Lightbox src="/img/docs/dbt-cloud/using-dbt-cloud/prod-settings-1.png" width="100%" title="Set your production environment as the default environment in your Environment Settings"/> ## Set up a foundational project diff --git a/website/docs/guides/sl-snowflake-qs.md b/website/docs/guides/sl-snowflake-qs.md index d9de3f0e5fd..79038cd1dfc 100644 --- a/website/docs/guides/sl-snowflake-qs.md +++ b/website/docs/guides/sl-snowflake-qs.md @@ -106,7 +106,6 @@ Open a new tab and follow these quick steps for account setup and data loading i </DetailsToggle> -- Production and development environments must be on [dbt version 1.6 or higher](/docs/dbt-versions/upgrade-dbt-version-in-cloud). Alternatively, set your environment to [**Versionless**](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) to always get the latest updates. - Create a [trial Snowflake account](https://signup.snowflake.com/): - Select the Enterprise Snowflake edition with ACCOUNTADMIN access. Consider organizational questions when choosing a cloud provider, refer to Snowflake's [Introduction to Cloud Platforms](https://docs.snowflake.com/en/user-guide/intro-cloud-platforms). - Select a cloud provider and region. All cloud providers and regions will work so choose whichever you prefer. diff --git a/website/docs/guides/snowflake-qs.md b/website/docs/guides/snowflake-qs.md index 1eae3a13fb0..f1edd5ffc00 100644 --- a/website/docs/guides/snowflake-qs.md +++ b/website/docs/guides/snowflake-qs.md @@ -230,6 +230,26 @@ Now that you have a repository configured, you can initialize your project and s ``` - In the command line bar at the bottom, enter `dbt run` and click **Enter**. You should see a `dbt run succeeded` message. +:::info +If you receive an insufficient privileges error on Snowflake at this point, it may be because your Snowflake role doesn't have permission to access the raw source data, to build target tables and views, or both. + +To troubleshoot, use a role with sufficient privileges (like `ACCOUNTADMIN`) and run the following commands in Snowflake. + +**Note**: Replace `snowflake_role_name` with the role you intend to use. If you launched dbt Cloud with Snowflake Partner Connect, use `pc_dbt_role` as the role. + +``` +grant all on database raw to role snowflake_role_name; +grant all on database analytics to role snowflake_role_name; + +grant all on schema raw.jaffle_shop to role snowflake_role_name; +grant all on schema raw.stripe to role snowflake_role_name; + +grant all on all tables in database raw to role snowflake_role_name; +grant all on future tables in database raw to role snowflake_role_name; +``` + +::: + ## Build your first model You have two options for working with files in the dbt Cloud IDE: diff --git a/website/docs/reference/commands/init.md b/website/docs/reference/commands/init.md index 112fff63a38..7b71bf70f45 100644 --- a/website/docs/reference/commands/init.md +++ b/website/docs/reference/commands/init.md @@ -31,7 +31,7 @@ If you've just cloned or downloaded an existing dbt project, `dbt init` can stil `dbt init` knows how to prompt for connection information by looking for a file named `profile_template.yml`. It will look for this file in two places: -- **Adapter plugin:** What's the bare minumum Postgres profile? What's the type of each field, what are its defaults? This information is stored in a file called [`dbt/include/postgres/profile_template.yml`](https://github.com/dbt-labs/dbt-core/blob/main/plugins/postgres/dbt/include/postgres/profile_template.yml). If you're the maintainer of an adapter plugin, we highly recommend that you add a `profile_template.yml` to your plugin, too. Refer to the [Build, test, document, and promote adapters](/guides/adapter-creation) guide for more information. +- **Adapter plugin:** What's the bare minumum Postgres profile? What's the type of each field, what are its defaults? This information is stored in a file called [`dbt/include/postgres/profile_template.yml`](https://github.com/dbt-labs/dbt-postgres/blob/main/dbt/include/postgres/profile_template.yml). If you're the maintainer of an adapter plugin, we highly recommend that you add a `profile_template.yml` to your plugin, too. Refer to the [Build, test, document, and promote adapters](/guides/adapter-creation) guide for more information. - **Existing project:** If you're the maintainer of an existing project, and you want to help new users get connected to your database quickly and easily, you can include your own custom `profile_template.yml` in the root of your project, alongside `dbt_project.yml`. For common connection attributes, set the values in `fixed`; leave user-specific attributes in `prompts`, but with custom hints and defaults as you'd like. diff --git a/website/docs/reference/commands/run.md b/website/docs/reference/commands/run.md index 26db40cb7e4..58a876f98ef 100644 --- a/website/docs/reference/commands/run.md +++ b/website/docs/reference/commands/run.md @@ -83,4 +83,15 @@ See [global configs](/reference/global-configs/print-output#print-color) The `run` command supports the `--empty` flag for building schema-only dry runs. The `--empty` flag limits the refs and sources to zero rows. dbt will still execute the model SQL against the target data warehouse but will avoid expensive reads of input data. This validates dependencies and ensures your models will build properly. -</VersionBlock> \ No newline at end of file +</VersionBlock> + +## Status codes + +When calling the [list_runs api](/dbt-cloud/api-v2#/operations/List%20Runs), you will get a status code for each run returned. The available run status codes are as follows: + +- Starting = 1 +- Running = 3 +- Success = 10 +- Error = 20 +- Canceled = 30 +- Skipped = 40 diff --git a/website/docs/reference/commands/version.md b/website/docs/reference/commands/version.md index 3847b3cd593..4d5ce6524dd 100644 --- a/website/docs/reference/commands/version.md +++ b/website/docs/reference/commands/version.md @@ -13,7 +13,7 @@ The `--version` command-line flag returns information about the currently instal ## Versioning To learn more about release versioning for dbt Core, refer to [How dbt Core uses semantic versioning](/docs/dbt-versions/core#how-dbt-core-uses-semantic-versioning). -If using [versionless dbt Cloud](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless), then `dbt_version` uses the latest (continuous) release version. This also follows semantic versioning guidelines, using the `YYYY.MM.DD+<suffix>` format. The year, month, and day represent the date the version was built (for example, `2024.10.28+996c6a8`). The suffix provides an additional unique identification for each build. +If using a [dbt Cloud release track](/docs/dbt-versions/cloud-release-tracks), which provide ongoing updates to dbt, then `dbt_version` represents the release version of dbt in dbt Cloud. This also follows semantic versioning guidelines, using the `YYYY.MM.DD+<suffix>` format. The year, month, and day represent the date the version was built (for example, `2024.10.28+996c6a8`). The suffix provides an additional unique identification for each build. ## Example usages diff --git a/website/docs/reference/database-permissions/snowflake-permissions.md b/website/docs/reference/database-permissions/snowflake-permissions.md index 3f474242834..1ab35e46d26 100644 --- a/website/docs/reference/database-permissions/snowflake-permissions.md +++ b/website/docs/reference/database-permissions/snowflake-permissions.md @@ -83,6 +83,7 @@ grant role reporter to user looker_user; -- or mode_user, periscope_user ``` 5. Let loader load data + Give the role unilateral permission to operate on the raw database ``` use role sysadmin; @@ -90,6 +91,7 @@ grant all on database raw to role loader; ``` 6. Let transformer transform data + The transformer role needs to be able to read raw data. If you do this before you have any data loaded, you can run: @@ -110,6 +112,7 @@ transformer also needs to be able to create in the analytics database: grant all on database analytics to role transformer; ``` 7. Let reporter read the transformed data + A previous version of this article recommended this be implemented through hooks in dbt, but this way lets you get away with a one-off statement. ``` grant usage on database analytics to role reporter; @@ -120,10 +123,11 @@ grant select on future views in database analytics to role reporter; Again, if you already have data in your analytics database, make sure you run: ``` grant usage on all schemas in database analytics to role reporter; -grant select on all tables in database analytics to role transformer; -grant select on all views in database analytics to role transformer; +grant select on all tables in database analytics to role reporter; +grant select on all views in database analytics to role reporter; ``` 8. Maintain + When new users are added, make sure you add them to the right role! Everything else should be inherited automatically thanks to those `future` grants. For more discussion and legacy information, refer to [this Discourse article](https://discourse.getdbt.com/t/setting-up-snowflake-the-exact-grant-statements-we-run/439). diff --git a/website/docs/reference/dbt-classes.md b/website/docs/reference/dbt-classes.md index 13f9263e545..a6a8c2d4fa6 100644 --- a/website/docs/reference/dbt-classes.md +++ b/website/docs/reference/dbt-classes.md @@ -98,9 +98,14 @@ col.numeric_type('numeric', 12, 4) # numeric(12,4) ### Properties -- **name**: Returns the name of the column +- **char_size**: Returns the maximum size for character varying columns +- **column**: Returns the name of the column +- **data_type**: Returns the data type of the column (with size/precision/scale included) +- **dtype**: Returns the data type of the column (without any size/precision/scale included) +- **name**: Returns the name of the column (identical to `column`, provided as an alias). +- **numeric_precision**: Returns the maximum precision for fixed decimal columns +- **numeric_scale**: Returns the maximum scale for fixed decimal columns - **quoted**: Returns the name of the column wrapped in quotes -- **data_type**: Returns the data type of the column ### Instance methods diff --git a/website/docs/reference/dbt-commands.md b/website/docs/reference/dbt-commands.md index ca9a7725eb2..9cbc5e5e38b 100644 --- a/website/docs/reference/dbt-commands.md +++ b/website/docs/reference/dbt-commands.md @@ -34,10 +34,10 @@ Commands with a ('❌') indicate write commands, commands with a ('✅') indicat | Command | Description | Parallel execution | <div style={{width:'250px'}}>Caveats</div> | |---------|-------------| :-----------------:| ------------------------------------------ | -| [build](/reference/commands/build) | Build and test all selected resources (models, seeds, snapshots, tests) | ❌ | All tools <br /> All [supported versions](/docs/dbt-versions/core) | +| [build](/reference/commands/build) | Builds and tests all selected resources (models, seeds, snapshots, tests) | ❌ | All tools <br /> All [supported versions](/docs/dbt-versions/core) | | cancel | Cancels the most recent invocation. | N/A | dbt Cloud CLI <br /> Requires [dbt v1.6 or higher](/docs/dbt-versions/core) | | [clean](/reference/commands/clean) | Deletes artifacts present in the dbt project | ✅ | All tools <br /> All [supported versions](/docs/dbt-versions/core) | -| [clone](/reference/commands/clone) | Clone selected models from the specified state | ❌ | All tools <br /> Requires [dbt v1.6 or higher](/docs/dbt-versions/core) | +| [clone](/reference/commands/clone) | Clones selected models from the specified state | ❌ | All tools <br /> Requires [dbt v1.6 or higher](/docs/dbt-versions/core) | | [compile](/reference/commands/compile) | Compiles (but does not run) the models in a project | ✅ | All tools <br /> All [supported versions](/docs/dbt-versions/core) | | [debug](/reference/commands/debug) | Debugs dbt connections and projects | ✅ | dbt Cloud IDE, dbt Core <br /> All [supported versions](/docs/dbt-versions/core) | | [deps](/reference/commands/deps) | Downloads dependencies for a project | ✅ | All tools <br /> All [supported versions](/docs/dbt-versions/core) | @@ -50,9 +50,9 @@ Commands with a ('❌') indicate write commands, commands with a ('✅') indicat | reattach | Reattaches to the most recent invocation to retrieve logs and artifacts. | N/A | dbt Cloud CLI <br /> Requires [dbt v1.6 or higher](/docs/dbt-versions/core) | | [retry](/reference/commands/retry) | Retry the last run `dbt` command from the point of failure | ❌ | All tools <br /> Requires [dbt v1.6 or higher](/docs/dbt-versions/core) | | [run](/reference/commands/run) | Runs the models in a project | ❌ | All tools <br /> All [supported versions](/docs/dbt-versions/core) | -| [run-operation](/reference/commands/run-operation) | Invoke a macro, including running arbitrary maintenance SQL against the database | ❌ | All tools <br /> All [supported versions](/docs/dbt-versions/core) | +| [run-operation](/reference/commands/run-operation) | Invokes a macro, including running arbitrary maintenance SQL against the database | ❌ | All tools <br /> All [supported versions](/docs/dbt-versions/core) | | [seed](/reference/commands/seed) | Loads CSV files into the database | ❌ | All tools <br /> All [supported versions](/docs/dbt-versions/core) | -| [show](/reference/commands/show) | Preview table rows post-transformation | ✅ | All tools <br /> All [supported versions](/docs/dbt-versions/core) | +| [show](/reference/commands/show) | Previews table rows post-transformation | ✅ | All tools <br /> All [supported versions](/docs/dbt-versions/core) | | [snapshot](/reference/commands/snapshot) | Executes "snapshot" jobs defined in a project | ❌ | All tools <br /> All [supported versions](/docs/dbt-versions/core) | | [source](/reference/commands/source) | Provides tools for working with source data (including validating that sources are "fresh") | ✅ | All tools<br /> All [supported versions](/docs/dbt-versions/core) | | [test](/reference/commands/test) | Executes tests defined in a project | ✅ | All tools <br /> All [supported versions](/docs/dbt-versions/core) | diff --git a/website/docs/reference/dbt-jinja-functions/config.md b/website/docs/reference/dbt-jinja-functions/config.md index 3903c82eef7..8083ea2a124 100644 --- a/website/docs/reference/dbt-jinja-functions/config.md +++ b/website/docs/reference/dbt-jinja-functions/config.md @@ -34,13 +34,21 @@ __Args__: The `config.get` function is used to get configurations for a model from the end-user. Configs defined in this way are optional, and a default value can be provided. +There are 3 cases: +1. The configuration variable exists, it is not `None` +1. The configuration variable exists, it is `None` +1. The configuration variable does not exist + Example usage: ```sql {% materialization incremental, default -%} -- Example w/ no default. unique_key will be None if the user does not provide this configuration {%- set unique_key = config.get('unique_key') -%} - -- Example w/ default value. Default to 'id' if 'unique_key' not provided + -- Example w/ alternate value. Use alternative of 'id' if 'unique_key' config is provided, but it is None + {%- set unique_key = config.get('unique_key') or 'id' -%} + + -- Example w/ default value. Default to 'id' if the 'unique_key' config does not exist {%- set unique_key = config.get('unique_key', default='id') -%} ... ``` diff --git a/website/docs/reference/dbt-jinja-functions/model.md b/website/docs/reference/dbt-jinja-functions/model.md index 516981e11e3..b0995ff958c 100644 --- a/website/docs/reference/dbt-jinja-functions/model.md +++ b/website/docs/reference/dbt-jinja-functions/model.md @@ -20,9 +20,9 @@ To view the contents of `model` for a given model: <Tabs> -<TabItem value="cli" label="CLI"> +<TabItem value="cli" label="Command line interface"> -If you're using the CLI, use [log()](/reference/dbt-jinja-functions/log) to print the full contents: +If you're using the command line interface (CLI), use [log()](/reference/dbt-jinja-functions/log) to print the full contents: ```jinja {{ log(model, info=True) }} @@ -42,6 +42,48 @@ If you're using the CLI, use [log()](/reference/dbt-jinja-functions/log) to prin </Tabs> +## Batch properties for microbatch models + +Starting in dbt Core v1.9, the model object includes a `batch` property (`model.batch`), which provides details about the current batch when executing an [incremental microbatch](/docs/build/incremental-microbatch) model. This property is only populated during the batch execution of a microbatch model. + +The following table describes the properties of the `batch` object. Note that dbt appends the property to the `model` and `batch` objects. + +| Property | Description | Example | +| -------- | ----------- | ------- | +| `id` | The unique identifier for the batch within the context of the microbatch model. | `model.batch.id` | +| `event_time_start` | The start time of the batch's [`event_time`](/reference/resource-configs/event-time) filter (inclusive). | `model.batch.event_time_start` | +| `event_time_end` | The end time of the batch's `event_time` filter (exclusive). | `model.batch.event_time_end` | + +### Usage notes + +`model.batch` is only available during the execution of a microbatch model batch. Outside of the microbatch execution, `model.batch` is `None`, and its sub-properties aren't accessible. + +#### Example of safeguarding access to batch properties + +We recommend to always check if `model.batch` is populated before accessing its properties. To do this, use an `if` statement for safe access to `batch` properties: + +```jinja +{% if model.batch %} + {{ log(model.batch.id) }} # Log the batch ID # + {{ log(model.batch.event_time_start) }} # Log the start time of the batch # + {{ log(model.batch.event_time_end) }} # Log the end time of the batch # +{% endif %} +``` + +In this example, the `if model.batch` statement makes sure that the code only runs during a batch execution. `log()` is used to print the `batch` properties for debugging. + +#### Example of log batch details + +This is a practical example of how you might use `model.batch` in a microbatch model to log batch details for the `batch.id`: + +```jinja +{% if model.batch %} + {{ log("Processing batch with ID: " ~ model.batch.id, info=True) }} + {{ log("Batch event time range: " ~ model.batch.event_time_start ~ " to " ~ model.batch.event_time_end, info=True) }} +{% endif %} +``` +In this example, the `if model.batch` statement makes sure that the code only runs during a batch execution. `log()` is used to print the `batch` properties for debugging. + ## Model structure and JSON schema To view the structure of `models` and their definitions: diff --git a/website/docs/reference/dbt-jinja-functions/this.md b/website/docs/reference/dbt-jinja-functions/this.md index f9f2961b08f..7d358cb6299 100644 --- a/website/docs/reference/dbt-jinja-functions/this.md +++ b/website/docs/reference/dbt-jinja-functions/this.md @@ -20,8 +20,6 @@ meta: ## Examples -<Snippet path="hooks-to-grants" /> - ### Configuring incremental models <File name='models/stg_events.sql'> diff --git a/website/docs/reference/dbtignore.md b/website/docs/reference/dbtignore.md index 8733fc592cd..063b455f5cc 100644 --- a/website/docs/reference/dbtignore.md +++ b/website/docs/reference/dbtignore.md @@ -20,6 +20,13 @@ another-non-dbt-model.py # ignore all .py files with "codegen" in the filename *codegen*.py + +# ignore all folders in a directory +path/to/folders/** + +# ignore some folders in a directory +path/to/folders/subfolder/** + ``` </File> diff --git a/website/docs/reference/global-configs/behavior-changes.md b/website/docs/reference/global-configs/behavior-changes.md index 94afa7c9cae..bda4d2b361a 100644 --- a/website/docs/reference/global-configs/behavior-changes.md +++ b/website/docs/reference/global-configs/behavior-changes.md @@ -59,13 +59,14 @@ flags: source_freshness_run_project_hooks: False restrict_direct_pg_catalog_access: False require_yaml_configuration_for_mf_time_spines: False + require_batched_execution_for_custom_microbatch_strategy: False ``` </File> -When we use dbt Cloud in the following table, we're referring to accounts that have gone "[Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless)." This table outlines which version of dbt Core contains the behavior change or the date the behavior change was added to dbt Cloud. +This table outlines which month of the "Latest" release track in dbt Cloud and which version of dbt Core contains the behavior change's introduction (disabled by default) or maturity (enabled by default). -| Flag | dbt Cloud: Intro | dbt Cloud: Maturity | dbt Core: Intro | dbt Core: Maturity | +| Flag | dbt Cloud "Latest": Intro | dbt Cloud "Latest": Maturity | dbt Core: Intro | dbt Core: Maturity | |-----------------------------------------------------------------|------------------|---------------------|-----------------|--------------------| | [require_explicit_package_overrides_for_builtin_materializations](#package-override-for-built-in-materialization) | 2024.04 | 2024.06 | 1.6.14, 1.7.14 | 1.8.0 | | [require_resource_names_without_spaces](#no-spaces-in-resource-names) | 2024.05 | TBD* | 1.8.0 | 1.10.0 | @@ -74,6 +75,8 @@ When we use dbt Cloud in the following table, we're referring to accounts that h | [skip_nodes_if_on_run_start_fails](#failures-in-on-run-start-hooks) | 2024.10 | TBD* | 1.9.0 | TBD* | | [state_modified_compare_more_unrendered_values](#source-definitions-for-state) | 2024.10 | TBD* | 1.9.0 | TBD* | | [require_yaml_configuration_for_mf_time_spines](#metricflow-time-spine-yaml) | 2024.10 | TBD* | 1.9.0 | TBD* | +| [require_batched_execution_for_custom_microbatch_strategy](#custom-microbatch-strategy) | 2024.11 | TBD* | 1.9.0 | TBD* | +| [cumulative_type_params](#cumulative-metrics-parameter) | 2024.11 | TBD* | 1.9.0 | TBD* | When the dbt Cloud Maturity is "TBD," it means we have not yet determined the exact date when these flags' default values will change. Affected users will see deprecation warnings in the meantime, and they will receive emails providing advance warning ahead of the maturity date. In the meantime, if you are seeing a deprecation warning, you can either: - Migrate your project to support the new behavior, and then set the flag to `True` to stop seeing the warnings. @@ -164,3 +167,61 @@ In previous versions (dbt Core 1.8 and earlier), the MetricFlow time spine confi When the flag is set to `True`, dbt will continue to support the SQL file configuration. When the flag is set to `False`, dbt will raise a deprecation warning if it detects a MetricFlow time spine configured in a SQL file. The MetricFlow YAML file should have the `time_spine:` field. Refer to [MetricFlow timespine](/docs/build/metricflow-time-spine) for more details. + +### Custom microbatch strategy +The `require_batched_execution_for_custom_microbatch_strategy` flag is set to `False` by default and is only relevant if you already have a custom microbatch macro in your project. If you don't have a custom microbatch macro, you don't need to set this flag as dbt will handle microbatching automatically for any model using the [microbatch strategy](/docs/build/incremental-microbatch#how-microbatch-compares-to-other-incremental-strategies). + +Set the flag is set to `True` if you have a custom microbatch macro set up in your project. When the flag is set to `True`, dbt will execute the custom microbatch strategy in batches. + +If you have a custom microbatch macro and the flag is left as `False`, dbt will issue a deprecation warning. + +Previously, users needed to set the `DBT_EXPERIMENTAL_MICROBATCH` environment variable to `True` to prevent unintended interactions with existing custom incremental strategies. But this is no longer necessary, as setting `DBT_EXPERMINENTAL_MICROBATCH` will no longer have an effect on runtime functionality. + +### Cumulative metrics + +[Cumulative-type metrics](/docs/build/cumulative#parameters) are nested under the `cumulative_type_params` field in [the dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks), dbt Core v1.9 and newer. Currently, dbt will warn users if they have cumulative metrics improperly nested. To enforce the new format (resulting in an error instead of a warning), set the `require_nested_cumulative_type_params` to `True`. + +Use the following metric configured with the syntax before v1.9 as an example: + +```yaml + + type: cumulative + type_params: + measure: order_count + window: 7 days + +``` + +If you run `dbt parse` with that syntax on Core v1.9 or [the dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks), you will receive a warning like: + +```bash + +15:36:22 [WARNING]: Cumulative fields `type_params.window` and +`type_params.grain_to_date` has been moved and will soon be deprecated. Please +nest those values under `type_params.cumulative_type_params.window` and +`type_params.cumulative_type_params.grain_to_date`. See documentation on +behavior changes: +https://docs.getdbt.com/reference/global-configs/behavior-changes. + +``` + +If you set `require_nested_cumulative_type_params` to `True` and re-run `dbt parse` you will now receive an error like: + +```bash + +21:39:18 Cumulative fields `type_params.window` and `type_params.grain_to_date` should be nested under `type_params.cumulative_type_params.window` and `type_params.cumulative_type_params.grain_to_date`. Invalid metrics: orders_last_7_days. See documentation on behavior changes: https://docs.getdbt.com/reference/global-configs/behavior-changes. + +``` + +Once the metric is updated, it will work as expected: + +```yaml + + type: cumulative + type_params: + measure: + name: order_count + cumulative_type_params: + window: 7 days + +``` diff --git a/website/docs/reference/global-configs/indirect-selection.md b/website/docs/reference/global-configs/indirect-selection.md index 729176a1ff4..03048b57119 100644 --- a/website/docs/reference/global-configs/indirect-selection.md +++ b/website/docs/reference/global-configs/indirect-selection.md @@ -6,7 +6,7 @@ sidebar: "Indirect selection" import IndirSelect from '/snippets/_indirect-selection-definitions.md'; -Use the `--indirect_selection` flag to `dbt test` or `dbt build` to configure which tests to run for the nodes you specify. You can set this as a CLI flag or an environment variable. In dbt Core, you can also configure user configurations in [YAML selectors](/reference/node-selection/yaml-selectors) or in the `flags:` block of `dbt_project.yml`, which sets project-level flags. +Use the `--indirect-selection` flag to `dbt test` or `dbt build` to configure which tests to run for the nodes you specify. You can set this as a CLI flag or an environment variable. In dbt Core, you can also configure user configurations in [YAML selectors](/reference/node-selection/yaml-selectors) or in the `flags:` block of `dbt_project.yml`, which sets project-level flags. When all flags are set, the order of precedence is as follows. Refer to [About global configs](/reference/global-configs/about-global-configs) for more details: diff --git a/website/docs/reference/global-configs/logs.md b/website/docs/reference/global-configs/logs.md index 682b9fc8393..85969a5bc02 100644 --- a/website/docs/reference/global-configs/logs.md +++ b/website/docs/reference/global-configs/logs.md @@ -66,19 +66,28 @@ See [structured logging](/reference/events-logging#structured-logging) for more The `LOG_LEVEL` config sets the minimum severity of events captured in the console and file logs. This is a more flexible alternative to the `--debug` flag. The available options for the log levels are `debug`, `info`, `warn`, `error`, or `none`. -Setting the `--log-level` will configure console and file logs. +- Setting the `--log-level` will configure console and file logs. + ```text + dbt --log-level debug run + ``` -```text -dbt --log-level debug run -``` +- Setting the `LOG_LEVEL` to `none` will disable information from being sent to either the console or file logs. + + ```text + dbt --log-level none + ``` -To set the file log level as a different value than the console, use the `--log-level-file` flag. +- To set the file log level as a different value than the console, use the `--log-level-file` flag. + ```text + dbt --log-level-file error run + ``` -```text -dbt --log-level-file error run -``` +- To only disable writing to the logs file but keep console logs, set `LOG_LEVEL_FILE` config to none. + ```text + dbt --log-level-file none + ``` ### Debug-level logging diff --git a/website/docs/reference/global-configs/resource-type.md b/website/docs/reference/global-configs/resource-type.md index 431b6c049cb..9a888c73885 100644 --- a/website/docs/reference/global-configs/resource-type.md +++ b/website/docs/reference/global-configs/resource-type.md @@ -6,7 +6,7 @@ sidebar: "resource type" <VersionBlock lastVersion="1.8"> -The `--resource-type` and `--exclude-resource-type` flags include or exclude resource types from the `dbt build`, `dbt clone`, and `dbt list` commands. In Versionless and from dbt v1.9 onwards, these flags are also supported in the `dbt test` command. +The `--resource-type` and `--exclude-resource-type` flags include or exclude resource types from the `dbt build`, `dbt clone`, and `dbt list` commands. In dbt v1.9 onwards, these flags are also supported in the `dbt test` command. </VersionBlock> diff --git a/website/docs/reference/global-configs/version-compatibility.md b/website/docs/reference/global-configs/version-compatibility.md index 80841678a85..7667dcfda9c 100644 --- a/website/docs/reference/global-configs/version-compatibility.md +++ b/website/docs/reference/global-configs/version-compatibility.md @@ -14,7 +14,7 @@ Running with dbt=1.0.0 Found 13 models, 2 tests, 1 archives, 0 analyses, 204 macros, 2 operations.... ``` -:::info Versionless +:::info dbt Cloud release tracks <Snippet path="_config-dbt-version-check" /> ::: diff --git a/website/docs/reference/model-configs.md b/website/docs/reference/model-configs.md index 9508cf68ceb..6c37b69758c 100644 --- a/website/docs/reference/model-configs.md +++ b/website/docs/reference/model-configs.md @@ -36,9 +36,11 @@ models: [+](/reference/resource-configs/plus-prefix)[materialized](/reference/resource-configs/materialized): <materialization_name> [+](/reference/resource-configs/plus-prefix)[sql_header](/reference/resource-configs/sql_header): <string> [+](/reference/resource-configs/plus-prefix)[on_configuration_change](/reference/resource-configs/on_configuration_change): apply | continue | fail #only for materialized views on supported adapters + [+](/reference/resource-configs/plus-prefix)[unique_key](/reference/resource-configs/unique_key): <column_name_or_expression> ``` + </File> </TabItem> @@ -57,6 +59,7 @@ models: [materialized](/reference/resource-configs/materialized): <materialization_name> [sql_header](/reference/resource-configs/sql_header): <string> [on_configuration_change](/reference/resource-configs/on_configuration_change): apply | continue | fail #only for materialized views on supported adapters + [unique_key](/reference/resource-configs/unique_key): <column_name_or_expression> ``` @@ -69,12 +72,13 @@ models: <File name='models/<model_name>.sql'> -```jinja +```sql {{ config( [materialized](/reference/resource-configs/materialized)="<materialization_name>", [sql_header](/reference/resource-configs/sql_header)="<string>" [on_configuration_change](/reference/resource-configs/on_configuration_change): apply | continue | fail #only for materialized views for supported adapters + [unique_key](/reference/resource-configs/unique_key)='column_name_or_expression' ) }} ``` @@ -212,7 +216,7 @@ models: <VersionBlock lastVersion="1.8"> -```jinja +```sql {{ config( [enabled](/reference/resource-configs/enabled)=true | false, @@ -233,7 +237,7 @@ models: <VersionBlock firstVersion="1.9"> -```jinja +```sql {{ config( [enabled](/reference/resource-configs/enabled)=true | false, @@ -246,8 +250,9 @@ models: [persist_docs](/reference/resource-configs/persist_docs)={<dict>}, [meta](/reference/resource-configs/meta)={<dict>}, [grants](/reference/resource-configs/grants)={<dict>}, - [contract](/reference/resource-configs/contract)={<dictionary>} - [event_time](/reference/resource-configs/event-time): my_time_field + [contract](/reference/resource-configs/contract)={<dictionary>}, + [event_time](/reference/resource-configs/event-time)='my_time_field', + ) }} ``` diff --git a/website/docs/reference/node-selection/defer.md b/website/docs/reference/node-selection/defer.md index 863494de12e..eddb1ece9d4 100644 --- a/website/docs/reference/node-selection/defer.md +++ b/website/docs/reference/node-selection/defer.md @@ -29,11 +29,12 @@ dbt test --models [...] --defer --state path/to/artifacts </VersionBlock> -When the `--defer` flag is provided, dbt will resolve `ref` calls differently depending on two criteria: -1. Is the referenced node included in the model selection criteria of the current run? -2. Does the referenced node exist as a database object in the current environment? +By default, dbt uses the [`target`](/reference/dbt-jinja-functions/target) namespace to resolve `ref` calls. -If the answer to both is **no**—a node is not included _and_ it does not exist as a database object in the current environment—references to it will use the other namespace instead, provided by the state manifest. +When `--defer` is enabled, dbt resolves ref calls using the state manifest instead, but only if: + +1. The node isn’t among the selected nodes, _and_ +2. It doesn’t exist in the database (or `--favor-state` is used). Ephemeral models are never deferred, since they serve as "passthroughs" for other `ref` calls. @@ -46,7 +47,7 @@ Deferral requires both `--defer` and `--state` to be set, either by passing flag #### Favor state -You can optionally skip the second criterion by passing the `--favor-state` flag. If passed, dbt will favor using the node defined in your `--state` namespace, even if the node exists in the current target. +When `--favor-state` is passed, dbt prioritizes node definitions from the `--state directory`. However, this doesn’t apply if the node is also part of the selected nodes. ### Example diff --git a/website/docs/reference/project-configs/analysis-paths.md b/website/docs/reference/project-configs/analysis-paths.md index 5c3d223a5cb..20e2e65c2ad 100644 --- a/website/docs/reference/project-configs/analysis-paths.md +++ b/website/docs/reference/project-configs/analysis-paths.md @@ -13,12 +13,31 @@ analysis-paths: [directorypath] </File> ## Definition -Specify a custom list of directories where [analyses](/docs/build/analyses) are located. +Specify a custom list of directories where [analyses](/docs/build/analyses) are located. ## Default Without specifying this config, dbt will not compile any `.sql` files as analyses. -However, the [`dbt init` command](/reference/commands/init) populates this value as `analyses` ([source](https://github.com/dbt-labs/dbt-starter-project/blob/HEAD/dbt_project.yml#L15)) +However, the [`dbt init` command](/reference/commands/init) populates this value as `analyses` ([source](https://github.com/dbt-labs/dbt-starter-project/blob/HEAD/dbt_project.yml#L15)). + +import RelativePath from '/snippets/_relative-path.md'; + +<RelativePath +path="analysis-paths" +absolute="/Users/username/project/analyses" +/> + +- ✅ **Do** + - Use relative path: + ```yml + analysis-paths: ["analyses"] + ``` + +- ❌ **Don't** + - Avoid absolute paths: + ```yml + analysis-paths: ["/Users/username/project/analyses"] + ``` ## Examples ### Use a subdirectory named `analyses` diff --git a/website/docs/reference/project-configs/asset-paths.md b/website/docs/reference/project-configs/asset-paths.md index 1fb3cf9f260..effae8bad7f 100644 --- a/website/docs/reference/project-configs/asset-paths.md +++ b/website/docs/reference/project-configs/asset-paths.md @@ -15,8 +15,29 @@ asset-paths: [directorypath] ## Definition Optionally specify a custom list of directories to copy to the `target` directory as part of the `docs generate` command. This is useful for rendering images in your repository in your project documentation. + ## Default -By default, dbt will not copy any additional files as part of docs generate, i.e. `asset-paths: []` + +By default, dbt will not copy any additional files as part of docs generate. For example, `asset-paths: []`. + +import RelativePath from '/snippets/_relative-path.md'; + +<RelativePath +path="asset-paths" +absolute="/Users/username/project/assets" +/> + +- ✅ **Do** + - Use relative path: + ```yml + asset-paths: ["assets"] + ``` + +- ❌ **Don't** + - Avoid absolute paths: + ```yml + asset-paths: ["/Users/username/project/assets"] + ``` ## Examples ### Compile files in the `assets` subdirectory as part of `docs generate` diff --git a/website/docs/reference/project-configs/docs-paths.md b/website/docs/reference/project-configs/docs-paths.md index 5481c19c9fd..6cd179201fc 100644 --- a/website/docs/reference/project-configs/docs-paths.md +++ b/website/docs/reference/project-configs/docs-paths.md @@ -30,6 +30,25 @@ By default, dbt will search in all resource paths for docs blocks (i.e. the comb </VersionBlock> +import RelativePath from '/snippets/_relative-path.md'; + +<RelativePath +path="docs-paths" +absolute="/Users/username/project/docs" +/> + +- ✅ **Do** + - Use relative path: + ```yml + docs-paths: ["docs"] + ``` + +- ❌ **Don't** + - Avoid absolute paths: + ```yml + docs-paths: ["/Users/username/project/docs"] + ``` + ## Example Use a subdirectory named `docs` for docs blocks: diff --git a/website/docs/reference/project-configs/macro-paths.md b/website/docs/reference/project-configs/macro-paths.md index 486ec08ffdf..d790899689e 100644 --- a/website/docs/reference/project-configs/macro-paths.md +++ b/website/docs/reference/project-configs/macro-paths.md @@ -16,7 +16,26 @@ macro-paths: [directorypath] Optionally specify a custom list of directories where [macros](/docs/build/jinja-macros#macros) are located. Note that you cannot co-locate models and macros. ## Default -By default, dbt will search for macros in a directory named `macros`, i.e. `macro-paths: ["macros"]` +By default, dbt will search for macros in a directory named `macros`. For example, `macro-paths: ["macros"]`. + +import RelativePath from '/snippets/_relative-path.md'; + +<RelativePath +path="macro-paths" +absolute="/Users/username/project/macros" +/> + +- ✅ **Do** + - Use relative path: + ```yml + macro-paths: ["macros"] + ``` + +- ❌ **Don't:** + - Avoid absolute paths: + ```yml + macro-paths: ["/Users/username/project/macros"] + ``` ## Examples ### Use a subdirectory named `custom_macros` instead of `macros` diff --git a/website/docs/reference/project-configs/model-paths.md b/website/docs/reference/project-configs/model-paths.md index a0652432787..44a40c33066 100644 --- a/website/docs/reference/project-configs/model-paths.md +++ b/website/docs/reference/project-configs/model-paths.md @@ -12,10 +12,29 @@ model-paths: [directorypath] </File> ## Definition -Optionally specify a custom list of directories where [models](/docs/build/models) and [sources](/docs/build/sources) are located. +Optionally specify a custom list of directories where [models](/docs/build/models), [sources](/docs/build/sources), and [unit tests](/docs/build/unit-tests) are located. ## Default -By default, dbt will search for models and sources in the `models` directory, i.e. `model-paths: ["models"]` +By default, dbt will search for models and sources in the `models` directory. For example, `model-paths: ["models"]`. + +import RelativePath from '/snippets/_relative-path.md'; + +<RelativePath +path="model-paths" +absolute="/Users/username/project/models" +/> + +- ✅ **Do** + - Use relative path: + ```yml + model-paths: ["models"] + ``` + +- ❌ **Don't:** + - Avoid absolute paths: + ```yml + model-paths: ["/Users/username/project/models"] + ``` ## Examples ### Use a subdirectory named `transformations` instead of `models` diff --git a/website/docs/reference/project-configs/on-run-start-on-run-end.md b/website/docs/reference/project-configs/on-run-start-on-run-end.md index 74557839f11..347ce54ab63 100644 --- a/website/docs/reference/project-configs/on-run-start-on-run-end.md +++ b/website/docs/reference/project-configs/on-run-start-on-run-end.md @@ -27,8 +27,6 @@ A SQL statement (or list of SQL statements) to be run at the start or end of the ## Examples -<Snippet path="hooks-to-grants" /> - ### Grant privileges on all schemas that dbt uses at the end of a run This leverages the [schemas](/reference/dbt-jinja-functions/schemas) variable that is only available in an `on-run-end` hook. diff --git a/website/docs/reference/project-configs/query-comment.md b/website/docs/reference/project-configs/query-comment.md index 7e654350306..f7f9472e947 100644 --- a/website/docs/reference/project-configs/query-comment.md +++ b/website/docs/reference/project-configs/query-comment.md @@ -30,7 +30,7 @@ query-comment: </File> ## Definition -A string to inject as a comment in each query that dbt runs against your database. This comment can be used to attribute SQL statements to specific dbt resources like models and tests. +A string to inject as a comment in each query that dbt runs against your database. This comment can attribute SQL statements to specific dbt resources like models and tests. The `query-comment` configuration can also call a macro that returns a string. @@ -51,7 +51,7 @@ create view analytics.analytics.orders as ( ## Using the dictionary syntax The dictionary syntax includes two keys: - * `comment` (optional, see above for default): The string to be injected to a query as a comment. + * `comment` (optional, for more information, refer to the [default](#default) section): The string to be injected into a query as a comment. * `append` (optional, default=`false`): Whether a comment should be appended (added to the bottom of a query) or not (i.e. added to the top of a query). By default, comments are added to the top of queries (i.e. `append: false`). This syntax is useful on databases like Snowflake which [remove leading SQL comments](https://docs.snowflake.com/en/release-notes/2017-04.html#queries-leading-comments-removed-during-execution). @@ -275,4 +275,6 @@ The following context variables are available when generating a query comment: | var | See [var](/reference/dbt-jinja-functions/var) | | target | See [target](/reference/dbt-jinja-functions/target) | | connection_name | A string representing the internal name for the connection. This string is generated by dbt. | -| node | A dictionary representation of the parsed node object. Use `node.unique_id`, `node.database`, `node.schema`, etc | +| node | A dictionary representation of the parsed node object. Use `node.unique_id`, `node.database`, `node.schema`, and so on. | + +Note: The `var()` function in `query-comment` macros only access variables passed through the `--vars` argument in the CLI. Variables defined in the vars block of your `dbt_project.yml` are not accessible when generating query comments. diff --git a/website/docs/reference/project-configs/require-dbt-version.md b/website/docs/reference/project-configs/require-dbt-version.md index 97b42e036ec..f659370af4e 100644 --- a/website/docs/reference/project-configs/require-dbt-version.md +++ b/website/docs/reference/project-configs/require-dbt-version.md @@ -22,7 +22,7 @@ When you set this configuration, dbt sends a helpful error message for any user If this configuration is not specified, no version check will occur. -:::info Versionless +:::info dbt Cloud release tracks <Snippet path="_config-dbt-version-check" /> diff --git a/website/docs/reference/project-configs/seed-paths.md b/website/docs/reference/project-configs/seed-paths.md index 614bda62cd2..53e2902cae0 100644 --- a/website/docs/reference/project-configs/seed-paths.md +++ b/website/docs/reference/project-configs/seed-paths.md @@ -16,10 +16,29 @@ Optionally specify a custom list of directories where [seed](/docs/build/seeds) ## Default -By default, dbt expects seeds to be located in the `seeds` directory, i.e. `seed-paths: ["seeds"]` +By default, dbt expects seeds to be located in the `seeds` directory. For example, `seed-paths: ["seeds"]`. + +import RelativePath from '/snippets/_relative-path.md'; + +<RelativePath +path="seed-paths" +absolute="/Users/username/project/seed" +/> + +- ✅ **Do** + - Use relative path: + ```yml + seed-paths: ["seed"] + ``` + +- ❌ **Don't:** + - Avoid absolute paths: + ```yml + seed-paths: ["/Users/username/project/seed"] + ``` ## Examples -### Use a subdirectory named `custom_seeds` instead of `seeds` +### Use a directory named `custom_seeds` instead of `seeds` <File name='dbt_project.yml'> diff --git a/website/docs/reference/project-configs/snapshot-paths.md b/website/docs/reference/project-configs/snapshot-paths.md index 8319833f1e6..a13697fc705 100644 --- a/website/docs/reference/project-configs/snapshot-paths.md +++ b/website/docs/reference/project-configs/snapshot-paths.md @@ -16,15 +16,35 @@ snapshot-paths: [directorypath] Optionally specify a custom list of directories where [snapshots](/docs/build/snapshots) are located. <VersionBlock firstVersion="1.9"> -In [Versionless](/docs/dbt-versions/versionless-cloud) and on dbt v1.9 and higher, you can co-locate your snapshots with models if they are [defined using the latest YAML syntax](/docs/build/snapshots). +In dbt Core v1.9+, you can co-locate your snapshots with models if they are [defined using the latest YAML syntax](/docs/build/snapshots). </VersionBlock> <VersionBlock lastVersion="1.8"> -Note that you cannot co-locate models and snapshots. However, in [Versionless](/docs/dbt-versions/versionless-cloud) and on dbt v1.9 and higher, you can co-locate your snapshots with models if they are [defined using the latest YAML syntax](/docs/build/snapshots). +Note that you cannot co-locate models and snapshots. However, in dbt Core v1.9+, you can co-locate your snapshots with models if they are [defined using the latest YAML syntax](/docs/build/snapshots). </VersionBlock> ## Default -By default, dbt will search for snapshots in the `snapshots` directory, i.e. `snapshot-paths: ["snapshots"]` +By default, dbt will search for snapshots in the `snapshots` directory. For example, `snapshot-paths: ["snapshots"]`. + + +import RelativePath from '/snippets/_relative-path.md'; + +<RelativePath +path="snapshot-paths" +absolute="/Users/username/project/snapshots" +/> + +- ✅ **Do** + - Use relative path: + ```yml + snapshot-paths: ["snapshots"] + ``` + +- ❌ **Don't:** + - Avoid absolute paths: + ```yml + snapshot-paths: ["/Users/username/project/snapshots"] + ``` ## Examples ### Use a subdirectory named `archives` instead of `snapshots` diff --git a/website/docs/reference/project-configs/test-paths.md b/website/docs/reference/project-configs/test-paths.md index 6749a07d23d..ab816eec973 100644 --- a/website/docs/reference/project-configs/test-paths.md +++ b/website/docs/reference/project-configs/test-paths.md @@ -21,6 +21,25 @@ Without specifying this config, dbt will search for tests in the `tests` directo - Generic test definitions in the `tests/generic` subdirectory - Singular tests (all other files) +import RelativePath from '/snippets/_relative-path.md'; + +<RelativePath +path="test-paths" +absolute="/Users/username/project/test" +/> + +- ✅ **Do** + - Use relative path: + ```yml + test-paths: ["test"] + ``` + +- ❌ **Don't:** + - Avoid absolute paths: + ```yml + test-paths: ["/Users/username/project/test"] + ``` + ## Examples ### Use a subdirectory named `custom_tests` instead of `tests` for data tests diff --git a/website/docs/reference/resource-configs/alias.md b/website/docs/reference/resource-configs/alias.md index 3f36bbd0d8f..5beaa238806 100644 --- a/website/docs/reference/resource-configs/alias.md +++ b/website/docs/reference/resource-configs/alias.md @@ -8,9 +8,11 @@ datatype: string <Tabs> <TabItem value="model" label="Models"> -Specify a custom alias for a model in your `dbt_project.yml` file or config block. +Specify a custom alias for a model in your `dbt_project.yml` file, `models/properties.yml` file, or config block in a SQL file. -For example, if you have a model that calculates `sales_total` and want to give it a more user-friendly alias, you can alias it like this: +For example, if you have a model that calculates `sales_total` and want to give it a more user-friendly alias, you can alias it as shown in the following examples. + +In the `dbt_project.yml` file, the following example sets a default `alias` for the `sales_total` model at the project level: <File name='dbt_project.yml'> @@ -22,16 +24,40 @@ models: ``` </File> +The following specifies an `alias` as part of the `models/properties.yml` file metadata, useful for centralized configuration: + +<File name='models/properties.yml'> + +```yml +version: 2 + +models: + - name: sales_total + config: + alias: sales_dashboard +``` +</File> + +The following assigns the `alias` directly in the In `models/sales_total.sql` file: + +<File name='models/sales_total.sql'> + +```sql +{{ config( + alias="sales_dashboard" +) }} +``` +</File> + This would return `analytics.finance.sales_dashboard` in the database, instead of the default `analytics.finance.sales_total`. </TabItem> <TabItem value="seeds" label="Seeds"> +Configure a seed's alias in your `dbt_project.yml` file or a `properties.yml` file. The following examples demonstrate how to `alias` a seed named `product_categories` to `categories_data`. -Configure a seed's alias in your `dbt_project.yml` file or config block. - -For example, if you have a seed that represents `product_categories` and want to alias it as `categories_data`, you would alias like this: +In the `dbt_project.yml` file at the project level: <File name='dbt_project.yml'> @@ -41,6 +67,21 @@ seeds: product_categories: +alias: categories_data ``` +</File> + +In the `seeds/properties.yml` file: + +<File name='seeds/properties.yml'> + +```yml +version: 2 + +seeds: + - name: product_categories + config: + alias: categories_data +``` +</File> This would return the name `analytics.finance.categories_data` in the database. @@ -55,9 +96,6 @@ seeds: +alias: country_mappings ``` - -</File> - </File> </TabItem> @@ -65,7 +103,9 @@ seeds: Configure a snapshots's alias in your `dbt_project.yml` file or config block. -For example, if you have a snapshot that is named `your_snapshot` and want to alias it as `the_best_snapshot`, you would alias like this: +The following examples demonstrate how to `alias` a snapshot named `your_snapshot` to `the_best_snapshot`. + +In the `dbt_project.yml` file at the project level: <File name='dbt_project.yml'> @@ -75,20 +115,57 @@ snapshots: your_snapshot: +alias: the_best_snapshot ``` +</File> -This would build your snapshot to `analytics.finance.the_best_snapshot` in the database. +In the `snapshots/properties.yml` file: + +<File name='snapshots/properties.yml'> + +```yml +version: 2 + +snapshots: + - name: your_snapshot + config: + alias: the_best_snapshot +``` +</File> + +In `snapshots/your_snapshot.sql` file: +<File name='snapshots/your_snapshot.sql'> + +```sql +{{ config( + alias="the_best_snapshot" +) }} +``` </File> +This would build your snapshot to `analytics.finance.the_best_snapshot` in the database. + </TabItem> <TabItem value="test" label="Tests"> -Configure a test's alias in your `schema.yml` file or config block. +Configure a data test's alias in your `dbt_project.yml` file, `properties.yml` file, or config block in the model file. + +The following examples demonstrate how to `alias` a unique data test named `order_id` to `unique_order_id_test` to identify a specific data test. -For example, to add a unique test to the `order_id` column and give it an alias `unique_order_id_test` to identify this specific test, you would alias like this: +In the `dbt_project.yml` file at the project level: -<File name='schema.yml'> +<File name='dbt_project.yml'> + +```yml +tests: + your_project: + +alias: unique_order_id_test +``` +</File> + +In the `models/properties.yml` file: + +<File name='models/properties.yml'> ```yml models: @@ -99,10 +176,22 @@ models: - unique: alias: unique_order_id_test ``` +</File> -When using `--store-failures`, this would return the name `analytics.finance.orders_order_id_unique_order_id_test` in the database. +In `tests/unique_order_id_test.sql` file: +<File name='tests/unique_order_id_test.sql'> + +```sql +{{ config( + alias="unique_order_id_test", + severity="error", +``` </File> + +When using [`store_failures_as`](/reference/resource-configs/store_failures_as), this would return the name `analytics.finance.orders_order_id_unique_order_id_test` in the database. + + </TabItem> </Tabs> diff --git a/website/docs/reference/resource-configs/athena-configs.md b/website/docs/reference/resource-configs/athena-configs.md index f871ede9fab..fd5bc663ee7 100644 --- a/website/docs/reference/resource-configs/athena-configs.md +++ b/website/docs/reference/resource-configs/athena-configs.md @@ -109,7 +109,7 @@ lf_grants={ There are some limitations and recommendations that should be considered: - `lf_tags` and `lf_tags_columns` configs support only attaching lf tags to corresponding resources. -- We recommend managing LF Tags permissions somewhere outside dbt. For example, [terraform](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/lakeformation_permissions) or [aws cdk](https://docs.aws.amazon.com/cdk/api/v1/docs/aws-lakeformation-readme.html). +- We recommend managing LF Tags permissions somewhere outside dbt. For example, [terraform](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/lakeformation_permissions) or [aws cdk](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_lakeformation-readme.html). - `data_cell_filters` management can't be automated outside dbt because the filter can't be attached to the table, which doesn't exist. Once you `enable` this config, dbt will set all filters and their permissions during every dbt run. Such an approach keeps the actual state of row-level security configuration after every dbt run and applies changes if they occur: drop, create, and update filters and their permissions. - Any tags listed in `lf_inherited_tags` should be strictly inherited from the database level and never overridden at the table and column level. - Currently, `dbt-athena` does not differentiate between an inherited tag association and an override it made previously. diff --git a/website/docs/reference/resource-configs/batch_size.md b/website/docs/reference/resource-configs/batch_size.md new file mode 100644 index 00000000000..4001545778a --- /dev/null +++ b/website/docs/reference/resource-configs/batch_size.md @@ -0,0 +1,56 @@ +--- +title: "batch_size" +id: "batch-size" +sidebar_label: "batch_size" +resource_types: [models] +description: "dbt uses `batch_size` to determine how large batches are when running a microbatch incremental model." +datatype: hour | day | month | year +--- + +Available in the [dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks) and dbt Core v1.9 and higher. + +## Definition + +The`batch_size` config determines how large batches are when running a microbatch. Accepted values are `hour`, `day`, `month`, or `year`. You can configure `batch_size` for a [model](/docs/build/models) in your `dbt_project.yml` file, property YAML file, or config block. + +## Examples + +The following examples set `day` as the `batch_size` for the `user_sessions` model. + +Example of the `batch_size` config in the `dbt_project.yml` file: + +<File name='dbt_project.yml'> + +```yml +models: + my_project: + user_sessions: + +batch_size: day +``` +</File> + +Example in a properties YAML file: + +<File name='models/properties.yml'> + +```yml +models: + - name: user_sessions + config: + batch_size: day +``` + +</File> + +Example in sql model config block: + +<File name="models/user_sessions.sql"> + +```sql +{{ config( + lookback='day +) }} +``` + +</File> + diff --git a/website/docs/reference/resource-configs/begin.md b/website/docs/reference/resource-configs/begin.md new file mode 100644 index 00000000000..dd47419be21 --- /dev/null +++ b/website/docs/reference/resource-configs/begin.md @@ -0,0 +1,55 @@ +--- +title: "begin" +id: "begin" +sidebar_label: "begin" +resource_types: [models] +description: "dbt uses `begin` to determine when a microbatch incremental model should begin from. When defined on a micorbatch incremental model, `begin` is used as the lower time bound when the model is built for the first time or fully refreshed." +datatype: string +--- + +Available in the [dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks) and dbt Core v1.9 and higher. + +## Definition + +Set the `begin` config to the timestamp value at which your microbatch model data should begin — at the point the data becomes relevant for the microbatch model. You can configure `begin` for a [model](/docs/build/models) in your `dbt_project.yml` file, property YAML file, or config block. The value for `begin` must be a string representing an ISO formatted date OR date and time. + +## Examples + +The following examples set `2024-01-01 00:00:00` as the `begin` config for the `user_sessions` model. + +Example in the `dbt_project.yml` file: + +<File name='dbt_project.yml'> + +```yml +models: + my_project: + user_sessions: + +begin: "2024-01-01 00:00:00" +``` +</File> + +Example in a properties YAML file: + +<File name='models/properties.yml'> + +```yml +models: + - name: user_sessions + config: + begin: "2024-01-01 00:00:00" +``` + +</File> + +Example in sql model config block: + +<File name="models/user_sessions.sql"> + +```sql +{{ config( + begin='2024-01-01 00:00:00' +) }} +``` + +</File> diff --git a/website/docs/reference/resource-configs/bigquery-configs.md b/website/docs/reference/resource-configs/bigquery-configs.md index 9dd39c936b6..c912bca0688 100644 --- a/website/docs/reference/resource-configs/bigquery-configs.md +++ b/website/docs/reference/resource-configs/bigquery-configs.md @@ -425,9 +425,10 @@ Please note that in order for policy tags to take effect, [column-level `persist The [`incremental_strategy` config](/docs/build/incremental-strategy) controls how dbt builds incremental models. dbt uses a [merge statement](https://cloud.google.com/bigquery/docs/reference/standard-sql/dml-syntax) on BigQuery to refresh incremental tables. -The `incremental_strategy` config can be set to one of two values: - - `merge` (default) - - `insert_overwrite` +The `incremental_strategy` config can be set to one of the following values: +- `merge` (default) +- `insert_overwrite` +- [`microbatch`](/docs/build/incremental-microbatch) ### Performance and cost @@ -561,7 +562,7 @@ If no `partitions` configuration is provided, dbt will instead: 3. Query the destination table to find the _max_ partition in the database When building your model SQL, you can take advantage of the introspection performed -by dbt to filter for only _new_ data. The max partition in the destination table +by dbt to filter for only _new_ data. The maximum value in the partitioned field in the destination table will be available using the `_dbt_max_partition` BigQuery scripting variable. **Note:** this is a BigQuery SQL variable, not a dbt Jinja variable, so no jinja brackets are required to access this variable. @@ -908,3 +909,10 @@ By default, this is set to `True` to support the default `intermediate_format` o ### The `intermediate_format` parameter The `intermediate_format` parameter specifies which file format to use when writing records to a table. The default is `parquet`. +<VersionBlock firstVersion="1.8"> + +## Unit test limitations + +You must specify all fields in a BigQuery `STRUCT` for [unit tests](/docs/build/unit-tests). You cannot use only a subset of fields in a `STRUCT`. + +</VersionBlock> diff --git a/website/docs/reference/resource-configs/contract.md b/website/docs/reference/resource-configs/contract.md index fb25076b0d9..18266ec672f 100644 --- a/website/docs/reference/resource-configs/contract.md +++ b/website/docs/reference/resource-configs/contract.md @@ -14,6 +14,13 @@ When the `contract` configuration is enforced, dbt will ensure that your model's This is to ensure that the people querying your model downstream—both inside and outside dbt—have a predictable and consistent set of columns to use in their analyses. Even a subtle change in data type, such as from `boolean` (`true`/`false`) to `integer` (`0`/`1`), could cause queries to fail in surprising ways. +## Support + +At present, model contracts are supported for: +- SQL models (not yet Python) +- Models materialized as `table`, `view`, and `incremental` (with `on_schema_change: append_new_columns` or `on_schema_change: fail`) +- The most popular data platforms — though support and enforcement of different [constraint types](/reference/resource-properties/constraints) vary by platform + ## Data type aliasing dbt uses built-in type aliasing for the `data_type` defined in your YAML. For example, you can specify `string` in your contract, and on Postgres/Redshift, dbt will convert it to `text`. If dbt doesn't recognize the `data_type` name among its known aliases, it will pass it through as-is. This is enabled by default, but you can opt-out by setting `alias_types` to `false`. @@ -91,12 +98,6 @@ When you `dbt run` your model, _before_ dbt has materialized it as a table in th 20:53:45 > in macro assert_columns_equivalent (macros/materializations/models/table/columns_spec_ddl.sql) ``` -## Support - -At present, model contracts are supported for: -- SQL models (not yet Python) -- Models materialized as `table`, `view`, and `incremental` (with `on_schema_change: append_new_columns`) -- The most popular data platforms — though support and enforcement of different [constraint types](/reference/resource-properties/constraints) vary by platform ### Incremental models and `on_schema_change` diff --git a/website/docs/reference/resource-configs/database.md b/website/docs/reference/resource-configs/database.md index 338159b30dc..6c57e7e2c69 100644 --- a/website/docs/reference/resource-configs/database.md +++ b/website/docs/reference/resource-configs/database.md @@ -49,7 +49,7 @@ This would result in the generated relation being located in the `staging` datab <VersionBlock lastVersion="1.8"> -Available for versionless dbt Cloud or dbt Core v1.9+. Select v1.9 or newer from the version dropdown to view the configs. +Available for dbt Cloud release tracks or dbt Core v1.9+. Select v1.9 or newer from the version dropdown to view the configs. </VersionBlock> @@ -79,22 +79,19 @@ This results in the generated relation being located in the `snapshots` database <TabItem value="test" label="Tests"> -Configure a database in your `dbt_project.yml` file. +Customize the database for storing test results in your `dbt_project.yml` file. -For example, to load a test into a database called `reporting` instead of the target database, you can configure it like this: +For example, to save test results in a specific database, you can configure it like this: <File name='dbt_project.yml'> ```yml tests: - - my_not_null_test: - column_name: order_id - type: not_null - +database: reporting + +store_failures: true + +database: test_results ``` -This would result in the generated relation being located in the `reporting` database, so the full relation name would be `reporting.finance.my_not_null_test`. - +This would result in the test results being stored in the `test_results` database. </File> </TabItem> </Tabs> diff --git a/website/docs/reference/resource-configs/databricks-configs.md b/website/docs/reference/resource-configs/databricks-configs.md index c77f3494aa7..6ac3e23c113 100644 --- a/website/docs/reference/resource-configs/databricks-configs.md +++ b/website/docs/reference/resource-configs/databricks-configs.md @@ -51,7 +51,7 @@ We do not yet have a PySpark API to set tblproperties at table creation, so this <VersionBlock firstVersion="1.9"> -dbt Core v.9 and Versionless dbt Cloud support for `table_format: iceberg`, in addition to all previous table configurations supported in 1.8. +dbt-databricks v1.9 adds support for the `table_format: iceberg` config. Try it now on the [dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks). All other table configurations were also supported in 1.8. | Option | Description | Required? | Model Support | Example | |---------------------|-----------------------------|-------------------------------------------|-----------------|--------------------------| @@ -76,7 +76,7 @@ dbt Core v.9 and Versionless dbt Cloud support for `table_format: iceberg`, in a ### Python submission methods -In dbt v1.9 and higher, or in [Versionless](/docs/dbt-versions/versionless-cloud) dbt Cloud, you can use these four options for `submission_method`: +In dbt-databricks v1.9 (try it now in [the dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks)), you can use these four options for `submission_method`: * `all_purpose_cluster`: Executes the python model either directly using the [command api](https://docs.databricks.com/api/workspace/commandexecution) or by uploading a notebook and creating a one-off job run * `job_cluster`: Creates a new job cluster to execute an uploaded notebook as a one-off job run diff --git a/website/docs/reference/resource-configs/dbt_valid_to_current.md b/website/docs/reference/resource-configs/dbt_valid_to_current.md index 7c0e33aa5d7..2a6cf3abe6d 100644 --- a/website/docs/reference/resource-configs/dbt_valid_to_current.md +++ b/website/docs/reference/resource-configs/dbt_valid_to_current.md @@ -6,7 +6,7 @@ default_value: {NULL} id: "dbt_valid_to_current" --- -Available from dbt v1.9 or with [Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) dbt Cloud. +Available from dbt v1.9 or with [the dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks) dbt Cloud. <File name='snapshots/schema.yml'> diff --git a/website/docs/reference/resource-configs/event-time.md b/website/docs/reference/resource-configs/event-time.md index d8c0c0e0472..c18c8de6397 100644 --- a/website/docs/reference/resource-configs/event-time.md +++ b/website/docs/reference/resource-configs/event-time.md @@ -7,7 +7,7 @@ description: "dbt uses event_time to understand when an event occurred. When def datatype: string --- -Available in dbt Cloud Versionless and dbt Core v1.9 and higher. +Available in [the dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks) and dbt Core v1.9 and higher. <Tabs> <TabItem value="model" label="Models"> diff --git a/website/docs/reference/resource-configs/hard-deletes.md b/website/docs/reference/resource-configs/hard-deletes.md new file mode 100644 index 00000000000..50c8046f4e1 --- /dev/null +++ b/website/docs/reference/resource-configs/hard-deletes.md @@ -0,0 +1,111 @@ +--- +title: hard_deletes +resource_types: [snapshots] +description: "Use the `hard_deletes` config to control how deleted rows are tracked in your snapshot table." +datatype: "boolean" +default_value: {ignore} +id: "hard-deletes" +sidebar_label: "hard_deletes" +--- + +Available from dbt v1.9 or with [dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks). + + +<File name='snapshots/schema.yml'> + +```yaml +snapshots: + - name: <snapshot_name> + config: + hard_deletes: 'ignore' | 'invalidate' | 'new_record' +``` +</File> + +<File name='dbt_project.yml'> + +```yml +snapshots: + [<resource-path>](/reference/resource-configs/resource-path): + +hard_deletes: "ignore" | "invalidate" | "new_record" +``` + +</File> + +<File name='snapshots/<filename>.sql'> + +```sql +{{ + config( + unique_key='id', + strategy='timestamp', + updated_at='updated_at', + hard_deletes='ignore' | 'invalidate' | 'new_record' + ) +}} +``` + +</File> + + +## Description + +The `hard_deletes` config gives you more control on how to handle deleted rows from the source. Supported options are `ignore` (default), `invalidate` (replaces the legacy `invalidate_hard_deletes=true`), and `new_record`. Note that `new_record` will create a new metadata column in the snapshot table. + +import HardDeletes from '/snippets/_hard-deletes.md'; + +<HardDeletes /> + +:::warning + +If you're updating an existing snapshot to use the `hard_deletes` config, dbt _will not_ handle migrations automatically. We recommend either only using these settings for net-new snapshots, or [arranging an update](/reference/snapshot-configs#snapshot-configuration-migration) of pre-existing tables before enabling this setting. +::: + +## Default + +By default, if you don’t specify `hard_deletes`, it'll automatically default to `ignore`. Deleted rows will not be tracked and their `dbt_valid_to` column remains `NULL`. + +The `hard_deletes` config has three methods: + +| Methods | Description | +| --------- | ----------- | +| `ignore` (default) | No action for deleted records. | +| `invalidate` | Behaves the same as the existing `invalidate_hard_deletes=true`, where deleted records are invalidated by setting `dbt_valid_to` to current time. This method replaces the `invalidate_hard_deletes` config to give you more control on how to handle deleted rows from the source. | +| `new_record` | Tracks deleted records as new rows using the `dbt_is_deleted` meta field when records are deleted.| + +## Considerations +- **Backward compatibility**: The `invalidate_hard_deletes` config is still supported for existing snapshots but can't be used alongside `hard_deletes`. +- **New snapshots**: For new snapshots, we recommend using `hard_deletes` instead of `invalidate_hard_deletes`. +- **Migration**: If you switch an existing snapshot to use `hard_deletes` without migrating your data, you may encounter inconsistent or incorrect results, such as a mix of old and new data formats. + +## Example + +<File name='snapshots/schema.yml'> + +```yaml +snapshots: + - name: my_snapshot + config: + hard_deletes: new_record # options are: 'ignore', 'invalidate', or 'new_record' + strategy: timestamp + updated_at: updated_at + columns: + - name: dbt_valid_from + description: Timestamp when the record became valid. + - name: dbt_valid_to + description: Timestamp when the record stopped being valid. + - name: dbt_is_deleted + description: Indicates whether the record was deleted. +``` + +</File> + +The resulting snapshot table contains the `hard_deletes: new_record` configuration. If a record is deleted and later restored, the resulting snapshot table might look like this: + +| id | dbt_scd_id | Status | dbt_updated_at | dbt_valid_from | dbt_valid_to | dbt_is_deleted | +| -- | -------------------- | ----- | -------------------- | --------------------| -------------------- | ----------- | +| 1 | 60a1f1dbdf899a4dd... | pending | 2024-10-02 ... | 2024-05-19... | 2024-05-20 ... | False | +| 1 | b1885d098f8bcff51... | pending | 2024-10-02 ... | 2024-05-20 ... | 2024-06-03 ... | True | +| 1 | b1885d098f8bcff53... | shipped | 2024-10-02 ... | 2024-06-03 ... | | False | +| 2 | b1885d098f8bcff55... | active | 2024-10-02 ... | 2024-05-19 ... | | False | + +In this example, the `dbt_is_deleted` column is set to `True` when the record is deleted. When the record is restored, the `dbt_is_deleted` column is set to `False`. diff --git a/website/docs/reference/resource-configs/invalidate_hard_deletes.md b/website/docs/reference/resource-configs/invalidate_hard_deletes.md index bdaec7e33a9..67123487fa1 100644 --- a/website/docs/reference/resource-configs/invalidate_hard_deletes.md +++ b/website/docs/reference/resource-configs/invalidate_hard_deletes.md @@ -1,9 +1,17 @@ --- +title: invalidate_hard_deletes (legacy) resource_types: [snapshots] description: "Invalidate_hard_deletes - Read this in-depth guide to learn about configurations in dbt." datatype: column_name +sidebar_label: invalidate_hard_deletes (legacy) --- +:::warning This is a legacy config — Use the [`hard_deletes`](/reference/resource-configs/hard-deletes) config instead. + +In Versionless and dbt Core 1.9 and higher, the [`hard_deletes`](/reference/resource-configs/hard-deletes) config replaces the `invalidate_hard_deletes` config for better control over how to handle deleted rows from the source. + +For new snapshots, set the config to `hard_deletes='invalidate'` instead of `invalidate_hard_deletes=true`. For existing snapshots, [arrange an update](/reference/snapshot-configs#snapshot-configuration-migration) of pre-existing tables before enabling this setting. Refer to +::: <VersionBlock firstVersion="1.9"> diff --git a/website/docs/reference/resource-configs/lookback.md b/website/docs/reference/resource-configs/lookback.md new file mode 100644 index 00000000000..037ffdeb68f --- /dev/null +++ b/website/docs/reference/resource-configs/lookback.md @@ -0,0 +1,55 @@ +--- +title: "lookback" +id: "lookback" +sidebar_label: "lookback" +resource_types: [models] +description: "dbt uses `lookback` to detrmine how many 'batches' of `batch_size` to reprocesses when a microbatch incremental model is running incrementally." +datatype: int +--- + +Available in the [dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks) and dbt Core v1.9 and higher. + +## Definition + +Set the `lookback` to an integer greater than or equal to zero. The default value is `1`. You can configure `lookback` for a [model](/docs/build/models) in your `dbt_project.yml` file, property YAML file, or config block. + +## Examples + +The following examples set `2` as the `lookback` config for the `user_sessions` model. + +Example in the `dbt_project.yml` file: + +<File name='dbt_project.yml'> + +```yml +models: + my_project: + user_sessions: + +lookback: 2 +``` +</File> + +Example in a properties YAML file: + +<File name='models/properties.yml'> + +```yml +models: + - name: user_sessions + config: + lookback: 2 +``` + +</File> + +Example in sql model config block: + +<File name="models/user_sessions.sql"> + +```sql +{{ config( + lookback=2 +) }} +``` + +</File> diff --git a/website/docs/reference/resource-configs/no-configs.md b/website/docs/reference/resource-configs/no-configs.md index 5eec26917c8..f72b286c837 100644 --- a/website/docs/reference/resource-configs/no-configs.md +++ b/website/docs/reference/resource-configs/no-configs.md @@ -1,11 +1,12 @@ --- -title: "No specifc configurations for this Adapter" +title: "No specific configurations for this adapter" id: "no-configs" --- If you were guided to this page from a data platform setup article, it most likely means: - Setting up the profile is the only action the end-user needs to take on the data platform, or -- The subsequent actions the end-user needs to take are not currently documented +- The subsequent actions the end-user needs to take are not currently documented, or +- Relevant information is provided on the documentation pages of the data platform vendor. If you'd like to contribute to data platform-specific configuration information, refer to [Documenting a new adapter](/guides/adapter-creation) diff --git a/website/docs/reference/resource-configs/postgres-configs.md b/website/docs/reference/resource-configs/postgres-configs.md index f2bf90a93c0..e71c6f1484d 100644 --- a/website/docs/reference/resource-configs/postgres-configs.md +++ b/website/docs/reference/resource-configs/postgres-configs.md @@ -11,6 +11,7 @@ In dbt-postgres, the following incremental materialization strategies are suppor - `append` (default when `unique_key` is not defined) - `merge` - `delete+insert` (default when `unique_key` is defined) +- [`microbatch`](/docs/build/incremental-microbatch) ## Performance optimizations diff --git a/website/docs/reference/resource-configs/pre-hook-post-hook.md b/website/docs/reference/resource-configs/pre-hook-post-hook.md index bd01a7be840..ee3c81b0fd6 100644 --- a/website/docs/reference/resource-configs/pre-hook-post-hook.md +++ b/website/docs/reference/resource-configs/pre-hook-post-hook.md @@ -160,8 +160,6 @@ import SQLCompilationError from '/snippets/_render-method.md'; ## Examples -<Snippet path="hooks-to-grants" /> - ### [Redshift] Unload one model to S3 <File name='model.sql'> diff --git a/website/docs/reference/resource-configs/redshift-configs.md b/website/docs/reference/resource-configs/redshift-configs.md index b033cd6267e..01c9bffd055 100644 --- a/website/docs/reference/resource-configs/redshift-configs.md +++ b/website/docs/reference/resource-configs/redshift-configs.md @@ -17,6 +17,7 @@ In dbt-redshift, the following incremental materialization strategies are suppor - `append` (default when `unique_key` is not defined) - `merge` - `delete+insert` (default when `unique_key` is defined) +- [`microbatch`](/docs/build/incremental-microbatch) All of these strategies are inherited from dbt-postgres. diff --git a/website/docs/reference/resource-configs/schema.md b/website/docs/reference/resource-configs/schema.md index 1e2ff47729c..6f56215de61 100644 --- a/website/docs/reference/resource-configs/schema.md +++ b/website/docs/reference/resource-configs/schema.md @@ -50,7 +50,7 @@ This would result in the generated relation being located in the `mappings` sche <VersionBlock lastVersion="1.8"> -Available for versionless dbt Cloud or dbt Core v1.9+. Select v1.9 or newer from the version dropdown to view the configs. +Available in dbt Core v1.9+. Select v1.9 or newer from the version dropdown to view the configs. Try it now in the [dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks). </VersionBlock> @@ -108,7 +108,9 @@ This would result in the test results being stored in the `test_results` schema. Refer to [Usage](#usage) for more examples. ## Definition -Optionally specify a custom schema for a [model](/docs/build/sql-models) or [seed](/docs/build/seeds). (To specify a schema for a [snapshot](/docs/build/snapshots), use the [`target_schema` config](/reference/resource-configs/target_schema)). +Optionally specify a custom schema for a [model](/docs/build/sql-models), [seed](/docs/build/seeds), [snapshot](/docs/build/snapshots), [saved query](/docs/build/saved-queries), or [test](/docs/build/data-tests). + +For users on dbt Cloud v1.8 or earlier, use the [`target_schema` config](/reference/resource-configs/target_schema) to specify a custom schema for a snapshot. When dbt creates a relation (<Term id="table" />/<Term id="view" />) in a database, it creates it as: `{{ database }}.{{ schema }}.{{ identifier }}`, e.g. `analytics.finance.payments` diff --git a/website/docs/reference/resource-configs/snapshot_meta_column_names.md b/website/docs/reference/resource-configs/snapshot_meta_column_names.md index 46aba7886d0..f1d29ba8bee 100644 --- a/website/docs/reference/resource-configs/snapshot_meta_column_names.md +++ b/website/docs/reference/resource-configs/snapshot_meta_column_names.md @@ -6,7 +6,7 @@ default_value: {"dbt_valid_from": "dbt_valid_from", "dbt_valid_to": "dbt_valid_t id: "snapshot_meta_column_names" --- -Starting in 1.9 or with [versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) dbt Cloud. +Available in dbt Core v1.9+. Select v1.9 or newer from the version dropdown to view the configs. Try it now in the [dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks). <File name='snapshots/schema.yml'> @@ -19,6 +19,7 @@ snapshots: dbt_valid_to: <string> dbt_scd_id: <string> dbt_updated_at: <string> + dbt_is_deleted: <boolean> ``` @@ -34,6 +35,7 @@ snapshots: "dbt_valid_to": "<string>", "dbt_scd_id": "<string>", "dbt_updated_at": "<string>", + "dbt_is_deleted": "<boolean>", } ) }} @@ -52,7 +54,7 @@ snapshots: dbt_valid_to: <string> dbt_scd_id: <string> dbt_updated_at: <string> - + dbt_is_deleted: <boolean> ``` </File> @@ -71,6 +73,7 @@ By default, dbt snapshots use the following column names to track change history | `dbt_valid_to` | The timestamp when this row is no longer valid. | | | `dbt_scd_id` | A unique key generated for each snapshot row. | This is used internally by dbt. | | `dbt_updated_at` | The `updated_at` timestamp of the source record when this snapshot row was inserted. | This is used internally by dbt. | +| `dbt_is_deleted` | A boolean value indicating if the record has been deleted. `True` if deleted, `False` otherwise. | Added when `hard_deletes='new_record'` is configured. | However, these column names can be customized using the `snapshot_meta_column_names` config. @@ -92,18 +95,21 @@ snapshots: unique_key: id strategy: check check_cols: all + hard_deletes: new_record snapshot_meta_column_names: dbt_valid_from: start_date dbt_valid_to: end_date dbt_scd_id: scd_id dbt_updated_at: modified_date + dbt_is_deleted: is_deleted ``` </File> The resulting snapshot table contains the configured meta column names: -| id | scd_id | modified_date | start_date | end_date | -| -- | -------------------- | -------------------- | -------------------- | -------------------- | -| 1 | 60a1f1dbdf899a4dd... | 2024-10-02 ... | 2024-10-02 ... | 2024-10-02 ... | -| 2 | b1885d098f8bcff51... | 2024-10-02 ... | 2024-10-02 ... | | +| id | scd_id | modified_date | start_date | end_date | is_deleted | +| -- | -------------------- | -------------------- | -------------------- | -------------------- | ---------- | +| 1 | 60a1f1dbdf899a4dd... | 2024-10-02 ... | 2024-10-02 ... | 2024-10-03 ... | False | +| 1 | 60a1f1dbdf899a4dd... | 2024-10-03 ... | 2024-10-03 ... | | True | +| 2 | b1885d098f8bcff51... | 2024-10-02 ... | 2024-10-02 ... | | False | diff --git a/website/docs/reference/resource-configs/snowflake-configs.md b/website/docs/reference/resource-configs/snowflake-configs.md index 7bef180e3d3..d576b195b65 100644 --- a/website/docs/reference/resource-configs/snowflake-configs.md +++ b/website/docs/reference/resource-configs/snowflake-configs.md @@ -38,11 +38,11 @@ flags: The following configurations are supported. For more information, check out the Snowflake reference for [`CREATE ICEBERG TABLE` (Snowflake as the catalog)](https://docs.snowflake.com/en/sql-reference/sql/create-iceberg-table-snowflake). -| Field | Type | Required | Description | Sample input | Note | -| --------------------- | ------ | -------- | -------------------------------------------------------------------------------------------------------------------------- | ------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Table Format | String | Yes | Configures the objects table format. | `iceberg` | `iceberg` is the only accepted value. | +| Field | Type | Required | Description | Sample input | Note | +| ------ | ----- | -------- | ------------- | ------------ | ------ | +| Table Format | String | Yes | Configures the objects table format. | `iceberg` | `iceberg` is the only accepted value. | | External volume | String | Yes(*) | Specifies the identifier (name) of the external volume where Snowflake writes the Iceberg table's metadata and data files. | `my_s3_bucket` | *You don't need to specify this if the account, database, or schema already has an associated external volume. [More info](https://docs.snowflake.com/en/sql-reference/sql/create-iceberg-table-snowflake#:~:text=Snowflake%20Table%20Structures.-,external_volume) | -| Base location Subpath | String | No | An optional suffix to add to the `base_location` path that dbt automatically specifies. | `jaffle_marketing_folder` | We recommend that you do not specify this. Modifying this parameter results in a new Iceberg table. See [Base Location](#base-location) for more info. | +| Base location Subpath | String | No | An optional suffix to add to the `base_location` path that dbt automatically specifies. | `jaffle_marketing_folder` | We recommend that you do not specify this. Modifying this parameter results in a new Iceberg table. See [Base Location](#base-location) for more info. | ### Example configuration @@ -470,8 +470,15 @@ In this example, you can set up a query tag to be applied to every query with th The [`incremental_strategy` config](/docs/build/incremental-strategy) controls how dbt builds incremental models. By default, dbt will use a [merge statement](https://docs.snowflake.net/manuals/sql-reference/sql/merge.html) on Snowflake to refresh incremental tables. +Snowflake supports the following incremental strategies: +- Merge (default) +- Append +- Delete+insert +- [`microbatch`](/docs/build/incremental-microbatch) + Snowflake's `merge` statement fails with a "nondeterministic merge" error if the `unique_key` specified in your model config is not actually unique. If you encounter this error, you can instruct dbt to use a two-step incremental approach by setting the `incremental_strategy` config for your model to `delete+insert`. + ## Configuring table clustering dbt supports [table clustering](https://docs.snowflake.net/manuals/user-guide/tables-clustering-keys.html) on Snowflake. To control clustering for a <Term id="table" /> or incremental model, use the `cluster_by` config. When this configuration is applied, dbt will do two things: @@ -701,4 +708,4 @@ flags: ``` -</VersionBlock> \ No newline at end of file +</VersionBlock> diff --git a/website/docs/reference/resource-configs/spark-configs.md b/website/docs/reference/resource-configs/spark-configs.md index 3b2174b8ff5..a52fd93eace 100644 --- a/website/docs/reference/resource-configs/spark-configs.md +++ b/website/docs/reference/resource-configs/spark-configs.md @@ -37,7 +37,8 @@ For that reason, the dbt-spark plugin leans heavily on the [`incremental_strateg - **`append`** (default): Insert new records without updating or overwriting any existing data. - **`insert_overwrite`**: If `partition_by` is specified, overwrite partitions in the <Term id="table" /> with new data. If no `partition_by` is specified, overwrite the entire table with new data. - **`merge`** (Delta, Iceberg and Hudi file format only): Match records based on a `unique_key`; update old records, insert new ones. (If no `unique_key` is specified, all new data is inserted, similar to `append`.) - +- `microbatch` Implements the [microbatch strategy](/docs/build/incremental-microbatch) using `event_time` to define time-based ranges for filtering data. + Each of these strategies has its pros and cons, which we'll discuss below. As with any model config, `incremental_strategy` may be specified in `dbt_project.yml` or within a model file's `config()` block. ### The `append` strategy diff --git a/website/docs/reference/resource-configs/target_database.md b/website/docs/reference/resource-configs/target_database.md index 3c07b442107..f80dd31f214 100644 --- a/website/docs/reference/resource-configs/target_database.md +++ b/website/docs/reference/resource-configs/target_database.md @@ -6,7 +6,9 @@ datatype: string :::note -For [versionless](/docs/dbt-versions/core-upgrade/upgrading-to-v1.8#versionless) dbt Cloud accounts and dbt Core v1.9+, this functionality is no longer utilized. Use the [database](/reference/resource-configs/database) config as an alternative to define a custom database while still respecting the `generate_database_name` macro. +Starting in dbt Core v1.9+, this functionality is no longer utilized. Use the [database](/reference/resource-configs/database) config as an alternative to define a custom database while still respecting the `generate_database_name` macro. + +Try it now in the [dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks). ::: diff --git a/website/docs/reference/resource-configs/target_schema.md b/website/docs/reference/resource-configs/target_schema.md index ffa95df9be7..1117e3ec42c 100644 --- a/website/docs/reference/resource-configs/target_schema.md +++ b/website/docs/reference/resource-configs/target_schema.md @@ -6,7 +6,9 @@ datatype: string :::info -For [versionless](/docs/dbt-versions/core-upgrade/upgrading-to-v1.8#versionless) dbt Cloud accounts and dbt Core v1.9+, this configuration is no longer required. Use the [schema](/reference/resource-configs/schema) config as an alternative to define a custom schema while still respecting the `generate_schema_name` macro. +Starting in dbt Core v1.9+, this functionality is no longer utilized. Use the [database](/reference/resource-configs/database) config as an alternative to define a custom database while still respecting the `generate_database_name` macro. + +Try it now in the [dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks). ::: @@ -40,7 +42,7 @@ On **BigQuery**, this is analogous to a `dataset`. ## Default <VersionBlock lastVersion="1.8" >This is a required parameter, no default is provided. </VersionBlock> -<VersionBlock firstVersion="1.9.1">For versionless dbt Cloud accounts and dbt Core v1.9+, this is not a required parameter. </VersionBlock> +<VersionBlock firstVersion="1.9.1">In dbt Core v1.9+ and dbt Cloud "Latest" release track, this is not a required parameter. </VersionBlock> ## Examples ### Build all snapshots in a schema named `snapshots` diff --git a/website/docs/reference/resource-configs/unique_key.md b/website/docs/reference/resource-configs/unique_key.md index 41884e175d2..071102bae6d 100644 --- a/website/docs/reference/resource-configs/unique_key.md +++ b/website/docs/reference/resource-configs/unique_key.md @@ -1,12 +1,65 @@ --- -resource_types: [snapshots] +resource_types: [snapshots, models] description: "Learn more about unique_key configurations in dbt." datatype: column_name_or_expression --- +<Tabs> + +<TabItem value="models" label="Models"> + +Configure the `unique_key` in the `config` block of your [incremental model's](/docs/build/incremental-models) SQL file, in your `models/properties.yml` file, or in your `dbt_project.yml` file. + +<File name='models/my_incremental_model.sql'> + +```sql +{{ + config( + materialized='incremental', + unique_key='id' + ) +}} + +``` + +</File> + +<File name='models/properties.yml'> + +```yaml +models: + - name: my_incremental_model + description: "An incremental model example with a unique key." + config: + materialized: incremental + unique_key: id + +``` + +</File> + +<File name='dbt_project.yml'> + +```yaml +name: jaffle_shop + +models: + jaffle_shop: + staging: + +unique_key: id +``` + +</File> + +</TabItem> + +<TabItem value="snapshots" label="Snapshots"> + <VersionBlock firstVersion="1.9"> +For [snapshots](/docs/build/snapshots), configure the `unique_key` in the your `snapshot/filename.yml` file or in your `dbt_project.yml` file. + <File name='snapshots/<filename>.yml'> ```yaml @@ -23,6 +76,8 @@ snapshots: <VersionBlock lastVersion="1.8"> +Configure the `unique_key` in the `config` block of your snapshot SQL file or in your `dbt_project.yml` file. + import SnapshotYaml from '/snippets/_snapshot-yaml-spec.md'; <SnapshotYaml/> @@ -49,10 +104,13 @@ snapshots: </File> +</TabItem> +</Tabs> + ## Description -A column name or expression that is unique for the inputs of a snapshot. dbt uses this to match records between a result set and an existing snapshot, so that changes can be captured correctly. +A column name or expression that is unique for the inputs of a snapshot or incremental model. dbt uses this to match records between a result set and an existing snapshot or incremental model, so that changes can be captured correctly. -In Versionless and dbt v1.9 and later, [snapshots](/docs/build/snapshots) are defined and configured in YAML files within your `snapshots/` directory. You can specify one or multiple `unique_key` values within your snapshot YAML file's `config` key. +In dbt Cloud "Latest" release track and from dbt v1.9, [snapshots](/docs/build/snapshots) are defined and configured in YAML files within your `snapshots/` directory. You can specify one or multiple `unique_key` values within your snapshot YAML file's `config` key. :::caution @@ -67,6 +125,32 @@ This is a **required parameter**. No default is provided. ## Examples ### Use an `id` column as a unique key +<Tabs> + +<TabItem value="models" label="Models"> + +In this example, the `id` column is the unique key for an incremental model. + +<File name='models/my_incremental_model.sql'> + +```sql +{{ + config( + materialized='incremental', + unique_key='id' + ) +}} + +select * from .. +``` + +</File> +</TabItem> + +<TabItem value="snapshots" label="Snapshots"> + +In this example, the `id` column is used as a unique key for a snapshot. + <VersionBlock firstVersion="1.9"> <File name="snapshots/orders_snapshot.yml"> @@ -114,10 +198,38 @@ snapshots: </File> +</TabItem> +</Tabs> + <VersionBlock firstVersion="1.9"> ### Use multiple unique keys +<Tabs> +<TabItem value="models" label="Models"> + +Configure multiple unique keys for an incremental model as a string representing a single column or a list of single-quoted column names that can be used together, for example, `['col1', 'col2', …]`. + +Columns must not contain null values, otherwise the incremental model will fail to match rows and generate duplicate rows. Refer to [Defining a unique key](/docs/build/incremental-models#defining-a-unique-key-optional) for more information. + +<File name='models/my_incremental_model.sql'> + +```sql +{{ config( + materialized='incremental', + unique_key=['order_id', 'location_id'] +) }} + +with... + +``` + +</File> + +</TabItem> + +<TabItem value="snapshots" label="Snapshots"> + You can configure snapshots to use multiple unique keys for `primary_key` columns. <File name='snapshots/transaction_items_snapshot.yml'> @@ -137,12 +249,35 @@ snapshots: ``` </File> +</TabItem> +</Tabs> </VersionBlock> <VersionBlock lastVersion="1.8"> ### Use a combination of two columns as a unique key +<Tabs> +<TabItem value="models" label="Models"> + +<File name='models/my_incremental_model.sql'> + +```sql +{{ config( + materialized='incremental', + unique_key=['order_id', 'location_id'] +) }} + +with... + +``` + +</File> + +</TabItem> + +<TabItem value="snapshots" label="Snapshots"> + This configuration accepts a valid column expression. As such, you can concatenate two columns together as a unique key if required. It's a good idea to use a separator (for example, `'-'`) to ensure uniqueness. <File name='snapshots/transaction_items_snapshot.sql'> @@ -170,7 +305,6 @@ from {{ source('erp', 'transactions') }} Though, it's probably a better idea to construct this column in your query and use that as the `unique_key`: - <File name='models/transaction_items_ephemeral.sql'> ```sql @@ -211,4 +345,6 @@ from {{ source('erp', 'transactions') }} ``` </File> +</TabItem> +</Tabs> </VersionBlock> diff --git a/website/docs/reference/resource-properties/concurrent_batches.md b/website/docs/reference/resource-properties/concurrent_batches.md new file mode 100644 index 00000000000..4d6b2ea0af4 --- /dev/null +++ b/website/docs/reference/resource-properties/concurrent_batches.md @@ -0,0 +1,90 @@ +--- +title: "concurrent_batches" +resource_types: [models] +datatype: model_name +description: "Learn about concurrent_batches in dbt." +--- + +:::note + +Available in dbt Core v1.9+ or the [dbt Cloud "Latest" release tracks](/docs/dbt-versions/cloud-release-tracks). + +::: + +<Tabs> +<TabItem value="Project file"> + + +<File name='dbt_project.yml'> + +```yaml +models: + +concurrent_batches: true +``` + +</File> + +</TabItem> + + +<TabItem value="sql file"> + +<File name='models/my_model.sql'> + +```sql +{{ + config( + materialized='incremental', + concurrent_batches=true, + incremental_strategy='microbatch' + ... + ) +}} +select ... +``` + +</File> + +</TabItem> +</Tabs> + +## Definition + +`concurrent_batches` is an override which allows you to decide whether or not you want to run batches in parallel or sequentially (one at a time). + +For more information, refer to [how batch execution works](/docs/build/incremental-microbatch#how-parallel-batch-execution-works). +## Example + +By default, dbt auto-detects whether batches can run in parallel for microbatch models. However, you can override dbt's detection by setting the `concurrent_batches` config to `false` in your `dbt_project.yml` or model `.sql` file to specify parallel or sequential execution, given you meet these conditions: +* You've configured a microbatch incremental strategy. +* You're working with cumulative metrics or any logic that depends on batch order. + +Set `concurrent_batches` config to `false` to ensure batches are processed sequentially. For example: + +<File name='dbt_project.yml'> + +```yaml +models: + my_project: + cumulative_metrics_model: + +concurrent_batches: false +``` +</File> + + +<File name='models/my_model.sql'> + +```sql +{{ + config( + materialized='incremental', + incremental_strategy='microbatch' + concurrent_batches=false + ) +}} +select ... + +``` +</File> + + diff --git a/website/docs/reference/resource-properties/constraints.md b/website/docs/reference/resource-properties/constraints.md index 6ba20db090f..1e418e884be 100644 --- a/website/docs/reference/resource-properties/constraints.md +++ b/website/docs/reference/resource-properties/constraints.md @@ -29,7 +29,7 @@ Foreign key constraints accept two additional inputs: - `to`: A relation input, likely `ref()`, indicating the referenced table. - `to_columns`: A list of column(s) in that table containing the corresponding primary or unique key. -This syntax for defining foreign keys uses `ref`, meaning it will capture dependencies and works across different environments. It's available in [dbt Cloud Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) and versions of dbt Core starting with v1.9. +This syntax for defining foreign keys uses `ref`, meaning it will capture dependencies and works across different environments. It's available in [dbt Cloud "Latest""](/docs/dbt-versions/cloud-release-tracks) and [dbt Core v1.9+](/docs/dbt-versions/core-upgrade/upgrading-to-v1.9). <File name='models/schema.yml'> diff --git a/website/docs/reference/resource-properties/unit-tests.md b/website/docs/reference/resource-properties/unit-tests.md index 08081c4c24a..7bc177a133c 100644 --- a/website/docs/reference/resource-properties/unit-tests.md +++ b/website/docs/reference/resource-properties/unit-tests.md @@ -7,7 +7,7 @@ datatype: test :::note -This functionality is only supported in dbt Core v1.8+ or dbt Cloud accounts that have gone ["Versionless"](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless). +This functionality is available in dbt Core v1.8+ and [dbt Cloud release tracks](/docs/dbt-versions/cloud-release-tracks). ::: diff --git a/website/docs/reference/resource-properties/versions.md b/website/docs/reference/resource-properties/versions.md index f6b71852aef..748aa477a4f 100644 --- a/website/docs/reference/resource-properties/versions.md +++ b/website/docs/reference/resource-properties/versions.md @@ -73,13 +73,13 @@ Note that the value of `defined_in` and the `alias` configuration of a model are When you use the `state:modified` selection method in Slim CI, dbt will detect changes to versioned model contracts, and raise an error if any of those changes could be breaking for downstream consumers. -Breaking changes include: -- Removing an existing column -- Changing the `data_type` of an existing column -- Removing or modifying one of the `constraints` on an existing column (dbt v1.6 or higher) -- Changing unversioned, contracted models. - - dbt also warns if a model has or had a contract but isn't versioned - +import BreakingChanges from '/snippets/_versions-contracts.md'; + +<BreakingChanges +value="Changing unversioned, contracted models." +value2="dbt also warns if a model has or had a contract but isn't versioned." +/> + <Tabs> <TabItem value="unversioned" label="Example message for unversioned models"> diff --git a/website/docs/reference/snapshot-configs.md b/website/docs/reference/snapshot-configs.md index 7b3c0f8e5b1..018988a4934 100644 --- a/website/docs/reference/snapshot-configs.md +++ b/website/docs/reference/snapshot-configs.md @@ -8,30 +8,16 @@ meta: import ConfigResource from '/snippets/_config-description-resource.md'; import ConfigGeneral from '/snippets/_config-description-general.md'; - ## Related documentation * [Snapshots](/docs/build/snapshots) * The `dbt snapshot` [command](/reference/commands/snapshot) -<!-- -Parts of a snapshot: -- name -- query ---> ## Available configurations ### Snapshot-specific configurations <ConfigResource meta={frontMatter.meta} /> -<VersionBlock lastVersion="1.8"> - -import SnapshotYaml from '/snippets/_snapshot-yaml-spec.md'; - -<SnapshotYaml/> - -</VersionBlock> - <Tabs groupId="config-languages" defaultValue="project-yaml" @@ -79,7 +65,8 @@ snapshots: [+](/reference/resource-configs/plus-prefix)[updated_at](/reference/resource-configs/updated_at): <column_name> [+](/reference/resource-configs/plus-prefix)[check_cols](/reference/resource-configs/check_cols): [<column_name>] | all [+](/reference/resource-configs/plus-prefix)[snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names): {<dictionary>} - [+](/reference/resource-configs/plus-prefix)[invalidate_hard_deletes](/reference/resource-configs/invalidate_hard_deletes) : true | false + [+](/reference/resource-configs/plus-prefix)[dbt_valid_to_current](/reference/resource-configs/dbt_valid_to_current): <string> + [+](/reference/resource-configs/plus-prefix)[hard_deletes](/reference/resource-configs/hard-deletes): string ``` </File> @@ -113,7 +100,8 @@ snapshots: [updated_at](/reference/resource-configs/updated_at): <column_name> [check_cols](/reference/resource-configs/check_cols): [<column_name>] | all [snapshot_meta_column_names](/reference/resource-configs/snapshot_meta_column_names): {<dictionary>} - [invalidate_hard_deletes](/reference/resource-configs/invalidate_hard_deletes) : true | false + [hard_deletes](/reference/resource-configs/hard-deletes): string + [dbt_valid_to_current](/reference/resource-configs/dbt_valid_to_current): <string> ``` </File> @@ -123,11 +111,9 @@ snapshots: <TabItem value="config-resource"> -<VersionBlock firstVersion="1.9"> - -Configurations can be applied to snapshots using the [YAML syntax](/docs/build/snapshots), available in Versionless and dbt v1.9 and higher, in the `snapshot` directory file. +import LegacySnapshotConfig from '/snippets/_legacy-snapshot-config.md'; -</VersionBlock> +<LegacySnapshotConfig /> <VersionBlock lastVersion="1.8"> @@ -150,11 +136,25 @@ Configurations can be applied to snapshots using the [YAML syntax](/docs/build/s </Tabs> +### Snapshot configuration migration + +The latest snapshot configurations introduced in dbt Core v1.9 (such as [`snapshot_meta_column_names`](/reference/resource-configs/snapshot_meta_column_names), [`dbt_valid_to_current`](/reference/resource-configs/dbt_valid_to_current), and `hard_deletes`) are best suited for new snapshots. For existing snapshots, we recommend the following to avoid any inconsistencies in your snapshots: + +#### For existing snapshots +- Migrate tables — Migrate the previous snapshot to the new table schema and values: + - Create a backup copy of your snapshots. + - Use `alter` statements as needed (or a script to apply `alter` statements) to ensure table consistency. +- New configurations — Convert the configs one at a time, testing as you go. + +:::warning +If you use one of the latest configs, such as `dbt_valid_to_current`, without migrating your data, you may have mixed old and new data, leading to an incorrect downstream result. +::: ### General configurations <ConfigGeneral /> + <Tabs groupId="config-languages" defaultValue="project-yaml" @@ -170,6 +170,7 @@ Configurations can be applied to snapshots using the [YAML syntax](/docs/build/s <VersionBlock firstVersion="1.9"> + ```yaml snapshots: [<resource-path>](/reference/resource-configs/resource-path): @@ -254,11 +255,7 @@ snapshots: <TabItem value="config"> -<VersionBlock firstVersion="1.9"> - -Configurations can be applied to snapshots using [YAML syntax](/docs/build/snapshots), available in Versionless and dbt v1.9 and higher, in the `snapshot` directory file. - -</VersionBlock> +<LegacySnapshotConfig /> <VersionBlock lastVersion="1.8"> @@ -287,24 +284,29 @@ Snapshots can be configured in multiple ways: <VersionBlock firstVersion="1.9"> -1. Defined in YAML files using a `config` [resource property](/reference/model-properties), typically in your [snapshots directory](/reference/project-configs/snapshot-paths) (available in [Versionless](/docs/dbt-versions/versionless-cloud) or and dbt Core v1.9 and higher). +1. Defined in YAML files using a `config` [resource property](/reference/model-properties), typically in your [snapshots directory](/reference/project-configs/snapshot-paths) (available in [the dbt Cloud release track](/docs/dbt-versions/cloud-release-tracks) and dbt v1.9 and higher). 2. From the `dbt_project.yml` file, under the `snapshots:` key. To apply a configuration to a snapshot, or directory of snapshots, define the resource path as nested dictionary keys. </VersionBlock> <VersionBlock lastVersion="1.8"> -1. Defined in YAML files using a `config` [resource property](/reference/model-properties), typically in your [snapshots directory](/reference/project-configs/snapshot-paths) (available in [Versionless](/docs/dbt-versions/versionless-cloud) or and dbt Core v1.9 and higher). -2. Using a `config` block within a snapshot defined in Jinja SQL +1. Defined in a YAML file using a `config` [resource property](/reference/model-properties), typically in your [snapshots directory](/reference/project-configs/snapshot-paths) (available in [the dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks) and dbt v1.9 and higher). The latest snapshot YAML syntax provides faster and more efficient management. +2. Using a `config` block within a snapshot defined in Jinja SQL. 3. From the `dbt_project.yml` file, under the `snapshots:` key. To apply a configuration to a snapshot, or directory of snapshots, define the resource path as nested dictionary keys. -Note that in Versionless and dbt v1.9 and later, snapshots are defined in an updated syntax using a YAML file within your `snapshots/` directory (as defined by the [`snapshot-paths` config](/reference/project-configs/snapshot-paths)). For faster and more efficient management, consider the updated snapshot YAML syntax, [available in Versionless](/docs/dbt-versions/versionless-cloud) or [dbt Core v1.9 and later](/docs/dbt-versions/core). - </VersionBlock> Snapshot configurations are applied hierarchically in the order above with higher taking precedence. ### Examples -The following examples demonstrate how to configure snapshots using the `dbt_project.yml` file, a `config` block within a snapshot, and a `.yml` file. + +<VersionBlock firstVersion="1.9"> +The following examples demonstrate how to configure snapshots using the `dbt_project.yml` file and a `.yml` file. +</VersionBlock> + +<VersionBlock lastVersion="1.8"> +The following examples demonstrate how to configure snapshots using the `dbt_project.yml` file, a `config` block within a snapshot (legacy method), and a `.yml` file. +</VersionBlock> - #### Apply configurations to all snapshots To apply a configuration to all snapshots, including those in any installed [packages](/docs/build/packages), nest the configuration directly under the `snapshots` key: @@ -347,6 +349,7 @@ The following examples demonstrate how to configure snapshots using the `dbt_pro {{ config( unique_key='id', + target_schema='snapshots', strategy='timestamp', updated_at='updated_at' ) @@ -396,7 +399,7 @@ The following examples demonstrate how to configure snapshots using the `dbt_pro </File> - You can also define some common configs in a snapshot's `config` block. We don't recommend this for a snapshot's required configuration, however. + You can also define some common configs in a snapshot's `config` block. However, we don't recommend this for a snapshot's required configuration. <File name='dbt_project.yml'> diff --git a/website/docs/reference/snapshot-properties.md b/website/docs/reference/snapshot-properties.md index d940a9f344c..11fb956a163 100644 --- a/website/docs/reference/snapshot-properties.md +++ b/website/docs/reference/snapshot-properties.md @@ -5,7 +5,7 @@ description: "Read this guide to learn about using source properties in dbt." <VersionBlock firstVersion="1.9"> -In Versionless and dbt v1.9 and later, snapshots are defined and configured in YAML files within your `snapshots/` directory (as defined by the [`snapshot-paths` config](/reference/project-configs/snapshot-paths)). Snapshot properties are declared within these YAML files, allowing you to define both the snapshot configurations and properties in one place. +In dbt v1.9 and later, snapshots are defined and configured in YAML files within your `snapshots/` directory (as defined by the [`snapshot-paths` config](/reference/project-configs/snapshot-paths)). Snapshot properties are declared within these YAML files, allowing you to define both the snapshot configurations and properties in one place. </VersionBlock> @@ -15,7 +15,7 @@ Snapshots properties can be declared in `.yml` files in: - your `snapshots/` directory (as defined by the [`snapshot-paths` config](/reference/project-configs/snapshot-paths)). - your `models/` directory (as defined by the [`model-paths` config](/reference/project-configs/model-paths)) -Note, in Versionless and dbt v1.9 and later, snapshots are defined in an updated syntax using a YAML file within your `snapshots/` directory (as defined by the [`snapshot-paths` config](/reference/project-configs/snapshot-paths)). For faster and more efficient management, consider the updated snapshot YAML syntax, [available in Versionless](/docs/dbt-versions/versionless-cloud) or [dbt Core v1.9 and later](/docs/dbt-versions/core). +Note, in dbt v1.9 and later, snapshots are defined in an updated syntax using a YAML file within your `snapshots/` directory (as defined by the [`snapshot-paths` config](/reference/project-configs/snapshot-paths)). For faster and more efficient management, consider the updated snapshot YAML syntax, available now in [the dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks) and soon in [dbt Core v1.9](/docs/dbt-versions/core-upgrade/upgrading-to-v1.9). </VersionBlock> diff --git a/website/docs/reference/source-configs.md b/website/docs/reference/source-configs.md index c5264e82fc7..959d4c542e9 100644 --- a/website/docs/reference/source-configs.md +++ b/website/docs/reference/source-configs.md @@ -255,7 +255,7 @@ sources: <VersionBlock lastVersion="1.8"> -Configuring an [`event_time`](/reference/resource-configs/event-time) for a source is only available in dbt Cloud Versionless or dbt Core versions 1.9 and later. +Configuring an [`event_time`](/reference/resource-configs/event-time) for a source is only available in [the dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks) or dbt Core versions 1.9 and later. </VersionBlock> diff --git a/website/package-lock.json b/website/package-lock.json index 936f05624bb..8d573ee3426 100644 --- a/website/package-lock.json +++ b/website/package-lock.json @@ -5,7 +5,6 @@ "requires": true, "packages": { "": { - "name": "website", "version": "0.0.0", "dependencies": { "@docusaurus/core": "3.4.0", diff --git a/website/sidebars.js b/website/sidebars.js index 04afb7c0c99..9a93980b12c 100644 --- a/website/sidebars.js +++ b/website/sidebars.js @@ -49,6 +49,7 @@ const sidebarSettings = { items: [ "docs/cloud/about-cloud-setup", "docs/cloud/account-settings", + "docs/cloud/account-integrations", "docs/dbt-cloud-environments", "docs/cloud/migration", { @@ -221,6 +222,7 @@ const sidebarSettings = { "docs/core/connect-data-platform/athena-setup", "docs/core/connect-data-platform/glue-setup", "docs/core/connect-data-platform/clickhouse-setup", + "docs/core/connect-data-platform/cratedb-setup", "docs/core/connect-data-platform/databend-setup", "docs/core/connect-data-platform/decodable-setup", "docs/core/connect-data-platform/doris-setup", @@ -776,7 +778,7 @@ const sidebarSettings = { link: { type: "doc", id: "docs/dbt-versions/core" }, items: [ "docs/dbt-versions/core", - "docs/dbt-versions/versionless-cloud", + "docs/dbt-versions/cloud-release-tracks", "docs/dbt-versions/upgrade-dbt-version-in-cloud", "docs/dbt-versions/product-lifecycles", "docs/dbt-versions/experimental-features", @@ -805,6 +807,7 @@ const sidebarSettings = { }, items: [ "docs/dbt-versions/dbt-cloud-release-notes", + "docs/dbt-versions/compatible-track-changelog", "docs/dbt-versions/2023-release-notes", "docs/dbt-versions/2022-release-notes", { @@ -924,6 +927,8 @@ const sidebarSettings = { items: [ "reference/resource-configs/access", "reference/resource-configs/alias", + "reference/resource-configs/batch-size", + "reference/resource-configs/begin", "reference/resource-configs/database", "reference/resource-configs/enabled", "reference/resource-configs/event-time", @@ -932,10 +937,12 @@ const sidebarSettings = { "reference/resource-configs/grants", "reference/resource-configs/group", "reference/resource-configs/docs", + "reference/resource-configs/lookback", "reference/resource-configs/persist_docs", "reference/resource-configs/pre-hook-post-hook", "reference/resource-configs/schema", "reference/resource-configs/tags", + "reference/resource-configs/unique_key", "reference/resource-configs/meta", "reference/advanced-config-usage", "reference/resource-configs/plus-prefix", @@ -951,6 +958,7 @@ const sidebarSettings = { "reference/resource-configs/materialized", "reference/resource-configs/on_configuration_change", "reference/resource-configs/sql_header", + "reference/resource-properties/concurrent_batches", ], }, { @@ -969,17 +977,17 @@ const sidebarSettings = { label: "For snapshots", items: [ "reference/snapshot-properties", - "reference/resource-configs/snapshot_name", "reference/snapshot-configs", "reference/resource-configs/check_cols", + "reference/resource-configs/dbt_valid_to_current", + "reference/resource-configs/hard-deletes", + "reference/resource-configs/invalidate_hard_deletes", + "reference/resource-configs/snapshot_meta_column_names", + "reference/resource-configs/snapshot_name", "reference/resource-configs/strategy", "reference/resource-configs/target_database", "reference/resource-configs/target_schema", - "reference/resource-configs/unique_key", "reference/resource-configs/updated_at", - "reference/resource-configs/invalidate_hard_deletes", - "reference/resource-configs/snapshot_meta_column_names", - "reference/resource-configs/dbt_valid_to_current", ], }, { diff --git a/website/snippets/_cloud-environments-info.md b/website/snippets/_cloud-environments-info.md index 6addd6a3a7a..6d202d01998 100644 --- a/website/snippets/_cloud-environments-info.md +++ b/website/snippets/_cloud-environments-info.md @@ -33,9 +33,7 @@ Both development and deployment environments have a section called **General Set :::note About dbt version -- dbt Cloud allows users to select any dbt release. At this time, **environments must use a dbt version greater than or equal to v1.0.0;** [lower versions are no longer supported](/docs/dbt-versions/upgrade-dbt-version-in-cloud). -- If you select a current version with `(latest)` in the name, your environment will automatically install the latest stable version of the minor version selected. -- Go **Versionless**, which removes the need for manually upgrading environment, while ensuring you are always up to date with the latest fixes and features. +dbt Cloud allows users to select a [release track](/docs/dbt-versions/cloud-release-tracks) to receive ongoing dbt version upgrades at the cadence that makes sense for their team. ::: ### Custom branch behavior diff --git a/website/snippets/_config-dbt-version-check.md b/website/snippets/_config-dbt-version-check.md index d4e495bd379..6dc2e702895 100644 --- a/website/snippets/_config-dbt-version-check.md +++ b/website/snippets/_config-dbt-version-check.md @@ -1,5 +1,5 @@ -Starting in 2024, when you select **Versionless** in dbt Cloud, dbt will ignore the `require-dbt-version` config. Refer to [Versionless](/docs/dbt-versions/upgrade-dbt-version-in-cloud#versionless) for more details. +Starting in 2024, when you select a [release track in dbt Cloud](/docs/dbt-versions/cloud-release-tracks) to receive ongoing dbt version upgrades, dbt will ignore the `require-dbt-version` config. dbt Labs is committed to zero breaking changes for code in dbt projects, with ongoing releases to dbt Cloud and new versions of dbt Core. We also recommend these best practices: diff --git a/website/snippets/_enterprise-permissions-table.md b/website/snippets/_enterprise-permissions-table.md index a5b825d34d2..b39337697c1 100644 --- a/website/snippets/_enterprise-permissions-table.md +++ b/website/snippets/_enterprise-permissions-table.md @@ -104,7 +104,7 @@ Key: | Custom env. variables | W | W | W | W | W | W | - | R | - | - | R | W | - | | Data platform configs | W | W | W | W | R | W | - | - | - | - | R | R | - | | Develop (IDE or CLI) | W | W | - | W | - | - | - | - | - | - | - | - | - | -| Environments | W | R* | R* | R* | R* | W | - | R | - | - | R | R* | - | +| Environments | W | R | R | R | R | W | - | R | - | - | R | R | - | | Jobs | W | R* | R* | R* | R* | W | R | R | - | - | R | R* | - | | Metadata GraphQL API access| R | R | R | R | R | R | - | R | R | - | R | R | - | | Permissions | W | - | R | R | R | - | - | - | - | - | - | R | - | diff --git a/website/snippets/_hard-deletes.md b/website/snippets/_hard-deletes.md new file mode 100644 index 00000000000..59c2e3af99e --- /dev/null +++ b/website/snippets/_hard-deletes.md @@ -0,0 +1,13 @@ +<Expandable alt_header="When to use the hard_deletes and invalidate_hard_deletes config?"> + +**Use `invalidate_hard_deletes` (v1.8 and earlier) if:** +- Gaps in the snapshot history (missing records for deleted rows) are acceptable. +- You want to invalidate deleted rows by setting their `dbt_valid_to` timestamp to the current time (implicit delete). +- You are working with smaller datasets where tracking deletions as a separate state is unnecessary. + +**Use `hard_deletes: new_record` (v1.9 and higher) if:** +- You want to maintain continuous snapshot history without gaps. +- You want to explicitly track deletions by adding new rows with a `dbt_is_deleted` column (explicit delete). +- You are working with larger datasets where explicitly tracking deleted records improves data lineage clarity. + +</Expandable> diff --git a/website/snippets/_legacy-snapshot-config.md b/website/snippets/_legacy-snapshot-config.md new file mode 100644 index 00000000000..a38995308e9 --- /dev/null +++ b/website/snippets/_legacy-snapshot-config.md @@ -0,0 +1,4 @@ + +:::info +Starting from [the dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks) and dbt Core v1.9, defining snapshots in a `.sql` file using a config block is a legacy method. You can define snapshots in YAML format using the latest [snapshot-specific configurations](/docs/build/snapshots#configuring-snapshots). For new snapshots, we recommend using these latest configs. If applying them to existing snapshots, you'll need to [migrate](#snapshot-configuration-migration) over. +::: diff --git a/website/snippets/_relative-path.md b/website/snippets/_relative-path.md new file mode 100644 index 00000000000..791f8e83f7e --- /dev/null +++ b/website/snippets/_relative-path.md @@ -0,0 +1 @@ +<span>Paths specified in <code>{props.path}</code> must be relative to the location of your `dbt_project.yml` file. Avoid using absolute paths like <code>{props.absolute}</code>, as it will lead to unexpected behavior and outcomes.</span> diff --git a/website/snippets/_release-stages-from-versionless.md b/website/snippets/_release-stages-from-versionless.md new file mode 100644 index 00000000000..f6fbf9153b0 --- /dev/null +++ b/website/snippets/_release-stages-from-versionless.md @@ -0,0 +1,5 @@ +:::note Versionless is now the "latest" release track + +This blog post was updated on December 04, 2024 to rename "versionless" to the "latest" release track allowing for the introduction of less-frequent release tracks. Learn more about [Release Tracks](/docs/dbt-versions/cloud-release-tracks) and how to use them. + +::: diff --git a/website/snippets/_sl-measures-parameters.md b/website/snippets/_sl-measures-parameters.md index 728d63c6b4f..8d6b84a71dd 100644 --- a/website/snippets/_sl-measures-parameters.md +++ b/website/snippets/_sl-measures-parameters.md @@ -1,11 +1,11 @@ -| Parameter | Description | | -| --- | --- | --- | -| [`name`](/docs/build/measures#name) | Provide a name for the measure, which must be unique and can't be repeated across all semantic models in your dbt project. | Required | -| [`description`](/docs/build/measures#description) | Describes the calculated measure. | Optional | -| [`agg`](/docs/build/measures#aggregation) | dbt supports the following aggregations: `sum`, `max`, `min`, `average`, `median`, `count_distinct`, `percentile`, and `sum_boolean`. | Required | -| [`expr`](/docs/build/measures#expr) | Either reference an existing column in the table or use a SQL expression to create or derive a new one. | Optional | -| [`non_additive_dimension`](/docs/build/measures#non-additive-dimensions) | Non-additive dimensions can be specified for measures that cannot be aggregated over certain dimensions, such as bank account balances, to avoid producing incorrect results. | Optional | -| `agg_params` | Specific aggregation properties, such as a percentile. | Optional | -| `agg_time_dimension` | The time field. Defaults to the default agg time dimension for the semantic model. | Optional | 1.6 and higher | -| `label` | String that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as orders_total or "orders_total"). Available in dbt version 1.7 or higher. | Optional -| `create_metric` | Create a `simple` metric from a measure by setting `create_metric: True`. The `label` and `description` attributes will be automatically propagated to the created metric. Available in dbt version 1.7 or higher. | Optional | +| Parameter | Description | Required | Type | +| --- | --- | --- | --- | +| [`name`](/docs/build/measures#name) | Provide a name for the measure, which must be unique and can't be repeated across all semantic models in your dbt project. | Required | String | +| [`description`](/docs/build/measures#description) | Describes the calculated measure. | Optional | String | +| [`agg`](/docs/build/measures#aggregation) | dbt supports the following aggregations: `sum`, `max`, `min`, `average`, `median`, `count_distinct`, `percentile`, and `sum_boolean`. | Required | String | +| [`expr`](/docs/build/measures#expr) | Either reference an existing column in the table or use a SQL expression to create or derive a new one. | Optional | String | +| [`non_additive_dimension`](/docs/build/measures#non-additive-dimensions) | Non-additive dimensions can be specified for measures that cannot be aggregated over certain dimensions, such as bank account balances, to avoid producing incorrect results. | Optional | String | +| `agg_params` | Specific aggregation properties, such as a percentile. | Optional | Dict | +| `agg_time_dimension` | The time field. Defaults to the default agg time dimension for the semantic model. | Optional | String | +| `label` | String that defines the display value in downstream tools. Accepts plain text, spaces, and quotes (such as `orders_total` or `"orders_total"`). Available in dbt version 1.7 or higher. | Optional | String | +| `create_metric` | Create a `simple` metric from a measure by setting `create_metric: True`. The `label` and `description` attributes will be automatically propagated to the created metric. Available in dbt version 1.7 or higher. | Optional | Boolean | diff --git a/website/snippets/_snapshot-yaml-spec.md b/website/snippets/_snapshot-yaml-spec.md index cb1675ce5bd..f306abb21dd 100644 --- a/website/snippets/_snapshot-yaml-spec.md +++ b/website/snippets/_snapshot-yaml-spec.md @@ -1,6 +1,4 @@ :::info Use the latest snapshot syntax -In [dbt Cloud Versionless](/docs/dbt-versions/versionless-cloud) or [dbt Core v1.9 and later](/docs/dbt-versions/core), you can configure snapshots in YAML files using the updated syntax within your `snapshots/` directory (as defined by the [`snapshot-paths` config](/reference/project-configs/snapshot-paths)). - -This syntax allows for faster, more efficient snapshot management. To use it, upgrade to Versionless or dbt v1.9 or newer. +In [dbt Cloud "Latest""](/docs/dbt-versions/cloud-release-tracks) or [dbt Core v1.9+](/docs/dbt-versions/core-upgrade/upgrading-to-v1.9), you can configure snapshots in YAML files using the updated syntax within your `snapshots/` directory (as defined by the [`snapshot-paths` config](/reference/project-configs/snapshot-paths)). This syntax allows for faster, more efficient snapshot management. ::: diff --git a/website/snippets/_sso-docs-mt-available.md b/website/snippets/_sso-docs-mt-available.md index e56403988a4..fdcdc8249ba 100644 --- a/website/snippets/_sso-docs-mt-available.md +++ b/website/snippets/_sso-docs-mt-available.md @@ -2,6 +2,6 @@ This guide describes a feature of the dbt Cloud Enterprise plan. If you’re interested in learning more about an Enterprise plan, contact us at [sales@getdbt.com](mailto:sales@getdbt.com). -These SSO configuration documents apply to multi-tenant Enterprise deployments only. [Single-tenant](/docs/cloud/about-cloud/tenancy#single-tenant) Virtual Private users can [email dbt Cloud Support](mailto:support@getdbt.com) to set up or update their SSO configuration. +These SSO configuration documents apply to multi-tenant Enterprise deployments only. ::: diff --git a/website/snippets/_state-modified-compare.md b/website/snippets/_state-modified-compare.md index c7bba1c8bdf..f89d63162ae 100644 --- a/website/snippets/_state-modified-compare.md +++ b/website/snippets/_state-modified-compare.md @@ -1,3 +1,3 @@ -You need to build the state directory using dbt v1.9 or higher, or [Versionless](/docs/dbt-versions/versionless-cloud) dbt Cloud, and you need to set `state_modified_compare_more_unrendered_values` to `true` within your dbt_project.yml. +You need to build the state directory using dbt v1.9 or higher, or [the dbt Cloud "Latest" release track](/docs/dbt-versions/cloud-release-tracks), and you need to set `state_modified_compare_more_unrendered_values` to `true` within your dbt_project.yml. If the state directory was built with an older dbt version or if the `state_modified_compare_more_unrendered_values` behavior change flag was either not set or set to `false`, you need to rebuild the state directory to avoid false positives during state comparison with `state:modified`. diff --git a/website/snippets/_versions-contracts.md b/website/snippets/_versions-contracts.md new file mode 100644 index 00000000000..1207e02fba9 --- /dev/null +++ b/website/snippets/_versions-contracts.md @@ -0,0 +1,7 @@ +Breaking changes include: + +- Removing an existing column +- Changing the data_type of an existing column +- Removing or modifying one of the `constraints` on an existing column (dbt v1.6 or higher) +- {props.value} + - {props.value2} diff --git a/website/snippets/access_url.md b/website/snippets/access_url.md index 4fb7aa776ae..90a9238618a 100644 --- a/website/snippets/access_url.md +++ b/website/snippets/access_url.md @@ -1 +1 @@ -The following steps use `YOUR_AUTH0_URI` and `YOUR_AUTH0_ENTITYID`, which need to be replaced with the [appropriate Auth0 SSO URI and Auth0 Entity ID](/docs/cloud/manage-access/set-up-sso-saml-2.0#auth0-multi-tenant-uris) for your region. +The following steps use `YOUR_AUTH0_URI` and `YOUR_AUTH0_ENTITYID`, which need to be replaced with the [appropriate Auth0 SSO URI and Auth0 Entity ID](#auth0-uris) for your region. diff --git a/website/snippets/auth0-uri.md b/website/snippets/auth0-uri.md index d040fb372a7..e1d05ebbe86 100644 --- a/website/snippets/auth0-uri.md +++ b/website/snippets/auth0-uri.md @@ -1,11 +1,11 @@ -The URI used for SSO connections on multi-tenant dbt Cloud instances will vary based on your dbt Cloud hosted region. Use your login URL (also referred to as your Access URL) to determine the correct Auth0 URI for your environment. +The URI used for SSO connections on multi-tenant dbt Cloud instances will vary based on your dbt Cloud hosted region. To find the URIs for your environment in dbt Cloud: + +1. Navigate to your **Account settings** and click **Single sign-on** on the left menu. +1. Click **Edit** in the **Single sign-on** pane. +1. Select the appropriate **Identity provider** from the dropdown and the **Login slug** and **Identity provider values** will populate for that provider. + +<Lightbox src="/img/docs/dbt-cloud/access-control/sso-uri.png" title="Example of the identity provider values for a SAML 2.0 provider" /> + -| Region | dbt Cloud Access URL | Auth0 SSO URI <YOUR_AUTH0_URI> | Auth0 Entity ID <YOUR_AUTH0_ENTITYID>* | -|--------|-----------------------|-------------------------------|----------------------------------------| -| US multi-tenant | cloud.getdbt.com | auth.cloud.getdbt.com | us-production-mt | -| US cell 1 | \{account prefix\}.us1.dbt.com | auth.cloud.getdbt.com | us-production-mt | -| EMEA | emea.dbt.com | auth.emea.dbt.com | emea-production-mt | -| APAC | au.dbt.com | auth.au.dbt.com | au-production-mt | -*Only applicable to SAML and Okta configurations. diff --git a/website/snippets/core-versions-table.md b/website/snippets/core-versions-table.md index 743b59c6bb7..0d82ab35573 100644 --- a/website/snippets/core-versions-table.md +++ b/website/snippets/core-versions-table.md @@ -2,7 +2,8 @@ | dbt Core | Initial release | Support level and end date | |:-------------------------------------------------------------:|:---------------:|:-------------------------------------:| -| [**v1.8**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.8) | May 9 2024 | <b>Active Support — May 8, 2025</b> | +| [**v1.9**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.9) | Dec 9, 2024 | <b> Active Support — Dec 8, 2025</b>| +| [**v1.8**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.8) | May 9, 2024 | <b>Active Support — May 8, 2025</b>| | [**v1.7**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.7) | Nov 2, 2023 | <div align="left">**dbt Core and dbt Cloud Developer & Team customers:** End of Life <br /> **dbt Cloud Enterprise customers:** Critical Support until further notice <sup>1</sup></div> | | [**v1.6**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.6) | Jul 31, 2023 | End of Life ⚠️ | | [**v1.5**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.5) | Apr 27, 2023 | End of Life ⚠️ | @@ -13,8 +14,8 @@ | [**v1.0**](/docs/dbt-versions/core-upgrade/Older%20versions/upgrading-to-v1.0) | Dec 3, 2021 | End of Life ⚠️ | | **v0.X** ⛔️ | (Various dates) | Deprecated ⛔️ | Deprecated ⛔️ | -All functionality in dbt Core since the v1.7 release is available in dbt Cloud, early and continuously, by selecting ["Versionless"](https://docs.getdbt.com/docs/dbt-versions/versionless-cloud). +All functionality in dbt Core since the v1.7 release is available in [dbt Cloud release tracks](/docs/dbt-versions/cloud-release-tracks), which provide automated upgrades at a cadence appropriate for your team. -<sup>1</sup> "Versionless" is now required for the Developer and Teams plans on dbt Cloud. Accounts using older dbt versions will be migrated to "Versionless." +<sup>1</sup> Release tracks are required for the Developer and Teams plans on dbt Cloud. Accounts using older dbt versions will be migrated to the "Latest" release track. -For customers of dbt Cloud Enterprise, dbt v1.7 will continue to be available as an option while dbt Labs rolls out a mechanism for "extended" upgrades. In the meantime, dbt Labs strongly recommends migrating any environments that are still running on older unsupported versions to "Versionless" dbt or dbt v1.7. +For customers of dbt Cloud Enterprise, dbt v1.7 will continue to be available as an option until dbt Labs announces that "Compatible" and "Extended" release tracks are Generally Available, planned for March 2025. (They are currently available to all eligible accounts in Preview.) In the meantime, dbt Labs strongly recommends migrating any environments that are still running on older unsupported versions to either release tracks or dbt v1.7. diff --git a/website/snippets/hooks-to-grants.md b/website/snippets/hooks-to-grants.md deleted file mode 100644 index d7586ec53ca..00000000000 --- a/website/snippets/hooks-to-grants.md +++ /dev/null @@ -1,3 +0,0 @@ - -In older versions of dbt, the most common use of `post-hook` was to execute `grant` statements, to apply database permissions to models right after creating them. We recommend using the [`grants` resource config](/reference/resource-configs/grants) instead, in order to automatically apply grants when your dbt model runs. - diff --git a/website/src/components/expandable/styles.module.css b/website/src/components/expandable/styles.module.css index fc6f258286b..4d3957228b9 100644 --- a/website/src/components/expandable/styles.module.css +++ b/website/src/components/expandable/styles.module.css @@ -145,4 +145,5 @@ .headerText { display: flex; align-items: center; -} \ No newline at end of file +} + diff --git a/website/static/img/blog/2024-11-27-test-smarter-part-2/testing_pipeline.png b/website/static/img/blog/2024-11-27-test-smarter-part-2/testing_pipeline.png new file mode 100644 index 00000000000..223846b043c Binary files /dev/null and b/website/static/img/blog/2024-11-27-test-smarter-part-2/testing_pipeline.png differ diff --git a/website/static/img/docs/dbt-cloud/access-control/sso-uri.png b/website/static/img/docs/dbt-cloud/access-control/sso-uri.png new file mode 100644 index 00000000000..c557b903e57 Binary files /dev/null and b/website/static/img/docs/dbt-cloud/access-control/sso-uri.png differ diff --git a/website/static/img/docs/dbt-cloud/account-integration-ai.jpg b/website/static/img/docs/dbt-cloud/account-integration-ai.jpg new file mode 100644 index 00000000000..7dd42ee037b Binary files /dev/null and b/website/static/img/docs/dbt-cloud/account-integration-ai.jpg differ diff --git a/website/static/img/docs/dbt-cloud/account-integration-azure-manual.jpg b/website/static/img/docs/dbt-cloud/account-integration-azure-manual.jpg new file mode 100644 index 00000000000..3b509d1c965 Binary files /dev/null and b/website/static/img/docs/dbt-cloud/account-integration-azure-manual.jpg differ diff --git a/website/static/img/docs/dbt-cloud/account-integration-azure-target.jpg b/website/static/img/docs/dbt-cloud/account-integration-azure-target.jpg new file mode 100644 index 00000000000..c8ff5dd8cf6 Binary files /dev/null and b/website/static/img/docs/dbt-cloud/account-integration-azure-target.jpg differ diff --git a/website/static/img/docs/dbt-cloud/account-integration-dbtlabs.jpg b/website/static/img/docs/dbt-cloud/account-integration-dbtlabs.jpg new file mode 100644 index 00000000000..a2d1386e0fa Binary files /dev/null and b/website/static/img/docs/dbt-cloud/account-integration-dbtlabs.jpg differ diff --git a/website/static/img/docs/dbt-cloud/account-integration-git.jpg b/website/static/img/docs/dbt-cloud/account-integration-git.jpg new file mode 100644 index 00000000000..70a275bd039 Binary files /dev/null and b/website/static/img/docs/dbt-cloud/account-integration-git.jpg differ diff --git a/website/static/img/docs/dbt-cloud/account-integration-oauth.jpg b/website/static/img/docs/dbt-cloud/account-integration-oauth.jpg new file mode 100644 index 00000000000..6efb135c46f Binary files /dev/null and b/website/static/img/docs/dbt-cloud/account-integration-oauth.jpg differ diff --git a/website/static/img/docs/dbt-cloud/account-integration-openai.jpg b/website/static/img/docs/dbt-cloud/account-integration-openai.jpg new file mode 100644 index 00000000000..f92fec5c712 Binary files /dev/null and b/website/static/img/docs/dbt-cloud/account-integration-openai.jpg differ diff --git a/website/static/img/docs/dbt-cloud/account-integrations.jpg b/website/static/img/docs/dbt-cloud/account-integrations.jpg new file mode 100644 index 00000000000..56ff1859636 Binary files /dev/null and b/website/static/img/docs/dbt-cloud/account-integrations.jpg differ diff --git a/website/static/img/docs/dbt-cloud/cloud-configuring-dbt-cloud/choosing-dbt-version/example-environment-settings.png b/website/static/img/docs/dbt-cloud/cloud-configuring-dbt-cloud/choosing-dbt-version/example-environment-settings.png index 02e5073fd16..7e0d2ea747a 100644 Binary files a/website/static/img/docs/dbt-cloud/cloud-configuring-dbt-cloud/choosing-dbt-version/example-environment-settings.png and b/website/static/img/docs/dbt-cloud/cloud-configuring-dbt-cloud/choosing-dbt-version/example-environment-settings.png differ diff --git a/website/static/img/docs/dbt-cloud/cloud-ide/copilot-sql-generation-prompt.jpg b/website/static/img/docs/dbt-cloud/cloud-ide/copilot-sql-generation-prompt.jpg new file mode 100644 index 00000000000..da42bbd83dd Binary files /dev/null and b/website/static/img/docs/dbt-cloud/cloud-ide/copilot-sql-generation-prompt.jpg differ diff --git a/website/static/img/docs/dbt-cloud/cloud-ide/copilot-sql-generation.gif b/website/static/img/docs/dbt-cloud/cloud-ide/copilot-sql-generation.gif new file mode 100644 index 00000000000..74e6409e34d Binary files /dev/null and b/website/static/img/docs/dbt-cloud/cloud-ide/copilot-sql-generation.gif differ diff --git a/website/static/img/docs/dbt-cloud/cloud-ide/dbt-assist-toggle.jpg b/website/static/img/docs/dbt-cloud/cloud-ide/dbt-assist-toggle.jpg deleted file mode 100644 index 50dfbe7f51a..00000000000 Binary files a/website/static/img/docs/dbt-cloud/cloud-ide/dbt-assist-toggle.jpg and /dev/null differ diff --git a/website/static/img/docs/dbt-cloud/cloud-ide/dbt-assist.gif b/website/static/img/docs/dbt-cloud/cloud-ide/dbt-assist.gif deleted file mode 100644 index be3236a5123..00000000000 Binary files a/website/static/img/docs/dbt-cloud/cloud-ide/dbt-assist.gif and /dev/null differ diff --git a/website/static/img/docs/dbt-cloud/cloud-ide/dbt-copilot-doc.gif b/website/static/img/docs/dbt-cloud/cloud-ide/dbt-copilot-doc.gif index cca8db37a0a..2e4d42e2efe 100644 Binary files a/website/static/img/docs/dbt-cloud/cloud-ide/dbt-copilot-doc.gif and b/website/static/img/docs/dbt-cloud/cloud-ide/dbt-copilot-doc.gif differ diff --git a/website/static/img/docs/dbt-cloud/using-dbt-cloud/Environment Variables/job-override.gif b/website/static/img/docs/dbt-cloud/using-dbt-cloud/Environment Variables/job-override.gif index 3ce6cee6259..1fb2cbd3e97 100644 Binary files a/website/static/img/docs/dbt-cloud/using-dbt-cloud/Environment Variables/job-override.gif and b/website/static/img/docs/dbt-cloud/using-dbt-cloud/Environment Variables/job-override.gif differ diff --git a/website/static/img/docs/dbt-cloud/using-dbt-cloud/Environment Variables/personal-override.gif b/website/static/img/docs/dbt-cloud/using-dbt-cloud/Environment Variables/personal-override.gif index 4185e3c98d8..d3e64f2c4af 100644 Binary files a/website/static/img/docs/dbt-cloud/using-dbt-cloud/Environment Variables/personal-override.gif and b/website/static/img/docs/dbt-cloud/using-dbt-cloud/Environment Variables/personal-override.gif differ diff --git a/website/static/img/docs/dbt-cloud/using-dbt-cloud/Environment Variables/personal-override.png b/website/static/img/docs/dbt-cloud/using-dbt-cloud/Environment Variables/personal-override.png index 64b0ac8170f..b221a0b73ba 100644 Binary files a/website/static/img/docs/dbt-cloud/using-dbt-cloud/Environment Variables/personal-override.png and b/website/static/img/docs/dbt-cloud/using-dbt-cloud/Environment Variables/personal-override.png differ diff --git a/website/static/img/docs/dbt-cloud/using-dbt-cloud/Environment Variables/refresh-ide.gif b/website/static/img/docs/dbt-cloud/using-dbt-cloud/Environment Variables/refresh-ide.gif deleted file mode 100644 index 14b700547ca..00000000000 Binary files a/website/static/img/docs/dbt-cloud/using-dbt-cloud/Environment Variables/refresh-ide.gif and /dev/null differ diff --git a/website/static/img/docs/dbt-cloud/using-dbt-cloud/Environment Variables/refresh-ide.png b/website/static/img/docs/dbt-cloud/using-dbt-cloud/Environment Variables/refresh-ide.png new file mode 100644 index 00000000000..54588f53d5d Binary files /dev/null and b/website/static/img/docs/dbt-cloud/using-dbt-cloud/Environment Variables/refresh-ide.png differ diff --git a/website/static/img/docs/dbt-cloud/using-dbt-cloud/prod-settings-1.png b/website/static/img/docs/dbt-cloud/using-dbt-cloud/prod-settings-1.png new file mode 100644 index 00000000000..5fd53ffde78 Binary files /dev/null and b/website/static/img/docs/dbt-cloud/using-dbt-cloud/prod-settings-1.png differ diff --git a/website/vercel.json b/website/vercel.json index 3340a4ab684..b68dc053db9 100644 --- a/website/vercel.json +++ b/website/vercel.json @@ -102,6 +102,11 @@ "destination": "/docs/dbt-versions/core-upgrade/Older%20versions/upgrading-to-v1.4", "permanent": true }, + { + "source": "/docs/dbt-versions/versionless-cloud", + "destination": "/docs/dbt-versions/cloud-release-tracks", + "permanent": true + }, { "source": "/best-practices/how-we-mesh/mesh-4-faqs", "destination": "/best-practices/how-we-mesh/mesh-5-faqs", @@ -3646,7 +3651,7 @@ }, { "key": "Content-Security-Policy", - "value": "img-src 'self' data: https:;" + "value": "img-src 'self' data: https:; frame-ancestors 'self' https://*.mutinyhq.com https://*.getdbt.com" }, { "key": "Strict-Transport-Security",