diff --git a/contributing/content-style-guide.md b/contributing/content-style-guide.md index 0d2bf243d45..e6fafec4593 100644 --- a/contributing/content-style-guide.md +++ b/contributing/content-style-guide.md @@ -284,7 +284,7 @@ If the list starts getting lengthy and dense, consider presenting the same conte A bulleted list with introductory text: -> A dbt project is a directory of `.sql` and .yml` files. The directory must contain at a minimum: +> A dbt project is a directory of `.sql` and `.yml` files. The directory must contain at a minimum: > > - Models: A model is a single `.sql` file. Each model contains a single `select` statement that either transforms raw data into a dataset that is ready for analytics or, more often, is an intermediate step in such a transformation. > - A project file: A `dbt_project.yml` file, which configures and defines your dbt project. diff --git a/website/blog/2021-11-22-dbt-labs-pr-template.md b/website/blog/2021-11-22-dbt-labs-pr-template.md index 40d4960ac18..439a02371ec 100644 --- a/website/blog/2021-11-22-dbt-labs-pr-template.md +++ b/website/blog/2021-11-22-dbt-labs-pr-template.md @@ -70,7 +70,7 @@ Checking for things like modularity and 1:1 relationships between sources and st #### Validation of models: -This section should show something to confirm that your model is doing what you intended it to do. This could be a [dbt test](/docs/build/tests) like uniqueness or not null, or could be an ad-hoc query that you wrote to validate your data. Here is a screenshot from a test run on a local development branch: +This section should show something to confirm that your model is doing what you intended it to do. This could be a [dbt test](/docs/build/data-tests) like uniqueness or not null, or could be an ad-hoc query that you wrote to validate your data. Here is a screenshot from a test run on a local development branch: ![test validation](/img/blog/pr-template-test-validation.png "dbt test validation") diff --git a/website/blog/2021-11-22-primary-keys.md b/website/blog/2021-11-22-primary-keys.md index 84c92055eb0..d5f87cddd94 100644 --- a/website/blog/2021-11-22-primary-keys.md +++ b/website/blog/2021-11-22-primary-keys.md @@ -51,7 +51,7 @@ In the days before testing your data was commonplace, you often found out that y ## How to test primary keys with dbt -Today, you can add two simple [dbt tests](/docs/build/tests) onto your primary keys and feel secure that you are going to catch the vast majority of problems in your data. +Today, you can add two simple [dbt tests](/docs/build/data-tests) onto your primary keys and feel secure that you are going to catch the vast majority of problems in your data. Not surprisingly, these two tests correspond to the two most common errors found on your primary keys, and are usually the first tests that teams testing data with dbt implement: diff --git a/website/blog/2021-11-29-dbt-airflow-spiritual-alignment.md b/website/blog/2021-11-29-dbt-airflow-spiritual-alignment.md index b179c0f5c7c..d20c7d139d0 100644 --- a/website/blog/2021-11-29-dbt-airflow-spiritual-alignment.md +++ b/website/blog/2021-11-29-dbt-airflow-spiritual-alignment.md @@ -90,7 +90,7 @@ So instead of getting bogged down in defining roles, let’s focus on hard skill The common skills needed for implementing any flavor of dbt (Core or Cloud) are: * SQL: ‘nuff said -* YAML: required to generate config files for [writing tests on data models](/docs/build/tests) +* YAML: required to generate config files for [writing tests on data models](/docs/build/data-tests) * [Jinja](/guides/using-jinja): allows you to write DRY code (using [macros](/docs/build/jinja-macros), for loops, if statements, etc) YAML + Jinja can be learned pretty quickly, but SQL is the non-negotiable you’ll need to get started. diff --git a/website/blog/2021-12-05-how-to-build-a-mature-dbt-project-from-scratch.md b/website/blog/2021-12-05-how-to-build-a-mature-dbt-project-from-scratch.md index 8ea387cf00c..f3a24a0febd 100644 --- a/website/blog/2021-12-05-how-to-build-a-mature-dbt-project-from-scratch.md +++ b/website/blog/2021-12-05-how-to-build-a-mature-dbt-project-from-scratch.md @@ -87,7 +87,7 @@ The most important thing we’re introducing when your project is an infant is t * Introduce modularity with [{{ ref() }}](/reference/dbt-jinja-functions/ref) and [{{ source() }}](/reference/dbt-jinja-functions/source) -* [Document](/docs/collaborate/documentation) and [test](/docs/build/tests) your first models +* [Document](/docs/collaborate/documentation) and [test](/docs/build/data-tests) your first models ![image alt text](/img/blog/building-a-mature-dbt-project-from-scratch/image_3.png) diff --git a/website/blog/2022-04-19-complex-deduplication.md b/website/blog/2022-04-19-complex-deduplication.md index daacff4eec6..f33e6a8fe35 100644 --- a/website/blog/2022-04-19-complex-deduplication.md +++ b/website/blog/2022-04-19-complex-deduplication.md @@ -146,7 +146,7 @@ select * from filter_real_diffs > *What happens in this step? You check your data because you are thorough!* -Good thing dbt has already built this for you. Add a [unique test](/docs/build/tests#generic-tests) to your YAML model block for your `grain_id` in this de-duped staging model, and give it a dbt test! +Good thing dbt has already built this for you. Add a [unique test](/docs/build/data-tests#generic-data-tests) to your YAML model block for your `grain_id` in this de-duped staging model, and give it a dbt test! ```yaml models: diff --git a/website/blog/2022-09-28-analyst-to-ae.md b/website/blog/2022-09-28-analyst-to-ae.md index 7c8ccaeabec..bf19bbae59e 100644 --- a/website/blog/2022-09-28-analyst-to-ae.md +++ b/website/blog/2022-09-28-analyst-to-ae.md @@ -111,7 +111,7 @@ The analyst caught the issue because they have the appropriate context to valida An analyst is able to identify which areas do *not* need to be 100% accurate, which means they can also identify which areas *do* need to be 100% accurate. -> dbt makes it very quick to add [data quality tests](/docs/build/tests). In fact, it’s so quick, that it’ll take an analyst longer to write up what tests they want than it would take for an analyst to completely finish coding them. +> dbt makes it very quick to add [data quality tests](/docs/build/data-tests). In fact, it’s so quick, that it’ll take an analyst longer to write up what tests they want than it would take for an analyst to completely finish coding them. When data quality issues are identified by the business, we often see that analysts are the first ones to be asked: diff --git a/website/blog/2022-10-19-polyglot-dbt-python-dataframes-and-sql.md b/website/blog/2022-10-19-polyglot-dbt-python-dataframes-and-sql.md index bab92000a16..694f6ddc105 100644 --- a/website/blog/2022-10-19-polyglot-dbt-python-dataframes-and-sql.md +++ b/website/blog/2022-10-19-polyglot-dbt-python-dataframes-and-sql.md @@ -133,9 +133,9 @@ This model tries to parse the raw string value into a Python datetime. When not #### Testing the result -During the build process, dbt will check if any of the values are null. This is using the built-in [`not_null`](https://docs.getdbt.com/docs/building-a-dbt-project/tests#generic-tests) test, which will generate and execute SQL in the data platform. +During the build process, dbt will check if any of the values are null. This is using the built-in [`not_null`](https://docs.getdbt.com/docs/building-a-dbt-project/tests#generic-data-tests) test, which will generate and execute SQL in the data platform. -Our initial recommendation for testing Python models is to use [generic](https://docs.getdbt.com/docs/building-a-dbt-project/tests#generic-tests) and [singular](https://docs.getdbt.com/docs/building-a-dbt-project/tests#singular-tests) tests. +Our initial recommendation for testing Python models is to use [generic](https://docs.getdbt.com/docs/building-a-dbt-project/tests#generic-data-tests) and [singular](https://docs.getdbt.com/docs/building-a-dbt-project/tests#singular-data-tests) tests. ```yaml version: 2 diff --git a/website/blog/2023-01-24-aggregating-test-failures.md b/website/blog/2023-01-24-aggregating-test-failures.md index d82c202b376..2319da910a6 100644 --- a/website/blog/2023-01-24-aggregating-test-failures.md +++ b/website/blog/2023-01-24-aggregating-test-failures.md @@ -30,7 +30,7 @@ _It should be noted that this framework is for dbt v1.0+ on BigQuery. Small adap When we talk about high quality data tests, we aren’t just referencing high quality code, but rather the informational quality of our testing framework and their corresponding error messages. Originally, we theorized that any test that cannot be acted upon is a test that should not be implemented. Later, we realized there is a time and place for tests that should receive attention at a critical mass of failures. All we needed was a higher specificity system: tests should have an explicit severity ranking associated with them, equipped to filter out the noise of common, but low concern, failures. Each test should also mesh into established [RACI](https://project-management.com/understanding-responsibility-assignment-matrix-raci-matrix/) guidelines that state which groups tackle what failures, and what constitutes a critical mass. -To ensure that tests are always acted upon, we implement tests differently depending on the user groups that must act when a test fails. This led us to have two main classes of tests — Data Integrity Tests (called [Generic Tests](https://docs.getdbt.com/docs/build/tests) in dbt docs) and Context Driven Tests (called [Singular Tests](https://docs.getdbt.com/docs/build/tests#singular-tests) in dbt docs), with varying levels of severity across both test classes. +To ensure that tests are always acted upon, we implement tests differently depending on the user groups that must act when a test fails. This led us to have two main classes of tests — Data Integrity Tests (called [Generic Tests](https://docs.getdbt.com/docs/build/tests) in dbt docs) and Context Driven Tests (called [Singular Tests](https://docs.getdbt.com/docs/build/tests#singular-data-tests) in dbt docs), with varying levels of severity across both test classes. Data Integrity tests (Generic Tests)  are simple — they’re tests akin to a uniqueness check or not null constraint. These tests are usually actionable by the data platform team rather than subject matter experts. We define Data Integrity tests in our YAML files, similar to how they are [outlined by dbt’s documentation on generic tests](https://docs.getdbt.com/docs/build/tests). They look something like this — diff --git a/website/blog/2023-07-03-data-vault-2-0-with-dbt-cloud.md b/website/blog/2023-07-03-data-vault-2-0-with-dbt-cloud.md index 2a4879ac98d..6b1012a5320 100644 --- a/website/blog/2023-07-03-data-vault-2-0-with-dbt-cloud.md +++ b/website/blog/2023-07-03-data-vault-2-0-with-dbt-cloud.md @@ -143,7 +143,7 @@ To help you get started, [we have created a template GitHub project](https://git ### Entity Relation Diagrams (ERDs) and dbt -Data lineage is dbt's strength, but sometimes it's not enough to help you to understand the relationships between Data Vault components like a classic ERD would. There are a few open source packages to visualize the entities in your Data Vault built with dbt. I recommend checking out the [dbterd](https://dbterd.datnguyen.de/1.2/index.html) which turns your [dbt relationship data quality checks](https://docs.getdbt.com/docs/build/tests#generic-tests) into an ERD. +Data lineage is dbt's strength, but sometimes it's not enough to help you to understand the relationships between Data Vault components like a classic ERD would. There are a few open source packages to visualize the entities in your Data Vault built with dbt. I recommend checking out the [dbterd](https://dbterd.datnguyen.de/1.2/index.html) which turns your [dbt relationship data quality checks](https://docs.getdbt.com/docs/build/tests#generic-data-tests) into an ERD. ## Summary diff --git a/website/blog/2023-12-11-semantic-layer-on-semantic-layer.md b/website/blog/2023-12-11-semantic-layer-on-semantic-layer.md new file mode 100644 index 00000000000..ea77072a6dd --- /dev/null +++ b/website/blog/2023-12-11-semantic-layer-on-semantic-layer.md @@ -0,0 +1,97 @@ +--- +title: "How we built consistent product launch metrics with the dbt Semantic Layer." +description: "We built an end-to-end data pipeline for measuring the launch of the dbt Semantic Layer using the dbt Semantic Layer." +slug: product-analytics-pipeline-with-dbt-semantic-layer + +authors: [jordan_stein] + +tags: [dbt Cloud] +hide_table_of_contents: false + +date: 2023-12-12 +is_featured: false +--- +There’s nothing quite like the feeling of launching a new product. +On launch day emotions can range from excitement, to fear, to accomplishment all in the same hour. +Once the dust settles and the product is in the wild, the next thing the team needs to do is track how the product is doing. +How many users do we have? How is performance looking? What features are customers using? How often? Answering these questions is vital to understanding the success of any product launch. + +At dbt we recently made the [Semantic Layer Generally Available](https://www.getdbt.com/blog/new-dbt-cloud-features-announced-at-coalesce-2023). The Semantic Layer lets teams define business metrics centrally, in dbt, and access them in multiple analytics tools through our semantic layer APIs. +I’m a Product Manager on the Semantic Layer team, and the launch of the Semantic Layer put our team in an interesting, somewhat “meta,” position: we need to understand how a product launch is doing, and the product we just launched is designed to make defining and consuming metrics much more efficient. It’s the perfect opportunity to put the semantic layer through its paces for product analytics. This blog post walks through the end-to-end process we used to set up product analytics for the dbt Semantic Layer using the dbt Semantic Layer. + +## Getting your data ready for metrics + +The first steps to building a product analytics pipeline with the Semantic Layer look the same as just using dbt - it’s all about data transformation. The steps we followed were broadly: + +1. Work with engineering to understand the data sources. In our case, it’s db exports from Semantic Layer Server. +2. Load the data into our warehouse. We use Fivetran and Snowflake. +3. Transform the data into normalized tables with dbt. This step is a classic. dbt’s bread and butter. You probably know the drill by now. + +There are [plenty of other great resources](https://docs.getdbt.com/docs/build/projects) on how to accomplish the above steps, I’m going to skip that in this post and focus on how we built business metrics using the Semantic Layer. Once the data is loaded and modeling is complete, our DAG for the Semantic Layer data looks like the following: + + + + + + + + +Let’s walk through the DAG from left to right: First, we have raw tables from the Semantic Layer Server loaded into our warehouse, next we have staging models where we apply business logic and finally a clean, normalized `fct_semantic_layer_queries` model. Finally, we built a semantic model named `semantic_layer_queries` on top of our normalized fact model. This is a typical DAG for a dbt project that contains semantic objects. Now let’s zoom in to the section of the DAG that contains our semantic layer objects and look in more detail at how we defined our semantic layer product metrics. + +## [How we build semantic models and metrics](https://docs.getdbt.com/best-practices/how-we-build-our-metrics/semantic-layer-1-intro) + +What [is a semantic model](https://docs.getdbt.com/docs/build/semantic-models)? Put simply, semantic models contain the components we need to build metrics. Semantic models are YAML files that live in your dbt project. They contain metadata about your dbt models in a format that MetricFlow, the query builder that powers the semantic layer, can understand. The DAG below in [dbt Explorer](https://docs.getdbt.com/docs/collaborate/explore-projects) shows the metrics we’ve built off of `semantic_layer_queries`. + + + + +Let’s dig into semantic models and metrics a bit more, and explain some of the data modeling decisions we made. First, we needed to decide what model to use as a base for our semantic model. We decide to use`fct_semantic_layer`queries as our base model because defining a semantic model on top of a normalized fact table gives us maximum flexibility to join to other tables. This increased the number of dimensions available, which means we can answer more questions. + +You may wonder: why not just build our metrics on top of raw tables and let MetricFlow figure out the rest? The reality is, that you will almost almost always need to do some form of data modeling to create the data set you want to build your metrics off of. MetricFlow’s job isn’t to do data modeling. The transformation step is done with dbt. + +Next, we had to decide what we wanted to put into our semantic models. Semantic models contain [dimensions](https://docs.getdbt.com/docs/build/dimensions), [measures](https://docs.getdbt.com/docs/build/measures), and [entities](https://docs.getdbt.com/docs/build/entities). We took the following approach to add each of these components: + +- Dimensions: We included all the relevant dimensions in our semantic model that stakeholders might ask for, like the time a query was created, the query status, and booleans showing if a query contained certain elements like a where filter or multiple metrics. +- Entities: We added entities to our semantic model, like dbt cloud environment id. Entities function as join keys in semantic models, which means any other semantic models that have a j[oinable entity](https://docs.getdbt.com/docs/build/join-logic) can be used when querying metrics. +- Measures: Next we added Measures. Measures define the aggregation you want to run on your data. I think of measures as a metric primitive, we’ll use them to build metrics and can reuse them to keep our code [DRY](https://docs.getdbt.com/terms/dry). + +Finally, we reference the measures defined in our semantic model to create metrics. Our initial set of usage metrics are all relatively simple aggregations. For example, the total number of queries run. + +```yaml +## Example of a metric definition +metrics: + - name: queries + description: The total number of queries run + type: simple + label: Semantic Layer Queries + type_params: + measure: queries +``` + +Having our metrics in the semantic layer is powerful in a few ways. Firstly, metric definitions and the generated SQL are centralized, and live in our dbt project, instead of being scattered across BI tools or sql clients. Secondly, the types of queries I can run are dynamic and flexible. Traditionally, I would materialize a cube or rollup table which needs to contain all the different dimensional slices my users might be curious about. Now, users can join tables and add dimensionality to their metrics queries on the fly at query time, saving our data team cycles of updating and adding new fields to rollup tables. Thirdly, we can expose these metrics to a variety of downstream BI tools so stakeholders in product, finance, or GTM can understand product performance regardless of their technical skills. + +Now that we’ve done the pipeline work to set up our metrics for the semantic layer launch we’re ready to analyze how the launch went! + +## Our Finance, Operations and GTM teams are all looking at the same metrics 😊 + +To query to Semantic Layer you have two paths: you can query metrics directly through the Semantic Layer APIs or use one of our [first-class integrations](https://docs.getdbt.com/docs/use-dbt-semantic-layer/avail-sl-integrations). Our analytics team and product teams are big Hex users, while our operations and finance teams live and breathe Google Sheets, so it’s important for us to have the same metric definitions available in both tools. + +The leg work of building our pipeline and defining metrics is all done, which makes last-mile consumption much easier. First, we set up a launch dashboard in Hex as the source of truth for semantic layer product metrics. This tool is used by cross-functional partners like marketing, sales, and the executive team to easily check product and usage metrics like total semantic layer queries, or weekly active semantic layer users. To set up our Hex connection, we simply enter a few details from our dbt Cloud environment and then we can work with metrics directly in Hex notebooks. We can use the JDBC interface, or use Hex’s GUI metric builder to build reports. We run all our WBRs off this dashboard, which allows us to spot trends in consumption and react quickly to changes in our business. + + + + + +On the finance and operations side, product usage data is crucial to making informed pricing decisions. All our pricing models are created in spreadsheets, so we leverage the Google Sheets integration to give those teams access to consistent data sets without the need to download CSVs from the Hex dashboard. This lets the Pricing team add dimensional slices, like tier and company size, to the data in a self-serve manner without having to request data team resources to generate those insights. This allows our finance team to iteratively build financial models and be more self-sufficient in pulling data, instead of relying on data team resources. + + + + + +As a former data scientist and data engineer, I personally think this is a huge improvement over the approach I would have used without the semantic layer. My old approach would have been to materialize One Big Table with all the numeric and categorical columns I needed for analysis. Then write a ton of SQL in Hex or various notebooks to create reports for stakeholders. Inevitably I’m signing up for more development cycles to update the pipeline whenever a new dimension needs to be added or the data needs to be aggregated in a slightly different way. From a data team management perspective, using a central semantic layer saves data analysts cycles since users can more easily self-serve. At every company I’ve ever worked at, data analysts are always in high demand, with more requests than they can reasonably accomplish. This means any time a stakeholder can self-serve their data without pulling us in is a huge win. + +## The Result: Consistent Governed Metrics + +And just like that, we have an end-to-end pipeline for product analytics on the dbt Semantic Layer using the dbt Semantic Layer 🤯. Part of the foundational work to build this pipeline will be familiar to you, like building out a normalized fact table using dbt. Hopefully walking through the next step of adding semantic models and metrics on top of those dbt models helped give you some ideas about how you can use the semantic layer for your team. Having launch metrics defined in dbt made keeping the entire organization up to date on product adoption and performance much easier. Instead of a rollup table or static materialized cubes, we added flexible metrics without rewriting logic in SQL, or adding additional tables to the end of our DAG. + +The result is access to consistent and governed metrics in the tool our stakeholders are already using to do their jobs. We are able to keep the entire organization aligned and give them access to consistent, accurate data they need to do their part to make the semantic layer product successful. Thanks for reading! If you’re thinking of using the semantic layer, or have questions we’re always happy to keep the conversation going in the [dbt community slack.](https://www.getdbt.com/community/join-the-community) Drop us a note in #dbt-cloud-semantic-layer. We’d love to hear from you! diff --git a/website/blog/authors.yml b/website/blog/authors.yml index 31d69824ed4..cd2bd162935 100644 --- a/website/blog/authors.yml +++ b/website/blog/authors.yml @@ -287,6 +287,14 @@ jonathan_natkins: url: https://twitter.com/nattyice name: Jon "Natty" Natkins organization: dbt Labs +jordan_stein: + image_url: /img/blog/authors/jordan.jpeg + job_title: Product Manager + links: + - icon: fa-linkedin + url: https://www.linkedin.com/in/jstein5/ + name: Jordan Stein + organization: dbt Labs josh_fell: image_url: /img/blog/authors/josh-fell.jpeg diff --git a/website/docs/best-practices/clone-incremental-models.md b/website/docs/best-practices/clone-incremental-models.md new file mode 100644 index 00000000000..4096af489ab --- /dev/null +++ b/website/docs/best-practices/clone-incremental-models.md @@ -0,0 +1,79 @@ +--- +title: "Clone incremental models as the first step of your CI job" +id: "clone-incremental-models" +description: Learn how to define clone incremental models as the first step of your CI job. +displayText: Clone incremental models as the first step of your CI job +hoverSnippet: Learn how to clone incremental models for CI jobs. +--- + +Before you begin, you must be aware of a few conditions: +- `dbt clone` is only available with dbt version 1.6 and newer. Refer to our [upgrade guide](/docs/dbt-versions/upgrade-core-in-cloud) for help enabling newer versions in dbt Cloud +- This strategy only works for warehouse that support zero copy cloning (otherwise `dbt clone` will just create pointer views). +- Some teams may want to test that their incremental models run in both incremental mode and full-refresh mode. + +Imagine you've created a [Slim CI job](/docs/deploy/continuous-integration) in dbt Cloud and it is configured to: + +- Defer to your production environment. +- Run the command `dbt build --select state:modified+` to run and test all of the models you've modified and their downstream dependencies. +- Trigger whenever a developer on your team opens a PR against the main branch. + + + +Now imagine your dbt project looks something like this in the DAG: + + + +When you open a pull request (PR) that modifies `dim_wizards`, your CI job will kickoff and build _only the modified models and their downstream dependencies_ (in this case, `dim_wizards` and `fct_orders`) into a temporary schema that's unique to your PR. + +This build mimics the behavior of what will happen once the PR is merged into the main branch. It ensures you're not introducing breaking changes, without needing to build your entire dbt project. + +## What happens when one of the modified models (or one of their downstream dependencies) is an incremental model? + +Because your CI job is building modified models into a PR-specific schema, on the first execution of `dbt build --select state:modified+`, the modified incremental model will be built in its entirety _because it does not yet exist in the PR-specific schema_ and [is_incremental will be false](/docs/build/incremental-models#understanding-the-is_incremental-macro). You're running in `full-refresh` mode. + +This can be suboptimal because: +- Typically incremental models are your largest datasets, so they take a long time to build in their entirety which can slow down development time and incur high warehouse costs. +- There are situations where a `full-refresh` of the incremental model passes successfully in your CI job but an _incremental_ build of that same table in prod would fail when the PR is merged into main (think schema drift where [on_schema_change](/docs/build/incremental-models#what-if-the-columns-of-my-incremental-model-change) config is set to `fail`) + +You can alleviate these problems by zero copy cloning the relevant, pre-exisitng incremental models into your PR-specific schema as the first step of the CI job using the `dbt clone` command. This way, the incremental models already exist in the PR-specific schema when you first execute the command `dbt build --select state:modified+` so the `is_incremental` flag will be `true`. + +You'll have two commands for your dbt Cloud CI check to execute: +1. Clone all of the pre-existing incremental models that have been modified or are downstream of another model that has been modified: `dbt clone --select state:modified+,config.materialized:incremental,state:old` +2. Build all of the models that have been modified and their downstream dependencies: `dbt build --select state:modified+` + +Because of your first clone step, the incremental models selected in your `dbt build` on the second step will run in incremental mode. + + + +Your CI jobs will run faster, and you're more accurately mimicking the behavior of what will happen once the PR has been merged into main. + +### Expansion on "think schema drift" where [on_schema_change](/docs/build/incremental-models#what-if-the-columns-of-my-incremental-model-change) config is set to `fail`" from above + +Imagine you have an incremental model `my_incremental_model` with the following config: + +```sql + +{{ + config( + materialized='incremental', + unique_key='unique_id', + on_schema_change='fail' + ) +}} + +``` + +Now, let’s say you open up a PR that adds a new column to `my_incremental_model`. In this case: +- An incremental build will fail. +- A `full-refresh` will succeed. + +If you have a daily production job that just executes `dbt build` without a `--full-refresh` flag, once the PR is merged into main and the job kicks off, you will get a failure. So the question is - what do you want to happen in CI? +- Do you want to also get a failure in CI, so that you know that once this PR is merged into main you need to immediately execute a `dbt build --full-refresh --select my_incremental_model` in production in order to avoid a failure in prod? This will block your CI check from passing. +- Do you want your CI check to succeed, because once you do run a `full-refresh` for this model in prod you will be in a successful state? This may lead unpleasant surprises if your production job is suddenly failing when you merge this PR into main if you don’t remember you need to execute a `dbt build --full-refresh --select my_incremental_model` in production. + +There’s probably no perfect solution here; it’s all just tradeoffs! Our preference would be to have the failing CI job and have to manually override the blocking branch protection rule so that there are no surprises and we can proactively run the appropriate command in production once the PR is merged. + +### Expansion on "why `state:old`" + +For brand new incremental models, you want them to run in `full-refresh` mode in CI, because they will run in `full-refresh` mode in production when the PR is merged into `main`. They also don't exist yet in the production environment... they're brand new! +If you don't specify this, you won't get an error just a “No relation found in state manifest for…”. So, it technically works without specifying `state:old` but adding `state:old` is more explicit and means it won't even try to clone the brand new incremental models. diff --git a/website/docs/best-practices/custom-generic-tests.md b/website/docs/best-practices/custom-generic-tests.md index f2d84e38853..e96fc864ee6 100644 --- a/website/docs/best-practices/custom-generic-tests.md +++ b/website/docs/best-practices/custom-generic-tests.md @@ -1,15 +1,15 @@ --- -title: "Writing custom generic tests" +title: "Writing custom generic data tests" id: "writing-custom-generic-tests" -description: Learn how to define your own custom generic tests. -displayText: Writing custom generic tests -hoverSnippet: Learn how to define your own custom generic tests. +description: Learn how to define your own custom generic data tests. +displayText: Writing custom generic data tests +hoverSnippet: Learn how to write your own custom generic data tests. --- -dbt ships with [Not Null](/reference/resource-properties/tests#not-null), [Unique](/reference/resource-properties/tests#unique), [Relationships](/reference/resource-properties/tests#relationships), and [Accepted Values](/reference/resource-properties/tests#accepted-values) generic tests. (These used to be called "schema tests," and you'll still see that name in some places.) Under the hood, these generic tests are defined as `test` blocks (like macros) in a globally accessible dbt project. You can find the source code for these tests in the [global project](https://github.com/dbt-labs/dbt-core/tree/main/core/dbt/include/global_project/macros/generic_test_sql). +dbt ships with [Not Null](/reference/resource-properties/data-tests#not-null), [Unique](/reference/resource-properties/data-tests#unique), [Relationships](/reference/resource-properties/data-tests#relationships), and [Accepted Values](/reference/resource-properties/data-tests#accepted-values) generic data tests. (These used to be called "schema tests," and you'll still see that name in some places.) Under the hood, these generic data tests are defined as `test` blocks (like macros) in a globally accessible dbt project. You can find the source code for these tests in the [global project](https://github.com/dbt-labs/dbt-core/tree/main/core/dbt/include/global_project/macros/generic_test_sql). :::info -There are tons of generic tests defined in open source packages, such as [dbt-utils](https://hub.getdbt.com/dbt-labs/dbt_utils/latest/) and [dbt-expectations](https://hub.getdbt.com/calogica/dbt_expectations/latest/) — the test you're looking for might already be here! +There are tons of generic data tests defined in open source packages, such as [dbt-utils](https://hub.getdbt.com/dbt-labs/dbt_utils/latest/) and [dbt-expectations](https://hub.getdbt.com/calogica/dbt_expectations/latest/) — the test you're looking for might already be here! ::: ### Generic tests with standard arguments diff --git a/website/docs/best-practices/dbt-unity-catalog-best-practices.md b/website/docs/best-practices/dbt-unity-catalog-best-practices.md index 89153fe1b86..a55e1d121af 100644 --- a/website/docs/best-practices/dbt-unity-catalog-best-practices.md +++ b/website/docs/best-practices/dbt-unity-catalog-best-practices.md @@ -49,9 +49,9 @@ The **prod** service principal should have “read” access to raw source data, | | Source Data | Development catalog | Production catalog | Test catalog | | --- | --- | --- | --- | --- | -| developers | use | use, create table & create view | use or none | none | -| production service principal | use | none | use, create table & create view | none | -| Test service principal | use | none | none | use, create table & create view | +| developers | use | use, create schema, table, & view | use or none | none | +| production service principal | use | none | use, create schema, table & view | none | +| Test service principal | use | none | none | use, create schema, table & view | ## Next steps diff --git a/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-1-intro.md b/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-1-intro.md index 19c6717063c..ee3d4262882 100644 --- a/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-1-intro.md +++ b/website/docs/best-practices/how-we-build-our-metrics/semantic-layer-1-intro.md @@ -4,10 +4,6 @@ description: Getting started with the dbt and MetricFlow hoverSnippet: Learn how to get started with the dbt and MetricFlow --- -:::tip -**This is a guide for a beta product.** We anticipate this guide will evolve alongside the Semantic Layer through community collaboration. We welcome discussions, ideas, issues, and contributions to refining best practices. -::: - Flying cars, hoverboards, and true self-service analytics: this is the future we were promised. The first two might still be a few years out, but real self-service analytics is here today. With dbt Cloud's Semantic Layer, you can resolve the tension between accuracy and flexibility that has hampered analytics tools for years, empowering everybody in your organization to explore a shared reality of metrics. Best of all for analytics engineers, building with these new tools will significantly [DRY](https://docs.getdbt.com/terms/dry) up and simplify your codebase. As you'll see, the deep interaction between your dbt models and the Semantic Layer make your dbt project the ideal place to craft your metrics. ## Learning goals diff --git a/website/docs/best-practices/materializations/materializations-guide-4-incremental-models.md b/website/docs/best-practices/materializations/materializations-guide-4-incremental-models.md index 71b24ef58f2..a195812ff1c 100644 --- a/website/docs/best-practices/materializations/materializations-guide-4-incremental-models.md +++ b/website/docs/best-practices/materializations/materializations-guide-4-incremental-models.md @@ -7,7 +7,7 @@ displayText: Materializations best practices hoverSnippet: Read this guide to understand the incremental models you can create in dbt. --- -So far we’ve looked at tables and views, which map to the traditional objects in the data warehouse. As mentioned earlier, incremental models are a little different. This where we start to deviate from this pattern with more powerful and complex materializations. +So far we’ve looked at tables and views, which map to the traditional objects in the data warehouse. As mentioned earlier, incremental models are a little different. This is where we start to deviate from this pattern with more powerful and complex materializations. - 📚 **Incremental models generate tables.** They physically persist the data itself to the warehouse, just piece by piece. What’s different is **how we build that table**. - 💅 **Only apply our transformations to rows of data with new or updated information**, this maximizes efficiency. @@ -53,7 +53,7 @@ where updated_at > (select max(updated_at) from {{ this }}) ``` -Let’s break down that `where` clause a bit, because this where the action is with incremental models. Stepping through the code **_right-to-left_** we: +Let’s break down that `where` clause a bit, because this is where the action is with incremental models. Stepping through the code **_right-to-left_** we: 1. Get our **cutoff.** 1. Select the `max(updated_at)` timestamp — the **most recent record** @@ -138,7 +138,7 @@ where {% endif %} ``` -Fantastic! We’ve got a working incremental model. On our first run, when there is no corresponding table in the warehouse, `is_incremental` will evaluate to false and we’ll capture the entire table. On subsequent runs is it will evaluate to true and we’ll apply our filter logic, capturing only the newer data. +Fantastic! We’ve got a working incremental model. On our first run, when there is no corresponding table in the warehouse, `is_incremental` will evaluate to false and we’ll capture the entire table. On subsequent runs it will evaluate to true and we’ll apply our filter logic, capturing only the newer data. ### Late arriving facts diff --git a/website/docs/community/resources/getting-help.md b/website/docs/community/resources/getting-help.md index 2f30644186e..19b7c22fbdf 100644 --- a/website/docs/community/resources/getting-help.md +++ b/website/docs/community/resources/getting-help.md @@ -60,4 +60,4 @@ If you want to receive dbt training, check out our [dbt Learn](https://learn.get - Billing - Bug reports related to the web interface -As a rule of thumb, if you are using dbt Cloud, but your problem is related to code within your dbt project, then please follow the above process rather than reaching out to support. +As a rule of thumb, if you are using dbt Cloud, but your problem is related to code within your dbt project, then please follow the above process rather than reaching out to support. Refer to [dbt Cloud support](/docs/dbt-support) for more information. diff --git a/website/docs/community/resources/oss-expectations.md b/website/docs/community/resources/oss-expectations.md index 9c916de1240..a57df7fe67f 100644 --- a/website/docs/community/resources/oss-expectations.md +++ b/website/docs/community/resources/oss-expectations.md @@ -94,6 +94,8 @@ PRs are your surest way to make the change you want to see in dbt / packages / d **Every PR should be associated with an issue.** Why? Before you spend a lot of time working on a contribution, we want to make sure that your proposal will be accepted. You should open an issue first, describing your desired outcome and outlining your planned change. If you've found an older issue that's already open, comment on it with an outline for your planned implementation. Exception to this rule: If you're just opening a PR for a cosmetic fix, such as a typo in documentation, an issue isn't needed. +**PRs should include robust testing.** Comprehensive testing within pull requests is crucial for the stability of our project. By prioritizing robust testing, we ensure the reliability of our codebase, minimize unforeseen issues and safeguard against potential regressions. We understand that creating thorough tests often requires significant effort, and your dedication to this process greatly contributes to the project's overall reliability. Thank you for your commitment to maintaining the integrity of our codebase!" + **Our goal is to review most new PRs within 7 days.** The first review will include some high-level comments about the implementation, including (at a high level) whether it’s something we think suitable to merge. Depending on the scope of the PR, the first review may include line-level code suggestions, or we may delay specific code review until the PR is more finalized / until we have more time. **Automation that can help us:** Many repositories have a template for pull request descriptions, which will include a checklist that must be completed before the PR can be merged. You don’t have to do all of these things to get an initial PR, but they definitely help. Those many include things like: diff --git a/website/docs/community/spotlight/alison-stanton.md b/website/docs/community/spotlight/alison-stanton.md index 2054f78b0b7..fd4a1796411 100644 --- a/website/docs/community/spotlight/alison-stanton.md +++ b/website/docs/community/spotlight/alison-stanton.md @@ -17,6 +17,7 @@ socialLinks: link: https://github.com/alison985/ dateCreated: 2023-11-07 hide_table_of_contents: true +communityAward: true --- ## When did you join the dbt community and in what way has it impacted your career? diff --git a/website/docs/community/spotlight/bruno-de-lima.md b/website/docs/community/spotlight/bruno-de-lima.md index 0365ee6c6a8..3cce6135ae0 100644 --- a/website/docs/community/spotlight/bruno-de-lima.md +++ b/website/docs/community/spotlight/bruno-de-lima.md @@ -20,6 +20,7 @@ socialLinks: link: https://medium.com/@bruno.szdl dateCreated: 2023-11-05 hide_table_of_contents: true +communityAward: true --- ## When did you join the dbt community and in what way has it impacted your career? diff --git a/website/docs/community/spotlight/dakota-kelley.md b/website/docs/community/spotlight/dakota-kelley.md index 57834d9cdff..85b79f0e85a 100644 --- a/website/docs/community/spotlight/dakota-kelley.md +++ b/website/docs/community/spotlight/dakota-kelley.md @@ -15,6 +15,7 @@ socialLinks: link: https://www.linkedin.com/in/dakota-kelley/ dateCreated: 2023-11-08 hide_table_of_contents: true +communityAward: true --- ## When did you join the dbt community and in what way has it impacted your career? diff --git a/website/docs/community/spotlight/fabiyi-opeyemi.md b/website/docs/community/spotlight/fabiyi-opeyemi.md index f67ff4aaefc..67efd90e1c5 100644 --- a/website/docs/community/spotlight/fabiyi-opeyemi.md +++ b/website/docs/community/spotlight/fabiyi-opeyemi.md @@ -18,6 +18,7 @@ socialLinks: link: https://www.linkedin.com/in/opeyemifabiyi/ dateCreated: 2023-11-06 hide_table_of_contents: true +communityAward: true --- ## When did you join the dbt community and in what way has it impacted your career? diff --git a/website/docs/community/spotlight/josh-devlin.md b/website/docs/community/spotlight/josh-devlin.md index d8a9b91c282..6036105940b 100644 --- a/website/docs/community/spotlight/josh-devlin.md +++ b/website/docs/community/spotlight/josh-devlin.md @@ -23,6 +23,7 @@ socialLinks: link: https://www.linkedin.com/in/josh-devlin/ dateCreated: 2023-11-10 hide_table_of_contents: true +communityAward: true --- ## When did you join the dbt community and in what way has it impacted your career? diff --git a/website/docs/community/spotlight/karen-hsieh.md b/website/docs/community/spotlight/karen-hsieh.md index 5147f39ce59..22d6915baf7 100644 --- a/website/docs/community/spotlight/karen-hsieh.md +++ b/website/docs/community/spotlight/karen-hsieh.md @@ -24,6 +24,7 @@ socialLinks: link: https://medium.com/@ijacwei dateCreated: 2023-11-04 hide_table_of_contents: true +communityAward: true --- ## When did you join the dbt community and in what way has it impacted your career? diff --git a/website/docs/community/spotlight/oliver-cramer.md b/website/docs/community/spotlight/oliver-cramer.md index bfd62db0908..7e9974a8a2c 100644 --- a/website/docs/community/spotlight/oliver-cramer.md +++ b/website/docs/community/spotlight/oliver-cramer.md @@ -16,6 +16,7 @@ socialLinks: link: https://www.linkedin.com/in/oliver-cramer/ dateCreated: 2023-11-02 hide_table_of_contents: true +communityAward: true --- ## When did you join the dbt community and in what way has it impacted your career? diff --git a/website/docs/community/spotlight/sam-debruyn.md b/website/docs/community/spotlight/sam-debruyn.md index 166adf58b09..24ce7aa1b15 100644 --- a/website/docs/community/spotlight/sam-debruyn.md +++ b/website/docs/community/spotlight/sam-debruyn.md @@ -18,6 +18,7 @@ socialLinks: link: https://debruyn.dev/ dateCreated: 2023-11-03 hide_table_of_contents: true +communityAward: true --- ## When did you join the dbt community and in what way has it impacted your career? diff --git a/website/docs/community/spotlight/stacy-lo.md b/website/docs/community/spotlight/stacy-lo.md index f0b70fcc225..23a5491dd18 100644 --- a/website/docs/community/spotlight/stacy-lo.md +++ b/website/docs/community/spotlight/stacy-lo.md @@ -17,6 +17,7 @@ socialLinks: link: https://www.linkedin.com/in/olycats/ dateCreated: 2023-11-01 hide_table_of_contents: true +communityAward: true --- ## When did you join the dbt community and in what way has it impacted your career? diff --git a/website/docs/community/spotlight/sydney-burns.md b/website/docs/community/spotlight/sydney-burns.md index ecebd6cdec0..25278ac6ecf 100644 --- a/website/docs/community/spotlight/sydney-burns.md +++ b/website/docs/community/spotlight/sydney-burns.md @@ -15,6 +15,7 @@ socialLinks: link: https://www.linkedin.com/in/sydneyeburns/ dateCreated: 2023-11-09 hide_table_of_contents: true +communityAward: true --- ## When did you join the dbt community and in what way has it impacted your career? diff --git a/website/docs/docs/about-setup.md b/website/docs/docs/about-setup.md index ceb34a5ccbb..1021c1b65ac 100644 --- a/website/docs/docs/about-setup.md +++ b/website/docs/docs/about-setup.md @@ -21,14 +21,14 @@ To begin configuring dbt now, select the option that is right for you. diff --git a/website/docs/docs/build/build-metrics-intro.md b/website/docs/docs/build/build-metrics-intro.md index cdac51224ed..24af2a0864a 100644 --- a/website/docs/docs/build/build-metrics-intro.md +++ b/website/docs/docs/build/build-metrics-intro.md @@ -14,7 +14,7 @@ Use MetricFlow in dbt to centrally define your metrics. As a key component of th MetricFlow allows you to: - Intuitively define metrics in your dbt project -- Develop from your preferred environment, whether that's the [dbt Cloud CLI](/docs/cloud/cloud-cli-installation), [dbt Cloud IDE](/docs/cloud/dbt-cloud-ide/develop-in-the-cloud), or [dbt Core](/docs/core/installation) +- Develop from your preferred environment, whether that's the [dbt Cloud CLI](/docs/cloud/cloud-cli-installation), [dbt Cloud IDE](/docs/cloud/dbt-cloud-ide/develop-in-the-cloud), or [dbt Core](/docs/core/installation-overview) - Use [MetricFlow commands](/docs/build/metricflow-commands) to query and test those metrics in your development environment - Harness the true magic of the universal dbt Semantic Layer and dynamically query these metrics in downstream tools (Available for dbt Cloud [Team or Enterprise](https://www.getdbt.com/pricing/) accounts only). diff --git a/website/docs/docs/build/cumulative-metrics.md b/website/docs/docs/build/cumulative-metrics.md index 708045c1f3e..45a136df751 100644 --- a/website/docs/docs/build/cumulative-metrics.md +++ b/website/docs/docs/build/cumulative-metrics.md @@ -38,10 +38,7 @@ metrics: ## Limitations Cumulative metrics are currently under active development and have the following limitations: - -1. You can only use the [`metric_time` dimension](/docs/build/dimensions#time) to check cumulative metrics. If you don't use `metric_time` in the query, the cumulative metric will return incorrect results because it won't perform the time spine join. This means you cannot reference time dimensions other than the `metric_time` in the query. -2. If you use `metric_time` in your query filter but don't include "start_time" and "end_time," cumulative metrics will left-censor the input data. For example, if you query a cumulative metric with a 7-day window with the filter `{{ TimeDimension('metric_time') }} BETWEEN '2023-08-15' AND '2023-08-30' `, the values for `2023-08-15` to `2023-08-20` return missing or incomplete data. This is because we apply the `metric_time` filter to the aggregation input. To avoid this, you must use `start_time` and `end_time` in the query filter. - +- You are required to use [`metric_time` dimension](/docs/build/dimensions#time) when querying cumulative metrics. If you don't use `metric_time` in the query, the cumulative metric will return incorrect results because it won't perform the time spine join. This means you cannot reference time dimensions other than the `metric_time` in the query. ## Cumulative metrics example diff --git a/website/docs/docs/build/tests.md b/website/docs/docs/build/data-tests.md similarity index 56% rename from website/docs/docs/build/tests.md rename to website/docs/docs/build/data-tests.md index 3d86dc6a81b..d981d7e272d 100644 --- a/website/docs/docs/build/tests.md +++ b/website/docs/docs/build/data-tests.md @@ -1,43 +1,43 @@ --- -title: "Add tests to your DAG" -sidebar_label: "Tests" -description: "Read this tutorial to learn how to use tests when building in dbt." +title: "Add data tests to your DAG" +sidebar_label: "Data tests" +description: "Read this tutorial to learn how to use data tests when building in dbt." search_weight: "heavy" -id: "tests" +id: "data-tests" keywords: - test, tests, testing, dag --- ## Related reference docs * [Test command](/reference/commands/test) -* [Test properties](/reference/resource-properties/tests) -* [Test configurations](/reference/test-configs) +* [Data test properties](/reference/resource-properties/data-tests) +* [Data test configurations](/reference/data-test-configs) * [Test selection examples](/reference/node-selection/test-selection-examples) ## Overview -Tests are assertions you make about your models and other resources in your dbt project (e.g. sources, seeds and snapshots). When you run `dbt test`, dbt will tell you if each test in your project passes or fails. +Data tests are assertions you make about your models and other resources in your dbt project (e.g. sources, seeds and snapshots). When you run `dbt test`, dbt will tell you if each test in your project passes or fails. -You can use tests to improve the integrity of the SQL in each model by making assertions about the results generated. Out of the box, you can test whether a specified column in a model only contains non-null values, unique values, or values that have a corresponding value in another model (for example, a `customer_id` for an `order` corresponds to an `id` in the `customers` model), and values from a specified list. You can extend tests to suit business logic specific to your organization – any assertion that you can make about your model in the form of a select query can be turned into a test. +You can use data tests to improve the integrity of the SQL in each model by making assertions about the results generated. Out of the box, you can test whether a specified column in a model only contains non-null values, unique values, or values that have a corresponding value in another model (for example, a `customer_id` for an `order` corresponds to an `id` in the `customers` model), and values from a specified list. You can extend data tests to suit business logic specific to your organization – any assertion that you can make about your model in the form of a select query can be turned into a data test. -Both types of tests return a set of failing records. Previously, generic/schema tests returned a numeric value representing failures. Generic tests (f.k.a. schema tests) are defined using `test` blocks instead of macros prefixed `test_`. +Data tests return a set of failing records. Generic data tests (f.k.a. schema tests) are defined using `test` blocks. -Like almost everything in dbt, tests are SQL queries. In particular, they are `select` statements that seek to grab "failing" records, ones that disprove your assertion. If you assert that a column is unique in a model, the test query selects for duplicates; if you assert that a column is never null, the test seeks after nulls. If the test returns zero failing rows, it passes, and your assertion has been validated. +Like almost everything in dbt, data tests are SQL queries. In particular, they are `select` statements that seek to grab "failing" records, ones that disprove your assertion. If you assert that a column is unique in a model, the test query selects for duplicates; if you assert that a column is never null, the test seeks after nulls. If the data test returns zero failing rows, it passes, and your assertion has been validated. -There are two ways of defining tests in dbt: -* A **singular** test is testing in its simplest form: If you can write a SQL query that returns failing rows, you can save that query in a `.sql` file within your [test directory](/reference/project-configs/test-paths). It's now a test, and it will be executed by the `dbt test` command. -* A **generic** test is a parameterized query that accepts arguments. The test query is defined in a special `test` block (like a [macro](jinja-macros)). Once defined, you can reference the generic test by name throughout your `.yml` files—define it on models, columns, sources, snapshots, and seeds. dbt ships with four generic tests built in, and we think you should use them! +There are two ways of defining data tests in dbt: +* A **singular** data test is testing in its simplest form: If you can write a SQL query that returns failing rows, you can save that query in a `.sql` file within your [test directory](/reference/project-configs/test-paths). It's now a data test, and it will be executed by the `dbt test` command. +* A **generic** data test is a parameterized query that accepts arguments. The test query is defined in a special `test` block (like a [macro](jinja-macros)). Once defined, you can reference the generic test by name throughout your `.yml` files—define it on models, columns, sources, snapshots, and seeds. dbt ships with four generic data tests built in, and we think you should use them! -Defining tests is a great way to confirm that your code is working correctly, and helps prevent regressions when your code changes. Because you can use them over and over again, making similar assertions with minor variations, generic tests tend to be much more common—they should make up the bulk of your dbt testing suite. That said, both ways of defining tests have their time and place. +Defining data tests is a great way to confirm that your outputs and inputs are as expected, and helps prevent regressions when your code changes. Because you can use them over and over again, making similar assertions with minor variations, generic data tests tend to be much more common—they should make up the bulk of your dbt data testing suite. That said, both ways of defining data tests have their time and place. -:::tip Creating your first tests +:::tip Creating your first data tests If you're new to dbt, we recommend that you check out our [quickstart guide](/guides) to build your first dbt project with models and tests. ::: -## Singular tests +## Singular data tests -The simplest way to define a test is by writing the exact SQL that will return failing records. We call these "singular" tests, because they're one-off assertions usable for a single purpose. +The simplest way to define a data test is by writing the exact SQL that will return failing records. We call these "singular" data tests, because they're one-off assertions usable for a single purpose. -These tests are defined in `.sql` files, typically in your `tests` directory (as defined by your [`test-paths` config](/reference/project-configs/test-paths)). You can use Jinja (including `ref` and `source`) in the test definition, just like you can when creating models. Each `.sql` file contains one `select` statement, and it defines one test: +These tests are defined in `.sql` files, typically in your `tests` directory (as defined by your [`test-paths` config](/reference/project-configs/test-paths)). You can use Jinja (including `ref` and `source`) in the test definition, just like you can when creating models. Each `.sql` file contains one `select` statement, and it defines one data test: @@ -56,10 +56,10 @@ having not(total_amount >= 0) The name of this test is the name of the file: `assert_total_payment_amount_is_positive`. Simple enough. -Singular tests are easy to write—so easy that you may find yourself writing the same basic structure over and over, only changing the name of a column or model. By that point, the test isn't so singular! In that case, we recommend... +Singular data tests are easy to write—so easy that you may find yourself writing the same basic structure over and over, only changing the name of a column or model. By that point, the test isn't so singular! In that case, we recommend... -## Generic tests -Certain tests are generic: they can be reused over and over again. A generic test is defined in a `test` block, which contains a parametrized query and accepts arguments. It might look like: +## Generic data tests +Certain data tests are generic: they can be reused over and over again. A generic data test is defined in a `test` block, which contains a parametrized query and accepts arguments. It might look like: ```sql {% test not_null(model, column_name) %} @@ -77,7 +77,7 @@ You'll notice that there are two arguments, `model` and `column_name`, which are If this is your first time working with adding properties to a resource, check out the docs on [declaring properties](/reference/configs-and-properties). ::: -Out of the box, dbt ships with four generic tests already defined: `unique`, `not_null`, `accepted_values` and `relationships`. Here's a full example using those tests on an `orders` model: +Out of the box, dbt ships with four generic data tests already defined: `unique`, `not_null`, `accepted_values` and `relationships`. Here's a full example using those tests on an `orders` model: ```yml version: 2 @@ -100,19 +100,19 @@ models: field: id ``` -In plain English, these tests translate to: +In plain English, these data tests translate to: * `unique`: the `order_id` column in the `orders` model should be unique * `not_null`: the `order_id` column in the `orders` model should not contain null values * `accepted_values`: the `status` column in the `orders` should be one of `'placed'`, `'shipped'`, `'completed'`, or `'returned'` * `relationships`: each `customer_id` in the `orders` model exists as an `id` in the `customers` (also known as referential integrity) -Behind the scenes, dbt constructs a `select` query for each test, using the parametrized query from the generic test block. These queries return the rows where your assertion is _not_ true; if the test returns zero rows, your assertion passes. +Behind the scenes, dbt constructs a `select` query for each data test, using the parametrized query from the generic test block. These queries return the rows where your assertion is _not_ true; if the test returns zero rows, your assertion passes. -You can find more information about these tests, and additional configurations (including [`severity`](/reference/resource-configs/severity) and [`tags`](/reference/resource-configs/tags)) in the [reference section](/reference/resource-properties/tests). +You can find more information about these data tests, and additional configurations (including [`severity`](/reference/resource-configs/severity) and [`tags`](/reference/resource-configs/tags)) in the [reference section](/reference/resource-properties/data-tests). -### More generic tests +### More generic data tests -Those four tests are enough to get you started. You'll quickly find you want to use a wider variety of tests—a good thing! You can also install generic tests from a package, or write your own, to use (and reuse) across your dbt project. Check out the [guide on custom generic tests](/best-practices/writing-custom-generic-tests) for more information. +Those four tests are enough to get you started. You'll quickly find you want to use a wider variety of tests—a good thing! You can also install generic data tests from a package, or write your own, to use (and reuse) across your dbt project. Check out the [guide on custom generic tests](/best-practices/writing-custom-generic-tests) for more information. :::info There are generic tests defined in some open source packages, such as [dbt-utils](https://hub.getdbt.com/dbt-labs/dbt_utils/latest/) and [dbt-expectations](https://hub.getdbt.com/calogica/dbt_expectations/latest/) — skip ahead to the docs on [packages](/docs/build/packages) to learn more! @@ -241,7 +241,7 @@ where {{ column_name }} is null ## Storing test failures -Normally, a test query will calculate failures as part of its execution. If you set the optional `--store-failures` flag, the [`store_failures`](/reference/resource-configs/store_failures), or the [`store_failures_as`](/reference/resource-configs/store_failures_as) configs, dbt will first save the results of a test query to a table in the database, and then query that table to calculate the number of failures. +Normally, a data test query will calculate failures as part of its execution. If you set the optional `--store-failures` flag, the [`store_failures`](/reference/resource-configs/store_failures), or the [`store_failures_as`](/reference/resource-configs/store_failures_as) configs, dbt will first save the results of a test query to a table in the database, and then query that table to calculate the number of failures. This workflow allows you to query and examine failing records much more quickly in development: diff --git a/website/docs/docs/build/dimensions.md b/website/docs/docs/build/dimensions.md index b8679fe11b0..3c4edd9aef0 100644 --- a/website/docs/docs/build/dimensions.md +++ b/website/docs/docs/build/dimensions.md @@ -15,7 +15,8 @@ In a data platform, dimensions is part of a larger structure called a semantic m Groups are defined within semantic models, alongside entities and measures, and correspond to non-aggregatable columns in your dbt model that provides categorical or time-based context. In SQL, dimensions is typically included in the GROUP BY clause.--> -All dimensions require a `name`, `type` and in some cases, an `expr` parameter. +All dimensions require a `name`, `type` and in some cases, an `expr` parameter. The `name` for your dimension must be unique to the semantic model and can not be the same as an existing `entity` or `measure` within that same model. + | Parameter | Description | Type | | --------- | ----------- | ---- | @@ -251,7 +252,8 @@ This example shows how to create slowly changing dimensions (SCD) using a semant | 333 | 2 | 2020-08-19 | 2021-10-22| | 333 | 3 | 2021-10-22 | 2048-01-01| -Take note of the extra arguments under `validity_params`: `is_start` and `is_end`. These arguments indicate the columns in the SCD table that contain the start and end dates for each tier (or beginning or ending timestamp column for a dimensional value). + +The `validity_params` include two important arguments — `is_start` and `is_end`. These specify the columns in the SCD table that mark the start and end dates (or timestamps) for each tier or dimension. Additionally, the entity is tagged as `natural` to differentiate it from a `primary` entity. In a `primary` entity, each entity value has one row. In contrast, a `natural` entity has one row for each combination of entity value and its validity period. ```yaml semantic_models: @@ -279,9 +281,11 @@ semantic_models: - name: tier type: categorical + primary_entity: sales_person + entities: - name: sales_person - type: primary + type: natural expr: sales_person_id ``` diff --git a/website/docs/docs/build/entities.md b/website/docs/docs/build/entities.md index 464fa2c3b8c..e44f9e79af6 100644 --- a/website/docs/docs/build/entities.md +++ b/website/docs/docs/build/entities.md @@ -8,7 +8,7 @@ tags: [Metrics, Semantic Layer] Entities are real-world concepts in a business such as customers, transactions, and ad campaigns. We often focus our analyses around specific entities, such as customer churn or annual recurring revenue modeling. We represent entities in our semantic models using id columns that serve as join keys to other semantic models in your semantic graph. -Within a semantic graph, the required parameters for an entity are `name` and `type`. The `name` refers to either the key column name from the underlying data table, or it may serve as an alias with the column name referenced in the `expr` parameter. +Within a semantic graph, the required parameters for an entity are `name` and `type`. The `name` refers to either the key column name from the underlying data table, or it may serve as an alias with the column name referenced in the `expr` parameter. The `name` for your entity must be unique to the semantic model and can not be the same as an existing `measure` or `dimension` within that same model. Entities can be specified with a single column or multiple columns. Entities (join keys) in a semantic model are identified by their name. Each entity name must be unique within a semantic model, but it doesn't have to be unique across different semantic models. diff --git a/website/docs/docs/build/incremental-models.md b/website/docs/docs/build/incremental-models.md index 3a597499f04..46788758ee6 100644 --- a/website/docs/docs/build/incremental-models.md +++ b/website/docs/docs/build/incremental-models.md @@ -249,31 +249,29 @@ The `merge` strategy is available in dbt-postgres and dbt-redshift beginning in - -| data platform adapter | default strategy | additional supported strategies | -| :-------------------| ---------------- | -------------------- | -| [dbt-postgres](/reference/resource-configs/postgres-configs#incremental-materialization-strategies) | `append` | `delete+insert` | -| [dbt-redshift](/reference/resource-configs/redshift-configs#incremental-materialization-strategies) | `append` | `delete+insert` | -| [dbt-bigquery](/reference/resource-configs/bigquery-configs#merge-behavior-incremental-models) | `merge` | `insert_overwrite` | -| [dbt-spark](/reference/resource-configs/spark-configs#incremental-models) | `append` | `merge` (Delta only) `insert_overwrite` | -| [dbt-databricks](/reference/resource-configs/databricks-configs#incremental-models) | `append` | `merge` (Delta only) `insert_overwrite` | -| [dbt-snowflake](/reference/resource-configs/snowflake-configs#merge-behavior-incremental-models) | `merge` | `append`, `delete+insert` | -| [dbt-trino](/reference/resource-configs/trino-configs#incremental) | `append` | `merge` `delete+insert` | +| data platform adapter | `append` | `merge` | `delete+insert` | `insert_overwrite` | +|-----------------------------------------------------------------------------------------------------|:--------:|:-------:|:---------------:|:------------------:| +| [dbt-postgres](/reference/resource-configs/postgres-configs#incremental-materialization-strategies) | ✅ | | ✅ | | +| [dbt-redshift](/reference/resource-configs/redshift-configs#incremental-materialization-strategies) | ✅ | | ✅ | | +| [dbt-bigquery](/reference/resource-configs/bigquery-configs#merge-behavior-incremental-models) | | ✅ | | ✅ | +| [dbt-spark](/reference/resource-configs/spark-configs#incremental-models) | ✅ | ✅ | | ✅ | +| [dbt-databricks](/reference/resource-configs/databricks-configs#incremental-models) | ✅ | ✅ | | ✅ | +| [dbt-snowflake](/reference/resource-configs/snowflake-configs#merge-behavior-incremental-models) | ✅ | ✅ | ✅ | | +| [dbt-trino](/reference/resource-configs/trino-configs#incremental) | ✅ | ✅ | ✅ | | - -| data platform adapter | default strategy | additional supported strategies | -| :----------------- | :----------------| : ---------------------------------- | -| [dbt-postgres](/reference/resource-configs/postgres-configs#incremental-materialization-strategies) | `append` | `merge` , `delete+insert` | -| [dbt-redshift](/reference/resource-configs/redshift-configs#incremental-materialization-strategies) | `append` | `merge`, `delete+insert` | -| [dbt-bigquery](/reference/resource-configs/bigquery-configs#merge-behavior-incremental-models) | `merge` | `insert_overwrite` | -| [dbt-spark](/reference/resource-configs/spark-configs#incremental-models) | `append` | `merge` (Delta only) `insert_overwrite` | -| [dbt-databricks](/reference/resource-configs/databricks-configs#incremental-models) | `append` | `merge` (Delta only) `insert_overwrite` | -| [dbt-snowflake](/reference/resource-configs/snowflake-configs#merge-behavior-incremental-models) | `merge` | `append`, `delete+insert` | -| [dbt-trino](/reference/resource-configs/trino-configs#incremental) | `append` | `merge` `delete+insert` | +| data platform adapter | `append` | `merge` | `delete+insert` | `insert_overwrite` | +|-----------------------------------------------------------------------------------------------------|:--------:|:-------:|:---------------:|:------------------:| +| [dbt-postgres](/reference/resource-configs/postgres-configs#incremental-materialization-strategies) | ✅ | ✅ | ✅ | | +| [dbt-redshift](/reference/resource-configs/redshift-configs#incremental-materialization-strategies) | ✅ | ✅ | ✅ | | +| [dbt-bigquery](/reference/resource-configs/bigquery-configs#merge-behavior-incremental-models) | | ✅ | | ✅ | +| [dbt-spark](/reference/resource-configs/spark-configs#incremental-models) | ✅ | ✅ | | ✅ | +| [dbt-databricks](/reference/resource-configs/databricks-configs#incremental-models) | ✅ | ✅ | | ✅ | +| [dbt-snowflake](/reference/resource-configs/snowflake-configs#merge-behavior-incremental-models) | ✅ | ✅ | ✅ | | +| [dbt-trino](/reference/resource-configs/trino-configs#incremental) | ✅ | ✅ | ✅ | | diff --git a/website/docs/docs/build/jinja-macros.md b/website/docs/docs/build/jinja-macros.md index 135db740f75..074e648d410 100644 --- a/website/docs/docs/build/jinja-macros.md +++ b/website/docs/docs/build/jinja-macros.md @@ -23,7 +23,7 @@ Using Jinja turns your dbt project into a programming environment for SQL, givin In fact, if you've used the [`{{ ref() }}` function](/reference/dbt-jinja-functions/ref), you're already using Jinja! -Jinja can be used in any SQL in a dbt project, including [models](/docs/build/sql-models), [analyses](/docs/build/analyses), [tests](/docs/build/tests), and even [hooks](/docs/build/hooks-operations). +Jinja can be used in any SQL in a dbt project, including [models](/docs/build/sql-models), [analyses](/docs/build/analyses), [tests](/docs/build/data-tests), and even [hooks](/docs/build/hooks-operations). :::info Ready to get started with Jinja and macros? diff --git a/website/docs/docs/build/join-logic.md b/website/docs/docs/build/join-logic.md index 9039822c9fd..29b9d101a59 100644 --- a/website/docs/docs/build/join-logic.md +++ b/website/docs/docs/build/join-logic.md @@ -84,10 +84,6 @@ mf query --metrics average_purchase_price --dimensions metric_time,user_id__type ## Multi-hop joins -:::info -This feature is currently in development and not currently available. -::: - MetricFlow allows users to join measures and dimensions across a graph of entities, which we refer to as a 'multi-hop join.' This is because users can move from one table to another like a 'hop' within a graph. Here's an example schema for reference: @@ -134,9 +130,6 @@ semantic_models: ### Query multi-hop joins -:::info -This feature is currently in development and not currently available. -::: To query dimensions _without_ a multi-hop join involved, you can use the fully qualified dimension name with the syntax entity double underscore (dunder) dimension, like `entity__dimension`. diff --git a/website/docs/docs/build/materializations.md b/website/docs/docs/build/materializations.md index 8846f4bb0c5..192284a31ca 100644 --- a/website/docs/docs/build/materializations.md +++ b/website/docs/docs/build/materializations.md @@ -14,6 +14,8 @@ pagination_next: "docs/build/incremental-models" - ephemeral - materialized view +You can also configure [custom materializations](/guides/create-new-materializations?step=1) in dbt. Custom materializations are a powerful way to extend dbt's functionality to meet your specific needs. + ## Configuring materializations By default, dbt models are materialized as "views". Models can be configured with a different materialization by supplying the `materialized` configuration parameter as shown below. diff --git a/website/docs/docs/build/measures.md b/website/docs/docs/build/measures.md index 74d37b70e94..feea2b30ca4 100644 --- a/website/docs/docs/build/measures.md +++ b/website/docs/docs/build/measures.md @@ -34,7 +34,8 @@ measures: When you create a measure, you can either give it a custom name or use the `name` of the data platform column directly. If the `name` of the measure is different from the column name, you need to add an `expr` to specify the column name. The `name` of the measure is used when creating a metric. -Measure names must be **unique** across all semantic models in a project. +Measure names must be unique across all semantic models in a project and can not be the same as an existing `entity` or `dimension` within that same model. + ### Description diff --git a/website/docs/docs/build/metricflow-commands.md b/website/docs/docs/build/metricflow-commands.md index 67589c07836..e3bb93da964 100644 --- a/website/docs/docs/build/metricflow-commands.md +++ b/website/docs/docs/build/metricflow-commands.md @@ -8,7 +8,7 @@ tags: [Metrics, Semantic Layer] Once you define metrics in your dbt project, you can query metrics, dimensions, and dimension values, and validate your configs using the MetricFlow commands. -MetricFlow allows you to define and query metrics in your dbt project in the [dbt Cloud CLI](/docs/cloud/cloud-cli-installation), [dbt Cloud IDE](/docs/cloud/dbt-cloud-ide/develop-in-the-cloud), or [dbt Core](/docs/core/installation). To experience the power of the universal [dbt Semantic Layer](/docs/use-dbt-semantic-layer/dbt-sl) and dynamically query those metrics in downstream tools, you'll need a dbt Cloud [Team or Enterprise](https://www.getdbt.com/pricing/) account. +MetricFlow allows you to define and query metrics in your dbt project in the [dbt Cloud CLI](/docs/cloud/cloud-cli-installation), [dbt Cloud IDE](/docs/cloud/dbt-cloud-ide/develop-in-the-cloud), or [dbt Core](/docs/core/installation-overview). To experience the power of the universal [dbt Semantic Layer](/docs/use-dbt-semantic-layer/dbt-sl) and dynamically query those metrics in downstream tools, you'll need a dbt Cloud [Team or Enterprise](https://www.getdbt.com/pricing/) account. MetricFlow is compatible with Python versions 3.8, 3.9, 3.10, and 3.11. @@ -556,3 +556,8 @@ Keep in mind that modifying your shell configuration files can have an impact on +
+Why is my query limited to 100 rows in the dbt Cloud CLI? +The default limit for query issues from the dbt Cloud CLI is 100 rows. We set this default to prevent returning unnecessarily large data sets as the dbt Cloud CLI is typically used to query the dbt Semantic Layer during the development process, not for production reporting or to access large data sets. For most workflows, you only need to return a subset of the data.

+However, you can change this limit if needed by setting the --limit option in your query. For example, to return 1000 rows, you can run dbt sl list metrics --limit 1000. +
diff --git a/website/docs/docs/build/packages.md b/website/docs/docs/build/packages.md index 8d18a55e949..b60b4ba5b5e 100644 --- a/website/docs/docs/build/packages.md +++ b/website/docs/docs/build/packages.md @@ -25,15 +25,9 @@ dbt _packages_ are in fact standalone dbt projects, with models and macros that * It's important to note that defining and installing dbt packages is different from [defining and installing Python packages](/docs/build/python-models#using-pypi-packages) -:::info `dependencies.yml` has replaced `packages.yml` -Starting from dbt v1.6, `dependencies.yml` has replaced `packages.yml`. This file can now contain both types of dependencies: "package" and "project" dependencies. -- "Package" dependencies lets you add source code from someone else's dbt project into your own, like a library. -- "Project" dependencies provide a different way to build on top of someone else's work in dbt. Refer to [Project dependencies](/docs/collaborate/govern/project-dependencies) for more info. -- -You can rename `packages.yml` to `dependencies.yml`, _unless_ you need to use Jinja within your packages specification. This could be necessary, for example, if you want to add an environment variable with a git token in a private git package specification. - -::: +import UseCaseInfo from '/snippets/_packages_or_dependencies.md'; + ## How do I add a package to my project? 1. Add a file named `dependencies.yml` or `packages.yml` to your dbt project. This should be at the same level as your `dbt_project.yml` file. diff --git a/website/docs/docs/build/projects.md b/website/docs/docs/build/projects.md index a54f6042cce..c5e08177dee 100644 --- a/website/docs/docs/build/projects.md +++ b/website/docs/docs/build/projects.md @@ -14,7 +14,7 @@ At a minimum, all a project needs is the `dbt_project.yml` project configuration | [models](/docs/build/models) | Each model lives in a single file and contains logic that either transforms raw data into a dataset that is ready for analytics or, more often, is an intermediate step in such a transformation. | | [snapshots](/docs/build/snapshots) | A way to capture the state of your mutable tables so you can refer to it later. | | [seeds](/docs/build/seeds) | CSV files with static data that you can load into your data platform with dbt. | -| [tests](/docs/build/tests) | SQL queries that you can write to test the models and resources in your project. | +| [data tests](/docs/build/data-tests) | SQL queries that you can write to test the models and resources in your project. | | [macros](/docs/build/jinja-macros) | Blocks of code that you can reuse multiple times. | | [docs](/docs/collaborate/documentation) | Docs for your project that you can build. | | [sources](/docs/build/sources) | A way to name and describe the data loaded into your warehouse by your Extract and Load tools. | diff --git a/website/docs/docs/build/semantic-models.md b/website/docs/docs/build/semantic-models.md index 09f808d7a17..5c6883cdcee 100644 --- a/website/docs/docs/build/semantic-models.md +++ b/website/docs/docs/build/semantic-models.md @@ -43,7 +43,7 @@ semantic_models: - name: the_name_of_the_semantic_model ## Required description: same as always ## Optional model: ref('some_model') ## Required - default: ## Required + defaults: ## Required agg_time_dimension: dimension_name ## Required if the model contains dimensions entities: ## Required - see more information in entities diff --git a/website/docs/docs/build/sl-getting-started.md b/website/docs/docs/build/sl-getting-started.md index d5a59c33ec2..4274fccf509 100644 --- a/website/docs/docs/build/sl-getting-started.md +++ b/website/docs/docs/build/sl-getting-started.md @@ -74,21 +74,9 @@ import SlSetUp from '/snippets/_new-sl-setup.md'; If you're encountering some issues when defining your metrics or setting up the dbt Semantic Layer, check out a list of answers to some of the questions or problems you may be experiencing. -
- How do I migrate from the legacy Semantic Layer to the new one? -
-
If you're using the legacy Semantic Layer, we highly recommend you upgrade your dbt version to dbt v1.6 or higher to use the new dbt Semantic Layer. Refer to the dedicated migration guide for more info.
-
-
-
-How are you storing my data? -User data passes through the Semantic Layer on its way back from the warehouse. dbt Labs ensures security by authenticating through the customer's data warehouse. Currently, we don't cache data for the long term, but it might temporarily stay in the system for up to 10 minutes, usually less. In the future, we'll introduce a caching feature that allows us to cache data on our infrastructure for up to 24 hours. -
- -
-Is the dbt Semantic Layer open source? -The dbt Semantic Layer is proprietary; however, some components of the dbt Semantic Layer are open source, such as dbt-core and MetricFlow.

dbt Cloud Developer or dbt Core users can define metrics in their project, including a local dbt Core project, using the dbt Cloud IDE, dbt Cloud CLI, or dbt Core CLI. However, to experience the universal dbt Semantic Layer and access those metrics using the API or downstream tools, users must be on a dbt Cloud Team or Enterprise plan.

Refer to Billing for more information. -
+import SlFaqs from '/snippets/_sl-faqs.md'; + + ## Next steps diff --git a/website/docs/docs/build/sources.md b/website/docs/docs/build/sources.md index a657b6257c9..466bcedc688 100644 --- a/website/docs/docs/build/sources.md +++ b/website/docs/docs/build/sources.md @@ -88,10 +88,10 @@ Using the `{{ source () }}` function also creates a dependency between the model ### Testing and documenting sources You can also: -- Add tests to sources +- Add data tests to sources - Add descriptions to sources, that get rendered as part of your documentation site -These should be familiar concepts if you've already added tests and descriptions to your models (if not check out the guides on [testing](/docs/build/tests) and [documentation](/docs/collaborate/documentation)). +These should be familiar concepts if you've already added tests and descriptions to your models (if not check out the guides on [testing](/docs/build/data-tests) and [documentation](/docs/collaborate/documentation)). diff --git a/website/docs/docs/build/sql-models.md b/website/docs/docs/build/sql-models.md index 237ac84c0c2..a0dd174278b 100644 --- a/website/docs/docs/build/sql-models.md +++ b/website/docs/docs/build/sql-models.md @@ -262,7 +262,7 @@ Additionally, the `ref` function encourages you to write modular transformations ## Testing and documenting models -You can also document and test models — skip ahead to the section on [testing](/docs/build/tests) and [documentation](/docs/collaborate/documentation) for more information. +You can also document and test models — skip ahead to the section on [testing](/docs/build/data-tests) and [documentation](/docs/collaborate/documentation) for more information. ## Additional FAQs diff --git a/website/docs/docs/cloud/about-cloud-develop.md b/website/docs/docs/cloud/about-cloud-develop.md deleted file mode 100644 index 90abbb98bf4..00000000000 --- a/website/docs/docs/cloud/about-cloud-develop.md +++ /dev/null @@ -1,33 +0,0 @@ ---- -title: About developing in dbt Cloud -id: about-cloud-develop -description: "Learn how to develop your dbt projects using dbt Cloud." -sidebar_label: "About developing in dbt Cloud" -pagination_next: "docs/cloud/cloud-cli-installation" -hide_table_of_contents: true ---- - -dbt Cloud offers a fast and reliable way to work on your dbt project. It runs dbt Core in a hosted (single or multi-tenant) environment. You can develop in your browser using an integrated development environment (IDE) or in a dbt Cloud-powered command line interface (CLI): - -
- - - - - -

- -The following sections provide detailed instructions on setting up the dbt Cloud CLI and dbt Cloud IDE. To get started with dbt development, you'll need a [developer](/docs/cloud/manage-access/seats-and-users) account. For a more comprehensive guide about developing in dbt, refer to our [quickstart guides](/guides). - - ---------- -**Note**: The dbt Cloud CLI and the open-sourced dbt Core are both command line tools that let you run dbt commands. The key distinction is the dbt Cloud CLI is tailored for dbt Cloud's infrastructure and integrates with all its [features](/docs/cloud/about-cloud/dbt-cloud-features). - diff --git a/website/docs/docs/cloud/about-cloud-setup.md b/website/docs/docs/cloud/about-cloud-setup.md index 5c8e5525bf1..7daf33a4684 100644 --- a/website/docs/docs/cloud/about-cloud-setup.md +++ b/website/docs/docs/cloud/about-cloud-setup.md @@ -13,14 +13,13 @@ dbt Cloud is the fastest and most reliable way to deploy your dbt jobs. It conta - Configuring access to [GitHub](/docs/cloud/git/connect-github), [GitLab](/docs/cloud/git/connect-gitlab), or your own [git repo URL](/docs/cloud/git/import-a-project-by-git-url). - [Managing users and licenses](/docs/cloud/manage-access/seats-and-users) - [Configuring secure access](/docs/cloud/manage-access/about-user-access) -- Configuring the [dbt Cloud IDE](/docs/cloud/about-cloud-develop) -- Installing and configuring the [dbt Cloud CLI](/docs/cloud/cloud-cli-installation) These settings are intended for dbt Cloud administrators. If you need a more detailed first-time setup guide for specific data platforms, read our [quickstart guides](/guides). If you want a more in-depth learning experience, we recommend taking the dbt Fundamentals on our [dbt Learn online courses site](https://courses.getdbt.com/). ## Prerequisites + - To set up dbt Cloud, you'll need to have a dbt Cloud account with administrator access. If you still need to create a dbt Cloud account, [sign up today](https://getdbt.com) on our North American servers or [contact us](https://getdbt.com/contact) for international options. - To have the best experience using dbt Cloud, we recommend you use modern and up-to-date web browsers like Chrome, Safari, Edge, and Firefox. diff --git a/website/docs/docs/cloud/about-cloud/regions-ip-addresses.md b/website/docs/docs/cloud/about-cloud/regions-ip-addresses.md index cc1c2531f56..119201b389d 100644 --- a/website/docs/docs/cloud/about-cloud/regions-ip-addresses.md +++ b/website/docs/docs/cloud/about-cloud/regions-ip-addresses.md @@ -11,8 +11,8 @@ dbt Cloud is [hosted](/docs/cloud/about-cloud/architecture) in multiple regions | Region | Location | Access URL | IP addresses | Developer plan | Team plan | Enterprise plan | |--------|----------|------------|--------------|----------------|-----------|-----------------| -| North America multi-tenant [^1] | AWS us-east-1 (N. Virginia) | cloud.getdbt.com | 52.45.144.63
54.81.134.249
52.22.161.231 | ✅ | ✅ | ✅ | -| North America Cell 1 [^1] | AWS us-east-1 (N.Virginia) | {account prefix}.us1.dbt.com | [Located in Account Settings](#locating-your-dbt-cloud-ip-addresses) | ❌ | ❌ | ✅ | +| North America multi-tenant [^1] | AWS us-east-1 (N. Virginia) | cloud.getdbt.com | 52.45.144.63
54.81.134.249
52.22.161.231
52.3.77.232
3.214.191.130
34.233.79.135 | ✅ | ✅ | ✅ | +| North America Cell 1 [^1] | AWS us-east-1 (N. Virginia) | {account prefix}.us1.dbt.com | 52.45.144.63
54.81.134.249
52.22.161.231
52.3.77.232
3.214.191.130
34.233.79.135 | ✅ | ❌ | ✅ | | EMEA [^1] | AWS eu-central-1 (Frankfurt) | emea.dbt.com | 3.123.45.39
3.126.140.248
3.72.153.148 | ❌ | ❌ | ✅ | | APAC [^1] | AWS ap-southeast-2 (Sydney)| au.dbt.com | 52.65.89.235
3.106.40.33
13.239.155.206
| ❌ | ❌ | ✅ | | Virtual Private dbt or Single tenant | Customized | Customized | Ask [Support](/community/resources/getting-help#dbt-cloud-support) for your IPs | ❌ | ❌ | ✅ | diff --git a/website/docs/docs/cloud/about-develop-dbt.md b/website/docs/docs/cloud/about-develop-dbt.md new file mode 100644 index 00000000000..a71c32d5352 --- /dev/null +++ b/website/docs/docs/cloud/about-develop-dbt.md @@ -0,0 +1,30 @@ +--- +title: About developing in dbt +id: about-develop-dbt +description: "Learn how to develop your dbt projects using dbt Cloud." +sidebar_label: "About developing in dbt" +pagination_next: "docs/cloud/about-cloud-develop-defer" +hide_table_of_contents: true +--- + +Develop dbt projects using dbt Cloud, which offers a fast and reliable way to work on your dbt project. It runs dbt Core in a hosted (single or multi-tenant) environment. + +You can develop in your browser using an integrated development environment (IDE) or in a dbt Cloud-powered command line interface (CLI). + +
+ + + + + +

+ +To get started with dbt development, you'll need a [dbt Cloud](https://www.getdbt.com/signup) account and developer seat. For a more comprehensive guide about developing in dbt, refer to our [quickstart guides](/guides). diff --git a/website/docs/docs/cloud/billing.md b/website/docs/docs/cloud/billing.md index 31b7689ceb9..b677f06ccfe 100644 --- a/website/docs/docs/cloud/billing.md +++ b/website/docs/docs/cloud/billing.md @@ -126,6 +126,8 @@ All included successful models built numbers above reflect our most current pric As an Enterprise customer, you pay annually via invoice, monthly in arrears for additional usage (if applicable), and may benefit from negotiated usage rates. Please refer to your order form or contract for your specific pricing details, or [contact the account team](https://www.getdbt.com/contact-demo) with any questions. +Enterprise plan billing information is not available in the dbt Cloud UI. Changes are handled through your dbt Labs Solutions Architect or account team manager. + ### Legacy plans Customers who purchased the dbt Cloud Team plan before August 11, 2023, remain on a legacy pricing plan as long as your account is in good standing. The legacy pricing plan is based on seats and includes unlimited models, subject to reasonable use. diff --git a/website/docs/docs/cloud/cloud-cli-installation.md b/website/docs/docs/cloud/cloud-cli-installation.md index f3294477611..8d2196696aa 100644 --- a/website/docs/docs/cloud/cloud-cli-installation.md +++ b/website/docs/docs/cloud/cloud-cli-installation.md @@ -26,7 +26,7 @@ dbt commands are run against dbt Cloud's infrastructure and benefit from: The dbt Cloud CLI is available in all [deployment regions](/docs/cloud/about-cloud/regions-ip-addresses) and for both multi-tenant and single-tenant accounts (Azure single-tenant not supported at this time). - Ensure you are using dbt version 1.5 or higher. Refer to [dbt Cloud versions](/docs/dbt-versions/upgrade-core-in-cloud) to upgrade. -- Note that SSH tunneling for [Postgres and Redshift](/docs/cloud/connect-data-platform/connect-redshift-postgresql-alloydb) connections and [Single sign-on (SSO)](/docs/cloud/manage-access/sso-overview) doesn't support the dbt Cloud CLI yet. +- Note that SSH tunneling for [Postgres and Redshift](/docs/cloud/connect-data-platform/connect-redshift-postgresql-alloydb) connections doesn't support the dbt Cloud CLI yet. ## Install dbt Cloud CLI diff --git a/website/docs/docs/cloud/connect-data-platform/about-connections.md b/website/docs/docs/cloud/connect-data-platform/about-connections.md index 1329d179900..93bbf83584f 100644 --- a/website/docs/docs/cloud/connect-data-platform/about-connections.md +++ b/website/docs/docs/cloud/connect-data-platform/about-connections.md @@ -3,7 +3,7 @@ title: "About data platform connections" id: about-connections description: "Information about data platform connections" sidebar_label: "About data platform connections" -pagination_next: "docs/cloud/connect-data-platform/connect-starburst-trino" +pagination_next: "docs/cloud/connect-data-platform/connect-microsoft-fabric" pagination_prev: null --- dbt Cloud can connect with a variety of data platform providers including: @@ -11,6 +11,7 @@ dbt Cloud can connect with a variety of data platform providers including: - [Apache Spark](/docs/cloud/connect-data-platform/connect-apache-spark) - [Databricks](/docs/cloud/connect-data-platform/connect-databricks) - [Google BigQuery](/docs/cloud/connect-data-platform/connect-bigquery) +- [Microsoft Fabric](/docs/cloud/connect-data-platform/connect-microsoft-fabric) - [PostgreSQL](/docs/cloud/connect-data-platform/connect-redshift-postgresql-alloydb) - [Snowflake](/docs/cloud/connect-data-platform/connect-snowflake) - [Starburst or Trino](/docs/cloud/connect-data-platform/connect-starburst-trino) diff --git a/website/docs/docs/cloud/connect-data-platform/connect-microsoft-fabric.md b/website/docs/docs/cloud/connect-data-platform/connect-microsoft-fabric.md new file mode 100644 index 00000000000..e9d67524e89 --- /dev/null +++ b/website/docs/docs/cloud/connect-data-platform/connect-microsoft-fabric.md @@ -0,0 +1,43 @@ +--- +title: "Connect Microsoft Fabric" +description: "Configure Microsoft Fabric connection." +sidebar_label: "Connect Microsoft Fabric" +--- + +## Supported authentication methods +The supported authentication methods are: +- Azure Active Directory (Azure AD) service principal +- Azure AD password + +SQL password (LDAP) is not supported in Microsoft Fabric Synapse Data Warehouse so you must use Azure AD. This means that to use [Microsoft Fabric](https://www.microsoft.com/en-us/microsoft-fabric) in dbt Cloud, you will need at least one Azure AD service principal to connect dbt Cloud to Fabric, ideally one service principal for each user. + +### Active Directory service principal +The following are the required fields for setting up a connection with a Microsoft Fabric using Azure AD service principal authentication. + +| Field | Description | +| --- | --- | +| **Server** | The service principal's **host** value for the Fabric test endpoint. | +| **Port** | The port to connect to Microsoft Fabric. You can use `1433` (the default), which is the standard SQL server port number. | +| **Database** | The service principal's **database** value for the Fabric test endpoint. | +| **Authentication** | Choose **Service Principal** from the dropdown. | +| **Tenant ID** | The service principal's **Directory (tenant) ID**. | +| **Client ID** | The service principal's **application (client) ID id**. | +| **Client secret** | The service principal's **client secret** (not the **client secret id**). | + + +### Active Directory password + +The following are the required fields for setting up a connection with a Microsoft Fabric using Azure AD password authentication. + +| Field | Description | +| --- | --- | +| **Server** | The server hostname to connect to Microsoft Fabric. | +| **Port** | The server port. You can use `1433` (the default), which is the standard SQL server port number. | +| **Database** | The database name. | +| **Authentication** | Choose **Active Directory Password** from the dropdown. | +| **User** | The AD username. | +| **Password** | The AD username's password. | + +## Configuration + +To learn how to optimize performance with data platform-specific configurations in dbt Cloud, refer to [Microsoft Fabric DWH configurations](/reference/resource-configs/fabric-configs). diff --git a/website/docs/docs/cloud/dbt-cloud-ide/dbt-cloud-ide.md b/website/docs/docs/cloud/dbt-cloud-ide/dbt-cloud-ide.md deleted file mode 100644 index 3c41432bc62..00000000000 --- a/website/docs/docs/cloud/dbt-cloud-ide/dbt-cloud-ide.md +++ /dev/null @@ -1,37 +0,0 @@ ---- -title: "dbt Cloud IDE" -description: "Learn how to configure Git in dbt Cloud" -pagination_next: "docs/cloud/dbt-cloud-ide/develop-in-the-cloud" -pagination_prev: null ---- - -
- - - - - -
-
-
- - - - -
\ No newline at end of file diff --git a/website/docs/docs/cloud/manage-access/audit-log.md b/website/docs/docs/cloud/manage-access/audit-log.md index b90bceef570..774400529e9 100644 --- a/website/docs/docs/cloud/manage-access/audit-log.md +++ b/website/docs/docs/cloud/manage-access/audit-log.md @@ -34,7 +34,7 @@ On the audit log page, you will see a list of various events and their associate Click the event card to see the details about the activity that triggered the event. This view provides important details, including when it happened and what type of event was triggered. For example, if someone changes the settings for a job, you can use the event details to see which job was changed (type of event: `v1.events.job_definition.Changed`), by whom (person who triggered the event: `actor`), and when (time it was triggered: `created_at_utc`). For types of events and their descriptions, see [Events in audit log](#events-in-audit-log). -The event details provides the key factors of an event: +The event details provide the key factors of an event: | Name | Description | | -------------------- | --------------------------------------------- | @@ -160,16 +160,22 @@ The audit log supports various events for different objects in dbt Cloud. You wi You can search the audit log to find a specific event or actor, which is limited to the ones listed in [Events in audit log](#events-in-audit-log). The audit log successfully lists historical events spanning the last 90 days. You can search for an actor or event using the search bar, and then narrow your results using the time window. - + ## Exporting logs You can use the audit log to export all historical audit results for security, compliance, and analysis purposes: -- For events within 90 days — dbt Cloud will automatically display the 90-day selectable date range. Select **Export Selection** to download a CSV file of all the events that occurred in your organization within 90 days. -- For events beyond 90 days — Select **Export All**. The Account Admin will receive an email link to download a CSV file of all the events that occurred in your organization. +- **For events within 90 days** — dbt Cloud will automatically display the 90-day selectable date range. Select **Export Selection** to download a CSV file of all the events that occurred in your organization within 90 days. - +- **For events beyond 90 days** — Select **Export All**. The Account Admin will receive an email link to download a CSV file of all the events that occurred in your organization. + +### Azure Single-tenant + +For users deployed in [Azure single tenant](/docs/cloud/about-cloud/tenancy), while the **Export All** button isn't available, you can conveniently use specific APIs to access all events: + +- [Get recent audit log events CSV](/dbt-cloud/api-v3#/operations/Get%20Recent%20Audit%20Log%20Events%20CSV) — This API returns all events in a single CSV without pagination. +- [List recent audit log events](/dbt-cloud/api-v3#/operations/List%20Recent%20Audit%20Log%20Events) — This API returns a limited number of events at a time, which means you will need to paginate the results. diff --git a/website/docs/docs/cloud/manage-access/sso-overview.md b/website/docs/docs/cloud/manage-access/sso-overview.md index f613df7907e..b4954955c8c 100644 --- a/website/docs/docs/cloud/manage-access/sso-overview.md +++ b/website/docs/docs/cloud/manage-access/sso-overview.md @@ -57,8 +57,9 @@ Non-admin users that currently login with a password will no longer be able to d ### Security best practices There are a few scenarios that might require you to login with a password. We recommend these security best-practices for the two most common scenarios: -* **Onboarding partners and contractors** - We highly recommend that you add partners and contractors to your Identity Provider. IdPs like Okta and Azure Active Directory (AAD) offer capabilities explicitly for temporary employees. We highly recommend that you reach out to your IT team to provision an SSO license for these situations. Using an IdP highly secure, reduces any breach risk, and significantly increases the security posture of your dbt Cloud environment. -* **Identity Provider is down -** Account admins will continue to be able to log in with a password which would allow them to work with your Identity Provider to troubleshoot the problem. +* **Onboarding partners and contractors** — We highly recommend that you add partners and contractors to your Identity Provider. IdPs like Okta and Azure Active Directory (AAD) offer capabilities explicitly for temporary employees. We highly recommend that you reach out to your IT team to provision an SSO license for these situations. Using an IdP highly secure, reduces any breach risk, and significantly increases the security posture of your dbt Cloud environment. +* **Identity Provider is down** — Account admins will continue to be able to log in with a password which would allow them to work with your Identity Provider to troubleshoot the problem. +* **Offboarding admins** — When offboarding admins, revoke access to dbt Cloud by deleting the user from your environment; otherwise, they can continue to use username/password credentials to log in. ### Next steps for non-admin users currently logging in with passwords @@ -67,4 +68,5 @@ If you have any non-admin users logging into dbt Cloud with a password today: 1. Ensure that all users have a user account in your identity provider and are assigned dbt Cloud so they won’t lose access. 2. Alert all dbt Cloud users that they won’t be able to use a password for logging in anymore unless they are already an Admin with a password. 3. We **DO NOT** recommend promoting any users to Admins just to preserve password-based logins because you will reduce security of your dbt Cloud environment. -** + + diff --git a/website/docs/docs/cloud/secure/about-privatelink.md b/website/docs/docs/cloud/secure/about-privatelink.md index b31e4c08a26..2134ab25cfe 100644 --- a/website/docs/docs/cloud/secure/about-privatelink.md +++ b/website/docs/docs/cloud/secure/about-privatelink.md @@ -23,3 +23,4 @@ dbt Cloud supports the following data platforms for use with the PrivateLink fea - [Databricks](/docs/cloud/secure/databricks-privatelink) - [Redshift](/docs/cloud/secure/redshift-privatelink) - [Postgres](/docs/cloud/secure/postgres-privatelink) +- [VCS](/docs/cloud/secure/vcs-privatelink) diff --git a/website/docs/docs/cloud/secure/vcs-privatelink.md b/website/docs/docs/cloud/secure/vcs-privatelink.md new file mode 100644 index 00000000000..13bb97dd6cd --- /dev/null +++ b/website/docs/docs/cloud/secure/vcs-privatelink.md @@ -0,0 +1,82 @@ +--- +title: "Configuring PrivateLink for self-hosted cloud version control systems (VCS)" +id: vcs-privatelink +description: "Setting up a PrivateLink connection between dbt Cloud and an organization’s cloud hosted git server" +sidebar_label: "PrivateLink for VCS" +--- + +import SetUpPages from '/snippets/_available-tiers-privatelink.md'; + + + +AWS PrivateLink provides private connectivity from dbt Cloud to your self-hosted cloud version control system (VCS) service by routing requests through your virtual private cloud (VPC). This type of connection does not require you to publicly expose an endpoint to your VCS repositories or for requests to the service to traverse the public internet, ensuring the most secure connection possible. AWS recommends PrivateLink connectivity as part of its [Well-Architected Framework](https://docs.aws.amazon.com/wellarchitected/latest/framework/welcome.html) and details this particular pattern in the **Shared Services** section of the [AWS PrivateLink whitepaper](https://docs.aws.amazon.com/pdfs/whitepapers/latest/aws-privatelink/aws-privatelink.pdf). + +You will learn, at a high level, the resources necessary to implement this solution. Cloud environments and provisioning processes vary greatly, so information from this guide may need to be adapted to fit your requirements. + +## PrivateLink connection overview + + + +### Required resources for creating a connection + +Creating an Interface VPC PrivateLink connection requires creating multiple AWS resources in your AWS account(s) and private network containing the self-hosted VCS instance. You are responsible for provisioning and maintaining these resources. Once provisioned, connection information and permissions are shared with dbt Labs to complete the connection, allowing for direct VPC to VPC private connectivity. + +This approach is distinct from and does not require you to implement VPC peering between your AWS account(s) and dbt Cloud. + +You need these resource to create a PrivateLink connection, which allows the dbt Cloud application to connect to your self-hosted cloud VCS. These resources can be created via the AWS Console, AWS CLI, or Infrastructure-as-Code such as [Terraform](https://registry.terraform.io/providers/hashicorp/aws/latest/docs) or [AWS CloudFormation](https://aws.amazon.com/cloudformation/). + +- **Target Group(s)** - A [Target Group](https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-target-groups.html) is attached to a [Listener](https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-listeners.html) on the NLB and is responsible for routing incoming requests to healthy targets in the group. If connecting to the VCS system over both SSH and HTTPS, two **Target Groups** will need to be created. + - **Target Type (choose most applicable):** + - **Instance/ASG:** Select existing EC2 instance(s) where the VCS system is running, or [an autoscaling group](https://docs.aws.amazon.com/autoscaling/ec2/userguide/attach-load-balancer-asg.html) (ASG) to automatically attach any instances launched from that ASG. + - **Application Load Balancer (ALB):** Select an ALB that already has VCS EC2 instances attached (HTTP/S traffic only). + - **IP Addresses:** Select the IP address(es) of the EC2 instances where the VCS system is installed. Keep in mind that the IP of the EC2 instance can change if the instance is relaunched for any reason. + - **Protocol/Port:** Choose one protocol and port pair per Target Group, for example: + - TG1 - SSH: TCP/22 + - TG2 - HTTPS: TCP/443 or TLS if you want to attach a certificate to decrypt TLS connections ([details](https://docs.aws.amazon.com/elasticloadbalancing/latest/network/create-tls-listener.html)). + - **VPC:** Choose the VPC in which the VPC Endpoint Service and NLB will be created. + - **Health checks:** Targets must register as healthy in order for the NLB to forward requests. Configure a health check that’s appropriate for your service and the protocol of the Target Group ([details](https://docs.aws.amazon.com/elasticloadbalancing/latest/network/target-group-health-checks.html)). + - **Register targets:** Register the targets (see above) for the VCS service ([details](https://docs.aws.amazon.com/elasticloadbalancing/latest/application/target-group-register-targets.html)). _It's critical to be sure targets are healthy before attempting connection from dbt Cloud._ +- **Network Load Balancer (NLB)** - Requires creating a Listener that attaches to the newly created Target Group(s) for port `443` and/or `22`, as applicable. + - **Scheme:** Internal + - **IP address type:** IPv4 + - **Network mapping:** Choose the VPC that the VPC Endpoint Service and NLB are being deployed in, and choose subnets from at least two Availability Zones. + - **Listeners:** Create one Listener per Target Group that maps the appropriate incoming port to the corresponding Target Group ([details](https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-listeners.html)). +- **Endpoint Service** - The VPC Endpoint Service is what allows for the VPC to VPC connection, routing incoming requests to the configured load balancer. + - **Load balancer type:** Network. + - **Load balancer:** Attach the NLB created in the previous step. + - **Acceptance required (recommended)**: When enabled, requires a new connection request to the VPC Endpoint Service to be accepted by the customer before connectivity is allowed ([details](https://docs.aws.amazon.com/vpc/latest/privatelink/configure-endpoint-service.html#accept-reject-connection-requests)). + + Once these resources have been provisioned, access needs to be granted for the dbt Labs AWS account to create a VPC Endpoint in our VPC. On the newly created VPC Endpoint Service, add a new [Allowed Principal](https://docs.aws.amazon.com/vpc/latest/privatelink/configure-endpoint-service.html#add-remove-permissions) for the appropriate dbt Labs principal: + + - **AWS Account ID:** `arn:aws:iam:::root` (contact your dbt Labs account representative for appropriate account ID). + +### Completing the connection + +To complete the connection, dbt Labs must now provision a VPC Endpoint to connect to your VPC Endpoint Service. This requires you send the following information: + + - VPC Endpoint Service name: + + + + - **DNS configuration:** If the connection to the VCS service requires a custom domain and/or URL for TLS, a private hosted zone can be configured by the dbt Labs Infrastructure team in the dbt Cloud private network. For example: + - **Private hosted zone:** `examplecorp.com` + - **DNS record:** `github.examplecorp.com` + +### Accepting the connection request + +When you have been notified that the resources are provisioned within the dbt Cloud environment, you must accept the endpoint connection (unless the VPC Endpoint Service is set to auto-accept connection requests). Requests can be accepted through the AWS console, as seen below, or through the AWS CLI. + + + +Once you accept the endpoint connection request, you can use the PrivateLink endpoint in dbt Cloud. + +## Configure in dbt Cloud + +Once dbt confirms that the PrivateLink integration is complete, you can use it in a new or existing git configuration. +1. Select **PrivateLink Endpoint** as the connection type, and your configured integrations will appear in the dropdown menu. +2. Select the configured endpoint from the drop down list. +3. Click **Save**. + + + + \ No newline at end of file diff --git a/website/docs/docs/collaborate/documentation.md b/website/docs/docs/collaborate/documentation.md index 16a4e610c70..1a989806851 100644 --- a/website/docs/docs/collaborate/documentation.md +++ b/website/docs/docs/collaborate/documentation.md @@ -15,7 +15,7 @@ pagination_prev: null ## Assumed knowledge -* [Tests](/docs/build/tests) +* [Tests](/docs/build/data-tests) ## Overview @@ -32,7 +32,7 @@ Here's an example docs site: ## Adding descriptions to your project -To add descriptions to your project, use the `description:` key in the same files where you declare [tests](/docs/build/tests), like so: +To add descriptions to your project, use the `description:` key in the same files where you declare [tests](/docs/build/data-tests), like so: diff --git a/website/docs/docs/collaborate/explore-multiple-projects.md b/website/docs/docs/collaborate/explore-multiple-projects.md new file mode 100644 index 00000000000..3be35110a37 --- /dev/null +++ b/website/docs/docs/collaborate/explore-multiple-projects.md @@ -0,0 +1,46 @@ +--- +title: "Explore multiple projects" +sidebar_label: "Explore multiple projects" +description: "Learn about project-level lineage in dbt Explorer and its uses." +pagination_next: null +--- + +You can also view all the different projects and public models in the account, where the public models are defined, and how they are used to gain a better understanding about your cross-project resources. + +The resource-level lineage graph for a given project displays the cross-project relationships in the DAG. The different icons indicate whether you’re looking at an upstream producer project (parent) or a downstream consumer project (child). + +When you view an upstream (parent) project, its public models display a counter icon in the upper right corner indicating how many downstream (child) projects depend on them. Selecting a model reveals the lineage indicating the projects dependent on that model. These counts include all projects listing the upstream one as a dependency in its `dependencies.yml`, even without a direct `{{ ref() }}`. Selecting a project node from a public model opens its detailed lineage graph, which is subject to your [permission](/docs/cloud/manage-access/enterprise-permissions). + + + +When viewing a downstream (child) project that imports and refs public models from upstream (parent) projects, public models will show up in the lineage graph and display an icon on the graph edge that indicates what the relationship is to a model from another project. Hovering over this icon indicates the specific dbt Cloud project that produces that model. Double-clicking on a model from another project opens the resource-level lineage graph of the parent project, which is subject to your permissions. + + + + +## Explore the project-level lineage graph + +For cross-project collaboration, you can interact with the DAG in all the same ways as described in [Explore your project's lineage](/docs/collaborate/explore-projects#project-lineage) but you can also interact with it at the project level and view the details. + +To get a list view of all the projects, select the account name at the top of the **Explore** page near the navigation bar. This view includes a public model list, project list, and a search bar for project searches. You can also view the project-level lineage graph by clicking the Lineage view icon in the page's upper right corner. + +If you have permissions for a project in the account, you can view all public models used across the entire account. However, you can only view full public model details and private models if you have permissions for a project where the models are defined. + +From the project-level lineage graph, you can: + +- Click the Lineage view icon (in the graph’s upper right corner) to view the cross-project lineage graph. +- Click the List view icon (in the graph’s upper right corner) to view the project list. + - Select a project from the **Projects** tab to switch to that project’s main **Explore** page. + - Select a model from the **Public Models** tab to view the [model’s details page](/docs/collaborate/explore-projects#view-resource-details). + - Perform searches on your projects with the search bar. +- Select a project node in the graph (double-clicking) to switch to that particular project’s lineage graph. + +When you select a project node in the graph, a project details panel opens on the graph’s right-hand side where you can: + +- View counts of the resources defined in the project. +- View a list of its public models, if any. +- View a list of other projects that uses the project, if any. +- Click **Open Project Lineage** to switch to the project’s lineage graph. +- Click the Share icon to copy the project panel link to your clipboard so you can share the graph with someone. + + \ No newline at end of file diff --git a/website/docs/docs/collaborate/explore-projects.md b/website/docs/docs/collaborate/explore-projects.md index 282ef566356..ed5dee93317 100644 --- a/website/docs/docs/collaborate/explore-projects.md +++ b/website/docs/docs/collaborate/explore-projects.md @@ -2,7 +2,7 @@ title: "Explore your dbt projects" sidebar_label: "Explore dbt projects" description: "Learn about dbt Explorer and how to interact with it to understand, improve, and leverage your data pipelines." -pagination_next: null +pagination_next: "docs/collaborate/model-performance" pagination_prev: null --- @@ -36,7 +36,7 @@ For a richer experience with dbt Explorer, you must: - Run [dbt source freshness](/reference/commands/source#dbt-source-freshness) within a job in the environment to view source freshness data. - Run [dbt snapshot](/reference/commands/snapshot) or [dbt build](/reference/commands/build) within a job in the environment to view snapshot details. -Richer and more timely metadata will become available as dbt, the Discovery API, and the underlying dbt Cloud platform evolves. +Richer and more timely metadata will become available as dbt Core, the Discovery API, and the underlying dbt Cloud platform evolves. ## Explore your project's lineage graph {#project-lineage} @@ -46,6 +46,8 @@ If you don't see the project lineage graph immediately, click **Render Lineage** The nodes in the lineage graph represent the project’s resources and the edges represent the relationships between the nodes. Nodes are color-coded and include iconography according to their resource type. +By default, dbt Explorer shows the project's [applied state](/docs/dbt-cloud-apis/project-state#definition-logical-vs-applied-state-of-dbt-nodes) lineage. That is, it shows models that have been successfully built and are available to query, not just the models defined in the project. + To explore the lineage graphs of tests and macros, view [their resource details pages](#view-resource-details). By default, dbt Explorer excludes these resources from the full lineage graph unless a search query returns them as results. To interact with the full lineage graph, you can: @@ -53,17 +55,23 @@ To interact with the full lineage graph, you can: - Hover over any item in the graph to display the resource’s name and type. - Zoom in and out on the graph by mouse-scrolling. - Grab and move the graph and the nodes. +- Right click on a node (context menu) to: + - Refocus on the node, including its parent and child nodes + - Refocus on the node and its children only + - Refocus on the node and it parents only + - View the node's [resource details](#view-resource-details) page + - Select a resource to highlight its relationship with other resources in your project. A panel opens on the graph’s right-hand side that displays a high-level summary of the resource’s details. The side panel includes a **General** tab for information like description, materialized type, and other details. - Click the Share icon in the side panel to copy the graph’s link to your clipboard. - Click the View Resource icon in the side panel to [view the resource details](#view-resource-details). -- [Search and select specific resources](#search-resources) or a subset of the DAG using selectors and graph operators. For example: +- [Search and select specific resources](#search-resources) or a subset of the DAG using [selectors](/reference/node-selection/methods) and [graph operators](/reference/node-selection/graph-operators). This can help you narrow the focus on the resources that interest you. For example: - `+[RESOURCE_NAME]` — Displays all parent nodes of the resource - `resource_type:model [RESOURCE_NAME]` — Displays all models matching the name search - [View resource details](#view-resource-details) by selecting a node (double-clicking) in the graph. - Click the List view icon in the graph's upper right corner to return to the main **Explore** page. - + ## Search for resources {#search-resources} @@ -74,9 +82,15 @@ Select a node (single-click) in the lineage graph to highlight its relationship ### Search with keywords When searching with keywords, dbt Explorer searches through your resource metadata (such as resource type, resource name, column name, source name, tags, schema, database, version, alias/identifier, and package name) and returns any matches. -### Search with selector methods +- Keyword search features a side panel (to the right of the main section) to filter search results by resource type. +- Use this panel to select specific resource tags or model access levels under the **Models** option. + - For example, a search for "sale" returns results that include all resources with the keyword "sale" in their metadata. Filtering by **Models** and **Sources** refines these results to only include models or sources. + +- When searching for an exact column name, the results show all relational nodes containing that column in their schemas. If there's a match, a notice in the search result indicates the resource contains the specified column. -You can search with [selector methods](/reference/node-selection/methods). Below are the selectors currently available in dbt Explorer: +### Search with selectors + +You can search with [selectors](/reference/node-selection/methods). Below are the selectors currently available in dbt Explorer: - `fqn:` — Find resources by [file or fully qualified name](/reference/node-selection/methods#the-fqn-method). This selector is the search bar's default. If you want to use the default, it's unnecessary to add `fqn:` before the search term. - `source:` — Find resources by a specified [source](/reference/node-selection/methods#the-source-method). @@ -91,23 +105,15 @@ You can search with [selector methods](/reference/node-selection/methods). Below -### Search with graph operators - -You can use [graph operators](/reference/node-selection/graph-operators) on keywords or selector methods. For example, `+orders` returns all the parents of `orders`. +Because the results of selectors are immutable, the filter side panel is not available with this search method. -### Search with set operators +When searching with selector methods, you can also use [graph operators](/reference/node-selection/graph-operators). For example, `+orders` returns all the parents of `orders`. This functionality is not available for keyword search. You can use multiple selector methods in your search query with [set operators](/reference/node-selection/set-operators). A space implies a union set operator and a comma for an intersection. For example: - `resource_type:metric,tag:nightly` — Returns metrics with the tag `nightly` - `+snowplow_sessions +fct_orders` — Returns resources that are parent nodes of either `snowplow_sessions` or `fct_orders` -### Search with both keywords and selector methods - -You can use keyword search to highlight results that are filtered by the selector search. For example, if you don't have a resource called `customers`, then `resource_type:metric customers` returns all the metrics in your project and highlights those that are related to the term `customers` in the name, in a column, tagged as customers, and so on. - -When searching in this way, the selectors behave as filters that you can use to narrow the search and keywords as a way to find matches within those filtered results. - - + ## Browse with the sidebar @@ -120,7 +126,7 @@ To browse using a different view, you can choose one of these options from the * - **File Tree** — All resources in the project organized by the file in which they are defined. This mirrors the file tree in your dbt project repository. - **Database** — All resources in the project organized by the database and schema in which they are built. This mirrors your data platform's structure that represents the [applied state](/docs/dbt-cloud-apis/project-state) of your project. - + ## View model versions @@ -132,7 +138,7 @@ You can view the definition and latest run results of any resource in your proje The details (metadata) available to you depends on the resource’s type, its definition, and the [commands](/docs/deploy/job-commands) that run within jobs in the production environment. - + ### Example of model details @@ -143,11 +149,11 @@ An example of the details you might get for a model: - **Lineage** graph — The model’s lineage graph that you can interact with. The graph includes one parent node and one child node from the model. Click the Expand icon in the graph's upper right corner to view the model in full lineage graph mode. - **Description** section — A [description of the model](/docs/collaborate/documentation#adding-descriptions-to-your-project). - **Recent** section — Information on the last time the model ran, how long it ran for, whether the run was successful, the job ID, and the run ID. - - **Tests** section — [Tests](/docs/build/tests) for the model. + - **Tests** section — [Tests](/docs/build/data-tests) for the model, including a status indicator for the latest test status. A :white_check_mark: denotes a passing test. - **Details** section — Key properties like the model’s relation name (for example, how it’s represented and how you can query it in the data platform: `database.schema.identifier`); model governance attributes like access, group, and if contracted; and more. - **Relationships** section — The nodes the model **Depends On**, is **Referenced by**, and (if applicable) is **Used by** for projects that have declared the models' project as a dependency. - **Code** tab — The source code and compiled code for the model. -- **Columns** tab — The available columns in the model. This tab also shows tests results (if any) that you can select to view the test's details page. A :white_check_mark: denotes a passing test. +- **Columns** tab — The available columns in the model. This tab also shows tests results (if any) that you can select to view the test's details page. A :white_check_mark: denotes a passing test. To filter the columns in the resource, you can use the search bar that's located at the top of the columns view. ### Example of exposure details @@ -189,47 +195,6 @@ An example of the details you might get for each source table within a source co - **Relationships** section — A table that lists all the sources used with their freshness status, the timestamp of when freshness was last checked, and the timestamp of when the source was last loaded. - **Columns** tab — The available columns in the source. This tab also shows tests results (if any) that you can select to view the test's details page. A :white_check_mark: denotes a passing test. -## About project-level lineage -You can also view all the different projects and public models in the account, where the public models are defined, and how they are used to gain a better understanding about your cross-project resources. - -When viewing the resource-level lineage graph for a given project that uses cross-project references, you can see cross-project relationships represented in the DAG. The iconography is slightly different depending on whether you're viewing the lineage of an upstream producer project or a downstream consumer project. - -When viewing an upstream (parent) project that produces public models that are imported by downstream (child) projects, public models will have a counter icon in their upper right corner that indicates the number of projects that declare the current project as a dependency. Selecting that model reveals the lineage to show the specific projects that are dependent on this model. Projects show up in this counter if they declare the parent project as a dependency in its `dependencies.yml` regardless of whether or not there's a direct `{{ ref() }}` against the public model. Selecting a project node from a public model opens the resource-level lineage graph for that project, which is subject to your permissions. - - - -When viewing a downstream (child) project that imports and refs public models from upstream (parent) projects, public models will show up in the lineage graph and display an icon on the graph edge that indicates what the relationship is to a model from another project. Hovering over this icon indicates the specific dbt Cloud project that produces that model. Double-clicking on a model from another project opens the resource-level lineage graph of the parent project, which is subject to your permissions. - - - - -### Explore the project-level lineage graph - -For cross-project collaboration, you can interact with the DAG in all the same ways as described in [Explore your project's lineage](#project-lineage) but you can also interact with it at the project level and view the details. - -To get a list view of all the projects, select the account name at the top of the **Explore** page near the navigation bar. This view includes a public model list, project list, and a search bar for project searches. You can also view the project-level lineage graph by clicking the Lineage view icon in the page's upper right corner. - -If you have permissions for a project in the account, you can view all public models used across the entire account. However, you can only view full public model details and private models if you have permissions for a project where the models are defined. - -From the project-level lineage graph, you can: - -- Click the Lineage view icon (in the graph’s upper right corner) to view the cross-project lineage graph. -- Click the List view icon (in the graph’s upper right corner) to view the project list. - - Select a project from the **Projects** tab to switch to that project’s main **Explore** page. - - Select a model from the **Public Models** tab to view the [model’s details page](#view-resource-details). - - Perform searches on your projects with the search bar. -- Select a project node in the graph (double-clicking) to switch to that particular project’s lineage graph. - -When you select a project node in the graph, a project details panel opens on the graph’s right-hand side where you can: - -- View counts of the resources defined in the project. -- View a list of its public models, if any. -- View a list of other projects that uses the project, if any. -- Click **Open Project Lineage** to switch to the project’s lineage graph. -- Click the Share icon to copy the project panel link to your clipboard so you can share the graph with someone. - - - ## Related content - [Enterprise permissions](/docs/cloud/manage-access/enterprise-permissions) - [About model governance](/docs/collaborate/govern/about-model-governance) diff --git a/website/docs/docs/collaborate/govern/model-contracts.md b/website/docs/docs/collaborate/govern/model-contracts.md index bb011119958..e3ea1e8c70c 100644 --- a/website/docs/docs/collaborate/govern/model-contracts.md +++ b/website/docs/docs/collaborate/govern/model-contracts.md @@ -183,9 +183,9 @@ Any model meeting the criteria described above _can_ define a contract. We recom A model's contract defines the **shape** of the returned dataset. If the model's logic or input data doesn't conform to that shape, the model does not build. -[Tests](/docs/build/tests) are a more flexible mechanism for validating the content of your model _after_ it's built. So long as you can write the query, you can run the test. Tests are more configurable, such as with [custom severity thresholds](/reference/resource-configs/severity). They are easier to debug after finding failures, because you can query the already-built model, or [store the failing records in the data warehouse](/reference/resource-configs/store_failures). +[Data Tests](/docs/build/data-tests) are a more flexible mechanism for validating the content of your model _after_ it's built. So long as you can write the query, you can run the data test. Data tests are more configurable, such as with [custom severity thresholds](/reference/resource-configs/severity). They are easier to debug after finding failures, because you can query the already-built model, or [store the failing records in the data warehouse](/reference/resource-configs/store_failures). -In some cases, you can replace a test with its equivalent constraint. This has the advantage of guaranteeing the validation at build time, and it probably requires less compute (cost) in your data platform. The prerequisites for replacing a test with a constraint are: +In some cases, you can replace a data test with its equivalent constraint. This has the advantage of guaranteeing the validation at build time, and it probably requires less compute (cost) in your data platform. The prerequisites for replacing a data test with a constraint are: - Making sure that your data platform can support and enforce the constraint that you need. Most platforms only enforce `not_null`. - Materializing your model as `table` or `incremental` (**not** `view` or `ephemeral`). - Defining a full contract for this model by specifying the `name` and `data_type` of each column. diff --git a/website/docs/docs/collaborate/govern/project-dependencies.md b/website/docs/docs/collaborate/govern/project-dependencies.md index 174e4572890..569d69a87e6 100644 --- a/website/docs/docs/collaborate/govern/project-dependencies.md +++ b/website/docs/docs/collaborate/govern/project-dependencies.md @@ -22,8 +22,12 @@ This year, dbt Labs is introducing an expanded notion of `dependencies` across m - **Packages** — Familiar and pre-existing type of dependency. You take this dependency by installing the package's full source code (like a software library). - **Projects** — A _new_ way to take a dependency on another project. Using a metadata service that runs behind the scenes, dbt Cloud resolves references on-the-fly to public models defined in other projects. You don't need to parse or run those upstream models yourself. Instead, you treat your dependency on those models as an API that returns a dataset. The maintainer of the public model is responsible for guaranteeing its quality and stability. +import UseCaseInfo from '/snippets/_packages_or_dependencies.md'; + + + +Refer to the [FAQs](#faqs) for more info. -Starting in dbt v1.6 or higher, `packages.yml` has been renamed to `dependencies.yml`. However, if you need use Jinja within your packages config, such as an environment variable for your private package, you need to keep using `packages.yml` for your packages for now. Refer to the [FAQs](#faqs) for more info. ## Prerequisites @@ -33,22 +37,6 @@ In order to add project dependencies and resolve cross-project `ref`, you must: - Have a successful run of the upstream ("producer") project - Have a multi-tenant or single-tenant [dbt Cloud Enterprise](https://www.getdbt.com/pricing) account (Azure ST is not supported but coming soon) - ## Example As an example, let's say you work on the Marketing team at the Jaffle Shop. The name of your team's project is `jaffle_marketing`: diff --git a/website/docs/docs/collaborate/model-performance.md b/website/docs/docs/collaborate/model-performance.md new file mode 100644 index 00000000000..7ef675b4e1e --- /dev/null +++ b/website/docs/docs/collaborate/model-performance.md @@ -0,0 +1,41 @@ +--- +title: "Model performance" +sidebar_label: "Model performance" +description: "Learn about ." +--- + +dbt Explorer provides metadata on dbt Cloud runs for in-depth model performance and quality analysis. This feature assists in reducing infrastructure costs and saving time for data teams by highlighting where to fine-tune projects and deployments — such as model refactoring or job configuration adjustments. + + + +:::tip Beta + +The model performance beta feature is now available in dbt Explorer! Check it out! +::: + +## The Performance overview page + +You can pinpoint areas for performance enhancement by using the Performance overview page. This page presents a comprehensive analysis across all project models and displays the longest-running models, those most frequently executed, and the ones with the highest failure rates during runs/tests. Data can be segmented by environment and job type which can offer insights into: + +- Most executed models (total count). +- Models with the longest execution time (average duration). +- Models with the most failures, detailing run failures (percentage and count) and test failures (percentage and count). + +Each data point links to individual models in Explorer. + + + +You can view historical metadata for up to the past three months. Select the time horizon using the filter, which defaults to a two-week lookback. + + + +## The Model performance tab + +You can view trends in execution times, counts, and failures by using the Model performance tab for historical performance analysis. Daily execution data includes: + +- Average model execution time. +- Model execution counts, including failures/errors (total sum). + +Clicking on a data point reveals a table listing all job runs for that day, with each row providing a direct link to the details of a specific run. + + \ No newline at end of file diff --git a/website/docs/docs/collaborate/project-recommendations.md b/website/docs/docs/collaborate/project-recommendations.md new file mode 100644 index 00000000000..e6263a875fc --- /dev/null +++ b/website/docs/docs/collaborate/project-recommendations.md @@ -0,0 +1,50 @@ +--- +title: "Project recommendations" +sidebar_label: "Project recommendations" +description: "dbt Explorer provides recommendations that you can take to improve the quality of your dbt project." +--- + +:::tip Beta + +The project recommendations beta feature is now available in dbt Explorer! Check it out! + +::: + +dbt Explorer provides recommendations about your project from the `dbt_project_evaluator` [package](https://hub.getdbt.com/dbt-labs/dbt_project_evaluator/latest/) using metadata from the Discovery API. + +Explorer also offers a global view, showing all the recommendations across the project for easy sorting and summarizing. + +These recommendations provide insight into how you can build a more well documented, well tested, and well built project, leading to less confusion and more trust. + +The Recommendations overview page includes two top-level metrics measuring the test and documentation coverage of the models in your project. + +- **Model test coverage** — The percent of models in your project (models not from a package or imported via dbt Mesh) with at least one dbt test configured on them. +- **Model documentation coverage** — The percent of models in your project (models not from a package or imported via dbt Mesh) with a description. + + + +## List of rules + +| Category | Name | Description | Package Docs Link | +| --- | --- | --- | --- | +| Modeling | Direct Join to Source | Model that joins both a model and source, indicating a missing staging model | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/modeling/#direct-join-to-source) | +| Modeling | Duplicate Sources | More than one source node corresponds to the same data warehouse relation | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/modeling/#duplicate-sources) | +| Modeling | Multiple Sources Joined | Models with more than one source parent, indicating lack of staging models | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/modeling/#multiple-sources-joined) | +| Modeling | Root Model | Models with no parents, indicating potential hardcoded references and need for sources | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/modeling/#root-models) | +| Modeling | Source Fanout | Sources with more than one model child, indicating a need for staging models | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/modeling/#source-fanout) | +| Modeling | Unused Source | Sources that are not referenced by any resource | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/modeling/#unused-sources) | +| Performance | Exposure Dependent on View | Exposures with at least one model parent materialized as a view, indicating potential query performance issues | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/performance/#exposure-parents-materializations) | +| Testing | Missing Primary Key Test | Models with insufficient testing on the grain of the model. | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/testing/#missing-primary-key-tests) | +| Documentation | Undocumented Models | Models without a model-level description | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/documentation/#undocumented-models) | +| Documentation | Undocumented Source | Sources (collections of source tables) without descriptions | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/documentation/#undocumented-sources) | +| Documentation | Undocumented Source Tables | Source tables without descriptions | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/documentation/#undocumented-source-tables) | +| Governance | Public Model Missing Contract | Models with public access that do not have a model contract to ensure the data types | [GitHub](https://dbt-labs.github.io/dbt-project-evaluator/0.8/rules/governance/#public-models-without-contracts) | + + +## The Recommendations tab + +Models, sources and exposures each also have a Recommendations tab on their resource details page, with the specific recommendations that correspond to that resource: + + + + diff --git a/website/docs/docs/community-adapters.md b/website/docs/docs/community-adapters.md index 444ea0e04b4..d1e63f03128 100644 --- a/website/docs/docs/community-adapters.md +++ b/website/docs/docs/community-adapters.md @@ -17,4 +17,4 @@ Community adapters are adapter plugins contributed and maintained by members of | [TiDB](/docs/core/connect-data-platform/tidb-setup) | [Firebolt](/docs/core/connect-data-platform/firebolt-setup) | [MindsDB](/docs/core/connect-data-platform/mindsdb-setup) | [Vertica](/docs/core/connect-data-platform/vertica-setup) | [AWS Glue](/docs/core/connect-data-platform/glue-setup) | [MySQL](/docs/core/connect-data-platform/mysql-setup) | | [Upsolver](/docs/core/connect-data-platform/upsolver-setup) | [Databend Cloud](/docs/core/connect-data-platform/databend-setup) | [fal - Python models](/docs/core/connect-data-platform/fal-setup) | - +| [TimescaleDB](https://dbt-timescaledb.debruyn.dev/) | | | diff --git a/website/docs/docs/connect-adapters.md b/website/docs/docs/connect-adapters.md index 6ccc1b4f376..56ff538dc9b 100644 --- a/website/docs/docs/connect-adapters.md +++ b/website/docs/docs/connect-adapters.md @@ -15,7 +15,7 @@ Explore the fastest and most reliable way to deploy dbt using dbt Cloud, a hoste Install dbt Core, an open-source tool, locally using the command line. dbt communicates with a number of different data platforms by using a dedicated adapter plugin for each. When you install dbt Core, you'll also need to install the specific adapter for your database, [connect to dbt Core](/docs/core/about-core-setup), and set up a `profiles.yml` file. -With a few exceptions [^1], you can install all [Verified adapters](/docs/supported-data-platforms) from PyPI using `python -m pip install adapter-name`. For example to install Snowflake, use the command `python -m pip install dbt-snowflake`. The installation will include `dbt-core` and any other required dependencies, which may include both other dependencies and even other adapter plugins. Read more about [installing dbt](/docs/core/installation). +With a few exceptions [^1], you can install all [Verified adapters](/docs/supported-data-platforms) from PyPI using `python -m pip install adapter-name`. For example to install Snowflake, use the command `python -m pip install dbt-snowflake`. The installation will include `dbt-core` and any other required dependencies, which may include both other dependencies and even other adapter plugins. Read more about [installing dbt](/docs/core/installation-overview). [^1]: Here are the two different adapters. Use the PyPI package name when installing with `pip` diff --git a/website/docs/docs/core/about-core-setup.md b/website/docs/docs/core/about-core-setup.md index 64e7694b793..8b170ba70d4 100644 --- a/website/docs/docs/core/about-core-setup.md +++ b/website/docs/docs/core/about-core-setup.md @@ -3,7 +3,7 @@ title: About dbt Core setup id: about-core-setup description: "Configuration settings for dbt Core." sidebar_label: "About dbt Core setup" -pagination_next: "docs/core/about-dbt-core" +pagination_next: "docs/core/dbt-core-environments" pagination_prev: null --- @@ -11,9 +11,10 @@ dbt Core is an [open-source](https://github.com/dbt-labs/dbt-core) tool that ena This section of our docs will guide you through various settings to get started: -- [About dbt Core](/docs/core/about-dbt-core) -- [Installing dbt](/docs/core/installation) - [Connecting to a data platform](/docs/core/connect-data-platform/profiles.yml) - [How to run your dbt projects](/docs/running-a-dbt-project/run-your-dbt-projects) +To learn about developing dbt projects in dbt Cloud, refer to [Develop with dbt Cloud](/docs/cloud/about-develop-dbt). + - dbt Cloud provides a command line interface with the [dbt Cloud CLI](/docs/cloud/cloud-cli-installation). Both dbt Core and the dbt Cloud CLI are command line tools that let you run dbt commands. The key distinction is the dbt Cloud CLI is tailored for dbt Cloud's infrastructure and integrates with all its [features](/docs/cloud/about-cloud/dbt-cloud-features). + If you need a more detailed first-time setup guide for specific data platforms, read our [quickstart guides](https://docs.getdbt.com/guides). diff --git a/website/docs/docs/core/about-dbt-core.md b/website/docs/docs/core/about-dbt-core.md deleted file mode 100644 index a35d92420f3..00000000000 --- a/website/docs/docs/core/about-dbt-core.md +++ /dev/null @@ -1,25 +0,0 @@ ---- -title: "About dbt Core" -id: "about-dbt-core" -sidebar_label: "About dbt Core" ---- - -[dbt Core](https://github.com/dbt-labs/dbt-core) is an open sourced project where you can develop from the command line and run your dbt project. - -To use dbt Core, your workflow generally looks like: - -1. **Build your dbt project in a code editor —** popular choices include VSCode and Atom. - -2. **Run your project from the command line —** macOS ships with a default Terminal program, however you can also use iTerm or the command line prompt within a code editor to execute dbt commands. - -:::info How we set up our computers for working on dbt projects - -We've written a [guide](https://discourse.getdbt.com/t/how-we-set-up-our-computers-for-working-on-dbt-projects/243) for our recommended setup when running dbt projects using dbt Core. - -::: - -If you're using the command line, we recommend learning some basics of your terminal to help you work more effectively. In particular, it's important to understand `cd`, `ls` and `pwd` to be able to navigate through the directory structure of your computer easily. - -You can find more information on installing and setting up the dbt Core [here](/docs/core/installation). - -**Note** — dbt supports a dbt Cloud CLI and dbt Core, both command line interface tools that enable you to run dbt commands. The key distinction is the dbt Cloud CLI is tailored for dbt Cloud's infrastructure and integrates with all its [features](/docs/cloud/about-cloud/dbt-cloud-features). diff --git a/website/docs/docs/core/connect-data-platform/about-core-connections.md b/website/docs/docs/core/connect-data-platform/about-core-connections.md index 492e5ae878a..61a7805d232 100644 --- a/website/docs/docs/core/connect-data-platform/about-core-connections.md +++ b/website/docs/docs/core/connect-data-platform/about-core-connections.md @@ -14,6 +14,7 @@ dbt Core can connect with a variety of data platform providers including: - [Apache Spark](/docs/core/connect-data-platform/spark-setup) - [Databricks](/docs/core/connect-data-platform/databricks-setup) - [Google BigQuery](/docs/core/connect-data-platform/bigquery-setup) +- [Microsoft Fabric](/docs/core/connect-data-platform/fabric-setup) - [PostgreSQL](/docs/core/connect-data-platform/postgres-setup) - [Snowflake](/docs/core/connect-data-platform/snowflake-setup) - [Starburst or Trino](/docs/core/connect-data-platform/trino-setup) diff --git a/website/docs/docs/core/connect-data-platform/fabric-setup.md b/website/docs/docs/core/connect-data-platform/fabric-setup.md index 11a8cf6f98b..deef1e04b22 100644 --- a/website/docs/docs/core/connect-data-platform/fabric-setup.md +++ b/website/docs/docs/core/connect-data-platform/fabric-setup.md @@ -8,7 +8,7 @@ meta: github_repo: 'Microsoft/dbt-fabric' pypi_package: 'dbt-fabric' min_core_version: '1.4.0' - cloud_support: Not Supported + cloud_support: Supported platform_name: 'Microsoft Fabric' config_page: '/reference/resource-configs/fabric-configs' --- diff --git a/website/docs/docs/core/connect-data-platform/infer-setup.md b/website/docs/docs/core/connect-data-platform/infer-setup.md index 7642c553cc4..c04fba59a56 100644 --- a/website/docs/docs/core/connect-data-platform/infer-setup.md +++ b/website/docs/docs/core/connect-data-platform/infer-setup.md @@ -12,16 +12,21 @@ meta: slack_channel_name: n/a slack_channel_link: platform_name: 'Infer' - config_page: '/reference/resource-configs/no-configs' + config_page: '/reference/resource-configs/infer-configs' min_supported_version: n/a --- +:::info Vendor-supported plugin + +Certain core functionality may vary. If you would like to report a bug, request a feature, or contribute, you can check out the linked repository and open an issue. + +::: + import SetUpPages from '/snippets/_setup-pages-intro.md'; - ## Connecting to Infer with **dbt-infer** Infer allows you to perform advanced ML Analytics within SQL as if native to your data warehouse. @@ -30,10 +35,18 @@ you can build advanced analysis for any business use case. Read more about SQL-inf and Infer in the [Infer documentation](https://docs.getinfer.io/). The `dbt-infer` package allow you to use SQL-inf easily within your DBT models. -You can read more about the `dbt-infer` package itself and how it connecst to Infer in the [dbt-infer documentation](https://dbt.getinfer.io/). +You can read more about the `dbt-infer` package itself and how it connects to Infer in the [dbt-infer documentation](https://dbt.getinfer.io/). + +The dbt-infer adapter is maintained via PyPi and installed with pip. +To install the latest dbt-infer package simply run the following within the same shell as you run dbt. +```python +pip install dbt-infer +``` + +Versioning of dbt-infer follows the standard dbt versioning scheme - meaning if you are using dbt 1.2 the corresponding dbt-infer will be named 1.2.x where is the latest minor version number. Before using SQL-inf in your DBT models you need to setup an Infer account and generate an API-key for the connection. -You can read how to do that in the [Getting Started Guide](https://dbt.getinfer.io/docs/getting_started#sign-up-to-infer). +You can read how to do that in the [Getting Started Guide](https://docs.getinfer.io/docs/reference/integrations/dbt). The profile configuration in `profiles.yml` for `dbt-infer` should look something like this: @@ -101,10 +114,10 @@ as native SQL functions. Infer supports a number of SQL-inf commands, including `PREDICT`, `EXPLAIN`, `CLUSTER`, `SIMILAR_TO`, `TOPICS`, `SENTIMENT`. -You can read more about SQL-inf and the commands it supports in the [SQL-inf Reference Guide](https://docs.getinfer.io/docs/reference). +You can read more about SQL-inf and the commands it supports in the [SQL-inf Reference Guide](https://docs.getinfer.io/docs/category/commands). To get you started we will give a brief example here of what such a model might look like. -You can find other more complex examples on the [dbt-infer examples page](https://dbt.getinfer.io/docs/examples). +You can find other more complex examples in the [dbt-infer examples repo](https://github.com/inferlabs/dbt-infer-examples). In our simple example, we will show how to use a previous model 'user_features' to predict churn by predicting the column `has_churned`. diff --git a/website/docs/docs/core/connect-data-platform/profiles.yml.md b/website/docs/docs/core/connect-data-platform/profiles.yml.md index 97254dda1c4..f8acb65f3d2 100644 --- a/website/docs/docs/core/connect-data-platform/profiles.yml.md +++ b/website/docs/docs/core/connect-data-platform/profiles.yml.md @@ -3,7 +3,7 @@ title: "About profiles.yml" id: profiles.yml --- -If you're using [dbt Core](/docs/core/about-dbt-core), you'll need a `profiles.yml` file that contains the connection details for your data platform. When you run dbt Core from the command line, it reads your `dbt_project.yml` file to find the `profile` name, and then looks for a profile with the same name in your `profiles.yml` file. This profile contains all the information dbt needs to connect to your data platform. +If you're using [dbt Core](/docs/core/installation-overview), you'll need a `profiles.yml` file that contains the connection details for your data platform. When you run dbt Core from the command line, it reads your `dbt_project.yml` file to find the `profile` name, and then looks for a profile with the same name in your `profiles.yml` file. This profile contains all the information dbt needs to connect to your data platform. For detailed info, you can refer to the [Connection profiles](/docs/core/connect-data-platform/connection-profiles). diff --git a/website/docs/docs/core/docker-install.md b/website/docs/docs/core/docker-install.md index 8de3bcb5c06..6c1ec9da9e1 100644 --- a/website/docs/docs/core/docker-install.md +++ b/website/docs/docs/core/docker-install.md @@ -11,7 +11,7 @@ You might also be able to use Docker to install and develop locally if you don't ### Prerequisites * You've installed Docker. For more information, see the [Docker](https://docs.docker.com/) site. -* You understand which database adapter(s) you need. For more information, see [About dbt adapters](/docs/core/installation#about-dbt-adapters). +* You understand which database adapter(s) you need. For more information, see [About dbt adapters](docs/core/installation-overview#about-dbt-data-platforms-and-adapters). * You understand how dbt Core is versioned. For more information, see [About dbt Core versions](/docs/dbt-versions/core). * You have a general understanding of the dbt, dbt workflow, developing locally in the command line interface (CLI). For more information, see [About dbt](/docs/introduction#how-do-i-use-dbt). diff --git a/website/docs/docs/core/installation-overview.md b/website/docs/docs/core/installation-overview.md index cb1df26b0f8..8c139012667 100644 --- a/website/docs/docs/core/installation-overview.md +++ b/website/docs/docs/core/installation-overview.md @@ -1,25 +1,35 @@ --- -title: "About installing dbt" -id: "installation" +title: "About dbt Core and installation" description: "You can install dbt Core using a few different tested methods." pagination_next: "docs/core/homebrew-install" pagination_prev: null --- +[dbt Core](https://github.com/dbt-labs/dbt-core) is an open sourced project where you can develop from the command line and run your dbt project. + +To use dbt Core, your workflow generally looks like: + +1. **Build your dbt project in a code editor —** popular choices include VSCode and Atom. + +2. **Run your project from the command line —** macOS ships with a default Terminal program, however you can also use iTerm or the command line prompt within a code editor to execute dbt commands. + +:::info How we set up our computers for working on dbt projects + +We've written a [guide](https://discourse.getdbt.com/t/how-we-set-up-our-computers-for-working-on-dbt-projects/243) for our recommended setup when running dbt projects using dbt Core. + +::: + +If you're using the command line, we recommend learning some basics of your terminal to help you work more effectively. In particular, it's important to understand `cd`, `ls` and `pwd` to be able to navigate through the directory structure of your computer easily. + +## Install dbt Core + You can install dbt Core on the command line by using one of these methods: - [Use pip to install dbt](/docs/core/pip-install) (recommended) - [Use Homebrew to install dbt](/docs/core/homebrew-install) - [Use a Docker image to install dbt](/docs/core/docker-install) - [Install dbt from source](/docs/core/source-install) - -:::tip Pro tip: Using the --help flag - -Most command-line tools, including dbt, have a `--help` flag that you can use to show available commands and arguments. For example, you can use the `--help` flag with dbt in two ways:

-— `dbt --help`: Lists the commands available for dbt
-— `dbt run --help`: Lists the flags available for the `run` command - -::: +- You can also develop locally using the [dbt Cloud CLI](/docs/cloud/cloud-cli-installation). The dbt Cloud CLI and dbt Core are both command line tools that let you run dbt commands. The key distinction is the dbt Cloud CLI is tailored for dbt Cloud's infrastructure and integrates with all its [features](/docs/cloud/about-cloud/dbt-cloud-features). ## Upgrading dbt Core @@ -32,3 +42,11 @@ dbt provides a number of resources for understanding [general best practices](/b ## About dbt data platforms and adapters dbt works with a number of different data platforms (databases, query engines, and other SQL-speaking technologies). It does this by using a dedicated _adapter_ for each. When you install dbt Core, you'll also want to install the specific adapter for your database. For more details, see [Supported Data Platforms](/docs/supported-data-platforms). + +:::tip Pro tip: Using the --help flag + +Most command-line tools, including dbt, have a `--help` flag that you can use to show available commands and arguments. For example, you can use the `--help` flag with dbt in two ways:

+— `dbt --help`: Lists the commands available for dbt
+— `dbt run --help`: Lists the flags available for the `run` command + +::: diff --git a/website/docs/docs/dbt-cloud-apis/schema-discovery-job.mdx b/website/docs/docs/dbt-cloud-apis/schema-discovery-job.mdx index 8b02c5601ad..faebcd9ec2c 100644 --- a/website/docs/docs/dbt-cloud-apis/schema-discovery-job.mdx +++ b/website/docs/docs/dbt-cloud-apis/schema-discovery-job.mdx @@ -61,4 +61,4 @@ query JobQueryExample { ### Fields When querying an `job`, you can use the following fields. - + diff --git a/website/docs/docs/dbt-cloud-apis/schema.jsx b/website/docs/docs/dbt-cloud-apis/schema.jsx index 31568671573..9ac4656c984 100644 --- a/website/docs/docs/dbt-cloud-apis/schema.jsx +++ b/website/docs/docs/dbt-cloud-apis/schema.jsx @@ -173,7 +173,7 @@ export const NodeArgsTable = ({ parent, name, useBetaAPI }) => { ) } -export const SchemaTable = ({ nodeName, useBetaAPI }) => { +export const SchemaTable = ({ nodeName, useBetaAPI, exclude = [] }) => { const [data, setData] = useState(null) useEffect(() => { const fetchData = () => { @@ -255,6 +255,7 @@ export const SchemaTable = ({ nodeName, useBetaAPI }) => { {data.data.__type.fields.map(function ({ name, description, type }) { + if (exclude.includes(name)) return; return ( {name} diff --git a/website/docs/docs/dbt-cloud-apis/service-tokens.md b/website/docs/docs/dbt-cloud-apis/service-tokens.md index f1369711d2b..b0b5fbd6cfe 100644 --- a/website/docs/docs/dbt-cloud-apis/service-tokens.md +++ b/website/docs/docs/dbt-cloud-apis/service-tokens.md @@ -51,7 +51,7 @@ Job admin service tokens can authorize requests for viewing, editing, and creati Member service tokens can authorize requests for viewing and editing resources, triggering runs, and inviting members to the account. Tokens assigned the Member permission set will have the same permissions as a Member user. For more information about Member users, see "[Self-service permissions](/docs/cloud/manage-access/self-service-permissions)". **Read-only**
-Read-only service tokens can authorize requests for viewing a read-only dashboard, viewing generated documentation, and viewing source freshness reports. +Read-only service tokens can authorize requests for viewing a read-only dashboard, viewing generated documentation, and viewing source freshness reports. This token can access and retrieve account-level information endpoints on the [Admin API](/docs/dbt-cloud-apis/admin-cloud-api) and authorize requests to the [Discovery API](/docs/dbt-cloud-apis/discovery-api). ### Enterprise plans using service account tokens diff --git a/website/docs/docs/dbt-cloud-apis/sl-jdbc.md b/website/docs/docs/dbt-cloud-apis/sl-jdbc.md index 931666dd10c..aba309566f8 100644 --- a/website/docs/docs/dbt-cloud-apis/sl-jdbc.md +++ b/website/docs/docs/dbt-cloud-apis/sl-jdbc.md @@ -352,6 +352,8 @@ semantic_layer.query(metrics=['food_order_amount', 'order_gross_profit'], ## FAQs + + - **Why do some dimensions use different syntax, like `metric_time` versus `[Dimension('metric_time')`?**
When you select a dimension on its own, such as `metric_time` you can use the shorthand method which doesn't need the “Dimension” syntax. However, when you perform operations on the dimension, such as adding granularity, the object syntax `[Dimension('metric_time')` is required. diff --git a/website/docs/docs/dbt-support.md b/website/docs/docs/dbt-support.md index 40968b9d763..84bf92482c5 100644 --- a/website/docs/docs/dbt-support.md +++ b/website/docs/docs/dbt-support.md @@ -17,7 +17,9 @@ If you're developing on the command line (CLI) and have questions or need some h ## dbt Cloud support -The global dbt Support team is available to dbt Cloud customers by email or in-product live chat. We want to help you work through implementing and utilizing dbt Cloud at your organization. Have a question you can't find an answer to in [our docs](https://docs.getdbt.com/) or [the Community Forum](https://discourse.getdbt.com/)? Our Support team is here to `dbt help` you! +The global dbt Support team is available to dbt Cloud customers by [email](mailto:support@getdbt.com) or using the in-product live chat (💬). + +We want to help you work through implementing and utilizing dbt Cloud at your organization. Have a question you can't find an answer to in [our docs](https://docs.getdbt.com/) or [the Community Forum](https://discourse.getdbt.com/)? Our Support team is here to `dbt help` you! - **Enterprise plans** — Priority [support](#severity-level-for-enterprise-support), options for custom support coverage hours, implementation assistance, dedicated management, and dbt Labs security reviews depending on price point. - **Developer and Team plans** — 24x5 support (no service level agreement (SLA); [contact Sales](https://www.getdbt.com/pricing/) for Enterprise plan inquiries). diff --git a/website/docs/docs/dbt-versions/core-upgrade/00-upgrading-to-v1.7.md b/website/docs/docs/dbt-versions/core-upgrade/00-upgrading-to-v1.7.md index 18863daba6f..af098860e6f 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/00-upgrading-to-v1.7.md +++ b/website/docs/docs/dbt-versions/core-upgrade/00-upgrading-to-v1.7.md @@ -12,7 +12,7 @@ import UpgradeMove from '/snippets/_upgrade-move.md'; ## Resources - [Changelog](https://github.com/dbt-labs/dbt-core/blob/8aaed0e29f9560bc53d9d3e88325a9597318e375/CHANGELOG.md) -- [CLI Installation guide](/docs/core/installation) +- [CLI Installation guide](/docs/core/installation-overview) - [Cloud upgrade guide](/docs/dbt-versions/upgrade-core-in-cloud) - [Release schedule](https://github.com/dbt-labs/dbt-core/issues/8260) diff --git a/website/docs/docs/dbt-versions/core-upgrade/01-upgrading-to-v1.6.md b/website/docs/docs/dbt-versions/core-upgrade/01-upgrading-to-v1.6.md index d36cc544814..33a038baa9b 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/01-upgrading-to-v1.6.md +++ b/website/docs/docs/dbt-versions/core-upgrade/01-upgrading-to-v1.6.md @@ -17,7 +17,7 @@ dbt Core v1.6 has three significant areas of focus: ## Resources - [Changelog](https://github.com/dbt-labs/dbt-core/blob/1.6.latest/CHANGELOG.md) -- [CLI Installation guide](/docs/core/installation) +- [dbt Core installation guide](/docs/core/installation-overview) - [Cloud upgrade guide](/docs/dbt-versions/upgrade-core-in-cloud) - [Release schedule](https://github.com/dbt-labs/dbt-core/issues/7481) diff --git a/website/docs/docs/dbt-versions/core-upgrade/02-upgrading-to-v1.5.md b/website/docs/docs/dbt-versions/core-upgrade/02-upgrading-to-v1.5.md index dded8a690fe..e739caa477a 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/02-upgrading-to-v1.5.md +++ b/website/docs/docs/dbt-versions/core-upgrade/02-upgrading-to-v1.5.md @@ -16,7 +16,7 @@ dbt Core v1.5 is a feature release, with two significant additions: ## Resources - [Changelog](https://github.com/dbt-labs/dbt-core/blob/1.5.latest/CHANGELOG.md) -- [CLI Installation guide](/docs/core/installation) +- [CLI Installation guide](/docs/core/installation-overview) - [Cloud upgrade guide](/docs/dbt-versions/upgrade-core-in-cloud) - [Release schedule](https://github.com/dbt-labs/dbt-core/issues/6715) diff --git a/website/docs/docs/dbt-versions/core-upgrade/03-upgrading-to-dbt-utils-v1.0.md b/website/docs/docs/dbt-versions/core-upgrade/03-upgrading-to-dbt-utils-v1.0.md index a7b302c9a58..229a54627fc 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/03-upgrading-to-dbt-utils-v1.0.md +++ b/website/docs/docs/dbt-versions/core-upgrade/03-upgrading-to-dbt-utils-v1.0.md @@ -82,7 +82,7 @@ models: # ...with this... where: "created_at > '2018-12-31'" ``` -**Note** — This may cause some tests to get the same autogenerated names. To resolve this, you can [define a custom name for a test](/reference/resource-properties/tests#define-a-custom-name-for-one-test). +**Note** — This may cause some tests to get the same autogenerated names. To resolve this, you can [define a custom name for a test](/reference/resource-properties/data-tests#define-a-custom-name-for-one-test). - The deprecated `unique_where` and `not_null_where` tests have been removed, because [where is now available natively to all tests](https://docs.getdbt.com/reference/resource-configs/where). To migrate, find and replace `dbt_utils.unique_where` with `unique` and `dbt_utils.not_null_where` with `not_null`. - `dbt_utils.current_timestamp()` is replaced by `dbt.current_timestamp()`. - Note that Postgres and Snowflake’s implementation of `dbt.current_timestamp()` differs from the old `dbt_utils` one ([full details here](https://github.com/dbt-labs/dbt-utils/pull/597#issuecomment-1231074577)). If you use Postgres or Snowflake and need identical backwards-compatible behavior, use `dbt.current_timestamp_backcompat()`. This discrepancy will hopefully be reconciled in a future version of dbt Core. diff --git a/website/docs/docs/dbt-versions/core-upgrade/04-upgrading-to-v1.4.md b/website/docs/docs/dbt-versions/core-upgrade/04-upgrading-to-v1.4.md index 6c6d96b2326..a946bdf369b 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/04-upgrading-to-v1.4.md +++ b/website/docs/docs/dbt-versions/core-upgrade/04-upgrading-to-v1.4.md @@ -12,7 +12,7 @@ import UpgradeMove from '/snippets/_upgrade-move.md'; ### Resources - [Changelog](https://github.com/dbt-labs/dbt-core/blob/1.4.latest/CHANGELOG.md) -- [CLI Installation guide](/docs/core/installation) +- [CLI Installation guide](/docs/core/installation-overview) - [Cloud upgrade guide](/docs/dbt-versions/upgrade-core-in-cloud) **Final release:** January 25, 2023 diff --git a/website/docs/docs/dbt-versions/core-upgrade/05-upgrading-to-v1.3.md b/website/docs/docs/dbt-versions/core-upgrade/05-upgrading-to-v1.3.md index f66d9bb9706..d9d97f17dc5 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/05-upgrading-to-v1.3.md +++ b/website/docs/docs/dbt-versions/core-upgrade/05-upgrading-to-v1.3.md @@ -12,7 +12,7 @@ import UpgradeMove from '/snippets/_upgrade-move.md'; ### Resources - [Changelog](https://github.com/dbt-labs/dbt-core/blob/1.3.latest/CHANGELOG.md) -- [CLI Installation guide](/docs/core/installation) +- [CLI Installation guide](/docs/core/installation-overview) - [Cloud upgrade guide](/docs/dbt-versions/upgrade-core-in-cloud) ## What to know before upgrading diff --git a/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.2.md b/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.2.md index 16825ff4e2b..72a3e0c82ad 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.2.md +++ b/website/docs/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.2.md @@ -12,7 +12,7 @@ import UpgradeMove from '/snippets/_upgrade-move.md'; ### Resources - [Changelog](https://github.com/dbt-labs/dbt-core/blob/1.2.latest/CHANGELOG.md) -- [CLI Installation guide](/docs/core/installation) +- [CLI Installation guide](/docs/core/installation-overview) - [Cloud upgrade guide](/docs/dbt-versions/upgrade-core-in-cloud) ## What to know before upgrading diff --git a/website/docs/docs/dbt-versions/core-upgrade/07-upgrading-to-v1.1.md b/website/docs/docs/dbt-versions/core-upgrade/07-upgrading-to-v1.1.md index 403264a46e6..868f3c7ed04 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/07-upgrading-to-v1.1.md +++ b/website/docs/docs/dbt-versions/core-upgrade/07-upgrading-to-v1.1.md @@ -12,7 +12,7 @@ import UpgradeMove from '/snippets/_upgrade-move.md'; ### Resources - [Changelog](https://github.com/dbt-labs/dbt-core/blob/1.1.latest/CHANGELOG.md) -- [CLI Installation guide](/docs/core/installation) +- [CLI Installation guide](/docs/core/installation-overview) - [Cloud upgrade guide](/docs/dbt-versions/upgrade-core-in-cloud) ## What to know before upgrading @@ -43,7 +43,7 @@ Expected a schema version of "https://schemas.getdbt.com/dbt/manifest/v5.json" i [**Incremental models**](/docs/build/incremental-models) can now accept a list of multiple columns as their `unique_key`, for models that need a combination of columns to uniquely identify each row. This is supported by the most common data warehouses, for incremental strategies that make use of the `unique_key` config (`merge` and `delete+insert`). -[**Generic tests**](/reference/resource-properties/tests) can define custom names. This is useful to "prettify" the synthetic name that dbt applies automatically. It's needed to disambiguate the case when the same generic test is defined multiple times with different configurations. +[**Generic tests**](/reference/resource-properties/data-tests) can define custom names. This is useful to "prettify" the synthetic name that dbt applies automatically. It's needed to disambiguate the case when the same generic test is defined multiple times with different configurations. [**Sources**](/reference/source-properties) can define configuration inline with other `.yml` properties, just like other resource types. The only supported config is `enabled`; you can use this to dynamically enable/disable sources based on environment or package variables. diff --git a/website/docs/docs/dbt-versions/core-upgrade/08-upgrading-to-v1.0.md b/website/docs/docs/dbt-versions/core-upgrade/08-upgrading-to-v1.0.md index c0ba804cd78..0460186551d 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/08-upgrading-to-v1.0.md +++ b/website/docs/docs/dbt-versions/core-upgrade/08-upgrading-to-v1.0.md @@ -13,7 +13,7 @@ import UpgradeMove from '/snippets/_upgrade-move.md'; - [Discourse](https://discourse.getdbt.com/t/3180) - [Changelog](https://github.com/dbt-labs/dbt-core/blob/1.0.latest/CHANGELOG.md) -- [CLI Installation guide](/docs/core/installation) +- [CLI Installation guide](/docs/core/installation-overview) - [Cloud upgrade guide](/docs/dbt-versions/upgrade-core-in-cloud) ## What to know before upgrading @@ -34,7 +34,7 @@ dbt Core major version 1.0 includes a number of breaking changes! Wherever possi ### Tests -The two **test types** are now "singular" and "generic" (instead of "data" and "schema", respectively). The `test_type:` selection method accepts `test_type:singular` and `test_type:generic`. (It will also accept `test_type:schema` and `test_type:data` for backwards compatibility.) **Not backwards compatible:** The `--data` and `--schema` flags to dbt test are no longer supported, and tests no longer have the tags `'data'` and `'schema'` automatically applied. Updated docs: [tests](/docs/build/tests), [test selection](/reference/node-selection/test-selection-examples), [selection methods](/reference/node-selection/methods). +The two **test types** are now "singular" and "generic" (instead of "data" and "schema", respectively). The `test_type:` selection method accepts `test_type:singular` and `test_type:generic`. (It will also accept `test_type:schema` and `test_type:data` for backwards compatibility.) **Not backwards compatible:** The `--data` and `--schema` flags to dbt test are no longer supported, and tests no longer have the tags `'data'` and `'schema'` automatically applied. Updated docs: [tests](/docs/build/data-tests), [test selection](/reference/node-selection/test-selection-examples), [selection methods](/reference/node-selection/methods). The `greedy` flag/property has been renamed to **`indirect_selection`**, which is now eager by default. **Note:** This reverts test selection to its pre-v0.20 behavior by default. `dbt test -s my_model` _will_ select multi-parent tests, such as `relationships`, that depend on unselected resources. To achieve the behavior change in v0.20 + v0.21, set `--indirect-selection=cautious` on the CLI or `indirect_selection: cautious` in YAML selectors. Updated docs: [test selection examples](/reference/node-selection/test-selection-examples), [yaml selectors](/reference/node-selection/yaml-selectors). diff --git a/website/docs/docs/dbt-versions/core-upgrade/10-upgrading-to-v0.20.md b/website/docs/docs/dbt-versions/core-upgrade/10-upgrading-to-v0.20.md index 9ff5695d5dc..be6054087b3 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/10-upgrading-to-v0.20.md +++ b/website/docs/docs/dbt-versions/core-upgrade/10-upgrading-to-v0.20.md @@ -29,9 +29,9 @@ dbt Core v0.20 has reached the end of critical support. No new patch versions wi ### Tests -- [Building a dbt Project: tests](/docs/build/tests) -- [Test Configs](/reference/test-configs) -- [Test properties](/reference/resource-properties/tests) +- [Building a dbt Project: tests](/docs/build/data-tests) +- [Test Configs](/reference/data-test-configs) +- [Test properties](/reference/resource-properties/data-tests) - [Node Selection](/reference/node-selection/syntax) (with updated [test selection examples](/reference/node-selection/test-selection-examples)) - [Writing custom generic tests](/best-practices/writing-custom-generic-tests) diff --git a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-14-0.md b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-14-0.md index 036a9a2aedf..48aa14a42e5 100644 --- a/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-14-0.md +++ b/website/docs/docs/dbt-versions/core-upgrade/11-Older versions/upgrading-to-0-14-0.md @@ -189,7 +189,7 @@ models:
-**Configuring the `incremental_strategy for a single model:** +**Configuring the `incremental_strategy` for a single model:** diff --git a/website/docs/docs/dbt-versions/core-versions.md b/website/docs/docs/dbt-versions/core-versions.md index c497401a17d..3ebf988c136 100644 --- a/website/docs/docs/dbt-versions/core-versions.md +++ b/website/docs/docs/dbt-versions/core-versions.md @@ -18,7 +18,7 @@ dbt Labs provides different support levels for different versions, which may inc ### Further reading - To learn how you can use dbt Core versions in dbt Cloud, see [Choosing a dbt Core version](/docs/dbt-versions/upgrade-core-in-cloud). -- To learn about installing dbt Core, see "[How to install dbt Core](/docs/core/installation)." +- To learn about installing dbt Core, see "[How to install dbt Core](/docs/core/installation-overview)." - To restrict your project to only work with a range of dbt Core versions, or use the currently running dbt Core version, see [`require-dbt-version`](/reference/project-configs/require-dbt-version) and [`dbt_version`](/reference/dbt-jinja-functions/dbt_version). ## Version support prior to v1.0 @@ -29,7 +29,7 @@ All dbt Core versions released prior to 1.0 and their version-specific documenta All dbt Core minor versions that have reached end-of-life (EOL) will have no new patch releases. This means they will no longer receive any fixes, including for known bugs that have been identified. Fixes for those bugs will instead be made in newer minor versions that are still under active support. -We recommend upgrading to a newer version in [dbt Cloud](/docs/dbt-versions/upgrade-core-in-cloud) or [dbt Core](/docs/core/installation#upgrading-dbt-core) to continue receiving support. +We recommend upgrading to a newer version in [dbt Cloud](/docs/dbt-versions/upgrade-core-in-cloud) or [dbt Core](/docs/core/installation-overview#upgrading-dbt-core) to continue receiving support. All dbt Core v1.0 and later are available in dbt Cloud until further notice. In the future, we intend to align dbt Cloud availability with dbt Core ongoing support. You will receive plenty of advance notice before any changes take place. diff --git a/website/docs/docs/dbt-versions/release-notes/74-Dec-2023/external-attributes.md b/website/docs/docs/dbt-versions/release-notes/74-Dec-2023/external-attributes.md new file mode 100644 index 00000000000..25791b66fb1 --- /dev/null +++ b/website/docs/docs/dbt-versions/release-notes/74-Dec-2023/external-attributes.md @@ -0,0 +1,16 @@ +--- +title: "Update: Extended attributes is GA" +description: "December 2023: The extended attributes feature is now GA in dbt Cloud. It enables you to override dbt adapter YAML attributes at the environment level." +sidebar_label: "Update: Extended attributes is GA" +sidebar_position: 10 +tags: [Dec-2023] +date: 2023-12-06 +--- + +The extended attributes feature in dbt Cloud is now GA! It allows for an environment level override on any YAML attribute that a dbt adapter accepts in its `profiles.yml`. You can provide a YAML snippet to add or replace any [profile](/docs/core/connect-data-platform/profiles.yml) value. + +To learn more, refer to [Extended attributes](/docs/dbt-cloud-environments#extended-attributes). + +The **Extended Atrributes** text box is available from your environment's settings page: + + diff --git a/website/docs/docs/dbt-versions/release-notes/75-Nov-2023/explorer-updates-rn.md b/website/docs/docs/dbt-versions/release-notes/75-Nov-2023/explorer-updates-rn.md new file mode 100644 index 00000000000..8b829311d81 --- /dev/null +++ b/website/docs/docs/dbt-versions/release-notes/75-Nov-2023/explorer-updates-rn.md @@ -0,0 +1,33 @@ +--- +title: "Enhancement: New features and UI changes to dbt Explorer" +description: "November 2023: New features and UI changes to dbt Explorer, including a new filter panel, improved lineage graph, and detailed resource information." +sidebar_label: "Enhancement: New features and UI changes to dbt Explorer" +sidebar_position: 08 +tags: [Nov-2023] +date: 2023-11-28 +--- + +dbt Labs is excited to announce the latest features and UI updates to dbt Explorer! + +For more details, refer to [Explore your dbt projects](/docs/collaborate/explore-projects). + +## The project's lineage graph + +- The search bar in the full lineage graph is now more prominent. +- It's easier to navigate across projects using the breadcrumbs. +- The new context menu (right click) makes it easier to focus on a node or to view its lineage. + + + +## Search improvements + +- When searching with keywords, a new side panel UI helps you filter search results by resource type, tag, column, and other key properties (instead of manually defining selectors). +- Search result logic is clearly explained. For instance, indicating whether a resource contains a column name (exact match only). + + + +## Resource details +- Model test result statuses are now displayed on the model details page. +- Column names can now be searched within the list. + + \ No newline at end of file diff --git a/website/docs/docs/dbt-versions/release-notes/02-Nov-2023/job-notifications-rn.md b/website/docs/docs/dbt-versions/release-notes/75-Nov-2023/job-notifications-rn.md similarity index 98% rename from website/docs/docs/dbt-versions/release-notes/02-Nov-2023/job-notifications-rn.md rename to website/docs/docs/dbt-versions/release-notes/75-Nov-2023/job-notifications-rn.md index 660129513d7..02fe2e037df 100644 --- a/website/docs/docs/dbt-versions/release-notes/02-Nov-2023/job-notifications-rn.md +++ b/website/docs/docs/dbt-versions/release-notes/75-Nov-2023/job-notifications-rn.md @@ -4,6 +4,7 @@ description: "November 2023: New quality-of-life improvements for setting up and sidebar_label: "Enhancement: Job notifications" sidebar_position: 10 tags: [Nov-2023] +date: 2023-11-28 --- There are new quality-of-life improvements in dbt Cloud for email and Slack notifications about your jobs: diff --git a/website/docs/docs/dbt-versions/release-notes/02-Nov-2023/microsoft-fabric-support-rn.md b/website/docs/docs/dbt-versions/release-notes/75-Nov-2023/microsoft-fabric-support-rn.md similarity index 65% rename from website/docs/docs/dbt-versions/release-notes/02-Nov-2023/microsoft-fabric-support-rn.md rename to website/docs/docs/dbt-versions/release-notes/75-Nov-2023/microsoft-fabric-support-rn.md index 13aefa80ffc..b416817f3a0 100644 --- a/website/docs/docs/dbt-versions/release-notes/02-Nov-2023/microsoft-fabric-support-rn.md +++ b/website/docs/docs/dbt-versions/release-notes/75-Nov-2023/microsoft-fabric-support-rn.md @@ -4,11 +4,14 @@ description: "November 2023: Public Preview now available for Microsoft Fabric i sidebar_label: "New: Public Preview of Microsoft Fabric support" sidebar_position: 09 tags: [Nov-2023] +date: 2023-11-28 --- Public Preview is now available in dbt Cloud for Microsoft Fabric! -To learn more, check out the [Quickstart for dbt Cloud and Microsoft Fabric](/guides/microsoft-fabric?step=1). The guide walks you through: +To learn more, refer to [Connect Microsoft Fabric](/docs/cloud/connect-data-platform/connect-microsoft-fabric) and [Microsoft Fabric DWH configurations](/reference/resource-configs/fabric-configs). + +Also, check out the [Quickstart for dbt Cloud and Microsoft Fabric](/guides/microsoft-fabric?step=1). The guide walks you through: - Loading the Jaffle Shop sample data (provided by dbt Labs) into your Microsoft Fabric warehouse. - Connecting dbt Cloud to Microsoft Fabric. diff --git a/website/docs/docs/dbt-versions/release-notes/75-Nov-2023/repo-caching.md b/website/docs/docs/dbt-versions/release-notes/75-Nov-2023/repo-caching.md new file mode 100644 index 00000000000..7c35991e961 --- /dev/null +++ b/website/docs/docs/dbt-versions/release-notes/75-Nov-2023/repo-caching.md @@ -0,0 +1,14 @@ +--- +title: "New: Support for Git repository caching" +description: "November 2023: dbt Cloud can cache your project's code (as well as other dbt packages) to ensure runs can begin despite an upstream Git provider's outage." +sidebar_label: "New: Support for Git repository caching" +sidebar_position: 07 +tags: [Nov-2023] +date: 2023-11-29 +--- + +Now available for dbt Cloud Enterprise plans is a new option to enable Git repository caching for your job runs. When enabled, dbt Cloud caches your dbt project's Git repository and uses the cached copy instead if there's an outage with the Git provider. This feature improves the reliability and stability of your job runs. + +To learn more, refer to [Repo caching](/docs/deploy/deploy-environments#git-repository-caching). + + \ No newline at end of file diff --git a/website/docs/docs/dbt-versions/release-notes/03-Oct-2023/api-v2v3-limit.md b/website/docs/docs/dbt-versions/release-notes/76-Oct-2023/api-v2v3-limit.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/03-Oct-2023/api-v2v3-limit.md rename to website/docs/docs/dbt-versions/release-notes/76-Oct-2023/api-v2v3-limit.md diff --git a/website/docs/docs/dbt-versions/release-notes/03-Oct-2023/cloud-cli-pp.md b/website/docs/docs/dbt-versions/release-notes/76-Oct-2023/cloud-cli-pp.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/03-Oct-2023/cloud-cli-pp.md rename to website/docs/docs/dbt-versions/release-notes/76-Oct-2023/cloud-cli-pp.md diff --git a/website/docs/docs/dbt-versions/release-notes/03-Oct-2023/custom-branch-fix-rn.md b/website/docs/docs/dbt-versions/release-notes/76-Oct-2023/custom-branch-fix-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/03-Oct-2023/custom-branch-fix-rn.md rename to website/docs/docs/dbt-versions/release-notes/76-Oct-2023/custom-branch-fix-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/03-Oct-2023/dbt-deps-auto-install.md b/website/docs/docs/dbt-versions/release-notes/76-Oct-2023/dbt-deps-auto-install.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/03-Oct-2023/dbt-deps-auto-install.md rename to website/docs/docs/dbt-versions/release-notes/76-Oct-2023/dbt-deps-auto-install.md diff --git a/website/docs/docs/dbt-versions/release-notes/03-Oct-2023/explorer-public-preview-rn.md b/website/docs/docs/dbt-versions/release-notes/76-Oct-2023/explorer-public-preview-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/03-Oct-2023/explorer-public-preview-rn.md rename to website/docs/docs/dbt-versions/release-notes/76-Oct-2023/explorer-public-preview-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/03-Oct-2023/native-retry-support-rn.md b/website/docs/docs/dbt-versions/release-notes/76-Oct-2023/native-retry-support-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/03-Oct-2023/native-retry-support-rn.md rename to website/docs/docs/dbt-versions/release-notes/76-Oct-2023/native-retry-support-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/03-Oct-2023/product-docs-sept-rn.md b/website/docs/docs/dbt-versions/release-notes/76-Oct-2023/product-docs-sept-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/03-Oct-2023/product-docs-sept-rn.md rename to website/docs/docs/dbt-versions/release-notes/76-Oct-2023/product-docs-sept-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/03-Oct-2023/sl-ga.md b/website/docs/docs/dbt-versions/release-notes/76-Oct-2023/sl-ga.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/03-Oct-2023/sl-ga.md rename to website/docs/docs/dbt-versions/release-notes/76-Oct-2023/sl-ga.md diff --git a/website/docs/docs/dbt-versions/release-notes/04-Sept-2023/ci-updates-phase2-rn.md b/website/docs/docs/dbt-versions/release-notes/77-Sept-2023/ci-updates-phase2-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/04-Sept-2023/ci-updates-phase2-rn.md rename to website/docs/docs/dbt-versions/release-notes/77-Sept-2023/ci-updates-phase2-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/04-Sept-2023/ci-updates-phase3-rn.md b/website/docs/docs/dbt-versions/release-notes/77-Sept-2023/ci-updates-phase3-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/04-Sept-2023/ci-updates-phase3-rn.md rename to website/docs/docs/dbt-versions/release-notes/77-Sept-2023/ci-updates-phase3-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/04-Sept-2023/product-docs-summer-rn.md b/website/docs/docs/dbt-versions/release-notes/77-Sept-2023/product-docs-summer-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/04-Sept-2023/product-docs-summer-rn.md rename to website/docs/docs/dbt-versions/release-notes/77-Sept-2023/product-docs-summer-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/04-Sept-2023/removing-prerelease-versions.md b/website/docs/docs/dbt-versions/release-notes/77-Sept-2023/removing-prerelease-versions.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/04-Sept-2023/removing-prerelease-versions.md rename to website/docs/docs/dbt-versions/release-notes/77-Sept-2023/removing-prerelease-versions.md diff --git a/website/docs/docs/dbt-versions/release-notes/05-Aug-2023/deprecation-endpoints-discovery.md b/website/docs/docs/dbt-versions/release-notes/78-Aug-2023/deprecation-endpoints-discovery.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/05-Aug-2023/deprecation-endpoints-discovery.md rename to website/docs/docs/dbt-versions/release-notes/78-Aug-2023/deprecation-endpoints-discovery.md diff --git a/website/docs/docs/dbt-versions/release-notes/05-Aug-2023/ide-v1.2.md b/website/docs/docs/dbt-versions/release-notes/78-Aug-2023/ide-v1.2.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/05-Aug-2023/ide-v1.2.md rename to website/docs/docs/dbt-versions/release-notes/78-Aug-2023/ide-v1.2.md diff --git a/website/docs/docs/dbt-versions/release-notes/05-Aug-2023/sl-revamp-beta.md b/website/docs/docs/dbt-versions/release-notes/78-Aug-2023/sl-revamp-beta.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/05-Aug-2023/sl-revamp-beta.md rename to website/docs/docs/dbt-versions/release-notes/78-Aug-2023/sl-revamp-beta.md diff --git a/website/docs/docs/dbt-versions/release-notes/06-July-2023/faster-run.md b/website/docs/docs/dbt-versions/release-notes/79-July-2023/faster-run.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/06-July-2023/faster-run.md rename to website/docs/docs/dbt-versions/release-notes/79-July-2023/faster-run.md diff --git a/website/docs/docs/dbt-versions/release-notes/07-June-2023/admin-api-rn.md b/website/docs/docs/dbt-versions/release-notes/80-June-2023/admin-api-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/07-June-2023/admin-api-rn.md rename to website/docs/docs/dbt-versions/release-notes/80-June-2023/admin-api-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/07-June-2023/ci-updates-phase1-rn.md b/website/docs/docs/dbt-versions/release-notes/80-June-2023/ci-updates-phase1-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/07-June-2023/ci-updates-phase1-rn.md rename to website/docs/docs/dbt-versions/release-notes/80-June-2023/ci-updates-phase1-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/07-June-2023/lint-format-rn.md b/website/docs/docs/dbt-versions/release-notes/80-June-2023/lint-format-rn.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/07-June-2023/lint-format-rn.md rename to website/docs/docs/dbt-versions/release-notes/80-June-2023/lint-format-rn.md diff --git a/website/docs/docs/dbt-versions/release-notes/07-June-2023/product-docs-jun.md b/website/docs/docs/dbt-versions/release-notes/80-June-2023/product-docs-jun.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/07-June-2023/product-docs-jun.md rename to website/docs/docs/dbt-versions/release-notes/80-June-2023/product-docs-jun.md diff --git a/website/docs/docs/dbt-versions/release-notes/08-May-2023/discovery-api-public-preview.md b/website/docs/docs/dbt-versions/release-notes/81-May-2023/discovery-api-public-preview.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/08-May-2023/discovery-api-public-preview.md rename to website/docs/docs/dbt-versions/release-notes/81-May-2023/discovery-api-public-preview.md diff --git a/website/docs/docs/dbt-versions/release-notes/08-May-2023/may-ide-updates.md b/website/docs/docs/dbt-versions/release-notes/81-May-2023/may-ide-updates.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/08-May-2023/may-ide-updates.md rename to website/docs/docs/dbt-versions/release-notes/81-May-2023/may-ide-updates.md diff --git a/website/docs/docs/dbt-versions/release-notes/08-May-2023/product-docs-may.md b/website/docs/docs/dbt-versions/release-notes/81-May-2023/product-docs-may.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/08-May-2023/product-docs-may.md rename to website/docs/docs/dbt-versions/release-notes/81-May-2023/product-docs-may.md diff --git a/website/docs/docs/dbt-versions/release-notes/08-May-2023/run-details-and-logs-improvements.md b/website/docs/docs/dbt-versions/release-notes/81-May-2023/run-details-and-logs-improvements.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/08-May-2023/run-details-and-logs-improvements.md rename to website/docs/docs/dbt-versions/release-notes/81-May-2023/run-details-and-logs-improvements.md diff --git a/website/docs/docs/dbt-versions/release-notes/08-May-2023/run-history-endpoint.md b/website/docs/docs/dbt-versions/release-notes/81-May-2023/run-history-endpoint.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/08-May-2023/run-history-endpoint.md rename to website/docs/docs/dbt-versions/release-notes/81-May-2023/run-history-endpoint.md diff --git a/website/docs/docs/dbt-versions/release-notes/08-May-2023/run-history-improvements.md b/website/docs/docs/dbt-versions/release-notes/81-May-2023/run-history-improvements.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/08-May-2023/run-history-improvements.md rename to website/docs/docs/dbt-versions/release-notes/81-May-2023/run-history-improvements.md diff --git a/website/docs/docs/dbt-versions/release-notes/09-April-2023/api-endpoint-restriction.md b/website/docs/docs/dbt-versions/release-notes/82-April-2023/api-endpoint-restriction.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/09-April-2023/api-endpoint-restriction.md rename to website/docs/docs/dbt-versions/release-notes/82-April-2023/api-endpoint-restriction.md diff --git a/website/docs/docs/dbt-versions/release-notes/09-April-2023/apr-ide-updates.md b/website/docs/docs/dbt-versions/release-notes/82-April-2023/apr-ide-updates.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/09-April-2023/apr-ide-updates.md rename to website/docs/docs/dbt-versions/release-notes/82-April-2023/apr-ide-updates.md diff --git a/website/docs/docs/dbt-versions/release-notes/09-April-2023/product-docs.md b/website/docs/docs/dbt-versions/release-notes/82-April-2023/product-docs.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/09-April-2023/product-docs.md rename to website/docs/docs/dbt-versions/release-notes/82-April-2023/product-docs.md diff --git a/website/docs/docs/dbt-versions/release-notes/09-April-2023/scheduler-optimized.md b/website/docs/docs/dbt-versions/release-notes/82-April-2023/scheduler-optimized.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/09-April-2023/scheduler-optimized.md rename to website/docs/docs/dbt-versions/release-notes/82-April-2023/scheduler-optimized.md diff --git a/website/docs/docs/dbt-versions/release-notes/09-April-2023/starburst-trino-ga.md b/website/docs/docs/dbt-versions/release-notes/82-April-2023/starburst-trino-ga.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/09-April-2023/starburst-trino-ga.md rename to website/docs/docs/dbt-versions/release-notes/82-April-2023/starburst-trino-ga.md diff --git a/website/docs/docs/dbt-versions/release-notes/10-Mar-2023/1.0-deprecation.md b/website/docs/docs/dbt-versions/release-notes/83-Mar-2023/1.0-deprecation.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/10-Mar-2023/1.0-deprecation.md rename to website/docs/docs/dbt-versions/release-notes/83-Mar-2023/1.0-deprecation.md diff --git a/website/docs/docs/dbt-versions/release-notes/10-Mar-2023/apiv2-limit.md b/website/docs/docs/dbt-versions/release-notes/83-Mar-2023/apiv2-limit.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/10-Mar-2023/apiv2-limit.md rename to website/docs/docs/dbt-versions/release-notes/83-Mar-2023/apiv2-limit.md diff --git a/website/docs/docs/dbt-versions/release-notes/10-Mar-2023/mar-ide-updates.md b/website/docs/docs/dbt-versions/release-notes/83-Mar-2023/mar-ide-updates.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/10-Mar-2023/mar-ide-updates.md rename to website/docs/docs/dbt-versions/release-notes/83-Mar-2023/mar-ide-updates.md diff --git a/website/docs/docs/dbt-versions/release-notes/10-Mar-2023/public-preview-trino-in-dbt-cloud.md b/website/docs/docs/dbt-versions/release-notes/83-Mar-2023/public-preview-trino-in-dbt-cloud.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/10-Mar-2023/public-preview-trino-in-dbt-cloud.md rename to website/docs/docs/dbt-versions/release-notes/83-Mar-2023/public-preview-trino-in-dbt-cloud.md diff --git a/website/docs/docs/dbt-versions/release-notes/11-Feb-2023/feb-ide-updates.md b/website/docs/docs/dbt-versions/release-notes/84-Feb-2023/feb-ide-updates.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/11-Feb-2023/feb-ide-updates.md rename to website/docs/docs/dbt-versions/release-notes/84-Feb-2023/feb-ide-updates.md diff --git a/website/docs/docs/dbt-versions/release-notes/11-Feb-2023/no-partial-parse-config.md b/website/docs/docs/dbt-versions/release-notes/84-Feb-2023/no-partial-parse-config.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/11-Feb-2023/no-partial-parse-config.md rename to website/docs/docs/dbt-versions/release-notes/84-Feb-2023/no-partial-parse-config.md diff --git a/website/docs/docs/dbt-versions/release-notes/12-Jan-2023/ide-updates.md b/website/docs/docs/dbt-versions/release-notes/85-Jan-2023/ide-updates.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/12-Jan-2023/ide-updates.md rename to website/docs/docs/dbt-versions/release-notes/85-Jan-2023/ide-updates.md diff --git a/website/docs/docs/dbt-versions/release-notes/23-Dec-2022/default-thread-value.md b/website/docs/docs/dbt-versions/release-notes/86-Dec-2022/default-thread-value.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/23-Dec-2022/default-thread-value.md rename to website/docs/docs/dbt-versions/release-notes/86-Dec-2022/default-thread-value.md diff --git a/website/docs/docs/dbt-versions/release-notes/23-Dec-2022/new-jobs-default-as-off.md b/website/docs/docs/dbt-versions/release-notes/86-Dec-2022/new-jobs-default-as-off.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/23-Dec-2022/new-jobs-default-as-off.md rename to website/docs/docs/dbt-versions/release-notes/86-Dec-2022/new-jobs-default-as-off.md diff --git a/website/docs/docs/dbt-versions/release-notes/23-Dec-2022/private-packages-clone-git-token.md b/website/docs/docs/dbt-versions/release-notes/86-Dec-2022/private-packages-clone-git-token.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/23-Dec-2022/private-packages-clone-git-token.md rename to website/docs/docs/dbt-versions/release-notes/86-Dec-2022/private-packages-clone-git-token.md diff --git a/website/docs/docs/dbt-versions/release-notes/24-Nov-2022/dbt-databricks-unity-catalog-support.md b/website/docs/docs/dbt-versions/release-notes/87-Nov-2022/dbt-databricks-unity-catalog-support.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/24-Nov-2022/dbt-databricks-unity-catalog-support.md rename to website/docs/docs/dbt-versions/release-notes/87-Nov-2022/dbt-databricks-unity-catalog-support.md diff --git a/website/docs/docs/dbt-versions/release-notes/24-Nov-2022/ide-features-ide-deprecation.md b/website/docs/docs/dbt-versions/release-notes/87-Nov-2022/ide-features-ide-deprecation.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/24-Nov-2022/ide-features-ide-deprecation.md rename to website/docs/docs/dbt-versions/release-notes/87-Nov-2022/ide-features-ide-deprecation.md diff --git a/website/docs/docs/dbt-versions/release-notes/25-Oct-2022/cloud-integration-azure.md b/website/docs/docs/dbt-versions/release-notes/88-Oct-2022/cloud-integration-azure.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/25-Oct-2022/cloud-integration-azure.md rename to website/docs/docs/dbt-versions/release-notes/88-Oct-2022/cloud-integration-azure.md diff --git a/website/docs/docs/dbt-versions/release-notes/25-Oct-2022/new-ide-launch.md b/website/docs/docs/dbt-versions/release-notes/88-Oct-2022/new-ide-launch.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/25-Oct-2022/new-ide-launch.md rename to website/docs/docs/dbt-versions/release-notes/88-Oct-2022/new-ide-launch.md diff --git a/website/docs/docs/dbt-versions/release-notes/26-Sept-2022/liststeps-endpoint-deprecation.md b/website/docs/docs/dbt-versions/release-notes/89-Sept-2022/liststeps-endpoint-deprecation.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/26-Sept-2022/liststeps-endpoint-deprecation.md rename to website/docs/docs/dbt-versions/release-notes/89-Sept-2022/liststeps-endpoint-deprecation.md diff --git a/website/docs/docs/dbt-versions/release-notes/26-Sept-2022/metadata-api-data-retention-limits.md b/website/docs/docs/dbt-versions/release-notes/89-Sept-2022/metadata-api-data-retention-limits.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/26-Sept-2022/metadata-api-data-retention-limits.md rename to website/docs/docs/dbt-versions/release-notes/89-Sept-2022/metadata-api-data-retention-limits.md diff --git a/website/docs/docs/dbt-versions/release-notes/27-Aug-2022/ide-improvement-beta.md b/website/docs/docs/dbt-versions/release-notes/91-Aug-2022/ide-improvement-beta.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/27-Aug-2022/ide-improvement-beta.md rename to website/docs/docs/dbt-versions/release-notes/91-Aug-2022/ide-improvement-beta.md diff --git a/website/docs/docs/dbt-versions/release-notes/27-Aug-2022/support-redshift-ra3.md b/website/docs/docs/dbt-versions/release-notes/91-Aug-2022/support-redshift-ra3.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/27-Aug-2022/support-redshift-ra3.md rename to website/docs/docs/dbt-versions/release-notes/91-Aug-2022/support-redshift-ra3.md diff --git a/website/docs/docs/dbt-versions/release-notes/28-July-2022/render-lineage-feature.md b/website/docs/docs/dbt-versions/release-notes/92-July-2022/render-lineage-feature.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/28-July-2022/render-lineage-feature.md rename to website/docs/docs/dbt-versions/release-notes/92-July-2022/render-lineage-feature.md diff --git a/website/docs/docs/dbt-versions/release-notes/29-May-2022/gitlab-auth.md b/website/docs/docs/dbt-versions/release-notes/93-May-2022/gitlab-auth.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/29-May-2022/gitlab-auth.md rename to website/docs/docs/dbt-versions/release-notes/93-May-2022/gitlab-auth.md diff --git a/website/docs/docs/dbt-versions/release-notes/30-April-2022/audit-log.md b/website/docs/docs/dbt-versions/release-notes/94-April-2022/audit-log.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/30-April-2022/audit-log.md rename to website/docs/docs/dbt-versions/release-notes/94-April-2022/audit-log.md diff --git a/website/docs/docs/dbt-versions/release-notes/30-April-2022/credentials-saved.md b/website/docs/docs/dbt-versions/release-notes/94-April-2022/credentials-saved.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/30-April-2022/credentials-saved.md rename to website/docs/docs/dbt-versions/release-notes/94-April-2022/credentials-saved.md diff --git a/website/docs/docs/dbt-versions/release-notes/30-April-2022/email-verification.md b/website/docs/docs/dbt-versions/release-notes/94-April-2022/email-verification.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/30-April-2022/email-verification.md rename to website/docs/docs/dbt-versions/release-notes/94-April-2022/email-verification.md diff --git a/website/docs/docs/dbt-versions/release-notes/30-April-2022/scheduler-improvements.md b/website/docs/docs/dbt-versions/release-notes/94-April-2022/scheduler-improvements.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/30-April-2022/scheduler-improvements.md rename to website/docs/docs/dbt-versions/release-notes/94-April-2022/scheduler-improvements.md diff --git a/website/docs/docs/dbt-versions/release-notes/31-March-2022/ide-timeout-message.md b/website/docs/docs/dbt-versions/release-notes/95-March-2022/ide-timeout-message.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/31-March-2022/ide-timeout-message.md rename to website/docs/docs/dbt-versions/release-notes/95-March-2022/ide-timeout-message.md diff --git a/website/docs/docs/dbt-versions/release-notes/31-March-2022/prep-and-waiting-time.md b/website/docs/docs/dbt-versions/release-notes/95-March-2022/prep-and-waiting-time.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/31-March-2022/prep-and-waiting-time.md rename to website/docs/docs/dbt-versions/release-notes/95-March-2022/prep-and-waiting-time.md diff --git a/website/docs/docs/dbt-versions/release-notes/32-February-2022/DAG-updates-more.md b/website/docs/docs/dbt-versions/release-notes/96-February-2022/DAG-updates-more.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/32-February-2022/DAG-updates-more.md rename to website/docs/docs/dbt-versions/release-notes/96-February-2022/DAG-updates-more.md diff --git a/website/docs/docs/dbt-versions/release-notes/32-February-2022/service-tokens-more.md b/website/docs/docs/dbt-versions/release-notes/96-February-2022/service-tokens-more.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/32-February-2022/service-tokens-more.md rename to website/docs/docs/dbt-versions/release-notes/96-February-2022/service-tokens-more.md diff --git a/website/docs/docs/dbt-versions/release-notes/33-January-2022/IDE-autocomplete-more.md b/website/docs/docs/dbt-versions/release-notes/97-January-2022/IDE-autocomplete-more.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/33-January-2022/IDE-autocomplete-more.md rename to website/docs/docs/dbt-versions/release-notes/97-January-2022/IDE-autocomplete-more.md diff --git a/website/docs/docs/dbt-versions/release-notes/33-January-2022/model-timing-more.md b/website/docs/docs/dbt-versions/release-notes/97-January-2022/model-timing-more.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/33-January-2022/model-timing-more.md rename to website/docs/docs/dbt-versions/release-notes/97-January-2022/model-timing-more.md diff --git a/website/docs/docs/dbt-versions/release-notes/34-dbt-cloud-changelog-2021.md b/website/docs/docs/dbt-versions/release-notes/98-dbt-cloud-changelog-2021.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/34-dbt-cloud-changelog-2021.md rename to website/docs/docs/dbt-versions/release-notes/98-dbt-cloud-changelog-2021.md diff --git a/website/docs/docs/dbt-versions/release-notes/35-dbt-cloud-changelog-2019-2020.md b/website/docs/docs/dbt-versions/release-notes/99-dbt-cloud-changelog-2019-2020.md similarity index 100% rename from website/docs/docs/dbt-versions/release-notes/35-dbt-cloud-changelog-2019-2020.md rename to website/docs/docs/dbt-versions/release-notes/99-dbt-cloud-changelog-2019-2020.md diff --git a/website/docs/docs/deploy/airgapped.md b/website/docs/docs/deploy/airgapped.md deleted file mode 100644 index a08370fef8c..00000000000 --- a/website/docs/docs/deploy/airgapped.md +++ /dev/null @@ -1,19 +0,0 @@ ---- -id: airgapped-deployment -title: Airgapped (Beta) ---- - -:::info Airgapped - -This section provides a high level summary of the airgapped deployment type for dbt Cloud. This deployment type is currently in Beta and may not be supported in the long term. -If you’re interested in learning more about airgapped deployments for dbt Cloud, contact us at sales@getdbt.com. - -::: - -The airgapped deployment is similar to an on-premise installation in that the dbt Cloud instance will live in your network, and is subject to your security procedures, technologies, and controls. An airgapped install allows you to run dbt Cloud without any external network dependencies and is ideal for organizations that have strict rules around installing software from the cloud. - -The installation process for airgapped is a bit different. Instead of downloading and installing images during installation time, you will download all of the necessary configuration and Docker images before starting the installation process, and manage uploading these images yourself. This means that you can remove all external network dependencies and run this application in a very secure environment. - -For more information about the dbt Cloud Airgapped deployment see the below. - -- [Customer Managed Network Architecture](/docs/cloud/about-cloud/architecture) diff --git a/website/docs/docs/deploy/deploy-jobs.md b/website/docs/docs/deploy/deploy-jobs.md index e43020bf66e..cee6e245359 100644 --- a/website/docs/docs/deploy/deploy-jobs.md +++ b/website/docs/docs/deploy/deploy-jobs.md @@ -84,15 +84,22 @@ To fully customize the scheduling of your job, choose the **Custom cron schedule Use tools such as [crontab.guru](https://crontab.guru/) to generate the correct cron syntax. This tool allows you to input cron snippets and returns their plain English translations. -Refer to the following example snippets: +Here are examples of cron job schedules. The dbt Cloud job scheduler supports using `L` to schedule jobs on the last day of the month: -- `0 * * * *`: Every hour, at minute 0 -- `*/5 * * * *`: Every 5 minutes -- `5 4 * * *`: At exactly 4:05 AM UTC -- `30 */4 * * *`: At minute 30 past every 4th hour (e.g. 4:30AM, 8:30AM, 12:30PM, etc., all UTC) -- `0 0 */2 * *`: At midnight UTC every other day +- `0 * * * *`: Every hour, at minute 0. +- `*/5 * * * *`: Every 5 minutes. +- `5 4 * * *`: At exactly 4:05 AM UTC. +- `30 */4 * * *`: At minute 30 past every 4th hour (such as 4:30 AM, 8:30 AM, 12:30 PM, and so on, all UTC). +- `0 0 */2 * *`: At 12:00 AM (midnight) UTC every other day. - `0 0 * * 1`: At midnight UTC every Monday. +- `0 0 L * *`: At 12:00 AM (midnight), on the last day of the month. +- `0 0 L 1,2,3,4,5,6,8,9,10,11,12 *`: At 12:00 AM, on the last day of the month, only in January, February, March, April, May, June, August, September, October, November, and December. +- `0 0 L 7 *`: At 12:00 AM, on the last day of the month, only in July. +- `0 0 L * FRI,SAT`: At 12:00 AM, on the last day of the month, and on Friday and Saturday. +- `0 12 L * *`: At 12:00 PM (afternoon), on the last day of the month. +- `0 7 L * 5`: At 07:00 AM, on the last day of the month, and on Friday. +- `30 14 L * *`: At 02:30 PM, on the last day of the month. ## Related docs diff --git a/website/docs/docs/deploy/job-commands.md b/website/docs/docs/deploy/job-commands.md index db284c78a05..26fe1931db6 100644 --- a/website/docs/docs/deploy/job-commands.md +++ b/website/docs/docs/deploy/job-commands.md @@ -41,7 +41,7 @@ For every job, you have the option to select the [Generate docs on run](/docs/co ### Command list -You can add or remove as many [dbt commands](/reference/dbt-commands) as necessary for every job. However, you need to have at least one dbt command. There are few commands listed as "dbt Core" in the [dbt Command reference doc](/reference/dbt-commands) page. This means they are meant for use in [dbt Core](/docs/core/about-dbt-core) only and are not available in dbt Cloud. +You can add or remove as many dbt commands as necessary for every job. However, you need to have at least one dbt command. There are few commands listed as "dbt Cloud CLI" or "dbt Core" in the [dbt Command reference page](/reference/dbt-commands) page. This means they are meant for use in dbt Core or dbt Cloud CLI, and not in dbt Cloud IDE. :::tip Using selectors diff --git a/website/docs/docs/deploy/retry-jobs.md b/website/docs/docs/deploy/retry-jobs.md index ea616121f38..beefb35379e 100644 --- a/website/docs/docs/deploy/retry-jobs.md +++ b/website/docs/docs/deploy/retry-jobs.md @@ -26,7 +26,7 @@ If your dbt job run completed with a status of **Error**, you can rerun it from ## Related content -- [Retry a failed run for a job](/dbt-cloud/api-v2#/operations/Retry%20a%20failed%20run%20for%20a%20job) API endpoint +- [Retry a failed run for a job](/dbt-cloud/api-v2#/operations/Retry%20Failed%20Job) API endpoint - [Run visibility](/docs/deploy/run-visibility) - [Jobs](/docs/deploy/jobs) -- [Job commands](/docs/deploy/job-commands) \ No newline at end of file +- [Job commands](/docs/deploy/job-commands) diff --git a/website/docs/docs/introduction.md b/website/docs/docs/introduction.md index c575a9ae657..08564aeb2f0 100644 --- a/website/docs/docs/introduction.md +++ b/website/docs/docs/introduction.md @@ -56,7 +56,7 @@ As a dbt user, your main focus will be on writing models (i.e. select queries) t | Use a code compiler | SQL files can contain Jinja, a lightweight templating language. Using Jinja in SQL provides a way to use control structures in your queries. For example, `if` statements and `for` loops. It also enables repeated SQL to be shared through `macros`. Read more about [Macros](/docs/build/jinja-macros).| | Determine the order of model execution | Often, when transforming data, it makes sense to do so in a staged approach. dbt provides a mechanism to implement transformations in stages through the [ref function](/reference/dbt-jinja-functions/ref). Rather than selecting from existing tables and views in your warehouse, you can select from another model.| | Document your dbt project | dbt provides a mechanism to write, version-control, and share documentation for your dbt models. You can write descriptions (in plain text or markdown) for each model and field. In dbt Cloud, you can auto-generate the documentation when your dbt project runs. Read more about the [Documentation](/docs/collaborate/documentation).| -| Test your models | Tests provide a way to improve the integrity of the SQL in each model by making assertions about the results generated by a model. Read more about writing tests for your models [Testing](/docs/build/tests)| +| Test your models | Tests provide a way to improve the integrity of the SQL in each model by making assertions about the results generated by a model. Read more about writing tests for your models [Testing](/docs/build/data-tests)| | Manage packages | dbt ships with a package manager, which allows analysts to use and publish both public and private repositories of dbt code which can then be referenced by others. Read more about [Package Management](/docs/build/packages). | | Load seed files| Often in analytics, raw values need to be mapped to a more readable value (for example, converting a country-code to a country name) or enriched with static or infrequently changing data. These data sources, known as seed files, can be saved as a CSV file in your `project` and loaded into your data warehouse using the `seed` command. Read more about [Seeds](/docs/build/seeds).| | Snapshot data | Often, records in a data source are mutable, in that they change over time. This can be difficult to handle in analytics if you want to reconstruct historic values. dbt provides a mechanism to snapshot raw data for a point in time, through use of [snapshots](/docs/build/snapshots).| diff --git a/website/docs/docs/running-a-dbt-project/run-your-dbt-projects.md b/website/docs/docs/running-a-dbt-project/run-your-dbt-projects.md index b3b6ffb3e45..f1e631f0d78 100644 --- a/website/docs/docs/running-a-dbt-project/run-your-dbt-projects.md +++ b/website/docs/docs/running-a-dbt-project/run-your-dbt-projects.md @@ -11,9 +11,9 @@ You can run your dbt projects with [dbt Cloud](/docs/cloud/about-cloud/dbt-cloud - Share your [dbt project's documentation](/docs/collaborate/build-and-view-your-docs) with your team. - Integrates with the dbt Cloud IDE, allowing you to run development tasks and environment in the dbt Cloud UI for a seamless experience. - The dbt Cloud CLI to develop and run dbt commands against your dbt Cloud development environment from your local command line. - - For more details, refer to [Develop in the Cloud](/docs/cloud/about-cloud-develop). + - For more details, refer to [Develop dbt](/docs/cloud/about-develop-dbt). -- **dbt Core**: An open source project where you can develop from the [command line](/docs/core/about-dbt-core). +- **dbt Core**: An open source project where you can develop from the [command line](/docs/core/installation-overview). The dbt Cloud CLI and dbt Core are both command line tools that enable you to run dbt commands. The key distinction is the dbt Cloud CLI is tailored for dbt Cloud's infrastructure and integrates with all its [features](/docs/cloud/about-cloud/dbt-cloud-features). diff --git a/website/docs/docs/use-dbt-semantic-layer/avail-sl-integrations.md b/website/docs/docs/use-dbt-semantic-layer/avail-sl-integrations.md index 4f4621fa860..be02fedb230 100644 --- a/website/docs/docs/use-dbt-semantic-layer/avail-sl-integrations.md +++ b/website/docs/docs/use-dbt-semantic-layer/avail-sl-integrations.md @@ -33,6 +33,7 @@ import AvailIntegrations from '/snippets/_sl-partner-links.md'; - {frontMatter.meta.api_name} to learn how to integrate and query your metrics in downstream tools. - [dbt Semantic Layer API query syntax](/docs/dbt-cloud-apis/sl-jdbc#querying-the-api-for-metric-metadata) - [Hex dbt Semantic Layer cells](https://learn.hex.tech/docs/logic-cell-types/transform-cells/dbt-metrics-cells) to set up SQL cells in Hex. +- [Resolve 'Failed APN'](/faqs/Troubleshooting/sl-alpn-error) error when connecting to the dbt Semantic Layer. diff --git a/website/docs/docs/use-dbt-semantic-layer/gsheets.md b/website/docs/docs/use-dbt-semantic-layer/gsheets.md index cb9f4014803..d7525fa7b26 100644 --- a/website/docs/docs/use-dbt-semantic-layer/gsheets.md +++ b/website/docs/docs/use-dbt-semantic-layer/gsheets.md @@ -17,6 +17,8 @@ The dbt Semantic Layer offers a seamless integration with Google Sheets through - You have a Google account with access to Google Sheets. - You can install Google add-ons. - You have a dbt Cloud Environment ID and a [service token](/docs/dbt-cloud-apis/service-tokens) to authenticate with from a dbt Cloud account. +- You must have a dbt Cloud Team or Enterprise [account](https://www.getdbt.com/pricing). Suitable for both Multi-tenant and Single-tenant deployment. + - Single-tenant accounts should contact their account representative for necessary setup and enablement. ## Installing the add-on @@ -54,10 +56,9 @@ To use the filter functionality, choose the [dimension](docs/build/dimensions) y - For categorical dimensiosn, type in the dimension value you want to filter by (no quotes needed) and press enter. - Continue adding additional filters as needed with AND and OR. If it's a time dimension, choose the operator and select from the calendar. - - **Limited Use Policy Disclosure** The dbt Semantic Layer for Sheet's use and transfer to any other app of information received from Google APIs will adhere to [Google API Services User Data Policy](https://developers.google.com/terms/api-services-user-data-policy), including the Limited Use requirements. - +## FAQs + diff --git a/website/docs/docs/use-dbt-semantic-layer/quickstart-sl.md b/website/docs/docs/use-dbt-semantic-layer/quickstart-sl.md index 84e3227b4e7..62437f4ecd6 100644 --- a/website/docs/docs/use-dbt-semantic-layer/quickstart-sl.md +++ b/website/docs/docs/use-dbt-semantic-layer/quickstart-sl.md @@ -26,7 +26,7 @@ MetricFlow, a powerful component of the dbt Semantic Layer, simplifies the creat Use this guide to fully experience the power of the universal dbt Semantic Layer. Here are the following steps you'll take: - [Create a semantic model](#create-a-semantic-model) in dbt Cloud using MetricFlow -- [Define metrics](#define-metrics) in dbt Cloud using MetricFlow +- [Define metrics](#define-metrics) in dbt using MetricFlow - [Test and query metrics](#test-and-query-metrics) with MetricFlow - [Run a production job](#run-a-production-job) in dbt Cloud - [Set up dbt Semantic Layer](#setup) in dbt Cloud @@ -88,20 +88,9 @@ import SlSetUp from '/snippets/_new-sl-setup.md'; If you're encountering some issues when defining your metrics or setting up the dbt Semantic Layer, check out a list of answers to some of the questions or problems you may be experiencing. -
- How do I migrate from the legacy Semantic Layer to the new one? -
-
If you're using the legacy Semantic Layer, we highly recommend you upgrade your dbt version to dbt v1.6 or higher to use the new dbt Semantic Layer. Refer to the dedicated migration guide for more info.
-
-
-
-How are you storing my data? -User data passes through the Semantic Layer on its way back from the warehouse. dbt Labs ensures security by authenticating through the customer's data warehouse. Currently, we don't cache data for the long term, but it might temporarily stay in the system for up to 10 minutes, usually less. In the future, we'll introduce a caching feature that allows us to cache data on our infrastructure for up to 24 hours. -
-
- Is the dbt Semantic Layer open source? - The dbt Semantic Layer is proprietary; however, some components of the dbt Semantic Layer are open source, such as dbt-core and MetricFlow.

dbt Cloud Developer or dbt Core users can define metrics in their project, including a local dbt Core project, using the dbt Cloud IDE, dbt Cloud CLI, or dbt Core CLI. However, to experience the universal dbt Semantic Layer and access those metrics using the API or downstream tools, users must be on a dbt Cloud Team or Enterprise plan.

Refer to Billing for more information. -
+import SlFaqs from '/snippets/_sl-faqs.md'; + + ## Next steps diff --git a/website/docs/docs/use-dbt-semantic-layer/sl-architecture.md b/website/docs/docs/use-dbt-semantic-layer/sl-architecture.md index 75a853fcbe8..9aea2ab42b0 100644 --- a/website/docs/docs/use-dbt-semantic-layer/sl-architecture.md +++ b/website/docs/docs/use-dbt-semantic-layer/sl-architecture.md @@ -14,43 +14,38 @@ The dbt Semantic Layer allows you to define metrics and use various interfaces t -## dbt Semantic Layer components +## Components The dbt Semantic Layer includes the following components: | Components | Information | dbt Core users | Developer plans | Team plans | Enterprise plans | License | -| --- | --- | :---: | :---: | :---: | --- | +| --- | --- | :---: | :---: | :---: | :---: | | **[MetricFlow](/docs/build/about-metricflow)** | MetricFlow in dbt allows users to centrally define their semantic models and metrics with YAML specifications. | ✅ | ✅ | ✅ | ✅ | BSL package (code is source available) | -| **MetricFlow Server**| A proprietary server that takes metric requests and generates optimized SQL for the specific data platform. | ❌ | ❌ | ✅ | ✅ | Proprietary, Cloud (Team & Enterprise)| -| **Semantic Layer Gateway** | A service that passes queries to the MetricFlow server and executes the SQL generated by MetricFlow against the data platform|

❌ | ❌ |✅ | ✅ | Proprietary, Cloud (Team & Enterprise) | -| **Semantic Layer APIs** | The interfaces allow users to submit metric queries using GraphQL and JDBC APIs. They also serve as the foundation for building first-class integrations with various tools. | ❌ | ❌ | ✅ | ✅ | Proprietary, Cloud (Team & Enterprise)| +| **dbt Semantic interfaces**| A configuration spec for defining metrics, dimensions, how they link to each other, and how to query them. The [dbt-semantic-interfaces](https://github.com/dbt-labs/dbt-semantic-interfaces) is available under Apache 2.0. | ❌ | ❌ | ✅ | ✅ | Proprietary, Cloud (Team & Enterprise)| +| **Service layer** | Coordinates query requests and dispatching the relevant metric query to the target query engine. This is provided through dbt Cloud and is available to all users on dbt version 1.6 or later. The service layer includes a Gateway service for executing SQL against the data platform. | ❌ | ❌ | ✅ | ✅ | Proprietary, Cloud (Team & Enterprise) | +| **[Semantic Layer APIs](/docs/dbt-cloud-apis/sl-api-overview)** | The interfaces allow users to submit metric queries using GraphQL and JDBC APIs. They also serve as the foundation for building first-class integrations with various tools. | ❌ | ❌ | ✅ | ✅ | Proprietary, Cloud (Team & Enterprise)| -## Related questions +## Feature comparison -
- How do I migrate from the legacy Semantic Layer to the new one? -
-
If you're using the legacy Semantic Layer, we highly recommend you upgrade your dbt version to dbt v1.6 or higher to use the new dbt Semantic Layer. Refer to the dedicated migration guide for more info.
-
-
- -
-How are you storing my data? -User data passes through the Semantic Layer on its way back from the warehouse. dbt Labs ensures security by authenticating through the customer's data warehouse. Currently, we don't cache data for the long term, but it might temporarily stay in the system for up to 10 minutes, usually less. In the future, we'll introduce a caching feature that allows us to cache data on our infrastructure for up to 24 hours. -
-
- Is the dbt Semantic Layer open source? -The dbt Semantic Layer is proprietary; however, some components of the dbt Semantic Layer are open source, such as dbt-core and MetricFlow.

dbt Cloud Developer or dbt Core users can define metrics in their project, including a local dbt Core project, using the dbt Cloud IDE, dbt Cloud CLI, or dbt Core CLI. However, to experience the universal dbt Semantic Layer and access those metrics using the API or downstream tools, users must be on a dbt Cloud Team or Enterprise plan.

Refer to Billing for more information. -
-
- Is there a dbt Semantic Layer discussion hub? -
-
Yes absolutely! Join the dbt Slack community and #dbt-cloud-semantic-layer slack channel for all things related to the dbt Semantic Layer. -
-
-
+The following table compares the features available in dbt Cloud and source available in dbt Core: + +| Feature | MetricFlow Source available | dbt Semantic Layer with dbt Cloud | +| ----- | :------: | :------: | +| Define metrics and semantic models in dbt using the MetricFlow spec | ✅ | ✅ | +| Generate SQL from a set of config files | ✅ | ✅ | +| Query metrics and dimensions through the command line interface (CLI) | ✅ | ✅ | +| Query dimension, entity, and metric metadata through the CLI | ✅ | ✅ | +| Query metrics and dimensions through semantic APIs (ADBC, GQL) | ❌ | ✅ | +| Connect to downstream integrations (Tableau, Hex, Mode, Google Sheets, and so on.) | ❌ | ✅ | +| Create and run Exports to save metrics queries as tables in your data platform. | ❌ | Coming soon | + +## FAQs + +import SlFaqs from '/snippets/_sl-faqs.md'; + + diff --git a/website/docs/docs/use-dbt-semantic-layer/tableau.md b/website/docs/docs/use-dbt-semantic-layer/tableau.md index 1d283023dda..0f12a75f468 100644 --- a/website/docs/docs/use-dbt-semantic-layer/tableau.md +++ b/website/docs/docs/use-dbt-semantic-layer/tableau.md @@ -21,7 +21,8 @@ This integration provides a live connection to the dbt Semantic Layer through Ta - Note that Tableau Online does not currently support custom connectors natively. If you use Tableau Online, you will only be able to access the connector in Tableau Desktop. - Log in to Tableau Desktop (with Online or Server credentials) or a license to Tableau Server - You need your dbt Cloud host, [Environment ID](/docs/use-dbt-semantic-layer/setup-sl#set-up-dbt-semantic-layer) and [service token](/docs/dbt-cloud-apis/service-tokens) to log in. This account should be set up with the dbt Semantic Layer. -- You must have a dbt Cloud Team or Enterprise [account](https://www.getdbt.com/pricing) and multi-tenant [deployment](/docs/cloud/about-cloud/regions-ip-addresses). (Single-Tenant coming soon) +- You must have a dbt Cloud Team or Enterprise [account](https://www.getdbt.com/pricing). Suitable for both Multi-tenant and Single-tenant deployment. + - Single-tenant accounts should contact their account representative for necessary setup and enablement. ## Installing the Connector @@ -36,7 +37,7 @@ This integration provides a live connection to the dbt Semantic Layer through Ta 2. Install the [JDBC driver](/docs/dbt-cloud-apis/sl-jdbc) to the folder based on your operating system: - Windows: `C:\Program Files\Tableau\Drivers` - - Mac: `~/Library/Tableau/Drivers` + - Mac: `~/Library/Tableau/Drivers` or `/Library/JDBC` or `~/Library/JDBC` - Linux: ` /opt/tableau/tableau_driver/jdbc` 3. Open Tableau Desktop or Tableau Server and find the **dbt Semantic Layer by dbt Labs** connector on the left-hand side. You may need to restart these applications for the connector to be available. 4. Connect with your Host, Environment ID, and Service Token information dbt Cloud provides during [Semantic Layer configuration](/docs/use-dbt-semantic-layer/setup-sl#:~:text=After%20saving%20it%2C%20you%27ll%20be%20provided%20with%20the%20connection%20information%20that%20allows%20you%20to%20connect%20to%20downstream%20tools). @@ -80,3 +81,5 @@ The following Tableau features aren't supported at this time, however, the dbt S - Filtering on a Date Part time dimension for a Cumulative metric type - Changing your date dimension to use "Week Number" +## FAQs + diff --git a/website/docs/faqs/API/_category_.yaml b/website/docs/faqs/API/_category_.yaml new file mode 100644 index 00000000000..fac67328a7a --- /dev/null +++ b/website/docs/faqs/API/_category_.yaml @@ -0,0 +1,10 @@ +# position: 2.5 # float position is supported +label: 'API' +collapsible: true # make the category collapsible +collapsed: true # keep the category collapsed by default +className: red +link: + type: generated-index + title: API FAQs +customProps: + description: Frequently asked questions about dbt APIs diff --git a/website/docs/faqs/Accounts/_category_.yaml b/website/docs/faqs/Accounts/_category_.yaml new file mode 100644 index 00000000000..b8ebee5fe2a --- /dev/null +++ b/website/docs/faqs/Accounts/_category_.yaml @@ -0,0 +1,10 @@ +# position: 2.5 # float position is supported +label: 'Accounts' +collapsible: true # make the category collapsible +collapsed: true # keep the category collapsed by default +className: red +link: + type: generated-index + title: Account FAQs +customProps: + description: Frequently asked questions about your account in dbt diff --git a/website/docs/faqs/Core/_category_.yaml b/website/docs/faqs/Core/_category_.yaml new file mode 100644 index 00000000000..bac4ad4a655 --- /dev/null +++ b/website/docs/faqs/Core/_category_.yaml @@ -0,0 +1,10 @@ +# position: 2.5 # float position is supported +label: 'dbt Core' +collapsible: true # make the category collapsible +collapsed: true # keep the category collapsed by default +className: red +link: + type: generated-index + title: 'dbt Core FAQs' +customProps: + description: Frequently asked questions about dbt Core diff --git a/website/docs/faqs/Core/install-pip-os-prereqs.md b/website/docs/faqs/Core/install-pip-os-prereqs.md index 41a4e4ec60e..1eb6205512a 100644 --- a/website/docs/faqs/Core/install-pip-os-prereqs.md +++ b/website/docs/faqs/Core/install-pip-os-prereqs.md @@ -57,7 +57,7 @@ pip install cryptography~=3.4 ``` -#### Windows +### Windows Windows requires Python and git to successfully install and run dbt Core. diff --git a/website/docs/faqs/Docs/_category_.yaml b/website/docs/faqs/Docs/_category_.yaml new file mode 100644 index 00000000000..8c7925dcc15 --- /dev/null +++ b/website/docs/faqs/Docs/_category_.yaml @@ -0,0 +1,10 @@ +# position: 2.5 # float position is supported +label: 'dbt Docs' +collapsible: true # make the category collapsible +collapsed: true # keep the category collapsed by default +className: red +link: + type: generated-index + title: dbt Docs FAQs +customProps: + description: Frequently asked questions about dbt Docs diff --git a/website/docs/faqs/Environments/_category_.yaml b/website/docs/faqs/Environments/_category_.yaml new file mode 100644 index 00000000000..8d252d2c5d3 --- /dev/null +++ b/website/docs/faqs/Environments/_category_.yaml @@ -0,0 +1,10 @@ +# position: 2.5 # float position is supported +label: 'Environments' +collapsible: true # make the category collapsible +collapsed: true # keep the category collapsed by default +className: red +link: + type: generated-index + title: 'Environments FAQs' +customProps: + description: Frequently asked questions about Environments in dbt diff --git a/website/docs/faqs/Git/_category_.yaml b/website/docs/faqs/Git/_category_.yaml new file mode 100644 index 00000000000..0d9e5ee6e91 --- /dev/null +++ b/website/docs/faqs/Git/_category_.yaml @@ -0,0 +1,10 @@ +# position: 2.5 # float position is supported +label: 'Git' +collapsible: true # make the category collapsible +collapsed: true # keep the category collapsed by default +className: red +link: + type: generated-index + title: Git FAQs +customProps: + description: Frequently asked questions about Git and dbt diff --git a/website/docs/faqs/Jinja/_category_.yaml b/website/docs/faqs/Jinja/_category_.yaml new file mode 100644 index 00000000000..809ca0bb8eb --- /dev/null +++ b/website/docs/faqs/Jinja/_category_.yaml @@ -0,0 +1,10 @@ +# position: 2.5 # float position is supported +label: 'Jinja' +collapsible: true # make the category collapsible +collapsed: true # keep the category collapsed by default +className: red +link: + type: generated-index + title: Jinja FAQs +customProps: + description: Frequently asked questions about Jinja and dbt diff --git a/website/docs/faqs/Models/_category_.yaml b/website/docs/faqs/Models/_category_.yaml new file mode 100644 index 00000000000..7398058db2b --- /dev/null +++ b/website/docs/faqs/Models/_category_.yaml @@ -0,0 +1,10 @@ +# position: 2.5 # float position is supported +label: 'Models' +collapsible: true # make the category collapsible +collapsed: true # keep the category collapsed by default +className: red +link: + type: generated-index + title: Models FAQs +customProps: + description: Frequently asked questions about Models in dbt diff --git a/website/docs/faqs/Models/specifying-column-types.md b/website/docs/faqs/Models/specifying-column-types.md index 8e8379c4ec1..904c616d89a 100644 --- a/website/docs/faqs/Models/specifying-column-types.md +++ b/website/docs/faqs/Models/specifying-column-types.md @@ -38,6 +38,6 @@ So long as your model queries return the correct column type, the table you crea To define additional column options: -* Rather than enforcing uniqueness and not-null constraints on your column, use dbt's [testing](/docs/build/tests) functionality to check that your assertions about your model hold true. +* Rather than enforcing uniqueness and not-null constraints on your column, use dbt's [data testing](/docs/build/data-tests) functionality to check that your assertions about your model hold true. * Rather than creating default values for a column, use SQL to express defaults (e.g. `coalesce(updated_at, current_timestamp()) as updated_at`) * In edge-cases where you _do_ need to alter a column (e.g. column-level encoding on Redshift), consider implementing this via a [post-hook](/reference/resource-configs/pre-hook-post-hook). diff --git a/website/docs/faqs/Project/_category_.yaml b/website/docs/faqs/Project/_category_.yaml new file mode 100644 index 00000000000..d2f695773f8 --- /dev/null +++ b/website/docs/faqs/Project/_category_.yaml @@ -0,0 +1,10 @@ +# position: 2.5 # float position is supported +label: 'Projects' +collapsible: true # make the category collapsible +collapsed: true # keep the category collapsed by default +className: red +link: + type: generated-index + title: Project FAQs +customProps: + description: Frequently asked questions about projects in dbt diff --git a/website/docs/faqs/Project/properties-not-in-config.md b/website/docs/faqs/Project/properties-not-in-config.md index d1aea32b687..76de58404a9 100644 --- a/website/docs/faqs/Project/properties-not-in-config.md +++ b/website/docs/faqs/Project/properties-not-in-config.md @@ -16,7 +16,7 @@ Certain properties are special, because: These properties are: - [`description`](/reference/resource-properties/description) -- [`tests`](/reference/resource-properties/tests) +- [`tests`](/reference/resource-properties/data-tests) - [`docs`](/reference/resource-configs/docs) - `columns` - [`quote`](/reference/resource-properties/quote) diff --git a/website/docs/faqs/Runs/_category_.yaml b/website/docs/faqs/Runs/_category_.yaml new file mode 100644 index 00000000000..5867a0d3710 --- /dev/null +++ b/website/docs/faqs/Runs/_category_.yaml @@ -0,0 +1,10 @@ +# position: 2.5 # float position is supported +label: 'Runs' +collapsible: true # make the category collapsible +collapsed: true # keep the category collapsed by default +className: red +link: + type: generated-index + title: Runs FAQs +customProps: + description: Frequently asked questions about runs in dbt diff --git a/website/docs/faqs/Seeds/_category_.yaml b/website/docs/faqs/Seeds/_category_.yaml new file mode 100644 index 00000000000..fd2f7d3d925 --- /dev/null +++ b/website/docs/faqs/Seeds/_category_.yaml @@ -0,0 +1,10 @@ +# position: 2.5 # float position is supported +label: 'Seeds' +collapsible: true # make the category collapsible +collapsed: true # keep the category collapsed by default +className: red +link: + type: generated-index + title: Seeds FAQs +customProps: + description: Frequently asked questions about seeds in dbt diff --git a/website/docs/faqs/Snapshots/_category_.yaml b/website/docs/faqs/Snapshots/_category_.yaml new file mode 100644 index 00000000000..743b508fefe --- /dev/null +++ b/website/docs/faqs/Snapshots/_category_.yaml @@ -0,0 +1,10 @@ +# position: 2.5 # float position is supported +label: 'Snapshots' +collapsible: true # make the category collapsible +collapsed: true # keep the category collapsed by default +className: red +link: + type: generated-index + title: Snapshots FAQs +customProps: + description: Frequently asked questions about snapshots in dbt diff --git a/website/docs/faqs/Tests/_category_.yaml b/website/docs/faqs/Tests/_category_.yaml new file mode 100644 index 00000000000..754b8ec267b --- /dev/null +++ b/website/docs/faqs/Tests/_category_.yaml @@ -0,0 +1,10 @@ +# position: 2.5 # float position is supported +label: 'Tests' +collapsible: true # make the category collapsible +collapsed: true # keep the category collapsed by default +className: red +link: + type: generated-index + title: Tests FAQs +customProps: + description: Frequently asked questions about tests in dbt diff --git a/website/docs/faqs/Tests/available-tests.md b/website/docs/faqs/Tests/available-tests.md index f08e6841bd0..2b5fd3ff55c 100644 --- a/website/docs/faqs/Tests/available-tests.md +++ b/website/docs/faqs/Tests/available-tests.md @@ -12,6 +12,6 @@ Out of the box, dbt ships with the following tests: * `accepted_values` * `relationships` (i.e. referential integrity) -You can also write your own [custom schema tests](/docs/build/tests). +You can also write your own [custom schema data tests](/docs/build/data-tests). Some additional custom schema tests have been open-sourced in the [dbt-utils package](https://github.com/dbt-labs/dbt-utils/tree/0.2.4/#schema-tests), check out the docs on [packages](/docs/build/packages) to learn how to make these tests available in your project. diff --git a/website/docs/faqs/Tests/custom-test-thresholds.md b/website/docs/faqs/Tests/custom-test-thresholds.md index 34d2eec7494..400a5b4e28b 100644 --- a/website/docs/faqs/Tests/custom-test-thresholds.md +++ b/website/docs/faqs/Tests/custom-test-thresholds.md @@ -10,5 +10,5 @@ As of `v0.20.0`, you can use the `error_if` and `warn_if` configs to set custom For dbt `v0.19.0` and earlier, you could try these possible solutions: -* Setting the [severity](/reference/resource-properties/tests#severity) to `warn`, or: +* Setting the [severity](/reference/resource-properties/data-tests#severity) to `warn`, or: * Writing a [custom generic test](/best-practices/writing-custom-generic-tests) that accepts a threshold argument ([example](https://discourse.getdbt.com/t/creating-an-error-threshold-for-schema-tests/966)) diff --git a/website/docs/faqs/Tests/testing-sources.md b/website/docs/faqs/Tests/testing-sources.md index 8eb769026e5..5e68b88dcbf 100644 --- a/website/docs/faqs/Tests/testing-sources.md +++ b/website/docs/faqs/Tests/testing-sources.md @@ -9,7 +9,7 @@ id: testing-sources To run tests on all sources, use the following command: ```shell -$ dbt test --select source:* + dbt test --select "source:*" ``` (You can also use the `-s` shorthand here instead of `--select`) diff --git a/website/docs/faqs/Troubleshooting/_category_.yaml b/website/docs/faqs/Troubleshooting/_category_.yaml new file mode 100644 index 00000000000..14c4b49044d --- /dev/null +++ b/website/docs/faqs/Troubleshooting/_category_.yaml @@ -0,0 +1,10 @@ +# position: 2.5 # float position is supported +label: 'Troubleshooting' +collapsible: true # make the category collapsible +collapsed: true # keep the category collapsed by default +className: red +link: + type: generated-index + title: Troubleshooting FAQs +customProps: + description: Frequently asked questions about troubleshooting dbt diff --git a/website/docs/faqs/Troubleshooting/sl-alpn-error.md b/website/docs/faqs/Troubleshooting/sl-alpn-error.md new file mode 100644 index 00000000000..f588d690fac --- /dev/null +++ b/website/docs/faqs/Troubleshooting/sl-alpn-error.md @@ -0,0 +1,14 @@ +--- +title: I'm receiving an `Failed ALPN` error when trying to connect to the dbt Semantic Layer. +description: "To resolve the 'Failed ALPN' error in the dbt Semantic Layer, create a SSL interception exception for the dbt Cloud domain." +sidebar_label: 'Use SSL exception to resolve `Failed ALPN` error' +--- + +If you're receiving a `Failed ALPN` error when trying to connect the dbt Semantic Layer with the various [data integration tools](/docs/use-dbt-semantic-layer/avail-sl-integrations) (such as Tableau, DBeaver, Datagrip, ADBC, or JDBC), it typically happens when connecting from a computer behind a corporate VPN or Proxy (like Zscaler or Check Point). + +The root cause is typically the proxy interfering with the TLS handshake as the dbt Semantic Layer uses gRPC/HTTP2 for connectivity. To resolve this: + +- If your proxy supports gRPC/HTTP2 but isn't configured to allow ALPN, adjust its settings accordingly to allow ALPN. Or create an exception for the dbt Cloud domain. +- If your proxy does not support gRPC/HTTP2, add an SSL interception exception for the dbt Cloud domain in your proxy settings + +This should help in successfully establishing the connection without the Failed ALPN error. diff --git a/website/docs/faqs/Warehouse/_category_.yaml b/website/docs/faqs/Warehouse/_category_.yaml new file mode 100644 index 00000000000..4de6e2e7d5e --- /dev/null +++ b/website/docs/faqs/Warehouse/_category_.yaml @@ -0,0 +1,10 @@ +# position: 2.5 # float position is supported +label: 'Warehouse' +collapsible: true # make the category collapsible +collapsed: true # keep the category collapsed by default +className: red +link: + type: generated-index + title: Warehouse FAQs +customProps: + description: Frequently asked questions about warehouses and dbt diff --git a/website/docs/guides/airflow-and-dbt-cloud.md b/website/docs/guides/airflow-and-dbt-cloud.md index a3ff59af14e..e7f754ef02d 100644 --- a/website/docs/guides/airflow-and-dbt-cloud.md +++ b/website/docs/guides/airflow-and-dbt-cloud.md @@ -11,50 +11,29 @@ recently_updated: true ## Introduction -In some cases, [Airflow](https://airflow.apache.org/) may be the preferred orchestrator for your organization over working fully within dbt Cloud. There are a few reasons your team might be considering using Airflow to orchestrate your dbt jobs: - -- Your team is already using Airflow to orchestrate other processes -- Your team needs to ensure that a [dbt job](https://docs.getdbt.com/docs/dbt-cloud/cloud-overview#schedule-and-run-dbt-jobs-in-production) kicks off before or after another process outside of dbt Cloud -- Your team needs flexibility to manage more complex scheduling, such as kicking off one dbt job only after another has completed -- Your team wants to own their own orchestration solution -- You need code to work right now without starting from scratch - -### Prerequisites - -- [dbt Cloud Teams or Enterprise account](https://www.getdbt.com/pricing/) (with [admin access](https://docs.getdbt.com/docs/cloud/manage-access/enterprise-permissions)) in order to create a service token. Permissions for service tokens can be found [here](https://docs.getdbt.com/docs/dbt-cloud-apis/service-tokens#permissions-for-service-account-tokens). -- A [free Docker account](https://hub.docker.com/signup) in order to sign in to Docker Desktop, which will be installed in the initial setup. -- A local digital scratchpad for temporarily copy-pasting API keys and URLs - -### Airflow + dbt Core - -There are [so many great examples](https://gitlab.com/gitlab-data/analytics/-/blob/master/dags/transformation/dbt_snowplow_backfill.py) from GitLab through their open source data engineering work. This is especially appropriate if you are well-versed in Kubernetes, CI/CD, and docker task management when building your airflow pipelines. If this is you and your team, you’re in good hands reading through more details [here](https://about.gitlab.com/handbook/business-technology/data-team/platform/infrastructure/#airflow) and [here](https://about.gitlab.com/handbook/business-technology/data-team/platform/dbt-guide/). - -### Airflow + dbt Cloud API w/Custom Scripts - -This has served as a bridge until the fabled Astronomer + dbt Labs-built dbt Cloud provider became generally available [here](https://registry.astronomer.io/providers/dbt%20Cloud/versions/latest). - -There are many different permutations of this over time: - -- [Custom Python Scripts](https://github.com/sungchun12/airflow-dbt-cloud/blob/main/archive/dbt_cloud_example.py): This is an airflow DAG based on [custom python API utilities](https://github.com/sungchun12/airflow-dbt-cloud/blob/main/archive/dbt_cloud_utils.py) -- [Make API requests directly through the BashOperator based on the docs](https://docs.getdbt.com/dbt-cloud/api-v2-legacy#operation/triggerRun): You can make cURL requests to invoke dbt Cloud to do what you want -- For more options, check out the [official dbt Docs](/docs/deploy/deployments#airflow) on the various ways teams are running dbt in airflow - -These solutions are great, but can be difficult to trust as your team grows and management for things like: testing, job definitions, secrets, and pipelines increase past your team’s capacity. Roles become blurry (or were never clearly defined at the start!). Both data and analytics engineers start digging through custom logging within each other’s workflows to make heads or tails of where and what the issue really is. Not to mention that when the issue is found, it can be even harder to decide on the best path forward for safely implementing fixes. This complex workflow and unclear delineation on process management results in a lot of misunderstandings and wasted time just trying to get the process to work smoothly! +Many organization already use [Airflow](https://airflow.apache.org/) to orchestrate their data workflows. dbt Cloud works great with Airflow, letting you execute your dbt code in dbt Cloud while keeping orchestration duties with Airflow. This ensures your project's metadata (important for tools like dbt Explorer) is available and up-to-date, while still enabling you to use Airflow for general tasks such as: +- Scheduling other processes outside of dbt runs +- Ensuring that a [dbt job](/docs/deploy/job-scheduler) kicks off before or after another process outside of dbt Cloud +- Triggering a dbt job only after another has completed In this guide, you'll learn how to: -1. Creating a working local Airflow environment -2. Invoking a dbt Cloud job with Airflow (with proof!) -3. Reusing tested and trusted Airflow code for your specific use cases +1. Create a working local Airflow environment +2. Invoke a dbt Cloud job with Airflow +3. Reuse tested and trusted Airflow code for your specific use cases You’ll also gain a better understanding of how this will: - Reduce the cognitive load when building and maintaining pipelines - Avoid dependency hell (think: `pip install` conflicts) -- Implement better recoveries from failures -- Define clearer workflows so that data and analytics engineers work better, together ♥️ +- Define clearer handoff of workflows between data engineers and analytics engineers + +## Prerequisites +- [dbt Cloud Teams or Enterprise account](https://www.getdbt.com/pricing/) (with [admin access](/docs/cloud/manage-access/enterprise-permissions)) in order to create a service token. Permissions for service tokens can be found [here](/docs/dbt-cloud-apis/service-tokens#permissions-for-service-account-tokens). +- A [free Docker account](https://hub.docker.com/signup) in order to sign in to Docker Desktop, which will be installed in the initial setup. +- A local digital scratchpad for temporarily copy-pasting API keys and URLs 🙌 Let’s get started! 🙌 @@ -72,7 +51,7 @@ brew install astro ## Install and start Docker Desktop -Docker allows us to spin up an environment with all the apps and dependencies we need for the example. +Docker allows us to spin up an environment with all the apps and dependencies we need for this guide. Follow the instructions [here](https://docs.docker.com/desktop/) to install Docker desktop for your own operating system. Once Docker is installed, ensure you have it up and running for the next steps. @@ -80,7 +59,7 @@ Follow the instructions [here](https://docs.docker.com/desktop/) to install Dock ## Clone the airflow-dbt-cloud repository -Open your terminal and clone the [airflow-dbt-cloud repository](https://github.com/sungchun12/airflow-dbt-cloud.git). This contains example Airflow DAGs that you’ll use to orchestrate your dbt Cloud job. Once cloned, navigate into the `airflow-dbt-cloud` project. +Open your terminal and clone the [airflow-dbt-cloud repository](https://github.com/sungchun12/airflow-dbt-cloud). This contains example Airflow DAGs that you’ll use to orchestrate your dbt Cloud job. Once cloned, navigate into the `airflow-dbt-cloud` project. ```bash git clone https://github.com/sungchun12/airflow-dbt-cloud.git @@ -91,12 +70,9 @@ cd airflow-dbt-cloud ## Start the Docker container -You can initialize an Astronomer project in an empty local directory using a Docker container, and then run your project locally using the `start` command. - -1. Run the following commands to initialize your project and start your local Airflow deployment: +1. From the `airflow-dbt-cloud` directory you cloned and opened in the prior step, run the following command to start your local Airflow deployment: ```bash - astro dev init astro dev start ``` @@ -110,10 +86,10 @@ You can initialize an Astronomer project in an empty local directory using a Doc Airflow Webserver: http://localhost:8080 Postgres Database: localhost:5432/postgres The default Airflow UI credentials are: admin:admin - The default Postrgres DB credentials are: postgres:postgres + The default Postgres DB credentials are: postgres:postgres ``` -2. Open the Airflow interface. Launch your web browser and navigate to the address for the **Airflow Webserver** from your output in Step 1. +2. Open the Airflow interface. Launch your web browser and navigate to the address for the **Airflow Webserver** from your output above (for us, `http://localhost:8080`). This will take you to your local instance of Airflow. You’ll need to log in with the **default credentials**: @@ -126,15 +102,15 @@ You can initialize an Astronomer project in an empty local directory using a Doc ## Create a dbt Cloud service token -Create a service token from within dbt Cloud using the instructions [found here](https://docs.getdbt.com/docs/dbt-cloud-apis/service-tokens). Ensure that you save a copy of the token, as you won’t be able to access this later. In this example we use `Account Admin`, but you can also use `Job Admin` instead for token permissions. +[Create a service token](/docs/dbt-cloud-apis/service-tokens) with `Job Admin` privileges from within dbt Cloud. Ensure that you save a copy of the token, as you won’t be able to access this later. ## Create a dbt Cloud job -In your dbt Cloud account create a job, paying special attention to the information in the bullets below. Additional information for creating a dbt Cloud job can be found [here](/guides/bigquery). +[Create a job in your dbt Cloud account](/docs/deploy/deploy-jobs#create-and-schedule-jobs), paying special attention to the information in the bullets below. -- Configure the job with the commands that you want to include when this job kicks off, as Airflow will be referring to the job’s configurations for this rather than being explicitly coded in the Airflow DAG. This job will run a set of commands rather than a single command. +- Configure the job with the full commands that you want to include when this job kicks off. This sample code has Airflow triggering the dbt Cloud job and all of its commands, instead of explicitly identifying individual models to run from inside of Airflow. - Ensure that the schedule is turned **off** since we’ll be using Airflow to kick things off. - Once you hit `save` on the job, make sure you copy the URL and save it for referencing later. The url will look similar to this: @@ -144,77 +120,59 @@ https://cloud.getdbt.com/#/accounts/{account_id}/projects/{project_id}/jobs/{job -## Add your dbt Cloud API token as a secure connection - - +## Connect dbt Cloud to Airflow -Now you have all the working pieces to get up and running with Airflow + dbt Cloud. Let’s dive into make this all work together. We will **set up a connection** and **run a DAG in Airflow** that kicks off a dbt Cloud job. +Now you have all the working pieces to get up and running with Airflow + dbt Cloud. It's time to **set up a connection** and **run a DAG in Airflow** that kicks off a dbt Cloud job. -1. Navigate to Admin and click on **Connections** +1. From the Airflow interface, navigate to Admin and click on **Connections** ![Airflow connections menu](/img/guides/orchestration/airflow-and-dbt-cloud/airflow-connections-menu.png) 2. Click on the `+` sign to add a new connection, then click on the drop down to search for the dbt Cloud Connection Type - ![Create connection](/img/guides/orchestration/airflow-and-dbt-cloud/create-connection.png) - ![Connection type](/img/guides/orchestration/airflow-and-dbt-cloud/connection-type.png) 3. Add in your connection details and your default dbt Cloud account id. This is found in your dbt Cloud URL after the accounts route section (`/accounts/{YOUR_ACCOUNT_ID}`), for example the account with id 16173 would see this in their URL: `https://cloud.getdbt.com/#/accounts/16173/projects/36467/jobs/65767/` -![https://lh3.googleusercontent.com/sRxe5xbv_LYhIKblc7eiY7AmByr1OibOac2_fIe54rpU3TBGwjMpdi_j0EPEFzM1_gNQXry7Jsm8aVw9wQBSNs1I6Cyzpvijaj0VGwSnmVf3OEV8Hv5EPOQHrwQgK2RhNBdyBxN2](https://lh3.googleusercontent.com/sRxe5xbv_LYhIKblc7eiY7AmByr1OibOac2_fIe54rpU3TBGwjMpdi_j0EPEFzM1_gNQXry7Jsm8aVw9wQBSNs1I6Cyzpvijaj0VGwSnmVf3OEV8Hv5EPOQHrwQgK2RhNBdyBxN2) - -## Add your `job_id` and `account_id` config details to the python file + ![Connection type](/img/guides/orchestration/airflow-and-dbt-cloud/connection-type-configured.png) - Add your `job_id` and `account_id` config details to the python file: [dbt_cloud_provider_eltml.py](https://github.com/sungchun12/airflow-dbt-cloud/blob/main/dags/dbt_cloud_provider_eltml.py). +## Update the placeholders in the sample code -1. You’ll find these details within the dbt Cloud job URL, see the comments in the code snippet below for an example. + Add your `account_id` and `job_id` to the python file [dbt_cloud_provider_eltml.py](https://github.com/sungchun12/airflow-dbt-cloud/blob/main/dags/dbt_cloud_provider_eltml.py). - ```python - # dbt Cloud Job URL: https://cloud.getdbt.com/#/accounts/16173/projects/36467/jobs/65767/ - # account_id: 16173 - #job_id: 65767 +Both IDs are included inside of the dbt Cloud job URL as shown in the following snippets: - # line 28 - default_args={"dbt_cloud_conn_id": "dbt_cloud", "account_id": 16173}, +```python +# For the dbt Cloud Job URL https://cloud.getdbt.com/#/accounts/16173/projects/36467/jobs/65767/ +# The account_id is 16173 - trigger_dbt_cloud_job_run = DbtCloudRunJobOperator( - task_id="trigger_dbt_cloud_job_run", - job_id=65767, # line 39 - check_interval=10, - timeout=300, - ) - ``` - -2. Turn on the DAG and verify the job succeeded after running. Note: screenshots taken from different job runs, but the user experience is consistent. - - ![https://lh6.googleusercontent.com/p8AqQRy0UGVLjDGPmcuGYmQ_BRodyL0Zis-eQgSmp69EHbKW51o4S-bCl1fXHlOmwpYEBxD0A-O1Q1hwt-VDVMO1wWH-AIeaoelBx06JXRJ0m1OcHaPpFKH0xDiduIhNlQhhbLiy](https://lh6.googleusercontent.com/p8AqQRy0UGVLjDGPmcuGYmQ_BRodyL0Zis-eQgSmp69EHbKW51o4S-bCl1fXHlOmwpYEBxD0A-O1Q1hwt-VDVMO1wWH-AIeaoelBx06JXRJ0m1OcHaPpFKH0xDiduIhNlQhhbLiy) - - ![Airflow DAG](/img/guides/orchestration/airflow-and-dbt-cloud/airflow-dag.png) - - ![Task run instance](/img/guides/orchestration/airflow-and-dbt-cloud/task-run-instance.png) - - ![https://lh6.googleusercontent.com/S9QdGhLAdioZ3x634CChugsJRiSVtTTd5CTXbRL8ADA6nSbAlNn4zV0jb3aC946c8SGi9FRTfyTFXqjcM-EBrJNK5hQ0HHAsR5Fj7NbdGoUfBI7xFmgeoPqnoYpjyZzRZlXkjtxS](https://lh6.googleusercontent.com/S9QdGhLAdioZ3x634CChugsJRiSVtTTd5CTXbRL8ADA6nSbAlNn4zV0jb3aC946c8SGi9FRTfyTFXqjcM-EBrJNK5hQ0HHAsR5Fj7NbdGoUfBI7xFmgeoPqnoYpjyZzRZlXkjtxS) - -## How do I rerun the dbt Cloud job and downstream tasks in my pipeline? - -If you have worked with dbt Cloud before, you have likely encountered cases where a job fails. In those cases, you have likely logged into dbt Cloud, investigated the error, and then manually restarted the job. - -This section of the guide will show you how to restart the job directly from Airflow. This will specifically run *just* the `trigger_dbt_cloud_job_run` and downstream tasks of the Airflow DAG and not the entire DAG. If only the transformation step fails, you don’t need to re-run the extract and load processes. Let’s jump into how to do that in Airflow. - -1. Click on the task +# Update line 28 +default_args={"dbt_cloud_conn_id": "dbt_cloud", "account_id": 16173}, +``` - ![Task DAG view](/img/guides/orchestration/airflow-and-dbt-cloud/task-dag-view.png) +```python +# For the dbt Cloud Job URL https://cloud.getdbt.com/#/accounts/16173/projects/36467/jobs/65767/ +# The job_id is 65767 + +# Update line 39 +trigger_dbt_cloud_job_run = DbtCloudRunJobOperator( + task_id="trigger_dbt_cloud_job_run", + job_id=65767, + check_interval=10, + timeout=300, + ) +``` -2. Clear the task instance + - ![Clear task instance](/img/guides/orchestration/airflow-and-dbt-cloud/clear-task-instance.png) +## Run the Airflow DAG - ![Approve clearing](/img/guides/orchestration/airflow-and-dbt-cloud/approve-clearing.png) +Turn on the DAG and trigger it to run. Verify the job succeeded after running. -3. Watch it rerun in real time +![Airflow DAG](/img/guides/orchestration/airflow-and-dbt-cloud/airflow-dag.png) - ![Re-run](/img/guides/orchestration/airflow-and-dbt-cloud/re-run.png) +Click Monitor Job Run to open the run details in dbt Cloud. +![Task run instance](/img/guides/orchestration/airflow-and-dbt-cloud/task-run-instance.png) ## Cleaning up @@ -224,9 +182,9 @@ At the end of this guide, make sure you shut down your docker container. When y $ astrocloud dev stop [+] Running 3/3 - ⠿ Container airflow-dbt-cloud_e3fe3c-webserver-1 Stopped 7.5s - ⠿ Container airflow-dbt-cloud_e3fe3c-scheduler-1 Stopped 3.3s - ⠿ Container airflow-dbt-cloud_e3fe3c-postgres-1 Stopped 0.3s + ⠿ Container airflow-dbt-cloud_e3fe3c-webserver-1 Stopped 7.5s + ⠿ Container airflow-dbt-cloud_e3fe3c-scheduler-1 Stopped 3.3s + ⠿ Container airflow-dbt-cloud_e3fe3c-postgres-1 Stopped 0.3s ``` To verify that the deployment has stopped, use the following command: @@ -244,37 +202,29 @@ airflow-dbt-cloud_e3fe3c-scheduler-1 exited airflow-dbt-cloud_e3fe3c-postgres-1 exited ``` - + ## Frequently asked questions ### How can we run specific subsections of the dbt DAG in Airflow? -Because of the way we configured the dbt Cloud job to run in Airflow, you can leave this job to your analytics engineers to define in the job configurations from dbt Cloud. If, for example, we need to run hourly-tagged models every hour and daily-tagged models daily, we can create jobs like `Hourly Run` or `Daily Run` and utilize the commands `dbt run -s tag:hourly` and `dbt run -s tag:daily` within each, respectively. We only need to grab our dbt Cloud `account` and `job id`, configure it in an Airflow DAG with the code provided, and then we can be on your way. See more node selection options: [here](/reference/node-selection/syntax) - -### How can I re-run models from the point of failure? - -You may want to parse the dbt DAG in Airflow to get the benefit of re-running from the point of failure. However, when you have hundreds of models in your DAG expanded out, it becomes useless for diagnosis and rerunning due to the overhead that comes along with creating an expansive Airflow DAG. +Because the Airflow DAG references dbt Cloud jobs, your analytics engineers can take responsibility for configuring the jobs in dbt Cloud. -You can’t re-run from failure natively in dbt Cloud today (feature coming!), but you can use a custom rerun parser. +For example, to run some models hourly and others daily, there will be jobs like `Hourly Run` or `Daily Run` using the commands `dbt run --select tag:hourly` and `dbt run --select tag:daily` respectively. Once configured in dbt Cloud, these can be added as steps in an Airflow DAG as shown in this guide. Refer to our full [node selection syntax docs here](/reference/node-selection/syntax). -Using a simple python script coupled with the dbt Cloud provider, you can: - -- Avoid managing artifacts in a separate storage bucket(dbt Cloud does this for you) -- Avoid building your own parsing logic -- Get clear logs on what models you're rerunning in dbt Cloud (without hard coding step override commands) - -Watch the video below to see how it works! +### How can I re-run models from the point of failure? - +You can trigger re-run from point of failure with the `rerun` API endpoint. See the docs on [retrying jobs](/docs/deploy/retry-jobs) for more information. ### Should Airflow run one big dbt job or many dbt jobs? -Overall we recommend being as purposeful and minimalistic as you can. This is because dbt manages all of the dependencies between models and the orchestration of running those dependencies in order, which in turn has benefits in terms of warehouse processing efforts. +dbt jobs are most effective when a build command contains as many models at once as is practical. This is because dbt manages the dependencies between models and coordinates running them in order, which ensures that your jobs can run in a highly parallelized fashion. It also streamlines the debugging process when a model fails and enables re-run from point of failure. + +As an explicit example, it's not recommended to have a dbt job for every single node in your DAG. Try combining your steps according to desired run frequency, or grouping by department (finance, marketing, customer success...) instead. ### We want to kick off our dbt jobs after our ingestion tool (such as Fivetran) / data pipelines are done loading data. Any best practices around that? -Our friends at Astronomer answer this question with this example: [here](https://registry.astronomer.io/dags/fivetran-dbt-cloud-census) +Astronomer's DAG registry has a sample workflow combining Fivetran, dbt Cloud and Census [here](https://registry.astronomer.io/dags/fivetran-dbt_cloud-census/versions/3.0.0). ### How do you set up a CI/CD workflow with Airflow? @@ -285,12 +235,12 @@ Check out these two resources for accomplishing your own CI/CD pipeline: ### Can dbt dynamically create tasks in the DAG like Airflow can? -We prefer to keep models bundled vs. unbundled. You can go this route, but if you have hundreds of dbt models, it’s more effective to let the dbt Cloud job handle the models and dependencies. Bundling provides the solution to clear observability when things go wrong - we've seen more success in having the ability to clearly see issues in a bundled dbt Cloud job than combing through the nodes of an expansive Airflow DAG. If you still have a use case for this level of control though, our friends at Astronomer answer this question [here](https://www.astronomer.io/blog/airflow-dbt-1/)! +As discussed above, we prefer to keep jobs bundled together and containing as many nodes as are necessary. If you must run nodes one at a time for some reason, then review [this article](https://www.astronomer.io/blog/airflow-dbt-1/) for some pointers. -### Can you trigger notifications if a dbt job fails with Airflow? Is there any way to access the status of the dbt Job to do that? +### Can you trigger notifications if a dbt job fails with Airflow? -Yes, either through [Airflow's email/slack](https://www.astronomer.io/guides/error-notifications-in-airflow/) functionality by itself or combined with [dbt Cloud's notifications](/docs/deploy/job-notifications), which support email and slack notifications. +Yes, either through [Airflow's email/slack](https://www.astronomer.io/guides/error-notifications-in-airflow/) functionality, or [dbt Cloud's notifications](/docs/deploy/job-notifications), which support email and Slack notifications. You could also create a [webhook](/docs/deploy/webhooks). -### Are there decision criteria for how to best work with dbt Cloud and airflow? +### How should I plan my dbt Cloud + Airflow implementation? -Check out this deep dive into planning your dbt Cloud + Airflow implementation [here](https://www.youtube.com/watch?v=n7IIThR8hGk)! +Check out [this recording](https://www.youtube.com/watch?v=n7IIThR8hGk) of a dbt meetup for some tips. diff --git a/website/docs/guides/bigquery-qs.md b/website/docs/guides/bigquery-qs.md index c1f632f0621..9cf2447fa52 100644 --- a/website/docs/guides/bigquery-qs.md +++ b/website/docs/guides/bigquery-qs.md @@ -78,7 +78,7 @@ In order to let dbt connect to your warehouse, you'll need to generate a keyfile - Click **Next** to create a new service account. 2. Create a service account for your new project from the [Service accounts page](https://console.cloud.google.com/projectselector2/iam-admin/serviceaccounts?supportedpurview=project). For more information, refer to [Create a service account](https://developers.google.com/workspace/guides/create-credentials#create_a_service_account) in the Google Cloud docs. As an example for this guide, you can: - Type `dbt-user` as the **Service account name** - - From the **Select a role** dropdown, choose **BigQuery Admin** and click **Continue** + - From the **Select a role** dropdown, choose **BigQuery Job User** and **BigQuery Data Editor** roles and click **Continue** - Leave the **Grant users access to this service account** fields blank - Click **Done** 3. Create a service account key for your new project from the [Service accounts page](https://console.cloud.google.com/iam-admin/serviceaccounts?walkthrough_id=iam--create-service-account-keys&start_index=1#step_index=1). For more information, refer to [Create a service account key](https://cloud.google.com/iam/docs/creating-managing-service-account-keys#creating) in the Google Cloud docs. When downloading the JSON file, make sure to use a filename you can easily remember. For example, `dbt-user-creds.json`. For security reasons, dbt Labs recommends that you protect this JSON file like you would your identity credentials; for example, don't check the JSON file into your version control software. diff --git a/website/docs/guides/building-packages.md b/website/docs/guides/building-packages.md index 641a1c6af6d..55f0c2ed912 100644 --- a/website/docs/guides/building-packages.md +++ b/website/docs/guides/building-packages.md @@ -104,7 +104,7 @@ dbt makes it possible for users of your package to override your model -This completes the integration setup and data is ready for business consumption. \ No newline at end of file +This completes the integration setup and data is ready for business consumption. diff --git a/website/docs/guides/how-to-use-databricks-workflows-to-run-dbt-cloud-jobs.md b/website/docs/guides/how-to-use-databricks-workflows-to-run-dbt-cloud-jobs.md index 30221332355..cb3a6804247 100644 --- a/website/docs/guides/how-to-use-databricks-workflows-to-run-dbt-cloud-jobs.md +++ b/website/docs/guides/how-to-use-databricks-workflows-to-run-dbt-cloud-jobs.md @@ -29,14 +29,6 @@ Using Databricks workflows to call the dbt Cloud job API can be useful for sever - [Databricks CLI](https://docs.databricks.com/dev-tools/cli/index.html) - **Note**: You only need to set up your authentication. Once you have set up your Host and Token and are able to run `databricks workspace ls /Users/`, you can proceed with the rest of this guide. -## Configure Databricks workflows for dbt Cloud jobs - -To use Databricks workflows for running dbt Cloud jobs, you need to perform the following steps: - -- [Set up a Databricks secret scope](#set-up-a-databricks-secret-scope) -- [Create a Databricks Python notebook](#create-a-databricks-python-notebook) -- [Configure the workflows to run the dbt Cloud jobs](#configure-the-workflows-to-run-the-dbt-cloud-jobs) - ## Set up a Databricks secret scope 1. Retrieve **[User API Token](https://docs.getdbt.com/docs/dbt-cloud-apis/user-tokens#user-api-tokens) **or **[Service Account Token](https://docs.getdbt.com/docs/dbt-cloud-apis/service-tokens#generating-service-account-tokens) **from dbt Cloud diff --git a/website/docs/guides/manual-install-qs.md b/website/docs/guides/manual-install-qs.md index 61796fe008a..e9c1af259ac 100644 --- a/website/docs/guides/manual-install-qs.md +++ b/website/docs/guides/manual-install-qs.md @@ -15,8 +15,8 @@ When you use dbt Core to work with dbt, you will be editing files locally using ### Prerequisites * To use dbt Core, it's important that you know some basics of the Terminal. In particular, you should understand `cd`, `ls` and `pwd` to navigate through the directory structure of your computer easily. -* Install dbt Core using the [installation instructions](/docs/core/installation) for your operating system. -* Complete [Setting up (in BigQuery)](/guides/bigquery?step=2) and [Loading data (BigQuery)](/guides/bigquery?step=3). +* Install dbt Core using the [installation instructions](/docs/core/installation-overview) for your operating system. +* Complete appropriate Setting up and Loading data steps in the Quickstart for dbt Cloud series. For example, for BigQuery, complete [Setting up (in BigQuery)](/guides/bigquery?step=2) and [Loading data (BigQuery)](/guides/bigquery?step=3). * [Create a GitHub account](https://github.com/join) if you don't already have one. ### Create a starter project diff --git a/website/docs/guides/microsoft-fabric-qs.md b/website/docs/guides/microsoft-fabric-qs.md index c7c53a2aac7..1d1e016a6f1 100644 --- a/website/docs/guides/microsoft-fabric-qs.md +++ b/website/docs/guides/microsoft-fabric-qs.md @@ -9,7 +9,7 @@ recently_updated: true --- ## Introduction -In this quickstart guide, you'll learn how to use dbt Cloud with Microsoft Fabric. It will show you how to: +In this quickstart guide, you'll learn how to use dbt Cloud with [Microsoft Fabric](https://www.microsoft.com/en-us/microsoft-fabric). It will show you how to: - Load the Jaffle Shop sample data (provided by dbt Labs) into your Microsoft Fabric warehouse. - Connect dbt Cloud to Microsoft Fabric. @@ -27,7 +27,7 @@ A public preview of Microsoft Fabric in dbt Cloud is now available! ### Prerequisites - You have a [dbt Cloud](https://www.getdbt.com/signup/) account. - You have started the Microsoft Fabric (Preview) trial. For details, refer to [Microsoft Fabric (Preview) trial](https://learn.microsoft.com/en-us/fabric/get-started/fabric-trial) in the Microsoft docs. -- As a Microsoft admin, you’ve enabled service principal authentication. For details, refer to [Enable service principal authentication](https://learn.microsoft.com/en-us/fabric/admin/metadata-scanning-enable-read-only-apis) in the Microsoft docs. dbt Cloud needs these authentication credentials to connect to Microsoft Fabric. +- As a Microsoft admin, you’ve enabled service principal authentication. You must add the service principal to the Microsoft Fabric workspace with either a Member (recommended) or Admin permission set. For details, refer to [Enable service principal authentication](https://learn.microsoft.com/en-us/fabric/admin/metadata-scanning-enable-read-only-apis) in the Microsoft docs. dbt Cloud needs these authentication credentials to connect to Microsoft Fabric. ### Related content - [dbt Courses](https://courses.getdbt.com/collections) @@ -54,8 +54,8 @@ A public preview of Microsoft Fabric in dbt Cloud is now available! CREATE TABLE dbo.customers ( [ID] [int], - [FIRST_NAME] [varchar] (8000), - [LAST_NAME] [varchar] (8000) + \[FIRST_NAME] [varchar](8000), + \[LAST_NAME] [varchar](8000) ); COPY INTO [dbo].[customers] @@ -72,7 +72,7 @@ A public preview of Microsoft Fabric in dbt Cloud is now available! [USER_ID] [int], -- [ORDER_DATE] [int], [ORDER_DATE] [date], - [STATUS] [varchar] (8000) + \[STATUS] [varchar](8000) ); COPY INTO [dbo].[orders] @@ -87,8 +87,8 @@ A public preview of Microsoft Fabric in dbt Cloud is now available! ( [ID] [int], [ORDERID] [int], - [PAYMENTMETHOD] [varchar] (8000), - [STATUS] [varchar] (8000), + \[PAYMENTMETHOD] [varchar](8000), + \[STATUS] [varchar](8000), [AMOUNT] [int], [CREATED] [date] ); @@ -108,6 +108,9 @@ A public preview of Microsoft Fabric in dbt Cloud is now available! 2. Enter a project name and click **Continue**. 3. Choose **Fabric** as your connection and click **Next**. 4. In the **Configure your environment** section, enter the **Settings** for your new project: + - **Server** — Use the service principal's **host** value for the Fabric test endpoint. + - **Port** — 1433 (which is the default). + - **Database** — Use the service principal's **database** value for the Fabric test endpoint. 5. Enter the **Development credentials** for your new project: - **Authentication** — Choose **Service Principal** from the dropdown. - **Tenant ID** — Use the service principal’s **Directory (tenant) id** as the value. diff --git a/website/docs/guides/productionize-your-dbt-databricks-project.md b/website/docs/guides/productionize-your-dbt-databricks-project.md index b95d8ffd2dd..3584cffba77 100644 --- a/website/docs/guides/productionize-your-dbt-databricks-project.md +++ b/website/docs/guides/productionize-your-dbt-databricks-project.md @@ -81,7 +81,7 @@ CI/CD, or Continuous Integration and Continuous Deployment/Delivery, has become The steps below show how to create a CI test for your dbt project. CD in dbt Cloud requires no additional steps, as your jobs will automatically pick up the latest changes from the branch assigned to the environment your job is running in. You may choose to add steps depending on your deployment strategy. If you want to dive deeper into CD options, check out [this blog on adopting CI/CD with dbt Cloud](https://www.getdbt.com/blog/adopting-ci-cd-with-dbt-cloud/). -dbt allows you to write [tests](/docs/build/tests) for your data pipeline, which can be run at every step of the process to ensure the stability and correctness of your data transformations. The main places you’ll use your dbt tests are: +dbt allows you to write [tests](/docs/build/data-tests) for your data pipeline, which can be run at every step of the process to ensure the stability and correctness of your data transformations. The main places you’ll use your dbt tests are: 1. **Daily runs:** Regularly running tests on your data pipeline helps catch issues caused by bad source data, ensuring the quality of data that reaches your users. 2. **Development**: Running tests during development ensures that your code changes do not break existing assumptions, enabling developers to iterate faster by catching problems immediately after writing code. diff --git a/website/docs/guides/set-up-your-databricks-dbt-project.md b/website/docs/guides/set-up-your-databricks-dbt-project.md index c17c6a1f99e..b2988f36589 100644 --- a/website/docs/guides/set-up-your-databricks-dbt-project.md +++ b/website/docs/guides/set-up-your-databricks-dbt-project.md @@ -62,7 +62,7 @@ Let’s [create a Databricks SQL warehouse](https://docs.databricks.com/sql/admi 5. Click *Create* 6. Configure warehouse permissions to ensure our service principal and developer have the right access. -We are not covering python in this post but if you want to learn more, check out these [docs](https://docs.getdbt.com/docs/build/python-models#specific-data-platforms). Depending on your workload, you may wish to create a larger SQL Warehouse for production workflows while having a smaller development SQL Warehouse (if you’re not using Serverless SQL Warehouses). +We are not covering python in this post but if you want to learn more, check out these [docs](https://docs.getdbt.com/docs/build/python-models#specific-data-platforms). Depending on your workload, you may wish to create a larger SQL Warehouse for production workflows while having a smaller development SQL Warehouse (if you’re not using Serverless SQL Warehouses). As your project grows, you might want to apply [compute per model configurations](/reference/resource-configs/databricks-configs#specifying-the-compute-for-models). ## Configure your dbt project diff --git a/website/docs/guides/sl-migration.md b/website/docs/guides/sl-migration.md index c3cca81f68e..8ede40a6a2d 100644 --- a/website/docs/guides/sl-migration.md +++ b/website/docs/guides/sl-migration.md @@ -91,13 +91,11 @@ At this point, both the new semantic layer and the old semantic layer will be ru Now that your Semantic Layer is set up, you will need to update any downstream integrations that used the legacy Semantic Layer. -### Migration guide for Hex +### Migration guide for Hex -To learn more about integrating with Hex, check out their [documentation](https://learn.hex.tech/docs/connect-to-data/data-connections/dbt-integration#dbt-semantic-layer-integration) for more info. Additionally, refer to [dbt Semantic Layer cells](https://learn.hex.tech/docs/logic-cell-types/transform-cells/dbt-metrics-cells) to set up SQL cells in Hex. +To learn more about integrating with Hex, check out their [documentation](https://learn.hex.tech/docs/connect-to-data/data-connections/dbt-integration#dbt-semantic-layer-integration) for more info. Additionally, refer to [dbt Semantic Layer cells](https://learn.hex.tech/docs/logic-cell-types/transform-cells/dbt-metrics-cells) to set up SQL cells in Hex. -1. Set up a new connection for the Semantic Layer for your account. Something to note is that your old connection will still work. The following Loom video guides you in setting up your Semantic Layer with Hex: - - +1. Set up a new connection for the dbt Semantic Layer for your account. Something to note is that your legacy connection will still work. 2. Re-create the dashboards or reports that use the legacy dbt Semantic Layer. diff --git a/website/docs/reference/artifacts/dbt-artifacts.md b/website/docs/reference/artifacts/dbt-artifacts.md index 859fde7c908..31525777500 100644 --- a/website/docs/reference/artifacts/dbt-artifacts.md +++ b/website/docs/reference/artifacts/dbt-artifacts.md @@ -48,3 +48,6 @@ In the manifest, the `metadata` may also include: #### Notes: - The structure of dbt artifacts is canonized by [JSON schemas](https://json-schema.org/), which are hosted at **schemas.getdbt.com**. - Artifact versions may change in any minor version of dbt (`v1.x.0`). Each artifact is versioned independently. + +## Related docs +- [Other artifacts](/reference/artifacts/other-artifacts) files such as `index.html` or `graph_summary.json`. diff --git a/website/docs/reference/artifacts/other-artifacts.md b/website/docs/reference/artifacts/other-artifacts.md index 205bdfc1a14..60050a6be66 100644 --- a/website/docs/reference/artifacts/other-artifacts.md +++ b/website/docs/reference/artifacts/other-artifacts.md @@ -21,7 +21,7 @@ This file is used to store a compressed representation of files dbt has parsed. **Produced by:** commands supporting [node selection](/reference/node-selection/syntax) -Stores the networkx representation of the dbt resource DAG. +Stores the network representation of the dbt resource DAG. ### graph_summary.json diff --git a/website/docs/reference/commands/test.md b/website/docs/reference/commands/test.md index c050d82a0ab..373ad9b6db3 100644 --- a/website/docs/reference/commands/test.md +++ b/website/docs/reference/commands/test.md @@ -28,4 +28,4 @@ dbt test --select "one_specific_model,test_type:singular" dbt test --select "one_specific_model,test_type:generic" ``` -For more information on writing tests, see the [Testing Documentation](/docs/build/tests). +For more information on writing tests, see the [Testing Documentation](/docs/build/data-tests). diff --git a/website/docs/reference/configs-and-properties.md b/website/docs/reference/configs-and-properties.md index c6458babeaa..9464faf719d 100644 --- a/website/docs/reference/configs-and-properties.md +++ b/website/docs/reference/configs-and-properties.md @@ -11,7 +11,7 @@ A rule of thumb: properties declare things _about_ your project resources; confi For example, you can use resource **properties** to: * Describe models, snapshots, seed files, and their columns -* Assert "truths" about a model, in the form of [tests](/docs/build/tests), e.g. "this `id` column is unique" +* Assert "truths" about a model, in the form of [data tests](/docs/build/data-tests), e.g. "this `id` column is unique" * Define pointers to existing tables that contain raw data, in the form of [sources](/docs/build/sources), and assert the expected "freshness" of this raw data * Define official downstream uses of your data models, in the form of [exposures](/docs/build/exposures) @@ -33,7 +33,7 @@ Depending on the resource type, configurations can be defined: dbt prioritizes configurations in order of specificity, from most specificity to least specificity. This generally follows the order above: an in-file `config()` block --> properties defined in a `.yml` file --> config defined in the project file. -Note - Generic tests work a little differently when it comes to specificity. See [test configs](/reference/test-configs). +Note - Generic data tests work a little differently when it comes to specificity. See [test configs](/reference/data-test-configs). Within the project file, configurations are also applied hierarchically. The most specific config always "wins": In the project file, configurations applied to a `marketing` subdirectory will take precedence over configurations applied to the entire `jaffle_shop` project. To apply a configuration to a model, or directory of models, define the resource path as nested dictionary keys. @@ -76,7 +76,7 @@ Certain properties are special, because: These properties are: - [`description`](/reference/resource-properties/description) -- [`tests`](/reference/resource-properties/tests) +- [`tests`](/reference/resource-properties/data-tests) - [`docs`](/reference/resource-configs/docs) - [`columns`](/reference/resource-properties/columns) - [`quote`](/reference/resource-properties/quote) diff --git a/website/docs/reference/test-configs.md b/website/docs/reference/data-test-configs.md similarity index 87% rename from website/docs/reference/test-configs.md rename to website/docs/reference/data-test-configs.md index 960e8d5471a..5f922d08c6b 100644 --- a/website/docs/reference/test-configs.md +++ b/website/docs/reference/data-test-configs.md @@ -1,8 +1,8 @@ --- -title: Test configurations -description: "Read this guide to learn about using test configurations in dbt." +title: Data test configurations +description: "Read this guide to learn about using data test configurations in dbt." meta: - resource_type: Tests + resource_type: Data tests --- import ConfigResource from '/snippets/_config-description-resource.md'; import ConfigGeneral from '/snippets/_config-description-general.md'; @@ -10,20 +10,20 @@ import ConfigGeneral from '/snippets/_config-description-general.md'; ## Related documentation -* [Tests](/docs/build/tests) +* [Data tests](/docs/build/data-tests) -Tests can be configured in a few different ways: -1. Properties within `.yml` definition (generic tests only, see [test properties](/reference/resource-properties/tests) for full syntax) +Data tests can be configured in a few different ways: +1. Properties within `.yml` definition (generic tests only, see [test properties](/reference/resource-properties/data-tests) for full syntax) 2. A `config()` block within the test's SQL definition 3. In `dbt_project.yml` -Test configs are applied hierarchically, in the order of specificity outlined above. In the case of a singular test, the `config()` block within the SQL definition takes precedence over configs in the project file. In the case of a specific instance of a generic test, the test's `.yml` properties would take precedence over any values set in its generic SQL definition's `config()`, which in turn would take precedence over values set in `dbt_project.yml`. +Data test configs are applied hierarchically, in the order of specificity outlined above. In the case of a singular test, the `config()` block within the SQL definition takes precedence over configs in the project file. In the case of a specific instance of a generic test, the test's `.yml` properties would take precedence over any values set in its generic SQL definition's `config()`, which in turn would take precedence over values set in `dbt_project.yml`. ## Available configurations Click the link on each configuration option to read more about what it can do. -### Test-specific configurations +### Data test-specific configurations @@ -204,7 +204,7 @@ version: 2 [alias](/reference/resource-configs/alias): ``` -This configuration mechanism is supported for specific instances of generic tests only. To configure a specific singular test, you should use the `config()` macro in its SQL definition. +This configuration mechanism is supported for specific instances of generic data tests only. To configure a specific singular test, you should use the `config()` macro in its SQL definition. @@ -216,7 +216,7 @@ This configuration mechanism is supported for specific instances of generic test #### Add a tag to one test -If a specific instance of a generic test: +If a specific instance of a generic data test: @@ -232,7 +232,7 @@ models: -If a singular test: +If a singular data test: @@ -244,7 +244,7 @@ select ... -#### Set the default severity for all instances of a generic test +#### Set the default severity for all instances of a generic data test @@ -260,7 +260,7 @@ select ... -#### Disable all tests from a package +#### Disable all data tests from a package diff --git a/website/docs/reference/dbt-commands.md b/website/docs/reference/dbt-commands.md index d5f0bfcd2ad..4cb20051ea2 100644 --- a/website/docs/reference/dbt-commands.md +++ b/website/docs/reference/dbt-commands.md @@ -5,7 +5,7 @@ title: "dbt Command reference" You can run dbt using the following tools: - In your browser with the [dbt Cloud IDE](/docs/cloud/dbt-cloud-ide/develop-in-the-cloud) -- On the command line interface using the [dbt Cloud CLI](/docs/cloud/cloud-cli-installation) or open-source [dbt Core](/docs/core/about-dbt-core), both of which enable you to execute dbt commands. The key distinction is the dbt Cloud CLI is tailored for dbt Cloud's infrastructure and integrates with all its [features](/docs/cloud/about-cloud/dbt-cloud-features). +- On the command line interface using the [dbt Cloud CLI](/docs/cloud/cloud-cli-installation) or open-source [dbt Core](/docs/core/installation-overview), both of which enable you to execute dbt commands. The key distinction is the dbt Cloud CLI is tailored for dbt Cloud's infrastructure and integrates with all its [features](/docs/cloud/about-cloud/dbt-cloud-features). The following sections outline the commands supported by dbt and their relevant flags. For information about selecting models on the command line, consult the docs on [Model selection syntax](/reference/node-selection/syntax). @@ -71,7 +71,7 @@ Use the following dbt commands in the [dbt Cloud IDE](/docs/cloud/dbt-cloud-ide/ -Use the following dbt commands in [dbt Core](/docs/core/about-dbt-core) and use the `dbt` prefix. For example, to run the `test` command, type `dbt test`. +Use the following dbt commands in [dbt Core](/docs/core/installation-overview) and use the `dbt` prefix. For example, to run the `test` command, type `dbt test`. - [build](/reference/commands/build): build and test all selected resources (models, seeds, snapshots, tests) - [clean](/reference/commands/clean): deletes artifacts present in the dbt project diff --git a/website/docs/reference/dbt-jinja-functions/config.md b/website/docs/reference/dbt-jinja-functions/config.md index c2fc8f96e5b..3903c82eef7 100644 --- a/website/docs/reference/dbt-jinja-functions/config.md +++ b/website/docs/reference/dbt-jinja-functions/config.md @@ -25,8 +25,6 @@ is responsible for handling model code that looks like this: ``` Review [Model configurations](/reference/model-configs) for examples and more information on valid arguments. -https://docs.getdbt.com/reference/model-configs - ## config.get __Args__: diff --git a/website/docs/reference/dbt-jinja-functions/return.md b/website/docs/reference/dbt-jinja-functions/return.md index 43bbddfa2d1..d2069bc9254 100644 --- a/website/docs/reference/dbt-jinja-functions/return.md +++ b/website/docs/reference/dbt-jinja-functions/return.md @@ -1,6 +1,6 @@ --- title: "About return function" -sidebar_variable: "return" +sidebar_label: "return" id: "return" description: "Read this guide to understand the return Jinja function in dbt." --- diff --git a/website/docs/reference/dbt_project.yml.md b/website/docs/reference/dbt_project.yml.md index 34af0f696c7..7b5d54c3e03 100644 --- a/website/docs/reference/dbt_project.yml.md +++ b/website/docs/reference/dbt_project.yml.md @@ -81,7 +81,7 @@ sources: [](source-configs) tests: - [](/reference/test-configs) + [](/reference/data-test-configs) vars: [](/docs/build/project-variables) @@ -153,7 +153,7 @@ sources: [](source-configs) tests: - [](/reference/test-configs) + [](/reference/data-test-configs) vars: [](/docs/build/project-variables) @@ -222,7 +222,7 @@ sources: [](source-configs) tests: - [](/reference/test-configs) + [](/reference/data-test-configs) vars: [](/docs/build/project-variables) diff --git a/website/docs/reference/model-properties.md b/website/docs/reference/model-properties.md index 63adc1f0d63..65f9307b5b3 100644 --- a/website/docs/reference/model-properties.md +++ b/website/docs/reference/model-properties.md @@ -23,9 +23,9 @@ models: [](/reference/model-configs): [constraints](/reference/resource-properties/constraints): - - [tests](/reference/resource-properties/tests): + [tests](/reference/resource-properties/data-tests): - - - ... # declare additional tests + - ... # declare additional data tests [columns](/reference/resource-properties/columns): - name: # required [description](/reference/resource-properties/description): @@ -33,9 +33,9 @@ models: [quote](/reference/resource-properties/quote): true | false [constraints](/reference/resource-properties/constraints): - - [tests](/reference/resource-properties/tests): + [tests](/reference/resource-properties/data-tests): - - - ... # declare additional tests + - ... # declare additional data tests [tags](/reference/resource-configs/tags): [] - name: ... # declare properties of additional columns @@ -51,9 +51,9 @@ models: - [config](/reference/resource-properties/config): [](/reference/model-configs): - [tests](/reference/resource-properties/tests): + [tests](/reference/resource-properties/data-tests): - - - ... # declare additional tests + - ... # declare additional data tests columns: # include/exclude columns from the top-level model properties - [include](/reference/resource-properties/include-exclude): @@ -63,9 +63,9 @@ models: [quote](/reference/resource-properties/quote): true | false [constraints](/reference/resource-properties/constraints): - - [tests](/reference/resource-properties/tests): + [tests](/reference/resource-properties/data-tests): - - - ... # declare additional tests + - ... # declare additional data tests [tags](/reference/resource-configs/tags): [] - v: ... # declare additional versions diff --git a/website/docs/reference/node-selection/defer.md b/website/docs/reference/node-selection/defer.md index 03c3b2aac12..81a0f4a0328 100644 --- a/website/docs/reference/node-selection/defer.md +++ b/website/docs/reference/node-selection/defer.md @@ -234,3 +234,8 @@ dbt will check to see if `dev_alice.model_a` exists. If it doesn't exist, dbt wi + +## Related docs + +- [Using defer in dbt Cloud](/docs/cloud/about-cloud-develop-defer) + diff --git a/website/docs/reference/node-selection/methods.md b/website/docs/reference/node-selection/methods.md index e29612e3401..2ffe0ea599e 100644 --- a/website/docs/reference/node-selection/methods.md +++ b/website/docs/reference/node-selection/methods.md @@ -173,7 +173,7 @@ dbt test --select "test_type:singular" # run all singular tests The `test_name` method is used to select tests based on the name of the generic test that defines it. For more information about how generic tests are defined, read about -[tests](/docs/build/tests). +[tests](/docs/build/data-tests). ```bash diff --git a/website/docs/reference/node-selection/syntax.md b/website/docs/reference/node-selection/syntax.md index d0ea4a9acd8..22946903b7d 100644 --- a/website/docs/reference/node-selection/syntax.md +++ b/website/docs/reference/node-selection/syntax.md @@ -35,7 +35,7 @@ To follow [POSIX standards](https://pubs.opengroup.org/onlinepubs/9699919799/bas 3. dbt now has a list of still-selected resources of varying types. As a final step, it tosses away any resource that does not match the resource type of the current task. (Only seeds are kept for `dbt seed`, only models for `dbt run`, only tests for `dbt test`, and so on.) -### Shorthand +## Shorthand Select resources to build (run, test, seed, snapshot) or check freshness: `--select`, `-s` @@ -43,6 +43,9 @@ Select resources to build (run, test, seed, snapshot) or check freshness: `--sel By default, `dbt run` will execute _all_ of the models in the dependency graph. During development (and deployment), it is useful to specify only a subset of models to run. Use the `--select` flag with `dbt run` to select a subset of models to run. Note that the following arguments (`--select`, `--exclude`, and `--selector`) also apply to other dbt tasks, such as `test` and `build`. + + + The `--select` flag accepts one or more arguments. Each argument can be one of: 1. a package name @@ -52,8 +55,7 @@ The `--select` flag accepts one or more arguments. Each argument can be one of: Examples: - - ```bash +```bash dbt run --select "my_dbt_project_name" # runs all models in your project dbt run --select "my_dbt_model" # runs a specific model dbt run --select "path.to.my.models" # runs all models in a specific directory @@ -61,14 +63,30 @@ dbt run --select "my_package.some_model" # run a specific model in a specific pa dbt run --select "tag:nightly" # run models with the "nightly" tag dbt run --select "path/to/models" # run models contained in path/to/models dbt run --select "path/to/my_model.sql" # run a specific model by its path - ``` +``` -dbt supports a shorthand language for defining subsets of nodes. This language uses the characters `+`, `@`, `*`, and `,`. + + - ```bash +dbt supports a shorthand language for defining subsets of nodes. This language uses the following characters: + +- plus operator [(`+`)](/reference/node-selection/graph-operators#the-plus-operator) +- at operator [(`@`)](/reference/node-selection/graph-operators#the-at-operator) +- asterisk operator (`*`) +- comma operator (`,`) + +Examples: + +```bash # multiple arguments can be provided to --select - dbt run --select "my_first_model my_second_model" +dbt run --select "my_first_model my_second_model" + +# select my_model and all of its children +dbt run --select "my_model+" + +# select my_model, its children, and the parents of its children +dbt run --models @my_model # these arguments can be projects, models, directory paths, tags, or sources dbt run --select "tag:nightly my_model finance.base.*" @@ -77,6 +95,10 @@ dbt run --select "tag:nightly my_model finance.base.*" dbt run --select "path:marts/finance,tag:nightly,config.materialized:table" ``` + + + + As your selection logic gets more complex, and becomes unwieldly to type out as command-line arguments, consider using a [yaml selector](/reference/node-selection/yaml-selectors). You can use a predefined definition with the `--selector` flag. Note that when you're using `--selector`, most other flags (namely `--select` and `--exclude`) will be ignored. diff --git a/website/docs/reference/project-configs/test-paths.md b/website/docs/reference/project-configs/test-paths.md index e3d0e0b76fa..59f17db05eb 100644 --- a/website/docs/reference/project-configs/test-paths.md +++ b/website/docs/reference/project-configs/test-paths.md @@ -13,7 +13,7 @@ test-paths: [directorypath] ## Definition -Optionally specify a custom list of directories where [singular tests](/docs/build/tests) are located. +Optionally specify a custom list of directories where [singular tests](/docs/build/data-tests) are located. ## Default diff --git a/website/docs/reference/resource-configs/access.md b/website/docs/reference/resource-configs/access.md index da50e48d2f0..fcde93647a1 100644 --- a/website/docs/reference/resource-configs/access.md +++ b/website/docs/reference/resource-configs/access.md @@ -27,7 +27,7 @@ You can apply access modifiers in config files, including `the dbt_project.yml`, There are multiple approaches to configuring access: -In the model configs of `dbt_project.yml``: +In the model configs of `dbt_project.yml`: ```yaml models: diff --git a/website/docs/reference/resource-configs/alias.md b/website/docs/reference/resource-configs/alias.md index 6b7588ecaf7..e1d3ae41f8b 100644 --- a/website/docs/reference/resource-configs/alias.md +++ b/website/docs/reference/resource-configs/alias.md @@ -112,7 +112,7 @@ When using `--store-failures`, this would return the name `analytics.finance.ord ## Definition -Optionally specify a custom alias for a [model](/docs/build/models), [tests](/docs/build/tests), [snapshots](/docs/build/snapshots), or [seed](/docs/build/seeds). +Optionally specify a custom alias for a [model](/docs/build/models), [data test](/docs/build/data-tests), [snapshot](/docs/build/snapshots), or [seed](/docs/build/seeds). When dbt creates a relation (/) in a database, it creates it as: `{{ database }}.{{ schema }}.{{ identifier }}`, e.g. `analytics.finance.payments` diff --git a/website/docs/reference/resource-configs/bigquery-configs.md b/website/docs/reference/resource-configs/bigquery-configs.md index ffbaa37c059..d3497a02caf 100644 --- a/website/docs/reference/resource-configs/bigquery-configs.md +++ b/website/docs/reference/resource-configs/bigquery-configs.md @@ -718,3 +718,188 @@ Views with this configuration will be able to select from objects in `project_1. The `grant_access_to` config is not thread-safe when multiple views need to be authorized for the same dataset. The initial `dbt run` operation after a new `grant_access_to` config is added should therefore be executed in a single thread. Subsequent runs using the same configuration will not attempt to re-apply existing access grants, and can make use of multiple threads. + + +## Materialized views + +The BigQuery adapter supports [materialized views](https://cloud.google.com/bigquery/docs/materialized-views-intro) +with the following configuration parameters: + +| Parameter | Type | Required | Default | Change Monitoring Support | +|-------------------------------------------------------------|------------------------|----------|---------|---------------------------| +| `on_configuration_change` | `` | no | `apply` | n/a | +| [`cluster_by`](#clustering-clause) | `[]` | no | `none` | drop/create | +| [`partition_by`](#partition-clause) | `{}` | no | `none` | drop/create | +| [`enable_refresh`](#auto-refresh) | `` | no | `true` | alter | +| [`refresh_interval_minutes`](#auto-refresh) | `` | no | `30` | alter | +| [`max_staleness`](#auto-refresh) (in Preview) | `` | no | `none` | alter | +| [`description`](/reference/resource-properties/description) | `` | no | `none` | alter | +| [`labels`](#specifying-labels) | `{: }` | no | `none` | alter | +| [`hours_to_expiration`](#controlling-table-expiration) | `` | no | `none` | alter | +| [`kms_key_name`](#using-kms-encryption) | `` | no | `none` | alter | + + + + + + + + +```yaml +models: + [](/reference/resource-configs/resource-path): + [+](/reference/resource-configs/plus-prefix)[materialized](/reference/resource-configs/materialized): materialized_view + [+](/reference/resource-configs/plus-prefix)on_configuration_change: apply | continue | fail + [+](/reference/resource-configs/plus-prefix)[cluster_by](#clustering-clause): | [] + [+](/reference/resource-configs/plus-prefix)[partition_by](#partition-clause): + - field: + - data_type: timestamp | date | datetime | int64 + # only if `data_type` is not 'int64' + - granularity: hour | day | month | year + # only if `data_type` is 'int64' + - range: + - start: + - end: + - interval: + [+](/reference/resource-configs/plus-prefix)[enable_refresh](#auto-refresh): true | false + [+](/reference/resource-configs/plus-prefix)[refresh_interval_minutes](#auto-refresh): + [+](/reference/resource-configs/plus-prefix)[max_staleness](#auto-refresh): + [+](/reference/resource-configs/plus-prefix)[description](/reference/resource-properties/description): + [+](/reference/resource-configs/plus-prefix)[labels](#specifying-labels): {: } + [+](/reference/resource-configs/plus-prefix)[hours_to_expiration](#acontrolling-table-expiration): + [+](/reference/resource-configs/plus-prefix)[kms_key_name](##using-kms-encryption): +``` + + + + + + + + + + +```yaml +version: 2 + +models: + - name: [] + config: + [materialized](/reference/resource-configs/materialized): materialized_view + on_configuration_change: apply | continue | fail + [cluster_by](#clustering-clause): | [] + [partition_by](#partition-clause): + - field: + - data_type: timestamp | date | datetime | int64 + # only if `data_type` is not 'int64' + - granularity: hour | day | month | year + # only if `data_type` is 'int64' + - range: + - start: + - end: + - interval: + [enable_refresh](#auto-refresh): true | false + [refresh_interval_minutes](#auto-refresh): + [max_staleness](#auto-refresh): + [description](/reference/resource-properties/description): + [labels](#specifying-labels): {: } + [hours_to_expiration](#acontrolling-table-expiration): + [kms_key_name](##using-kms-encryption): +``` + + + + + + + + + + +```jinja +{{ config( + [materialized](/reference/resource-configs/materialized)='materialized_view', + on_configuration_change="apply" | "continue" | "fail", + [cluster_by](#clustering-clause)="" | [""], + [partition_by](#partition-clause)={ + "field": "", + "data_type": "timestamp" | "date" | "datetime" | "int64", + + # only if `data_type` is not 'int64' + "granularity": "hour" | "day" | "month" | "year, + + # only if `data_type` is 'int64' + "range": { + "start": , + "end": , + "interval": , + } + }, + + # auto-refresh options + [enable_refresh](#auto-refresh)= true | false, + [refresh_interval_minutes](#auto-refresh)=, + [max_staleness](#auto-refresh)="", + + # additional options + [description](/reference/resource-properties/description)="", + [labels](#specifying-labels)={ + "": "", + }, + [hours_to_expiration](#acontrolling-table-expiration)=, + [kms_key_name](##using-kms-encryption)="", +) }} +``` + + + + + + + +Many of these parameters correspond to their table counterparts and have been linked above. +The set of parameters unique to materialized views covers [auto-refresh functionality](#auto-refresh). + +Find more information about these parameters in the BigQuery docs: +- [CREATE MATERIALIZED VIEW statement](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#create_materialized_view_statement) +- [materialized_view_option_list](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#materialized_view_option_list) + +### Auto-refresh + +| Parameter | Type | Required | Default | Change Monitoring Support | +|------------------------------|--------------|----------|---------|---------------------------| +| `enable_refresh` | `` | no | `true` | alter | +| `refresh_interval_minutes` | `` | no | `30` | alter | +| `max_staleness` (in Preview) | `` | no | `none` | alter | + +BigQuery supports [automatic refresh](https://cloud.google.com/bigquery/docs/materialized-views-manage#automatic_refresh) configuration for materialized views. +By default, a materialized view will automatically refresh within 5 minutes of changes in the base table, but not more frequently than once every 30 minutes. +BigQuery only officially supports the configuration of the frequency (the "once every 30 minutes" frequency); +however, there is a feature in preview that allows for the configuration of the staleness (the "5 minutes" refresh). +dbt will monitor these parameters for changes and apply them using an `ALTER` statement. + +Find more information about these parameters in the BigQuery docs: +- [materialized_view_option_list](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#materialized_view_option_list) +- [max_staleness](https://cloud.google.com/bigquery/docs/materialized-views-create#max_staleness) + +### Limitations + +As with most data platforms, there are limitations associated with materialized views. Some worth noting include: + +- Materialized view SQL has a [limited feature set](https://cloud.google.com/bigquery/docs/materialized-views-create#supported-mvs). +- Materialized view SQL cannot be updated; the materialized view must go through a `--full-refresh` (DROP/CREATE). +- The `partition_by` clause on a materialized view must match that of the underlying base table. +- While materialized views can have descriptions, materialized view *columns* cannot. +- Recreating/dropping the base table requires recreating/dropping the materialized view. + +Find more information about materialized view limitations in Google's BigQuery [docs](https://cloud.google.com/bigquery/docs/materialized-views-intro#limitations). + + diff --git a/website/docs/reference/resource-configs/database.md b/website/docs/reference/resource-configs/database.md index 7d91358ff01..19c9eca272d 100644 --- a/website/docs/reference/resource-configs/database.md +++ b/website/docs/reference/resource-configs/database.md @@ -70,7 +70,7 @@ This would result in the generated relation being located in the `reporting` dat ## Definition -Optionally specify a custom database for a [model](/docs/build/sql-models), [seed](/docs/build/seeds), or [tests](/docs/build/tests). (To specify a database for a [snapshot](/docs/build/snapshots), use the [`target_database` config](/reference/resource-configs/target_database)). +Optionally specify a custom database for a [model](/docs/build/sql-models), [seed](/docs/build/seeds), or [data test](/docs/build/data-tests). (To specify a database for a [snapshot](/docs/build/snapshots), use the [`target_database` config](/reference/resource-configs/target_database)). When dbt creates a relation (/) in a database, it creates it as: `{{ database }}.{{ schema }}.{{ identifier }}`, e.g. `analytics.finance.payments` diff --git a/website/docs/reference/resource-configs/databricks-configs.md b/website/docs/reference/resource-configs/databricks-configs.md index 65c6607cdcd..677cad57ce6 100644 --- a/website/docs/reference/resource-configs/databricks-configs.md +++ b/website/docs/reference/resource-configs/databricks-configs.md @@ -38,9 +38,9 @@ When materializing a model as `table`, you may include several optional configs ## Incremental models dbt-databricks plugin leans heavily on the [`incremental_strategy` config](/docs/build/incremental-models#about-incremental_strategy). This config tells the incremental materialization how to build models in runs beyond their first. It can be set to one of four values: - - **`append`** (default): Insert new records without updating or overwriting any existing data. + - **`append`**: Insert new records without updating or overwriting any existing data. - **`insert_overwrite`**: If `partition_by` is specified, overwrite partitions in the with new data. If no `partition_by` is specified, overwrite the entire table with new data. - - **`merge`** (Delta and Hudi file format only): Match records based on a `unique_key`, updating old records, and inserting new ones. (If no `unique_key` is specified, all new data is inserted, similar to `append`.) + - **`merge`** (default; Delta and Hudi file format only): Match records based on a `unique_key`, updating old records, and inserting new ones. (If no `unique_key` is specified, all new data is inserted, similar to `append`.) - **`replace_where`** (Delta file format only): Match records based on `incremental_predicates`, replacing all records that match the predicates from the existing table with records matching the predicates from the new data. (If no `incremental_predicates` are specified, all new data is inserted, similar to `append`.) Each of these strategies has its pros and cons, which we'll discuss below. As with any model config, `incremental_strategy` may be specified in `dbt_project.yml` or within a model file's `config()` block. @@ -49,8 +49,6 @@ Each of these strategies has its pros and cons, which we'll discuss below. As wi Following the `append` strategy, dbt will perform an `insert into` statement with all new data. The appeal of this strategy is that it is straightforward and functional across all platforms, file types, connection methods, and Apache Spark versions. However, this strategy _cannot_ update, overwrite, or delete existing data, so it is likely to insert duplicate records for many data sources. -Specifying `append` as the incremental strategy is optional, since it's the default strategy used when none is specified. - + + +## Selecting compute per model + +Beginning in version 1.7.2, you can assign which compute resource to use on a per-model basis. +For SQL models, you can select a SQL Warehouse (serverless or provisioned) or an all purpose cluster. +For details on how this feature interacts with python models, see [Specifying compute for Python models](#specifying-compute-for-python-models). + +:::note + +This is an optional setting. If you do not configure this as shown below, we will default to the compute specified by http_path in the top level of the output section in your profile. +This is also the compute that will be used for tasks not associated with a particular model, such as gathering metadata for all tables in a schema. + +::: + + +To take advantage of this capability, you will need to add compute blocks to your profile: + + + +```yaml + +: + target: # this is the default target + outputs: + : + type: databricks + catalog: [optional catalog name if you are using Unity Catalog] + schema: [schema name] # Required + host: [yourorg.databrickshost.com] # Required + + ### This path is used as the default compute + http_path: [/sql/your/http/path] # Required + + ### New compute section + compute: + + ### Name that you will use to refer to an alternate compute + Compute1: + http_path: [‘/sql/your/http/path’] # Required of each alternate compute + + ### A third named compute, use whatever name you like + Compute2: + http_path: [‘/some/other/path’] # Required of each alternate compute + ... + + : # additional targets + ... + ### For each target, you need to define the same compute, + ### but you can specify different paths + compute: + + ### Name that you will use to refer to an alternate compute + Compute1: + http_path: [‘/sql/your/http/path’] # Required of each alternate compute + + ### A third named compute, use whatever name you like + Compute2: + http_path: [‘/some/other/path’] # Required of each alternate compute + ... + +``` + + + +The new compute section is a map of user chosen names to objects with an http_path property. +Each compute is keyed by a name which is used in the model definition/configuration to indicate which compute you wish to use for that model/selection of models. +We recommend choosing a name that is easily recognized as the compute resources you're using, such as the name of the compute resource inside the Databricks UI. + +:::note + +You need to use the same set of names for compute across your outputs, though you may supply different http_paths, allowing you to use different computes in different deployment scenarios. + +::: + +To configure this inside of dbt Cloud, use the [extended attributes feature](/docs/dbt-cloud-environments#extended-attributes-) on the desired environments: + +```yaml + +compute: + Compute1: + http_path:[`/some/other/path'] + Compute2: + http_path:[`/some/other/path'] + +``` + +### Specifying the compute for models + +As with many other configuaration options, you can specify the compute for a model in multiple ways, using `databricks_compute`. +In your `dbt_project.yml`, the selected compute can be specified for all the models in a given directory: + + + +```yaml + +... + +models: + +databricks_compute: "Compute1" # use the `Compute1` warehouse/cluster for all models in the project... + my_project: + clickstream: + +databricks_compute: "Compute2" # ...except for the models in the `clickstream` folder, which will use `Compute2`. + +snapshots: + +databricks_compute: "Compute1" # all Snapshot models are configured to use `Compute1`. + +``` + + + +For an individual model the compute can be specified in the model config in your schema file. + + + +```yaml + +models: + - name: table_model + config: + databricks_compute: Compute1 + columns: + - name: id + data_type: int + +``` + + + + +Alternatively the warehouse can be specified in the config block of a model's SQL file. + + + +```sql + +{{ + config( + materialized='table', + databricks_compute='Compute1' + ) +}} +select * from {{ ref('seed') }} + +``` + + + + +To validate that the specified compute is being used, look for lines in your dbt.log like: + +``` +Databricks adapter ... using default compute resource. +``` + +or + +``` +Databricks adapter ... using compute resource . +``` + +### Specifying compute for Python models + +Materializing a python model requires execution of SQL as well as python. +Specifically, if your python model is incremental, the current execution pattern involves executing python to create a staging table that is then merged into your target table using SQL. +The python code needs to run on an all purpose cluster, while the SQL code can run on an all purpose cluster or a SQL Warehouse. +When you specify your `databricks_compute` for a python model, you are currently only specifying which compute to use when running the model-specific SQL. +If you wish to use a different compute for executing the python itself, you must specify an alternate `http_path` in the config for the model. Please note that declaring a separate SQL compute and a python compute for your python dbt models is optional. If you wish to do this: + + + + ```python + +def model(dbt, session): + dbt.config( + http_path="sql/protocolv1/..." + ) + +``` + + + +If your default compute is a SQL Warehouse, you will need to specify an all purpose cluster `http_path` in this way. + + ## Persisting model descriptions diff --git a/website/docs/reference/resource-configs/infer-configs.md b/website/docs/reference/resource-configs/infer-configs.md new file mode 100644 index 00000000000..c4837806935 --- /dev/null +++ b/website/docs/reference/resource-configs/infer-configs.md @@ -0,0 +1,39 @@ +--- +title: "Infer configurations" +description: "Read this guide to understand how to configure Infer with dbt." +id: "infer-configs" +--- + + +## Authentication + +To connect to Infer from your dbt instance you need to set up a correct profile in your `profiles.yml`. + +The format of this should look like this: + + + +```yaml +: + target: + outputs: + : + type: infer + url: "" + username: "" + apikey: "" + data_config: + [configuration for your underlying data warehouse] +``` + + + +### Description of Infer Profile Fields + +| Field | Required | Description | +|------------|----------|---------------------------------------------------------------------------------------------------------------------------------------------------| +| `type` | Yes | Must be set to `infer`. This must be included either in `profiles.yml` or in the `dbt_project.yml` file. | +| `url` | Yes | The host name of the Infer server to connect to. Typically this is `https://app.getinfer.io`. | +| `username` | Yes | Your Infer username - the one you use to login. | +| `apikey` | Yes | Your Infer api key. | +| `data_config` | Yes | The configuration for your underlying data warehouse. The format of this follows the format of the configuration for your data warehouse adapter. | diff --git a/website/docs/reference/resource-configs/meta.md b/website/docs/reference/resource-configs/meta.md index bc0c0c7c041..bd9755548ab 100644 --- a/website/docs/reference/resource-configs/meta.md +++ b/website/docs/reference/resource-configs/meta.md @@ -127,7 +127,7 @@ See [configs and properties](/reference/configs-and-properties) for details. -You can't add YAML `meta` configs for [generic tests](/docs/build/tests#generic-tests). However, you can add `meta` properties to [singular tests](/docs/build/tests#singular-tests) using `config()` at the top of the test file. +You can't add YAML `meta` configs for [generic tests](/docs/build/data-tests#generic-data-tests). However, you can add `meta` properties to [singular tests](/docs/build/data-tests#singular-data-tests) using `config()` at the top of the test file. diff --git a/website/docs/reference/resource-configs/postgres-configs.md b/website/docs/reference/resource-configs/postgres-configs.md index 97a695ee12e..fcc0d91a47c 100644 --- a/website/docs/reference/resource-configs/postgres-configs.md +++ b/website/docs/reference/resource-configs/postgres-configs.md @@ -10,16 +10,16 @@ In dbt-postgres, the following incremental materialization strategies are suppor -- `append` (default) -- `delete+insert` +- `append` (default when `unique_key` is not defined) +- `delete+insert` (default when `unique_key` is defined) -- `append` (default) +- `append` (default when `unique_key` is not defined) - `merge` -- `delete+insert` +- `delete+insert` (default when `unique_key` is defined) @@ -108,21 +108,100 @@ models: ## Materialized views -The Postgres adapter supports [materialized views](https://www.postgresql.org/docs/current/rules-materializedviews.html). -Indexes are the only configuration that is specific to `dbt-postgres`. -The remaining configuration follows the general [materialized view](/docs/build/materializations#materialized-view) configuration. -There are also some limitations that we hope to address in the next version. +The Postgres adapter supports [materialized views](https://www.postgresql.org/docs/current/rules-materializedviews.html) +with the following configuration parameters: -### Monitored configuration changes +| Parameter | Type | Required | Default | Change Monitoring Support | +|---------------------------|--------------------|----------|---------|---------------------------| +| `on_configuration_change` | `` | no | `apply` | n/a | +| [`indexes`](#indexes) | `[{}]` | no | `none` | alter | -The settings below are monitored for changes applicable to `on_configuration_change`. + -#### Indexes -Index changes (`CREATE`, `DROP`) can be applied without the need to rebuild the materialized view. -This differs from a table model, where the table needs to be dropped and re-created to update the indexes. -If the `indexes` portion of the `config` block is updated, the changes will be detected and applied -directly to the materialized view in place. + + + + +```yaml +models: + [](/reference/resource-configs/resource-path): + [+](/reference/resource-configs/plus-prefix)[materialized](/reference/resource-configs/materialized): materialized_view + [+](/reference/resource-configs/plus-prefix)on_configuration_change: apply | continue | fail + [+](/reference/resource-configs/plus-prefix)[indexes](#indexes): + - columns: [] + unique: true | false + type: hash | btree +``` + + + + + + + + + + +```yaml +version: 2 + +models: + - name: [] + config: + [materialized](/reference/resource-configs/materialized): materialized_view + on_configuration_change: apply | continue | fail + [indexes](#indexes): + - columns: [] + unique: true | false + type: hash | btree +``` + + + + + + + + + + +```jinja +{{ config( + [materialized](/reference/resource-configs/materialized)="materialized_view", + on_configuration_change="apply" | "continue" | "fail", + [indexes](#indexes)=[ + { + "columns": [""], + "unique": true | false, + "type": "hash" | "btree", + } + ] +) }} +``` + + + + + + + +The [`indexes`](#indexes) parameter corresponds to that of a table, as explained above. +It's worth noting that, unlike tables, dbt monitors this parameter for changes and applies the changes without dropping the materialized view. +This happens via a `DROP/CREATE` of the indexes, which can be thought of as an `ALTER` of the materialized view. + +Find more information about materialized view parameters in the Postgres docs: +- [CREATE MATERIALIZED VIEW](https://www.postgresql.org/docs/current/sql-creatematerializedview.html) + + ### Limitations @@ -138,3 +217,5 @@ If the user changes the model's config to `materialized="materialized_view"`, th The solution is to execute `DROP TABLE my_model` on the data warehouse before trying the model again. + + diff --git a/website/docs/reference/resource-configs/redshift-configs.md b/website/docs/reference/resource-configs/redshift-configs.md index 9bd127a1e1a..85b2af0c552 100644 --- a/website/docs/reference/resource-configs/redshift-configs.md +++ b/website/docs/reference/resource-configs/redshift-configs.md @@ -16,16 +16,16 @@ In dbt-redshift, the following incremental materialization strategies are suppor -- `append` (default) -- `delete+insert` - +- `append` (default when `unique_key` is not defined) +- `delete+insert` (default when `unique_key` is defined) + -- `append` (default) +- `append` (default when `unique_key` is not defined) - `merge` -- `delete+insert` +- `delete+insert` (default when `unique_key` is defined) @@ -111,40 +111,138 @@ models: ## Materialized views -The Redshift adapter supports [materialized views](https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-overview.html). -Redshift-specific configuration includes the typical `dist`, `sort_type`, `sort`, and `backup`. -For materialized views, there is also the `auto_refresh` setting, which allows Redshift to [automatically refresh](https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-refresh.html) the materialized view for you. -The remaining configuration follows the general [materialized view](/docs/build/materializations#Materialized-View) configuration. -There are also some limitations that we hope to address in the next version. +The Redshift adapter supports [materialized views](https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-overview.html) +with the following configuration parameters: + +| Parameter | Type | Required | Default | Change Monitoring Support | +|-------------------------------------------|--------------|----------|------------------------------------------------|---------------------------| +| `on_configuration_change` | `` | no | `apply` | n/a | +| [`dist`](#using-sortkey-and-distkey) | `` | no | `even` | drop/create | +| [`sort`](#using-sortkey-and-distkey) | `[]` | no | `none` | drop/create | +| [`sort_type`](#using-sortkey-and-distkey) | `` | no | `auto` if no `sort`
`compound` if `sort` | drop/create | +| [`auto_refresh`](#auto-refresh) | `` | no | `false` | alter | +| [`backup`](#backup) | `` | no | `true` | n/a | + + + + + + + + +```yaml +models: + [](/reference/resource-configs/resource-path): + [+](/reference/resource-configs/plus-prefix)[materialized](/reference/resource-configs/materialized): materialized_view + [+](/reference/resource-configs/plus-prefix)on_configuration_change: apply | continue | fail + [+](/reference/resource-configs/plus-prefix)[dist](#using-sortkey-and-distkey): all | auto | even | + [+](/reference/resource-configs/plus-prefix)[sort](#using-sortkey-and-distkey): | [] + [+](/reference/resource-configs/plus-prefix)[sort_type](#using-sortkey-and-distkey): auto | compound | interleaved + [+](/reference/resource-configs/plus-prefix)[auto_refresh](#auto-refresh): true | false + [+](/reference/resource-configs/plus-prefix)[backup](#backup): true | false +``` + + + + + + + + + + +```yaml +version: 2 + +models: + - name: [] + config: + [materialized](/reference/resource-configs/materialized): materialized_view + on_configuration_change: apply | continue | fail + [dist](#using-sortkey-and-distkey): all | auto | even | + [sort](#using-sortkey-and-distkey): | [] + [sort_type](#using-sortkey-and-distkey): auto | compound | interleaved + [auto_refresh](#auto-refresh): true | false + [backup](#backup): true | false +``` + + + + + + + + + + +```jinja +{{ config( + [materialized](/reference/resource-configs/materialized)="materialized_view", + on_configuration_change="apply" | "continue" | "fail", + [dist](#using-sortkey-and-distkey)="all" | "auto" | "even" | "", + [sort](#using-sortkey-and-distkey)=[""], + [sort_type](#using-sortkey-and-distkey)="auto" | "compound" | "interleaved", + [auto_refresh](#auto-refresh)=true | false, + [backup](#backup)=true | false, +) }} +``` -### Monitored configuration changes + + + + + -The settings below are monitored for changes applicable to `on_configuration_change`. +Many of these parameters correspond to their table counterparts and have been linked above. +The parameters unique to materialized views are the [auto-refresh](#auto-refresh) and [backup](#backup) functionality, which are covered below. -#### Dist +Find more information about the [CREATE MATERIALIZED VIEW](https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-create-sql-command.html) parameters in the Redshift docs. -Changes to `dist` will result in a full refresh of the existing materialized view (applied at the time of the next `dbt run` of the model). Redshift requires a materialized view to be -dropped and recreated to apply a change to the `distkey` or `diststyle`. +#### Auto-refresh -#### Sort type, sort +| Parameter | Type | Required | Default | Change Monitoring Support | +|----------------|-------------|----------|---------|---------------------------| +| `auto_refresh` | `` | no | `false` | alter | -Changes to `sort_type` or `sort` will result in a full refresh. Redshift requires a materialized -view to be dropped and recreated to apply a change to the `sortkey` or `sortstyle`. +Redshift supports [automatic refresh](https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-refresh.html#materialized-view-auto-refresh) configuration for materialized views. +By default, a materialized view does not automatically refresh. +dbt monitors this parameter for changes and applies them using an `ALTER` statement. + +Learn more information about the [parameters](https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-create-sql-command.html#mv_CREATE_MATERIALIZED_VIEW-parameters) in the Redshift docs. #### Backup -Changes to `backup` will result in a full refresh. Redshift requires a materialized -view to be dropped and recreated to apply a change to the `backup` setting. +| Parameter | Type | Required | Default | Change Monitoring Support | +|-----------|-------------|----------|---------|---------------------------| +| `backup` | `` | no | `true` | n/a | -#### Auto refresh +Redshift supports [backup](https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-snapshots.html) configuration of clusters at the object level. +This parameter identifies if the materialized view should be backed up as part of the cluster snapshot. +By default, a materialized view will be backed up during a cluster snapshot. +dbt cannot monitor this parameter as it is not queryable within Redshift. +If the value is changed, the materialized view will need to go through a `--full-refresh` in order to set it. -The `auto_refresh` setting can be updated via an `ALTER` statement. This setting effectively toggles -automatic refreshes on or off. The default setting for this config is off (`False`). If this -is the only configuration change for the materialized view, dbt will choose to apply -an `ALTER` statement instead of issuing a full refresh, +Learn more about the [parameters](https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-create-sql-command.html#mv_CREATE_MATERIALIZED_VIEW-parameters) in the Redshift docs. ### Limitations +As with most data platforms, there are limitations associated with materialized views. Some worth noting include: + +- Materialized views cannot reference views, temporary tables, user-defined functions, or late-binding tables. +- Auto-refresh cannot be used if the materialized view references mutable functions, external schemas, or another materialized view. + +Find more information about materialized view limitations in Redshift's [docs](https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-create-sql-command.html#mv_CREATE_MATERIALIZED_VIEW-limitations). + + + #### Changing materialization from "materialized_view" to "table" or "view" Swapping a materialized view to a table or view is not supported. @@ -157,3 +255,5 @@ If the user changes the model's config to `materialized="table"`, they will get The workaround is to execute `DROP MATERIALIZED VIEW my_mv CASCADE` on the data warehouse before trying the model again. + + diff --git a/website/docs/reference/resource-configs/snowflake-configs.md b/website/docs/reference/resource-configs/snowflake-configs.md index 30c7966ec68..aa2bf370df6 100644 --- a/website/docs/reference/resource-configs/snowflake-configs.md +++ b/website/docs/reference/resource-configs/snowflake-configs.md @@ -346,82 +346,122 @@ In the configuration format for the model SQL file: ## Dynamic tables -The Snowflake adapter supports [dynamic tables](https://docs.snowflake.com/en/sql-reference/sql/create-dynamic-table). +The Snowflake adapter supports [dynamic tables](https://docs.snowflake.com/en/user-guide/dynamic-tables-about). This materialization is specific to Snowflake, which means that any model configuration that would normally come along for the ride from `dbt-core` (e.g. as with a `view`) may not be available for dynamic tables. This gap will decrease in future patches and versions. While this materialization is specific to Snowflake, it very much follows the implementation of [materialized views](/docs/build/materializations#Materialized-View). In particular, dynamic tables have access to the `on_configuration_change` setting. -There are also some limitations that we hope to address in the next version. +Dynamic tables are supported with the following configuration parameters: -### Parameters +| Parameter | Type | Required | Default | Change Monitoring Support | +|----------------------------------------------------------|------------|----------|---------|---------------------------| +| `on_configuration_change` | `` | no | `apply` | n/a | +| [`target_lag`](#target-lag) | `` | yes | | alter | +| [`snowflake_warehouse`](#configuring-virtual-warehouses) | `` | yes | | alter | -Dynamic tables in `dbt-snowflake` require the following parameters: -- `target_lag` -- `snowflake_warehouse` -- `on_configuration_change` + -To learn more about each parameter and what values it can take, see -the Snowflake docs page: [`CREATE DYNAMIC TABLE: Parameters`](https://docs.snowflake.com/en/sql-reference/sql/create-dynamic-table). -### Usage + -You can create a dynamic table by editing _one_ of these files: + -- the SQL file for your model -- the `dbt_project.yml` configuration file +```yaml +models: + [](/reference/resource-configs/resource-path): + [+](/reference/resource-configs/plus-prefix)[materialized](/reference/resource-configs/materialized): dynamic_table + [+](/reference/resource-configs/plus-prefix)on_configuration_change: apply | continue | fail + [+](/reference/resource-configs/plus-prefix)[target_lag](#target-lag): downstream | + [+](/reference/resource-configs/plus-prefix)[snowflake_warehouse](#configuring-virtual-warehouses): +``` -The following examples create a dynamic table: + - + -```sql -{{ config( - materialized = 'dynamic_table', - snowflake_warehouse = 'snowflake_warehouse', - target_lag = '10 minutes', -) }} -``` -
+ - + ```yaml +version: 2 + models: - path: - materialized: dynamic_table - snowflake_warehouse: snowflake_warehouse - target_lag: '10 minutes' + - name: [] + config: + [materialized](/reference/resource-configs/materialized): dynamic_table + on_configuration_change: apply | continue | fail + [target_lag](#target-lag): downstream | + [snowflake_warehouse](#configuring-virtual-warehouses): ``` -### Monitored configuration changes + -The settings below are monitored for changes applicable to `on_configuration_change`. -#### Target lag + -Changes to `target_lag` can be applied by running an `ALTER` statement. Refreshing is essentially -always on for dynamic tables; this setting changes how frequently the dynamic table is updated. + -#### Warehouse +```jinja +{{ config( + [materialized](/reference/resource-configs/materialized)="dynamic_table", + on_configuration_change="apply" | "continue" | "fail", + [target_lag](#target-lag)="downstream" | " seconds | minutes | hours | days", + [snowflake_warehouse](#configuring-virtual-warehouses)="", +) }} +``` + + + + + + + +Find more information about these parameters in Snowflake's [docs](https://docs.snowflake.com/en/sql-reference/sql/create-dynamic-table): -Changes to `snowflake_warehouse` can be applied via an `ALTER` statement. +### Target lag + +Snowflake allows two configuration scenarios for scheduling automatic refreshes: +- **Time-based** — Provide a value of the form ` { seconds | minutes | hours | days }`. For example, if the dynamic table needs to be updated every 30 minutes, use `target_lag='30 minutes'`. +- **Downstream** — Applicable when the dynamic table is referenced by other dynamic tables. In this scenario, `target_lag='downstream'` allows for refreshes to be controlled at the target, instead of at each layer. + +Find more information about `target_lag` in Snowflake's [docs](https://docs.snowflake.com/en/user-guide/dynamic-tables-refresh#understanding-target-lag). ### Limitations +As with materialized views on most data platforms, there are limitations associated with dynamic tables. Some worth noting include: + +- Dynamic table SQL has a [limited feature set](https://docs.snowflake.com/en/user-guide/dynamic-tables-tasks-create#query-constructs-not-currently-supported-in-dynamic-tables). +- Dynamic table SQL cannot be updated; the dynamic table must go through a `--full-refresh` (DROP/CREATE). +- Dynamic tables cannot be downstream from: materialized views, external tables, streams. +- Dynamic tables cannot reference a view that is downstream from another dynamic table. + +Find more information about dynamic table limitations in Snowflake's [docs](https://docs.snowflake.com/en/user-guide/dynamic-tables-tasks-create#dynamic-table-limitations-and-supported-functions). + + + #### Changing materialization to and from "dynamic_table" -Swapping an already materialized model to be a dynamic table and vice versa. -The workaround is manually dropping the existing materialization in the data warehouse prior to calling `dbt run`. -Normally, re-running with the `--full-refresh` flag would resolve this, but not in this case. -This would only need to be done once as the existing object would then be a dynamic table. +Version `1.6.x` does not support altering the materialization from a non-dynamic table be a dynamic table and vice versa. +Re-running with the `--full-refresh` does not resolve this either. +The workaround is manually dropping the existing model in the warehouse prior to calling `dbt run`. +This only needs to be done once for the conversion. For example, assume for the example model below, `my_model`, has already been materialized to the underlying data platform via `dbt run`. -If the user changes the model's config to `materialized="dynamic_table"`, they will get an error. +If the model config is updated to `materialized="dynamic_table"`, dbt will return an error. The workaround is to execute `DROP TABLE my_model` on the data warehouse before trying the model again. @@ -429,7 +469,7 @@ The workaround is to execute `DROP TABLE my_model` on the data warehouse before ```yaml {{ config( - materialized="table" # or any model type eg view, incremental + materialized="table" # or any model type (e.g. view, incremental) ) }} ``` @@ -437,3 +477,5 @@ The workaround is to execute `DROP TABLE my_model` on the data warehouse before + + diff --git a/website/docs/reference/resource-configs/store_failures_as.md b/website/docs/reference/resource-configs/store_failures_as.md index a9149360089..dd61030afb8 100644 --- a/website/docs/reference/resource-configs/store_failures_as.md +++ b/website/docs/reference/resource-configs/store_failures_as.md @@ -17,7 +17,7 @@ You can configure it in all the same places as `store_failures`, including singu #### Singular test -[Singular test](https://docs.getdbt.com/docs/build/tests#singular-tests) in `tests/singular/check_something.sql` file +[Singular test](https://docs.getdbt.com/docs/build/tests#singular-data-tests) in `tests/singular/check_something.sql` file ```sql {{ config(store_failures_as="table") }} @@ -29,7 +29,7 @@ where 1=0 #### Generic test -[Generic tests](https://docs.getdbt.com/docs/build/tests#generic-tests) in `models/_models.yml` file +[Generic tests](https://docs.getdbt.com/docs/build/tests#generic-data-tests) in `models/_models.yml` file ```yaml models: @@ -70,7 +70,7 @@ As with most other configurations, `store_failures_as` is "clobbered" when appli Additional resources: -- [Test configurations](/reference/test-configs#related-documentation) -- [Test-specific configurations](/reference/test-configs#test-specific-configurations) +- [Data test configurations](/reference/data-test-configs#related-documentation) +- [Data test-specific configurations](/reference/data-test-configs#test-data-specific-configurations) - [Configuring directories of models in dbt_project.yml](/reference/model-configs#configuring-directories-of-models-in-dbt_projectyml) - [Config inheritance](/reference/configs-and-properties#config-inheritance) \ No newline at end of file diff --git a/website/docs/reference/resource-properties/columns.md b/website/docs/reference/resource-properties/columns.md index ff8aa8734c6..74727977feb 100644 --- a/website/docs/reference/resource-properties/columns.md +++ b/website/docs/reference/resource-properties/columns.md @@ -28,7 +28,7 @@ models: data_type: [description](/reference/resource-properties/description): [quote](/reference/resource-properties/quote): true | false - [tests](/reference/resource-properties/tests): ... + [tests](/reference/resource-properties/data-tests): ... [tags](/reference/resource-configs/tags): ... [meta](/reference/resource-configs/meta): ... - name: @@ -55,7 +55,7 @@ sources: [description](/reference/resource-properties/description): data_type: [quote](/reference/resource-properties/quote): true | false - [tests](/reference/resource-properties/tests): ... + [tests](/reference/resource-properties/data-tests): ... [tags](/reference/resource-configs/tags): ... [meta](/reference/resource-configs/meta): ... - name: @@ -81,7 +81,7 @@ seeds: [description](/reference/resource-properties/description): data_type: [quote](/reference/resource-properties/quote): true | false - [tests](/reference/resource-properties/tests): ... + [tests](/reference/resource-properties/data-tests): ... [tags](/reference/resource-configs/tags): ... [meta](/reference/resource-configs/meta): ... - name: @@ -106,7 +106,7 @@ snapshots: [description](/reference/resource-properties/description): data_type: [quote](/reference/resource-properties/quote): true | false - [tests](/reference/resource-properties/tests): ... + [tests](/reference/resource-properties/data-tests): ... [tags](/reference/resource-configs/tags): ... [meta](/reference/resource-configs/meta): ... - name: diff --git a/website/docs/reference/resource-properties/config.md b/website/docs/reference/resource-properties/config.md index 55d2f64d9ff..89d189d8a78 100644 --- a/website/docs/reference/resource-properties/config.md +++ b/website/docs/reference/resource-properties/config.md @@ -98,7 +98,7 @@ version: 2 - [](#test_name): : config: - [](/reference/test-configs): + [](/reference/data-test-configs): ... ``` diff --git a/website/docs/reference/resource-properties/tests.md b/website/docs/reference/resource-properties/data-tests.md similarity index 83% rename from website/docs/reference/resource-properties/tests.md rename to website/docs/reference/resource-properties/data-tests.md index 0fe86ccc57d..ce557ebeb4f 100644 --- a/website/docs/reference/resource-properties/tests.md +++ b/website/docs/reference/resource-properties/data-tests.md @@ -1,8 +1,8 @@ --- -title: "About tests property" -sidebar_label: "tests" +title: "About data tests property" +sidebar_label: "Data tests" resource_types: all -datatype: test +datatype: data-test keywords: [test, tests, custom tests, custom test name, test name] --- @@ -30,7 +30,7 @@ models: - [](#test_name): : [config](/reference/resource-properties/config): - [](/reference/test-configs): + [](/reference/data-test-configs): [columns](/reference/resource-properties/columns): - name: @@ -39,7 +39,7 @@ models: - [](#test_name): : [config](/reference/resource-properties/config): - [](/reference/test-configs): + [](/reference/data-test-configs): ```
@@ -62,7 +62,7 @@ sources: - [](#test_name): : [config](/reference/resource-properties/config): - [](/reference/test-configs): + [](/reference/data-test-configs): columns: - name: @@ -71,7 +71,7 @@ sources: - [](#test_name): : [config](/reference/resource-properties/config): - [](/reference/test-configs): + [](/reference/data-test-configs): ``` @@ -93,7 +93,7 @@ seeds: - [](#test_name): : [config](/reference/resource-properties/config): - [](/reference/test-configs): + [](/reference/data-test-configs): columns: - name: @@ -102,7 +102,7 @@ seeds: - [](#test_name): : [config](/reference/resource-properties/config): - [](/reference/test-configs): + [](/reference/data-test-configs): ``` @@ -124,7 +124,7 @@ snapshots: - [](#test_name): : [config](/reference/resource-properties/config): - [](/reference/test-configs): + [](/reference/data-test-configs): columns: - name: @@ -133,7 +133,7 @@ snapshots: - [](#test_name): : [config](/reference/resource-properties/config): - [](/reference/test-configs): + [](/reference/data-test-configs): ``` @@ -152,17 +152,17 @@ This feature is not implemented for analyses. ## Related documentation -* [Testing guide](/docs/build/tests) +* [Data testing guide](/docs/build/data-tests) ## Description -The `tests` property defines assertions about a column, , or . The property contains a list of [generic tests](/docs/build/tests#generic-tests), referenced by name, which can include the four built-in generic tests available in dbt. For example, you can add tests that ensure a column contains no duplicates and zero null values. Any arguments or [configurations](/reference/test-configs) passed to those tests should be nested below the test name. +The data `tests` property defines assertions about a column, , or . The property contains a list of [generic tests](/docs/build/data-tests#generic-data-tests), referenced by name, which can include the four built-in generic tests available in dbt. For example, you can add tests that ensure a column contains no duplicates and zero null values. Any arguments or [configurations](/reference/data-test-configs) passed to those tests should be nested below the test name. Once these tests are defined, you can validate their correctness by running `dbt test`. -## Out-of-the-box tests +## Out-of-the-box data tests -There are four generic tests that are available out of the box, for everyone using dbt. +There are four generic data tests that are available out of the box, for everyone using dbt. ### `not_null` @@ -262,7 +262,7 @@ The `to` argument accepts a [Relation](/reference/dbt-classes#relation) – this ## Additional examples ### Test an expression -Some tests require multiple columns, so it doesn't make sense to nest them under the `columns:` key. In this case, you can apply the test to the model (or source, seed, or snapshot) instead: +Some data tests require multiple columns, so it doesn't make sense to nest them under the `columns:` key. In this case, you can apply the data test to the model (or source, seed, or snapshot) instead: @@ -300,7 +300,7 @@ models: Check out the guide on writing a [custom generic test](/best-practices/writing-custom-generic-tests) for more information. -### Custom test name +### Custom data test name By default, dbt will synthesize a name for your generic test by concatenating: - test name (`not_null`, `unique`, etc) @@ -434,11 +434,11 @@ $ dbt test 12:48:04 Done. PASS=2 WARN=0 ERROR=0 SKIP=0 TOTAL=2 ``` -**If using [`store_failures`](/reference/resource-configs/store_failures):** dbt uses each test's name as the name of the table in which to store any failing records. If you have defined a custom name for one test, that custom name will also be used for its table of failures. You may optionally configure an [`alias`](/reference/resource-configs/alias) for the test, to separately control both the name of the test (for metadata) and the name of its database table (for storing failures). +**If using [`store_failures`](/reference/resource-configs/store_failures):** dbt uses each data test's name as the name of the table in which to store any failing records. If you have defined a custom name for one test, that custom name will also be used for its table of failures. You may optionally configure an [`alias`](/reference/resource-configs/alias) for the test, to separately control both the name of the test (for metadata) and the name of its database table (for storing failures). ### Alternative format for defining tests -When defining a generic test with several arguments and configurations, the YAML can look and feel unwieldy. If you find it easier, you can define the same test properties as top-level keys of a single dictionary, by providing the test name as `test_name` instead. It's totally up to you. +When defining a generic data test with several arguments and configurations, the YAML can look and feel unwieldy. If you find it easier, you can define the same test properties as top-level keys of a single dictionary, by providing the test name as `test_name` instead. It's totally up to you. This example is identical to the one above: diff --git a/website/docs/reference/seed-properties.md b/website/docs/reference/seed-properties.md index 85e7be21ae1..9201df65f4c 100644 --- a/website/docs/reference/seed-properties.md +++ b/website/docs/reference/seed-properties.md @@ -18,7 +18,7 @@ seeds: show: true | false [config](/reference/resource-properties/config): [](/reference/seed-configs): - [tests](/reference/resource-properties/tests): + [tests](/reference/resource-properties/data-tests): - - ... # declare additional tests columns: @@ -27,7 +27,7 @@ seeds: [meta](/reference/resource-configs/meta): {} [quote](/reference/resource-properties/quote): true | false [tags](/reference/resource-configs/tags): [] - [tests](/reference/resource-properties/tests): + [tests](/reference/resource-properties/data-tests): - - ... # declare additional tests diff --git a/website/docs/reference/snapshot-properties.md b/website/docs/reference/snapshot-properties.md index 301747e9325..8f01fd8e988 100644 --- a/website/docs/reference/snapshot-properties.md +++ b/website/docs/reference/snapshot-properties.md @@ -22,7 +22,7 @@ snapshots: show: true | false [config](/reference/resource-properties/config): [](/reference/snapshot-configs): - [tests](/reference/resource-properties/tests): + [tests](/reference/resource-properties/data-tests): - - ... columns: @@ -31,7 +31,7 @@ snapshots: [meta](/reference/resource-configs/meta): {} [quote](/reference/resource-properties/quote): true | false [tags](/reference/resource-configs/tags): [] - [tests](/reference/resource-properties/tests): + [tests](/reference/resource-properties/data-tests): - - ... # declare additional tests - ... # declare properties of additional columns diff --git a/website/docs/reference/source-properties.md b/website/docs/reference/source-properties.md index d107881967e..aa95a19327c 100644 --- a/website/docs/reference/source-properties.md +++ b/website/docs/reference/source-properties.md @@ -57,7 +57,7 @@ sources: [meta](/reference/resource-configs/meta): {} [identifier](/reference/resource-properties/identifier): [loaded_at_field](/reference/resource-properties/freshness#loaded_at_field): - [tests](/reference/resource-properties/tests): + [tests](/reference/resource-properties/data-tests): - - ... # declare additional tests [tags](/reference/resource-configs/tags): [] @@ -80,7 +80,7 @@ sources: [description](/reference/resource-properties/description): [meta](/reference/resource-configs/meta): {} [quote](/reference/resource-properties/quote): true | false - [tests](/reference/resource-properties/tests): + [tests](/reference/resource-properties/data-tests): - - ... # declare additional tests [tags](/reference/resource-configs/tags): [] diff --git a/website/docs/terms/data-wrangling.md b/website/docs/terms/data-wrangling.md index 58034fe8e91..46a14a25949 100644 --- a/website/docs/terms/data-wrangling.md +++ b/website/docs/terms/data-wrangling.md @@ -51,7 +51,7 @@ The cleaning stage involves using different functions so that the values in your - Removing appropriate duplicates or nulls you found in the discovery process - Eliminating unnecessary characters or spaces from values -Certain cleaning steps, like removing rows with null values, are helpful to do at the beginning of the process because removing nulls and duplicates from the start can increase the performance of your downstream models. In the cleaning step, it’s important to follow a standard for your transformations here. This means you should be following a consistent naming convention for your columns (especially for your primary keys) and casting to the same timezone and datatypes throughout your models. Examples include making sure all dates are in UTC time rather than source timezone-specific, all string in either lower or upper case, etc. +Certain cleaning steps, like removing rows with null values, are helpful to do at the beginning of the process because removing nulls and duplicates from the start can increase the performance of your downstream models. In the cleaning step, it’s important to follow a standard for your transformations here. This means you should be following a consistent naming convention for your columns (especially for your primary keys) and casting to the same timezone and datatypes throughout your models. Examples include making sure all dates are in UTC time rather than source timezone-specific, all strings are in either lower or upper case, etc. :::tip dbt to the rescue! If you're struggling to do all the cleaning on your own, remember that dbt packages ([dbt expectations](https://github.com/calogica/dbt-expectations), [dbt_utils](https://hub.getdbt.com/dbt-labs/dbt_utils/latest/), and [re_data](https://www.getre.io/)) and their macros are also available to help you clean up your data. @@ -150,9 +150,9 @@ For nested data types such as JSON, you’ll want to check out the JSON parsing ### Validating -dbt offers [generic tests](/docs/build/tests#more-generic-tests) in every dbt project that allows you to validate accepted, unique, and null values. They also allow you to validate the relationships between tables and that the primary key is unique. +dbt offers [generic data tests](/docs/build/data-tests#more-generic-data-tests) in every dbt project that allows you to validate accepted, unique, and null values. They also allow you to validate the relationships between tables and that the primary key is unique. -If you can’t find what you need with the generic tests, you can download an additional dbt testing package called [dbt_expectations](https://hub.getdbt.com/calogica/dbt_expectations/0.1.2/) that dives even deeper into how you can test the values in your columns. This package has useful tests like `expect_column_values_to_be_in_type_list`, `expect_column_values_to_be_between`, and `expect_column_value_lengths_to_equal`. +If you can’t find what you need with the generic tests, you can download an additional dbt testing package called [dbt_expectations](https://hub.getdbt.com/calogica/dbt_expectations/0.1.2/) that dives even deeper into how you can test the values in your columns. This package has useful data tests like `expect_column_values_to_be_in_type_list`, `expect_column_values_to_be_between`, and `expect_column_value_lengths_to_equal`. ## Conclusion diff --git a/website/docs/terms/primary-key.md b/website/docs/terms/primary-key.md index 4acd1e8c46d..fde3ff44ac7 100644 --- a/website/docs/terms/primary-key.md +++ b/website/docs/terms/primary-key.md @@ -108,7 +108,7 @@ In general for Redshift, it’s still good practice to define your primary keys ### Google BigQuery -BigQuery is pretty unique here in that it doesn’t support or enforce primary keys. If your team is on BigQuery, you’ll need to have some [pretty solid testing](/docs/build/tests) in place to ensure your primary key fields are unique and non-null. +BigQuery is pretty unique here in that it doesn’t support or enforce primary keys. If your team is on BigQuery, you’ll need to have some [pretty solid data testing](/docs/build/data-tests) in place to ensure your primary key fields are unique and non-null. ### Databricks @@ -141,7 +141,7 @@ If you don't have a field in your table that would act as a natural primary key, If your data warehouse doesn’t provide out-of-the box support and enforcement for primary keys, it’s important to clearly label and put your own constraints on primary key fields. This could look like: * **Creating a consistent naming convention for your primary keys**: You may see an `id` field or fields prefixed with `pk_` (ex. `pk_order_id`) to identify primary keys. You may also see the primary key be named as the obvious table grain (ex. In the jaffle shop’s `orders` table, the primary key is called `order_id`). -* **Adding automated [tests](/docs/build/tests) to your data models**: Use a data tool, such as dbt, to create not null and unique tests for your primary key fields. +* **Adding automated [data tests](/docs/build/data-tests) to your data models**: Use a data tool, such as dbt, to create not null and unique tests for your primary key fields. ## Testing primary keys diff --git a/website/docs/terms/surrogate-key.md b/website/docs/terms/surrogate-key.md index e57a0b74a7f..1c4d7f21d57 100644 --- a/website/docs/terms/surrogate-key.md +++ b/website/docs/terms/surrogate-key.md @@ -177,7 +177,7 @@ After executing this, the table would now have the `unique_id` field now uniquel Amazing, you just made a surrogate key! You can just move on to the next data model, right? No!! It’s critically important to test your surrogate keys for uniqueness and non-null values to ensure that the correct fields were chosen to create the surrogate key. -In order to test for null and unique values you can utilize code-based tests like [dbt tests](/docs/build/tests), that can check fields for nullness and uniqueness. You can additionally utilize simple SQL queries or unit tests to check if surrogate key count and non-nullness is correct. +In order to test for null and unique values you can utilize code-based data tests like [dbt tests](/docs/build/data-tests), that can check fields for nullness and uniqueness. You can additionally utilize simple SQL queries or unit tests to check if surrogate key count and non-nullness is correct. ## A note on hashing algorithms diff --git a/website/docusaurus.config.js b/website/docusaurus.config.js index 13c284dd557..b4b758e7744 100644 --- a/website/docusaurus.config.js +++ b/website/docusaurus.config.js @@ -71,7 +71,7 @@ var siteSettings = { }, announcementBar: { id: "biweekly-demos", - content: "Join our weekly demos and dbt Cloud in action!", + content: "Join our weekly demos and see dbt Cloud in action!", backgroundColor: "#047377", textColor: "#fff", isCloseable: true, diff --git a/website/sidebars.js b/website/sidebars.js index 720b752ed41..8d7be07d491 100644 --- a/website/sidebars.js +++ b/website/sidebars.js @@ -1,6 +1,11 @@ const sidebarSettings = { docs: [ "docs/introduction", + { + type: "link", + label: "Guides", + href: `/guides`, + }, { type: "category", label: "Supported data platforms", @@ -27,12 +32,7 @@ const sidebarSettings = { "docs/cloud/about-cloud/browsers", ], }, // About dbt Cloud directory - { - type: "link", - label: "Guides", - href: `/guides`, - }, - { + { type: "category", label: "Set up dbt", collapsed: true, @@ -54,6 +54,7 @@ const sidebarSettings = { link: { type: "doc", id: "docs/cloud/connect-data-platform/about-connections" }, items: [ "docs/cloud/connect-data-platform/about-connections", + "docs/cloud/connect-data-platform/connect-microsoft-fabric", "docs/cloud/connect-data-platform/connect-starburst-trino", "docs/cloud/connect-data-platform/connect-snowflake", "docs/cloud/connect-data-platform/connect-bigquery", @@ -121,35 +122,6 @@ const sidebarSettings = { }, ], }, // Supported Git providers - { - type: "category", - label: "Develop in dbt Cloud", - link: { type: "doc", id: "docs/cloud/about-cloud-develop" }, - items: [ - "docs/cloud/about-cloud-develop", - "docs/cloud/about-cloud-develop-defer", - { - type: "category", - label: "dbt Cloud CLI", - link: { type: "doc", id: "docs/cloud/cloud-cli-installation" }, - items: [ - "docs/cloud/cloud-cli-installation", - "docs/cloud/configure-cloud-cli", - ], - }, - { - type: "category", - label: "dbt Cloud IDE", - link: { type: "doc", id: "docs/cloud/dbt-cloud-ide/develop-in-the-cloud" }, - items: [ - "docs/cloud/dbt-cloud-ide/develop-in-the-cloud", - "docs/cloud/dbt-cloud-ide/ide-user-interface", - "docs/cloud/dbt-cloud-ide/lint-format", - "docs/cloud/dbt-cloud-ide/dbt-cloud-tips", - ], - }, - ], - }, // dbt Cloud develop directory { type: "category", label: "Secure your tenant", @@ -162,6 +134,7 @@ const sidebarSettings = { "docs/cloud/secure/databricks-privatelink", "docs/cloud/secure/redshift-privatelink", "docs/cloud/secure/postgres-privatelink", + "docs/cloud/secure/vcs-privatelink", "docs/cloud/secure/ip-restrictions", ], }, // PrivateLink @@ -175,14 +148,13 @@ const sidebarSettings = { link: { type: "doc", id: "docs/core/about-core-setup" }, items: [ "docs/core/about-core-setup", - "docs/core/about-dbt-core", "docs/core/dbt-core-environments", { type: "category", - label: "Install dbt", - link: { type: "doc", id: "docs/core/installation" }, + label: "Install dbt Core", + link: { type: "doc", id: "docs/core/installation-overview", }, items: [ - "docs/core/installation", + "docs/core/installation-overview", "docs/core/homebrew-install", "docs/core/pip-install", "docs/core/docker-install", @@ -249,6 +221,37 @@ const sidebarSettings = { "docs/running-a-dbt-project/using-threads", ], }, + { + type: "category", + label: "Develop with dbt Cloud", + collapsed: true, + link: { type: "doc", id: "docs/cloud/about-develop-dbt" }, + items: [ + "docs/cloud/about-develop-dbt", + "docs/cloud/about-cloud-develop-defer", + { + type: "category", + label: "dbt Cloud CLI", + collapsed: true, + link: { type: "doc", id: "docs/cloud/cloud-cli-installation" }, + items: [ + "docs/cloud/cloud-cli-installation", + "docs/cloud/configure-cloud-cli", + ], + }, + { + type: "category", + label: "dbt Cloud IDE", + link: { type: "doc", id: "docs/cloud/dbt-cloud-ide/develop-in-the-cloud" }, + items: [ + "docs/cloud/dbt-cloud-ide/develop-in-the-cloud", + "docs/cloud/dbt-cloud-ide/ide-user-interface", + "docs/cloud/dbt-cloud-ide/lint-format", + "docs/cloud/dbt-cloud-ide/dbt-cloud-tips", + ], + }, + ], + }, { type: "category", label: "Build dbt projects", @@ -274,7 +277,7 @@ const sidebarSettings = { }, "docs/build/snapshots", "docs/build/seeds", - "docs/build/tests", + "docs/build/data-tests", "docs/build/jinja-macros", "docs/build/sources", "docs/build/exposures", @@ -415,7 +418,17 @@ const sidebarSettings = { link: { type: "doc", id: "docs/collaborate/collaborate-with-others" }, items: [ "docs/collaborate/collaborate-with-others", - "docs/collaborate/explore-projects", + { + type: "category", + label: "Explore dbt projects", + link: { type: "doc", id: "docs/collaborate/explore-projects" }, + items: [ + "docs/collaborate/explore-projects", + "docs/collaborate/model-performance", + "docs/collaborate/project-recommendations", + "docs/collaborate/explore-multiple-projects", + ], + }, { type: "category", label: "Git version control", @@ -710,6 +723,7 @@ const sidebarSettings = { "reference/resource-configs/oracle-configs", "reference/resource-configs/upsolver-configs", "reference/resource-configs/starrocks-configs", + "reference/resource-configs/infer-configs", ], }, { @@ -730,7 +744,7 @@ const sidebarSettings = { "reference/resource-properties/latest_version", "reference/resource-properties/include-exclude", "reference/resource-properties/quote", - "reference/resource-properties/tests", + "reference/resource-properties/data-tests", "reference/resource-properties/versions", ], }, @@ -796,7 +810,7 @@ const sidebarSettings = { type: "category", label: "For tests", items: [ - "reference/test-configs", + "reference/data-test-configs", "reference/resource-configs/fail_calc", "reference/resource-configs/limit", "reference/resource-configs/severity", @@ -956,11 +970,11 @@ const sidebarSettings = { type: "category", label: "Database Permissions", items: [ - "reference/database-permissions/about-database-permissions", + "reference/database-permissions/about-database-permissions", "reference/database-permissions/databricks-permissions", "reference/database-permissions/postgres-permissions", - "reference/database-permissions/redshift-permissions", - "reference/database-permissions/snowflake-permissions", + "reference/database-permissions/redshift-permissions", + "reference/database-permissions/snowflake-permissions", ], }, ], @@ -1050,6 +1064,7 @@ const sidebarSettings = { "best-practices/materializations/materializations-guide-7-conclusion", ], }, + "best-practices/clone-incremental-models", "best-practices/writing-custom-generic-tests", "best-practices/best-practice-workflows", "best-practices/dbt-unity-catalog-best-practices", diff --git a/website/snippets/_adapters-verified.md b/website/snippets/_adapters-verified.md index b9a71c67c36..c3607b50125 100644 --- a/website/snippets/_adapters-verified.md +++ b/website/snippets/_adapters-verified.md @@ -46,7 +46,7 @@ + +:::note + +This feature is only available on the dbt Cloud Enterprise plan. + +::: + ### Custom branch behavior By default, all environments will use the default branch in your repository (usually the `main` branch) when accessing your dbt code. This is overridable within each dbt Cloud Environment using the **Default to a custom branch** option. This setting have will have slightly different behavior depending on the environment type: @@ -44,7 +62,7 @@ By default, all environments will use the default branch in your repository (usu For more info, check out this [FAQ page on this topic](/faqs/Environments/custom-branch-settings)! -### Extended attributes +### Extended attributes :::note Extended attributes are retrieved and applied only at runtime when `profiles.yml` is requested for a specific Cloud run. Extended attributes are currently _not_ taken into consideration for Cloud-specific features such as PrivateLink or SSH Tunneling that do not rely on `profiles.yml` values. diff --git a/website/snippets/_new-sl-setup.md b/website/snippets/_new-sl-setup.md index 3cb6e09eb4c..18e75c3278d 100644 --- a/website/snippets/_new-sl-setup.md +++ b/website/snippets/_new-sl-setup.md @@ -1,6 +1,7 @@ You can set up the dbt Semantic Layer in dbt Cloud at the environment and project level. Before you begin: -- You must have a dbt Cloud Team or Enterprise [multi-tenant](/docs/cloud/about-cloud/regions-ip-addresses) deployment. Single-tenant coming soon. +- You must have a dbt Cloud Team or Enterprise account. Suitable for both Multi-tenant and Single-tenant deployment. + - Single-tenant accounts should contact their account representative for necessary setup and enablement. - You must be part of the Owner group, and have the correct [license](/docs/cloud/manage-access/seats-and-users) and [permissions](/docs/cloud/manage-access/self-service-permissions) to configure the Semantic Layer: * Enterprise plan — Developer license with Account Admin permissions. Or Owner with a Developer license, assigned Project Creator, Database Admin, or Admin permissions. * Team plan — Owner with a Developer license. diff --git a/website/snippets/_packages_or_dependencies.md b/website/snippets/_packages_or_dependencies.md new file mode 100644 index 00000000000..5cc4c67e63c --- /dev/null +++ b/website/snippets/_packages_or_dependencies.md @@ -0,0 +1,34 @@ + +## Use cases + +Starting from dbt v1.6, `dependencies.yml` has replaced `packages.yml`. The `dependencies.yml` file can now contain both types of dependencies: "package" and "project" dependencies. +- ["Package" dependencies](/docs/build/packages) lets you add source code from someone else's dbt project into your own, like a library. +- ["Project" dependencies](/docs/collaborate/govern/project-dependencies) provide a different way to build on top of someone else's work in dbt. + +If your dbt project doesn't require the use of Jinja within the package specifications, you can simply rename your existing `packages.yml` to `dependencies.yml`. However, something to note is if your project's package specifications use Jinja, particularly for scenarios like adding an environment variable or a [Git token method](/docs/build/packages#git-token-method) in a private Git package specification, you should continue using the `packages.yml` file name. + +There are some important differences between Package dependencies and Project dependencies: + + + + +Project dependencies are designed for the [dbt Mesh](/best-practices/how-we-mesh/mesh-1-intro) and [cross-project reference](/docs/collaborate/govern/project-dependencies#how-to-use-ref) workflow: + +- Use `dependencies.yml` when you need to set up cross-project references between different dbt projects, especially in a dbt Mesh setup. +- Use `dependencies.yml` when you want to include both projects and non-private dbt packages in your project's dependencies. + - Private packages are not supported in `dependencies.yml` because they intentionally don't support Jinja rendering or conditional configuration. This is to maintain static and predictable configuration and ensures compatibility with other services, like dbt Cloud. +- Use `dependencies.yml` for organization and maintainability. It can help maintain your project's organization by allowing you to specify [dbt Hub packages](https://hub.getdbt.com/) like `dbt_utils`. This reduces the need for multiple YAML files to manage dependencies. + + + + + +Package dependencies allow you to add source code from someone else's dbt project into your own, like a library: + +- Use `packages.yml` when you want to download dbt packages, such as dbt projects, into your root or parent dbt project. Something to note is that it doesn't contribute to the dbt Mesh workflow. +- Use `packages.yml` to include packages, including private packages, in your project's dependencies. If you have private packages that you need to reference, `packages.yml` is the way to go. +- `packages.yml` supports Jinja rendering for historical reasons, allowing dynamic configurations. This can be useful if you need to insert values, like a [Git token method](/docs/build/packages#git-token-method) from an environment variable, into your package specifications. + +Currently, to use private git repositories in dbt, you need to use a workaround that involves embedding a git token with Jinja. This is not ideal as it requires extra steps like creating a user and sharing a git token. We're planning to introduce a simpler method soon that won't require Jinja-embedded secret environment variables. For that reason, `dependencies.yml` does not support Jinja. + + diff --git a/website/snippets/_run-result.md b/website/snippets/_run-result.md index 77a35676e86..28de3a97cb6 100644 --- a/website/snippets/_run-result.md +++ b/website/snippets/_run-result.md @@ -1,2 +1,2 @@ -- `adapter_response`: Dictionary of metadata returned from the database, which varies by adapter. For example, success `code`, number of `rows_affected`, total `bytes_processed`, and so on. Not applicable for [tests](/docs/build/tests). +- `adapter_response`: Dictionary of metadata returned from the database, which varies by adapter. For example, success `code`, number of `rows_affected`, total `bytes_processed`, and so on. Not applicable for [tests](/docs/build/data-tests). * `rows_affected` returns the number of rows modified by the last statement executed. In cases where the query's row count can't be determined or isn't applicable (such as when creating a view), a [standard value](https://peps.python.org/pep-0249/#rowcount) of `-1` is returned for `rowcount`. diff --git a/website/snippets/_sl-connect-and-query-api.md b/website/snippets/_sl-connect-and-query-api.md index 429f41c3bf6..f7f1d2add24 100644 --- a/website/snippets/_sl-connect-and-query-api.md +++ b/website/snippets/_sl-connect-and-query-api.md @@ -1,10 +1,8 @@ You can query your metrics in a JDBC-enabled tool or use existing first-class integrations with the dbt Semantic Layer. -You must have a dbt Cloud Team or Enterprise [multi-tenant](/docs/cloud/about-cloud/regions-ip-addresses) deployment. Single-tenant coming soon. - +- You must have a dbt Cloud Team or Enterprise account. Suitable for both Multi-tenant and Single-tenant deployment. + - Single-tenant accounts should contact their account representative for necessary setup and enablement. - To learn how to use the JDBC or GraphQL API and what tools you can query it with, refer to [dbt Semantic Layer APIs](/docs/dbt-cloud-apis/sl-api-overview). - * To authenticate, you need to [generate a service token](/docs/dbt-cloud-apis/service-tokens) with Semantic Layer Only and Metadata Only permissions. * Refer to the [SQL query syntax](/docs/dbt-cloud-apis/sl-jdbc#querying-the-api-for-metric-metadata) to query metrics using the API. - - To learn more about the sophisticated integrations that connect to the dbt Semantic Layer, refer to [Available integrations](/docs/use-dbt-semantic-layer/avail-sl-integrations) for more info. diff --git a/website/snippets/_sl-faqs.md b/website/snippets/_sl-faqs.md new file mode 100644 index 00000000000..def8f3837f6 --- /dev/null +++ b/website/snippets/_sl-faqs.md @@ -0,0 +1,33 @@ +- **Is the dbt Semantic Layer open source?** + - The dbt Semantic Layer is proprietary; however, some components of the dbt Semantic Layer are open source, such as dbt-core and MetricFlow. + + dbt Cloud Developer or dbt Core users can define metrics in their project, including a local dbt Core project, using the dbt Cloud IDE, dbt Cloud CLI, or dbt Core CLI. However, to experience the universal dbt Semantic Layer and access those metrics using the API or downstream tools, users must be on a dbt Cloud [Team or Enterprise](https://www.getdbt.com/pricing/) plan. + + Refer to [Billing](https://docs.getdbt.com/docs/cloud/billing) for more information. + +- **How can open-source users use the dbt Semantic Layer?** + - The dbt Semantic Layer requires the use of the dbt Cloud-provided service for coordinating query requests. Open source users who don’t use dbt Cloud can currently work around the lack of a service layer. They can do this by running `mf query --explain` in the command line. This command generates SQL code, which they can then use in their current systems for running and managing queries. + + As we refine MetricFlow’s API layers, some users may find it easier to set up their own custom service layers for managing query requests. This is not currently recommended, as the API boundaries around MetricFlow are not sufficiently well-defined for broad-based community use + +- **Why is my query limited to 100 rows in the dbt Cloud CLI?** +- The default `limit` for query issues from the dbt Cloud CLI is 100 rows. We set this default to prevent returning unnecessarily large data sets as the dbt Cloud CLI is typically used to query the dbt Semantic Layer during the development process, not for production reporting or to access large data sets. For most workflows, you only need to return a subset of the data. + + However, you can change this limit if needed by setting the `--limit` option in your query. For example, to return 1000 rows, you can run `dbt sl list metrics --limit 1000`. + +- **Can I reference MetricFlow queries inside dbt models?** + - dbt relies on Jinja macros to compile SQL, while MetricFlow is Python-based and does direct SQL rendering targeting at a specific dialect. MetricFlow does not support pass-through rendering of Jinja macros, so we can’t easily reference MetricFlow queries inside of dbt models. + + Beyond the technical challenges that could be overcome, we see Metrics as the leaf node of your DAG, and a place for users to consume metrics. If you need to do additional transformation on top of a metric, this is usually a sign that there is more modeling that needs to be done. + +- **Can I create tables in my data platform using MetricFlow?** + - You can use the upcoming feature, Exports, which will allow you to create a [pre-defined](/docs/build/saved-queries) MetricFlow query as a table in your data platform. This feature will be available to dbt Cloud customers only. This is because MetricFlow is primarily for query rendering while dispatching the relevant query and performing any DDL is the domain of the service layer on top of MetricFlow. + +- **How do I migrate from the legacy Semantic Layer to the new one?** + - If you're using the legacy Semantic Layer, we highly recommend you [upgrade your dbt version](/docs/dbt-versions/upgrade-core-in-cloud) to dbt v1.6 or higher to use the new dbt Semantic Layer. Refer to the dedicated [migration guide](/guides/sl-migration) for more info. + +- **How are you storing my data?** + - User data passes through the Semantic Layer on its way back from the warehouse. dbt Labs ensures security by authenticating through the customer's data warehouse. Currently, we don't cache data for the long term, but it might temporarily stay in the system for up to 10 minutes, usually less. In the future, we'll introduce a caching feature that allows us to cache data on our infrastructure for up to 24 hours. + +- **Is there a dbt Semantic Layer discussion hub?** + - Yes absolutely! Join the [dbt Slack community](https://getdbt.slack.com) and [#dbt-cloud-semantic-layer slack channel](https://getdbt.slack.com/archives/C046L0VTVR6) for all things related to the dbt Semantic Layer. diff --git a/website/snippets/_sl-plan-info.md b/website/snippets/_sl-plan-info.md index 083ab2209bc..fe4e6024226 100644 --- a/website/snippets/_sl-plan-info.md +++ b/website/snippets/_sl-plan-info.md @@ -1,2 +1,2 @@ -To define and query metrics with the {props.product}, you must be on a {props.plan} multi-tenant plan .


+To define and query metrics with the {props.product}, you must be on a {props.plan} account. Suitable for both Multi-tenant and Single-tenant accounts. Note: Single-tenant accounts should contact their account representative for necessary setup and enablement.

diff --git a/website/snippets/_v2-sl-prerequisites.md b/website/snippets/_v2-sl-prerequisites.md index c80db4d1c8f..eb8b5fc27e4 100644 --- a/website/snippets/_v2-sl-prerequisites.md +++ b/website/snippets/_v2-sl-prerequisites.md @@ -1,15 +1,16 @@ -- Have a dbt Cloud Team or Enterprise [multi-tenant](/docs/cloud/about-cloud/regions-ip-addresses) deployment. Single-tenant coming soon. -- Have both your production and development environments running dbt version 1.6 or higher. Refer to [upgrade in dbt Cloud](/docs/dbt-versions/upgrade-core-in-cloud) for more info. +- Have a dbt Cloud Team or Enterprise account. Suitable for both Multi-tenant and Single-tenant deployment. + - Note: Single-tenant accounts should contact their account representative for necessary setup and enablement. +- Have both your production and development environments running [dbt version 1.6 or higher](/docs/dbt-versions/upgrade-core-in-cloud). - Use Snowflake, BigQuery, Databricks, or Redshift. - Create a successful run in the environment where you configure the Semantic Layer. - **Note:** Semantic Layer currently supports the Deployment environment for querying. (_development querying experience coming soon_) - Set up the [Semantic Layer API](/docs/dbt-cloud-apis/sl-api-overview) in the integrated tool to import metric definitions. - - To access the API and query metrics in downstream tools, you must have a dbt Cloud [Team or Enterprise](https://www.getdbt.com/pricing/) account. dbt Core or Developer accounts can define metrics but won't be able to dynamically query them.
+ - dbt Core or Developer accounts can define metrics but won't be able to dynamically query them.
- Understand [MetricFlow's](/docs/build/about-metricflow) key concepts, which powers the latest dbt Semantic Layer. -- Note that SSH tunneling for [Postgres and Redshift](/docs/cloud/connect-data-platform/connect-redshift-postgresql-alloydb) connections, [PrivateLink](/docs/cloud/secure/about-privatelink), and [Single sign-on (SSO)](/docs/cloud/manage-access/sso-overview) isn't supported yet. +- Note that SSH tunneling for [Postgres and Redshift](/docs/cloud/connect-data-platform/connect-redshift-postgresql-alloydb) connections, [PrivateLink](/docs/cloud/secure/about-privatelink), and [Single sign-on (SSO)](/docs/cloud/manage-access/sso-overview) doesn't supported the dbt Semantic Layer yet.
diff --git a/website/snippets/core-version-support.md b/website/snippets/core-version-support.md index ff9fa94ff8c..4ec976d4df6 100644 --- a/website/snippets/core-version-support.md +++ b/website/snippets/core-version-support.md @@ -2,4 +2,4 @@ - **[Active](/docs/dbt-versions/core#ongoing-patches)** — We will patch regressions, new bugs, and include fixes for older bugs / quality-of-life improvements. We implement these changes when we have high confidence that they're narrowly scoped and won't cause unintended side effects. - **[Critical](/docs/dbt-versions/core#ongoing-patches)** — Newer minor versions transition the previous minor version into "Critical Support" with limited "security" releases for critical security and installation fixes. - **[End of Life](/docs/dbt-versions/core#eol-version-support)** — Minor versions that have reached EOL no longer receive new patch releases. -- **Deprecated** — dbt-core versions older than v1.0 are no longer maintained by dbt Labs, nor supported in dbt Cloud. +- **Deprecated** — dbt Core versions older than v1.0 are no longer maintained by dbt Labs, nor supported in dbt Cloud. diff --git a/website/snippets/core-versions-table.md b/website/snippets/core-versions-table.md index f1241d8301b..fc7b054bc0a 100644 --- a/website/snippets/core-versions-table.md +++ b/website/snippets/core-versions-table.md @@ -1,14 +1,15 @@ -### Latest Releases +### Latest releases -| dbt Core | Initial Release | Support Level | Critical Support Until | -|------------------------------------------------------------|-----------------|----------------|-------------------------| -| [**v1.7**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.7) | Nov 2, 2023 | Active | Nov 1, 2024 | -| [**v1.6**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.6) | Jul 31, 2023 | Critical | Jul 30, 2024 | -| [**v1.5**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.5) | Apr 27, 2023 | Critical | Apr 27, 2024 | -| [**v1.4**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.4) | Jan 25, 2023 | Critical | Jan 25, 2024 | -| [**v1.3**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.3) | Oct 12, 2022 | End of Life* ⚠️ | Oct 12, 2023 | -| [**v1.2**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.2) | Jul 26, 2022 | End of Life* ⚠️ | Jul 26, 2023 | -| [**v1.1**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.1) ⚠️ | Apr 28, 2022 | Deprecated ⛔️ | Deprecated ⛔️ | -| [**v1.0**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.0) ⚠️ | Dec 3, 2021 | Deprecated ⛔️ | Deprecated ⛔️ | +| dbt Core | Initial release | Support level and end date | +|:----------------------------------------------------:|:---------------:|:-------------------------------------:| +| [**v1.7**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.7) | Nov 2, 2023 | Active — Nov 1, 2024 | +| [**v1.6**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.6) | Jul 31, 2023 | Critical — Jul 30, 2024 | +| [**v1.5**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.5) | Apr 27, 2023 | Critical — Apr 27, 2024 | +| [**v1.4**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.4) | Jan 25, 2023 | Critical — Jan 25, 2024 | +| [**v1.3**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.3) | Oct 12, 2022 | End of Life* ⚠️ | +| [**v1.2**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.2) | Jul 26, 2022 | End of Life* ⚠️ | +| [**v1.1**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.1) | Apr 28, 2022 | End of Life* ⚠️ | +| [**v1.0**](/docs/dbt-versions/core-upgrade/upgrading-to-v1.0) | Dec 3, 2021 | End of Life* ⚠️ | | **v0.X** ⛔️ | (Various dates) | Deprecated ⛔️ | Deprecated ⛔️ | _*All versions of dbt Core since v1.0 are available in dbt Cloud until further notice. Versions that are EOL do not receive any fixes. For the best support, we recommend upgrading to a version released within the past 12 months._ + diff --git a/website/snippets/tutorial-add-tests-to-models.md b/website/snippets/tutorial-add-tests-to-models.md index 491fc72ba85..f743c2bf947 100644 --- a/website/snippets/tutorial-add-tests-to-models.md +++ b/website/snippets/tutorial-add-tests-to-models.md @@ -1,4 +1,4 @@ -Adding [tests](/docs/build/tests) to a project helps validate that your models are working correctly. +Adding [tests](/docs/build/data-tests) to a project helps validate that your models are working correctly. To add tests to your project: diff --git a/website/src/components/communitySpotlightCard/index.js b/website/src/components/communitySpotlightCard/index.js index 08707a93dd4..122edee8f06 100644 --- a/website/src/components/communitySpotlightCard/index.js +++ b/website/src/components/communitySpotlightCard/index.js @@ -1,5 +1,6 @@ import React from 'react' import Link from '@docusaurus/Link'; +import Head from "@docusaurus/Head"; import styles from './styles.module.css'; import imageCacheWrapper from '../../../functions/image-cache-wrapper'; @@ -47,24 +48,45 @@ function CommunitySpotlightCard({ frontMatter, isSpotlightMember = false }) { jobTitle, companyName, organization, - socialLinks + socialLinks, + communityAward } = frontMatter - return ( - + // Get meta description text + const metaDescription = stripHtml(description) + + return ( + + {isSpotlightMember && metaDescription ? ( + + + + + ) : null} + {communityAward ? ( +
+ Community Award Recipient +
+ ) : null} {image && (
{id && isSpotlightMember ? ( - {title} + {title} ) : ( - - {title} + + {title} )}
@@ -72,19 +94,26 @@ function CommunitySpotlightCard({ frontMatter, isSpotlightMember = false }) {
{!isSpotlightMember && id ? (

- {title} + + {title} +

- ) : ( + ) : (

{title}

)} - {pronouns &&
{pronouns}
} - + {pronouns && ( +
{pronouns}
+ )} + {isSpotlightMember && (
{(jobTitle || companyName) && (
{jobTitle && jobTitle} - {jobTitle && companyName && ', '} + {jobTitle && companyName && ", "} {companyName && companyName}
)} @@ -101,7 +130,10 @@ function CommunitySpotlightCard({ frontMatter, isSpotlightMember = false }) {
)} {description && !isSpotlightMember && ( -

+

)} {socialLinks && isSpotlightMember && socialLinks?.length > 0 && (

@@ -109,8 +141,15 @@ function CommunitySpotlightCard({ frontMatter, isSpotlightMember = false }) { <> {item?.name && item?.link && ( <> - {i !== 0 && ' | '} - {item.name} + {i !== 0 && " | "} + + {item.name} + )} @@ -118,29 +157,33 @@ function CommunitySpotlightCard({ frontMatter, isSpotlightMember = false }) {
)} {id && !isSpotlightMember && ( - Read More + > + Read More + )}
{description && isSpotlightMember && (

About

-

- +

)}
- ) + ); } -// Truncate text +// Truncate description text for community member cards function truncateText(str) { // Max length of string let maxLength = 300 - // Check if anchor link starts within first 300 characters + // Check if anchor link starts within maxLength let hasLinks = false if(str.substring(0, maxLength - 3).match(/(?:]+)>)/gi, "") + + // Strip new lines and return 130 character substring for description + const updatedDesc = strippedHtml + ?.substring(0, maxLength) + ?.replace(/(\r\n|\r|\n)/g, ""); + + return desc?.length > maxLength ? `${updatedDesc}...` : updatedDesc +} + export default CommunitySpotlightCard diff --git a/website/src/components/communitySpotlightCard/styles.module.css b/website/src/components/communitySpotlightCard/styles.module.css index 253a561ebea..5df85c8a4cc 100644 --- a/website/src/components/communitySpotlightCard/styles.module.css +++ b/website/src/components/communitySpotlightCard/styles.module.css @@ -15,6 +15,19 @@ header.spotlightMemberCard { div.spotlightMemberCard { margin-bottom: 2.5rem; } +.spotlightMemberCard .awardBadge { + flex: 0 0 100%; + margin-bottom: .5rem; +} +.spotlightMemberCard .awardBadge span { + max-width: fit-content; + color: #fff; + background: var(--ifm-color-primary); + display: block; + border-radius: 1rem; + padding: 5px 10px; + font-size: .7rem; +} .spotlightMemberCard .spotlightMemberImgContainer { flex: 0 0 100%; } @@ -81,6 +94,9 @@ div.spotlightMemberCard { margin-bottom: 0; padding-left: 0; } + .spotlightMemberCard .awardBadge span { + font-size: .8rem; + } .spotlightMemberCard .spotlightMemberImgContainer { flex: 0 0 346px; margin-right: 2rem; @@ -100,7 +116,3 @@ div.spotlightMemberCard { line-height: 2rem; } } - - - - diff --git a/website/src/components/communitySpotlightList/index.js b/website/src/components/communitySpotlightList/index.js index 6885f5ff2ac..ed0dbf6d653 100644 --- a/website/src/components/communitySpotlightList/index.js +++ b/website/src/components/communitySpotlightList/index.js @@ -36,6 +36,7 @@ function CommunitySpotlightList({ spotlightData }) { {metaTitle} +