Skip to content

Commit

Permalink
Merge branch 'current' into dbt-teradata-1.8.x
Browse files Browse the repository at this point in the history
  • Loading branch information
runleonarun authored Jun 20, 2024
2 parents 97c6bfc + 5462ebb commit 60822d0
Show file tree
Hide file tree
Showing 4 changed files with 35 additions and 4 deletions.
24 changes: 24 additions & 0 deletions contributing/content-types.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ These content types can all form articles. Some content types can form sections
* [Procedural](#procedural)
* [Guide](#guide)
* [Quickstart](#quickstart-guide)
* [Cookbook recipes](#cookbook-recipes)


## Conceptual
Expand Down Expand Up @@ -165,3 +166,26 @@ Quickstart guides are generally more conversational in tone than our other docum

Examples
TBD

## Cookbook recipes
The dbt Cookbook recipes are a collection of scenario-based, real-world examples for building with the dbt. Cookbook recipes offer practical, scenario-based examples for using specific features.

Code examples could be written in SQL or [Python](/docs/build/python-models), though most will be in SQL.

If there are examples or guides you'd like to see, feel free to suggest them on the [documentation issues page](https://github.com/dbt-labs/docs.getdbt.com/issues/new/choose). We're also happy to accept high-quality pull requests, as long as they fit the scope of the cookbook.

### Contents of a cookbook recipe article or header

Cookbook recipes should contain real-life scenarios with objectives, prerequisites, detailed steps, code snippets, and outcomes — providing users with a dedicated section to implement solutions based on their needs. Cookbook recipes complement the existing guides by giving users hands-on, actionable instructions and code.

Each cookbook recipe should include objectives, a clear use case, prerequisites, step-by-step instructions, code snippets, expected output, and troubleshooting tips.

### Titles for cookbook recipe content

Cookbook recipe headers should always start with a “How to create [topic]” or "How to [verb] [topic]".

### Examples of cookbook recipe content

- How to calculate annual recurring revenue (ARR) using metrics in dbt
- How to calculate customer acquisition cost (CAC) using metrics in dbt
- How to track the total number of sale transactions using metrics in dbt
3 changes: 2 additions & 1 deletion website/docs/docs/build/incremental-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,8 @@ from {{ ref('app_data_events') }}

-- this filter will only be applied on an incremental run
-- (uses >= to include records whose timestamp occurred since the last run of this model)
where event_time >= (select coalesce(max(event_time), '1900-01-01') from {{ this }})
-- (If event_time is NULL or the table is truncated, the condition will always be true and load all records)
where event_time >= (select coalesce(max(event_time),'1900-01-01'::TIMESTAMP) from {{ this }} )

{% endif %}
```
Expand Down
10 changes: 8 additions & 2 deletions website/docs/docs/core/connect-data-platform/spark-setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ $ python -m pip install "dbt-spark[session]"

<p>For further info, refer to the GitHub repository: <a href={`https://github.com/${frontMatter.meta.github_repo}`}>{frontMatter.meta.github_repo}</a></p>

## Connection Methods
## Connection methods

dbt-spark can connect to Spark clusters by four different methods:

Expand Down Expand Up @@ -211,7 +211,7 @@ Spark can be customized using [Application Properties](https://spark.apache.org/
### Usage with EMR
To connect to Apache Spark running on an Amazon EMR cluster, you will need to run `sudo /usr/lib/spark/sbin/start-thriftserver.sh` on the master node of the cluster to start the Thrift server (see [the docs](https://aws.amazon.com/premiumsupport/knowledge-center/jdbc-connection-emr/) for more information). You will also need to connect to port 10001, which will connect to the Spark backend Thrift server; port 10000 will instead connect to a Hive backend, which will not work correctly with dbt.

### Supported Functionality
### Supported functionality

Most dbt Core functionality is supported, but some features are only available
on Delta Lake (Databricks).
Expand All @@ -220,3 +220,9 @@ Delta-only features:
1. Incremental model updates by `unique_key` instead of `partition_by` (see [`merge` strategy](/reference/resource-configs/spark-configs#the-merge-strategy))
2. [Snapshots](/docs/build/snapshots)
3. [Persisting](/reference/resource-configs/persist_docs) column-level descriptions as database comments

### Default namespace with Thrift connection method

If your Spark cluster doesn't have a default namespace, metadata queries that run before any dbt workflow will fail, causing the entire workflow to fail, even if your configurations are correct. The metadata queries fail there's no default namespace in which to run it.

To debug, review the debug-level logs to confirm the query dbt is running when it encounters the error: `dbt run --debug` or `logs/dbt.log`.
2 changes: 1 addition & 1 deletion website/docs/docs/deploy/deploy-jobs.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ You can use deploy jobs to build production data assets. Deploy jobs make it eas
- Commit SHA
- Environment name
- Sources and documentation info, if applicable
- Job run details, including run timing, [model timing data](#model-timing), and [artifacts](/docs/deploy/artifacts)
- Job run details, including run timing, [model timing data](/docs/deploy/run-visibility#model-timing), and [artifacts](/docs/deploy/artifacts)
- Detailed run steps with logs and their run step statuses

You can create a deploy job and configure it to run on [scheduled days and times](#schedule-days) or enter a [custom cron schedule](#cron-schedule).
Expand Down

0 comments on commit 60822d0

Please sign in to comment.