Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] microbatch, running batches in parallel #6550

Closed
graciegoheen opened this issue Nov 26, 2024 · 7 comments
Closed

[Core] microbatch, running batches in parallel #6550

graciegoheen opened this issue Nov 26, 2024 · 7 comments
Assignees
Labels
content Improvements or additions to content dbt Core The changes proposed in this issue relate to dbt Core dbt-core v1.9

Comments

@graciegoheen
Copy link
Collaborator

Link to the page(s) on docs.getdbt.com requiring updates

Tell us more about this update

  • we need docs for microbatch batches running in parallel
    • when you would want to run batches in parallel and when you would not
    • new config for running batches in parallel - concurrent_batches (default based on usage of {{ this }} and whether or not adapter supports)
    • which adapters support it

Core PR here - dbt-labs/dbt-core#10958
Adapter PRs to come

Reviewers/Stakeholders/SMEs

Grace, Doug, Quigley, Michelle

Related GitHub issues

No response

Additional information

No response

@graciegoheen graciegoheen added content Improvements or additions to content dbt Core The changes proposed in this issue relate to dbt Core labels Nov 26, 2024
@mirnawong1
Copy link
Contributor

mirnawong1 commented Nov 27, 2024

once documented, link to new section in line 35. see: #6544 (comment). specifically, this line here where 'concurrently' needs to link out to the new section:

This is a powerful abstraction that makes it possible for dbt to run batches separately, concurrently, and retry them independently.

@nataliefiann nataliefiann self-assigned this Nov 27, 2024
@nataliefiann
Copy link
Contributor

Hiya @mirnawong1

I've assigned this one to myself. I'll pick this up.

Kind Regards
Natalie

@nataliefiann
Copy link
Contributor

Hiya @graciegoheen

I'm picking this up for @mirnawong1
I've created a draft notion for this but I just wanted to ask with the part on adapters that support the microbatches running in parallel - are those the adapters mentioned here under the microbatch section?

Kind Regards
Natalie

@runleonarun
Copy link
Collaborator

Hey @nataliefiann, you should work with @mirnawong1 and @QMalcolm on this! Grace is OOO

@nataliefiann
Copy link
Contributor

Thanks @runleonarun
@mirnawong1 has scheduled a meeting with me on this today so I should be able to get this PR'd and I'll @ Malcolm for the tech review

Kind Regards
Natalie

@nataliefiann
Copy link
Contributor

Hiya @QMalcolm

I'm going to start drafting my PR on this but whilst I do so, I have a few questions if you wouldn't mind shedding some light please:

Is this (running in parallel) only supported for models? And within models, is it only supported in project.yml file, properties.yml and sql file (concurrent batches) config?

Is there a limit to how many models can be run in parallel, does it depend on factors such a batch size, event time etc?

Also, with regards to the adapters that support the microbatches running in parallel feature - are those the adapters mentioned here under the microbatch section?

Kind Regards
Natalie

runleonarun added a commit that referenced this issue Dec 7, 2024
## What are you changing in this pull request and why?

I have created this PR following this Git issue:
#6550 for Parallel
batch execution

## Checklist
- [ ] I have reviewed the [Content style
guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md)
so my content adheres to these guidelines.
- [ ] The topic I'm writing about is for specific dbt version(s) and I
have versioned it according to the [version a whole
page](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version)
and/or [version a block of
content](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-blocks-of-content)
guidelines.
- [ ] I have added checklist item(s) to this list for anything anything
that needs to happen before this PR is merged, such as "needs technical
review" or "change base branch."
- [ ] The content in this PR requires a dbt release note, so I added one
to the [release notes
page](https://docs.getdbt.com/docs/dbt-versions/dbt-cloud-release-notes).
<!--
PRE-RELEASE VERSION OF dbt (if so, uncomment):
- [ ] Add a note to the prerelease version [Migration
Guide](https://github.com/dbt-labs/docs.getdbt.com/tree/current/website/docs/docs/dbt-versions/core-upgrade)
-->
<!-- 
ADDING OR REMOVING PAGES (if so, uncomment):
- [ ] Add/remove page in `website/sidebars.js`
- [ ] Provide a unique filename for new pages
- [ ] Add an entry for deleted pages in `website/vercel.json`
- [ ] Run link testing locally with `npm run build` to update the links
that point to deleted pages
-->

<!-- vercel-deployment-preview -->
---
🚀 Deployment available! Here are the direct links to the updated files:


-
https://docs-getdbt-com-git-nfiann-rbip-dbt-labs.vercel.app/docs/build/incremental-microbatch
-
https://docs-getdbt-com-git-nfiann-rbip-dbt-labs.vercel.app/docs/dbt-versions/core-upgrade/06-upgrading-to-v1.9
-
https://docs-getdbt-com-git-nfiann-rbip-dbt-labs.vercel.app/reference/resource-properties/concurrent_batches

<!-- end-vercel-deployment-preview -->

---------

Co-authored-by: Mirna Wong <[email protected]>
Co-authored-by: Quigley Malcolm <[email protected]>
Co-authored-by: Leona B. Campbell <[email protected]>
@runleonarun
Copy link
Collaborator

closed by #6589

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
content Improvements or additions to content dbt Core The changes proposed in this issue relate to dbt Core dbt-core v1.9
Projects
None yet
Development

No branches or pull requests

4 participants