Dependency between landing table and source node #6214
enrico-spadaTR
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Problem
DBT builds the dependency graph for all
ref
nodes. This allows to delegate scheduling to DBT such that: if Model B depends on Model A, then Model B can run only after Model A.Unfortunately, this feature is not implemented also for 'source' node. In fact, when DBT runs before the landing table referred
by the 'source' node, then the pipeline will start. Therefore, this limitation forces us to (1) split DBT into multiple pipelines and (2) use an external orchestrator to manually define the logic for synchronous scheduling.
IDEA
Evolve DBT to implement a dependency logic between the 'source' its corresponding landing table.
POC
I tried to implement a very hacky POC: using a pre_hook that forces the model in "waiting" until the landing table is refreshed.
Besides all the risks and limitations of such hacky solutions, one of the many blockers is that it keeps threads busy without doing nothing.
Beta Was this translation helpful? Give feedback.
All reactions