-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
needs to use {{this}} format for incremental #3341
Conversation
hmm, seems like something is wrong with config? cc @jeff-dude |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{{this}} provides a more precise incremental updating option.
Original idea is having an overlapping region when incremental update, to ensure 100% coverage of new incoming records, and dbt uses unique_key
in the config block to filter out the duplicated/existed records when updating.
@@ -24,7 +24,7 @@ WITH | |||
FROM {{ source('solana','account_activity') }} | |||
WHERE tx_success | |||
{% if is_incremental() %} | |||
AND block_time >= date_trunc("day", now() - interval '1 day') | |||
AND block_time > (SELECT max(day) FROM {{this}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should probably use >=
here to not drop later incremental updates from the same day?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since day
is derived from date_trunc(...)
, it represents the lower bound of the time, I think >
will include all events in that date after UTC 00:00.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to update on same day, only next day. To prevent duplicates.
There is the edge case of missing an address if their balance changes later that day but not the next day.
But ignoring that case for now!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah @Hosuke you're right! I was confused but indeed because we compare block_time with the day timestamp this works and there's no real difference between >
and >=
. 👍
CI issues should be resolved now, I've triggered a re-run |
can you confirm the failing test on duplicates? |
Looking |
the test schema is gone, but I really can't see any reason there would be duplicates given we're using a window function. |
@jeff-dude I'm like 100% confident there should be no duplicates - can we merge? |
if there are dupes here, there'll be dupes in prod. have you tried compiling the model, then placing in CTE and running group by/having > 1 on the unique fields? you can likely start with a subset of data for faster results, then open up more if nothing returns |
I think I know the cause of the bug, |
I tried to fix the problem by giving a default empty string I actually found out this problem is an opening issue in dbt: dbt-labs/dbt-adapters#159 |
hmm odd. I've run a group by check and got no nulls https://dune.com/queries/2479876?d=1&q=v2+&sidebar=query-explorer |
thank you @Hosuke !! |
great catch! we've seen this elsewhere before too. it's a good thing to keep in mind, that unique keys can't be null or the merge statement fails the join condition, leading to duplicate inserts. i wonder if this means we need to apply |
i'm trying to figure out why the ci test action is picking up previously merged PR spells still. we've had the successful run in prod that should in theory make sure those don't run anymore. i've raised a ticket internally to investigate we're getting timeouts due to all of that running at the same time |
seems like we're getting failures where not null tests were applied (assuming the unique keys) |
Oh token_mint_address and token_balance can be null, those tests should he removed. Just address and balance should have not null tests. |
what's the move here? |
closing PRs marked as |
Brief comments on the purpose of your changes:
We were getting duplicates with this always running with one day of history, I've changed it to only run if the block_time is in the next day.
For Dune Engine V2
I've checked that:
General checks:
lowercase_snake_cased
Pricing checks:
coin_id
represents the ID of the coin on coinpaprika.comJoin logic:
Incremental logic: