Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adapter/sources: Support custom timelines on source tables to handle CdCv2 envelope type correctly #30013

Merged
merged 2 commits into from
Oct 21, 2024

Conversation

rjobanp
Copy link
Contributor

@rjobanp rjobanp commented Oct 15, 2024

Motivation

The above bug demonstrated hung queries whenever using a CREATE TABLE .. FROM SOURCE .. ENVELOPE MATERIALIZE statement, since ENVELOPE MATERIALIZE (also known as Envelope::CdcV2) uses a non-default timeline by default.

This PR introduces the ability to set a custom timeline on CREATE TABLE .. FROM SOURCE and to correct the default timeline when using ENVELOPE MATERIALIZE, matching the existing timeline handling semantics of CREATE SOURCE statements.

The tests for this are in @nrainer-materialize 's PR #29801 and I tested this change on his branch locally to verify the fix.

Tips for reviewer

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

Comment on lines +1727 to +1743
// Allow users to specify a timeline. If they do not, determine a default
// timeline for the source.
let timeline = match timeline {
None => match envelope {
SourceEnvelope::CdcV2 => {
Timeline::External(scx.catalog.resolve_full_name(&name).to_string())
}
_ => Timeline::EpochMilliseconds,
},
// TODO(benesch): if we stabilize this, can we find a better name than
// `mz_epoch_ms`? Maybe just `mz_system`?
Some(timeline) if timeline == "mz_epoch_ms" => Timeline::EpochMilliseconds,
Some(timeline) if timeline.starts_with("mz_") => {
return Err(PlanError::UnacceptableTimelineName(timeline));
}
Some(timeline) => Timeline::User(timeline),
};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the same logic as already exists in plan_create_source. It will be removed from that function once we drop support for outputting data to the primary source collection

@rjobanp rjobanp force-pushed the source-table-timelines branch from 395421b to be5fa3a Compare October 17, 2024 17:52
@@ -99,7 +99,7 @@ idx_c FOR 240000
# Test subsource propagation. Test sources with and without subsources and view dependencies to
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes in this file are a revert of 9bfaac4 which seemed to break the intent of the test (to validate that the retain history values are propagated from source -> subsources / tables)

@rjobanp rjobanp marked this pull request as ready for review October 17, 2024 18:22
@rjobanp rjobanp requested a review from a team as a code owner October 17, 2024 18:22
@rjobanp rjobanp requested review from ParkMyCar and jkosh44 October 17, 2024 18:22
Copy link
Contributor

@jkosh44 jkosh44 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

pub fn timeline(&self) -> Timeline {
Timeline::EpochMilliseconds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm honestly worried about how many places in the code blindly assumes that all tables are on the EpochMillisecond timeline. Nothing to do I suppose but see what breaks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I ran the nightly tests on this branch since I had a similar worry, but things seemed okay 🤷🏽

@rjobanp rjobanp merged commit 769fa0d into MaterializeInc:main Oct 21, 2024
140 checks passed
@rjobanp rjobanp deleted the source-table-timelines branch October 21, 2024 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants