Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage/sql: Refactor subsource purification (prep for source-fed table purification) [ENG-TASK-19] #28310

Merged
merged 3 commits into from
Jul 21, 2024

Conversation

rjobanp
Copy link
Contributor

@rjobanp rjobanp commented Jul 17, 2024

Motivation

Inspired by & replaces Sean's draft PR #27320 to shuffle when subsource statements are generated.

This PR is prep work for the implementation described in #27907

This PR shouldn't introduce any functional changes -- it is a refactor of subsource purification such that we now generate CreateSubsourceStatements after purification and using the same method in both 'create source' and 'alter source' statements.

The intent is to tease apart the purification logic (obtaining state from external systems) from the statement generation logic, such that I can re-use as much of the subsource purification logic as possible to purify & plan CREATE TABLE .. FROM SOURCE statements in subsequent PRs.

In the spirit of yesterday's cluster-team whiteboard discussions I also cleaned up the naming in much of the purification code.

Tips for reviewer

The first commit is the actual refactor

The 2nd and 3rd commits are a rename of many overloaded structs/variables using the word 'subsource', to clarify whether they were referring to an 'external reference' (e.g. upstream postgres table) or a 'source export' - the object (e.g. either a subsource or table) being exported by a source. This ended up more LOC than expected but clarifying these terms will likely make later PRs easier to grok.

Checklist

@rjobanp rjobanp force-pushed the subsource-schema-planning branch 2 times, most recently from 55397a1 to 2ad317f Compare July 18, 2024 14:07
@rjobanp rjobanp changed the title storage/sql: Reorder subsource purification / planning to happen after source planning storage/sql: Refactor subsource purification part 1 Jul 18, 2024
@rjobanp rjobanp force-pushed the subsource-schema-planning branch from f272c67 to 4f8ca41 Compare July 18, 2024 15:48
@rjobanp rjobanp changed the title storage/sql: Refactor subsource purification part 1 storage/sql: Refactor subsource purification (prep for source-fed table purification) Jul 18, 2024
@rjobanp rjobanp requested a review from petrosagg July 18, 2024 17:03
@rjobanp rjobanp marked this pull request as ready for review July 18, 2024 17:04
@rjobanp rjobanp requested review from a team as code owners July 18, 2024 17:04
@rjobanp rjobanp requested a review from ParkMyCar July 18, 2024 17:04
Copy link
Contributor

@petrosagg petrosagg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I think a nice future step would be to replace some of the enums with a generic parameter that describes the source, which will get rid of the unreachable! statements that we must have now.

Also thanks for separating the commit of the renames, made reviewing so much easier!

@rjobanp rjobanp merged commit 343710c into MaterializeInc:main Jul 21, 2024
79 checks passed
@rjobanp rjobanp deleted the subsource-schema-planning branch July 21, 2024 17:31
@bosconi bosconi changed the title storage/sql: Refactor subsource purification (prep for source-fed table purification) storage/sql: Refactor subsource purification (prep for source-fed table purification) [ENG-TASK-19] Jul 24, 2024
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants