DNM Maybe working migrations #30980

jkosh44 · 2025-01-08T20:51:35Z

Motivation

Tips for reviewer

Checklist

This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

Previously, async storage workers would calculate the as_of of a dataflow from the since of the dataflow's output's remap shards. If there was more than one distinct remap shard among the outputs, then the storage worker would panic. It's expected that the only collection that will ever have a remap shard is the ingestion collection itself. Furthermore, we are planning to remove the ingestion collection from the outputs of the dataflow (in fact there's already a feature flag that does this). If the ingestion is removed from the outputs, then no output will have a remap shard, and the as_of will always be empty. This commit simplifies the existing as_of calculation and fixes the as_of calculation when the ingestion collection is removed from the outputs. It does this by calculating the as_of directly from the ingestion's remap shard. Additionally, it asserts that if any of the outputs have a remap shard, then it must be equal to the ingestion's remap shard. Works towards resolving #MaterializeInc/database-issues/issues/8620

Previously, collection statistics were only initialized for ingestion dataflow outputs. When the `force_source_table_syntax` flag is enabled, the ingestion collection is excluded from the ingestion dataflow outputs. As a result, statistics are never created for the ingestion collection. This causes later parts of the code to panic because it is assumed that all ingestion collections have statistics initialized. This commit fixes the issue by ensuring that statistics are always initialized for ingestion collections, even if it's not included in the dataflow outputs. Works towards resolving #MaterializeInc/database-issues/issues/8620

This commit changes `ReadHold` to contain sender closure, rather than an `UnboundedSender` directly, for transmitting read hold changes. This provides the abililty to construct `ReadHolds` that are connected to channels with an item type different from `(GlobalId, ChangeBatch)`, performing the necessary transformation inside the closure.

Changes to external `ReadHold`s on compute collections are now sequenced through the `command_{tx,rx}` channels of the respective `Instance` controllers. This ensures that `Instance`s observe read hold changes always after they have observed the creation of the affected collection, which wasn't the case previously. As a side effect of this change, each external compute `ReadHold` now contains a copy of an instance `Client`. These copies keep the connected `Instance`s alive, even when the `ComputeController` has already dropped them. This seems fine in that arguably an instance shouldn't shut down if somebody still has read holds on its collections. But it is a change to the previous behavior.

…ges" This reverts commit 6f2214e.

Now that some of the read holds held by `Instance` are sequencing their changes through the command channel, it is no longer guaranteed that an `apply_read_hold_changes` call also applies cascading changes on dependencies. The previous `check_empty` check expected this to be the case, so it would incorrectly fail its asserts. This is fixed by having the check also drain and apply the contents of the command channel, until no new read capability changes are produced. To clarify the semantics, `check_empty` is also renamed to `shutdown` and explicitly terminates the execution of the `Instance` task.

This commit changes the compute controller to also sequence internal read hold changes through `command_tx`. This simplifies things as now all read hold changes flow through the same channels and we don't need to worry about having two kinds of `ChangeTx` that secretly point to different destinations. The commit removes the separate `read_holds_{tx,rx}` channel in the compute controller and moves to applying received read hold changes directly, rather than batching them. This removes a bit of complexity but also imposes a risk of a performance regression. My hope is that the performance impact will be negligible.

The changes made in the previous commits make it so the compute controller is slower at applying read hold downgrades. In tests that spawn two envd instances in sequence that can lead to issues because the first instance might still try to apply read hold downgrades while the second instance is already running, causing `PersistEpoch` conflicts. This race condition existed before but apparently wasn't hit because read hold downgrades by the first instance didn't get delayed long enough. The fix chosen here is to convert the tests that want to run two envd instances in sequence from using `TestHarness::start` to using `TestHarness::start_blocking`. The latter runs the `TestServer` in a separate tokio runtime that gets dropped when the `TestServer` is dropped, ensuring that no controller tasks can still be alive when the second envd instance is started.

This commit moves detection of topic recreation from the reader operator into the metadata fetcher. This is done to make the metadata fetcher aware of the unrecoverable error so it can seal the collection by emitting a probe with an empty upstream frontier. Prior to this change, the collection wouldn't be sealed under the reclock-to-latest scheme, and a replica restart could cause it to receive new, incorrect updates. Apart from fixing the sealing bug, this commit makes two functional changes: * It adds detection of topic deletion. That was easy to add and not having it seemed like an oversight. * It omits the "high watermark regressed"/"partition count regressed" detail in the "topic recreated" error. It didn't feel like a useful piece of information and omitting it simplifies the code a lot.

Shutting down the metadata fetcher operator releases its capability on the probe output. When the reclock-to-latest reclocking is enabled this makes the remap operator believe that the upstream has advanced to the empty frontier, making it seal the collection. We must thus not shut down the metadata fetcher when it encounters a connection error, as the source did not in fact advance to the empty frontier in this case.

Before, we were using the wrong assumption that the new feedback UPSERT operator could not yield finalized tombstone values, because it only writes finalized values that "exist". But it can happen that `ensure_decoded` puts in place a finalized tombstone when state values consolidate to a diff of zero. We say "can happen" above, because `merge_update` returns whether or not a merged value can be deleted, and this returns true when the diff_sum is zero. So in most cases the finalized tombstone would be cleaned up right away. However, not all callers immediately (or at all) act on this "can delete" information, which leads to finalized tombstones actually existing in state.

…Inc#30962) Chatted about this in [Slack](https://materializeinc.slack.com/archives/CU7ELJ6E9/p1736243826039879), as-is we only see non-inlined functions which makes it hard to follow timely operators. This PR bumps our debuginfo from line-tables to limited, let's see if it helps. ### Motivation Better debug info ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [x] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design.  - [x] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [x] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](MaterializeInc/cloud#5021)).  - [x] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.

Previously, the output index of each ingestion export was calculated assuming that index 0 was reserved for the primary source relation. When `force_source_table_syntax` is enabled, the primary source relation is not included in the exports, and therefore caused all export indexes to be off by one. This commit fixes the issue by decrementing all export indexes by 1 if the primary source relation was not included in the exports. It's unclear to the author how errors are being handled when the primary relation is not included in the exports. Works towards resolving #MaterializeInc/database-issues/issues/8620

…to values m

…sorting of item updates

jkosh44 and others added 30 commits January 8, 2025 13:15

Revert "controller/compute: defer application of early read hold chan…

062dcd8

…ges" This reverts commit 6f2214e.

Opt-in catalog migration for converting subsources to source tables

a0b1299

Add migration logic for source statements

4ce1748

Rename the feature flag

f9b3c01

Add new table to audit log

1094734

Add platform check scenario to test migration

aa9d33c

Switch to using system vars instead of flags to allow console access …

b810598

…to values m

Fixes to migration based on testing

a9ac9dc

Also test the migration in the legacy upgrade tests

869dde3

platform checks: unique source name

7daeefc

Migration structure cleanup from feedback

3144c3e

Address more feedback; ensure new source name is unique

b08465c

Fix legacy-upgrade checks

8cc26c9

Fixes caused by rebase on main

a061b34

ci: print source table migration issues

12428f7

migration tests: pg-cdc-old-syntax

ebcd57e

migration tests: extract logic

db36e65

migration tests: improve verification

7a36b2a

nrainer-materialize and others added 20 commits January 8, 2025 15:50

migration tests: mysql-cdc-old-syntax

093bc6c

migration tests: testdrive-old-kafka-syntax

e2ec8a6

migration tests: improve output

08b6748

migration tests: fixes

9879b1b

migration tests: fixes

6ac9b65

Fix for mysql source being restarted after new table added

03f7f3a

Avoid needing to rewrite ids of dependent statements by changing the …

cb52018

…sorting of item updates

Fix merge skew

6a42fca

Fix dependency tracking

5cb65cf

Fix lint

dc1a48a

Fix some issues

b14cc63

Fix dependency tracking

1763a86

Fix merge skew

ace0355

Update test versions

bdb5783

More merge skew fixes

032cc84

Update item sorting to only sort within item groups

31fb760

Experiment for migrate

ac821fa

Fix migration idempotency

f56fac0

Fixup

51647b6

resolve merge conflicts

6462bfc

jkosh44 closed this Jan 9, 2025

jkosh44 deleted the maybe-working-migrations branch January 9, 2025 13:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DNM Maybe working migrations #30980

DNM Maybe working migrations #30980

jkosh44 commented Jan 8, 2025

DNM Maybe working migrations #30980

DNM Maybe working migrations #30980

Conversation

jkosh44 commented Jan 8, 2025

Motivation

Tips for reviewer

Checklist