Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table union support in prototype #477

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open

Conversation

mheinzel
Copy link
Collaborator

Tests pass, but still needs a bit of cleaning up, additional invariants and proper tracing. The lookups are currently quite naive and inefficient.

@mheinzel mheinzel force-pushed the mheinzel/prototype-union branch 3 times, most recently from c0c6301 to 0b4dff4 Compare November 25, 2024 14:14
@mheinzel mheinzel marked this pull request as ready for review November 25, 2024 14:15
@mheinzel mheinzel force-pushed the mheinzel/prototype-union branch from 0b4dff4 to 4e7e82e Compare November 26, 2024 16:26
@mheinzel
Copy link
Collaborator Author

There are still a few TODOs and I might not do proper tracing in this PR, but the lookups are batched now. I'm not sure about commit 4e7e82e, do you think it makes sense to include?

@mheinzel mheinzel force-pushed the mheinzel/prototype-union branch 2 times, most recently from cd7bfb5 to ad30901 Compare November 28, 2024 12:31
Copy link
Collaborator

@jorisdral jorisdral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! 🚀 This is a rather superficial review, since I don't understand the ins and outs of unions as well as you and @dcoutts do, so I feel I can't comment much on the details (yet). I hope it's useful nonetheless.

BTW, I tried to review the PR per commit but it wasn't always clear to me what the goal of the changes are (e.g., 0a0fb75). Maybe we could add some more hints to commit messages?

prototypes/ScheduledMerges.hs Outdated Show resolved Hide resolved
Comment on lines 133 to 163
newtype MergingTree s =
MergingTree (STRef s (MergingTreeState s))

data MergingTreeState s = CompletedTreeMerge !Run
-- | Reuses MergingRun (with its STRef) to allow
-- sharing existing merges.
| OngoingTreeMerge !(MergingRun s)
| PendingTreeMerge !(PendingMerge s)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two types should be documented

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second Joris's request for these to have some explanation. I know what they are, but this should be for everyone.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I documented the types and also added a big top-level comment explaining the general approach to unions.

prototypes/ScheduledMerges.hs Outdated Show resolved Hide resolved
prototypes/ScheduledMerges.hs Show resolved Hide resolved
Comment on lines 209 to 211
mergePolicyForLevel _ [] = MergePolicyLevelling
mergePolicyForLevel _ _ = MergePolicyTiering
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also depend on whether there is a union on the last level? The way I see it now, it does not make much sense if the the last real level before the union uses levelling if it does not also remove deletes

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. This means that inserts on a table with a union is cheaper than on one without a union, since we do more work on levelling merges than tiering merges. So potentially we could even spend the "saved" credits on merging the union? These credits are probably well spent, as they help to bring down the cost of lookups, but this also further muddies the separation between the progress of union and non-union (level) merges. 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it so we don't do levelling if there is a union level. I also left a TODO for the idea of supplying some of the regular (non-union) credits into the union.

prototypes/ScheduledMerges.hs Outdated Show resolved Hide resolved
Comment on lines 823 to 853
| MergeStartedEvent {
mergePolicy :: MergePolicy,
mergeLast :: MergeLastLevel,
mergeDebt :: MergeDebt,
mergePolicy :: Maybe MergePolicy,
mergeType :: MergeType,
mergeDebt :: Int,
mergeCost :: Int,
mergeRunsSize :: [Int]
}
| MergeCompletedEvent {
mergePolicy :: MergePolicy,
mergeLast :: MergeLastLevel,
mergePolicy :: Maybe MergePolicy,
mergeType :: MergeType,
mergeSize :: Int
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have separate constructors for different types of merges? I don't have a preference, and I'm not sure if it's useful, but it might make it clear where in the merge algorithm these things are coming from

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's one of the questions to answer around tracing. It's a little complicated, since level merges (merges created by inserting values) also become part of a union and might get completed through supplyUnionCredits. Should these then have a different CompletedEvent than merges created through union? Seems a bit awkward to have to make this distinction (especially in case we don't want the "tweak PendingMerge representation" change, which currently tracks the origin of a merge in a way).

Copy link
Collaborator Author

@mheinzel mheinzel Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I postponed this question for now, but left a TODO to tackle tracing soonish. In particular, we strictly only trace inside the regular levels now, so with a little refactoring the Maybe wasn't necessary.

Comment on lines 620 to 629
lookups :: LSM s -> [Key] -> ST s [LookupResult Value Blob]
lookups (LSMHandle _ lsmr) ks = do
LSMContent wb ls ul <- readSTRef lsmr
runs <- concat <$> flattenLevels ls
let runsAcc = lookupsBufferAndRuns ks wb runs
acc <- case ul of
NoUnion ->
return runsAcc
Union t _ -> do
treeAcc <- lookupsTree ks t
return $ mergeLookupAccs [runsAcc, treeAcc]
return $ map (\k -> queryLookupAcc k acc) ks

lookup :: LSM s -> Key -> ST s (LookupResult Value Blob)
lookup lsm k = do
runs <- concat <$> allLayers lsm
return $ doLookup k runs

doLookup :: Key -> [Run] -> LookupResult Value Blob
doLookup k =
foldr (\run continue ->
case Map.lookup k run of
Nothing -> continue
Just (Insert v mb) -> Found v mb
Just Delete -> NotFound
Just (Mupsert v) -> case continue of
NotFound -> Found v Nothing
Found v' mb -> Found (resolveValue v v') mb)
NotFound
lookup lsm k = head <$> lookups lsm [k]
Copy link
Collaborator

@jorisdral jorisdral Nov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced we have to prototype batched lookups so closely to the implementation, though I'm aware I could also be missing context on why it's necessary.

The lookups need to do bracketing properly, but that seems orthogonal to single-key vs. batched lookups style. Would it not lead to simpler code if we were implement the batch lookup in terms of the single key lookup, like it was before?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lookups into unions are somewhat complicated, depend on the representation of MergingTrees and offer many different design choices. So I wanted to be able to play around with them here and get a good idea of what the implications are for the real code.

I think you're right in that I could write very similar code handling a single key only, which means the accumulator would just be a Maybe Op and I could remove a few types. The correspondence between prototype and code then becomes a little less obvious, but should still be easy to translate, so maybe a better choice indeed. 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am handling each key individually now, the code still looks pretty similar. It still follows the real implementation relatively closely: accumulating a Maybe Op, starting with the entry from the write buffer. It also has a separate function for performing a batch of lookups, so it's clear how many of these are needed when doing lookups into a merging tree.

prototypes/ScheduledMergesTestQLS.hs Outdated Show resolved Hide resolved
Comment on lines 272 to 277
MergePolicyLevelling -> do
case ir of
Merging _ (MergingRun mergeType _) ->
assertST (mergeType == MergeLevel (mergeLast ls))
_ -> pure ()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that MerginRuns in the levels should have MergeLevel seems like something we could enforce statically with types. Maybe by giving MergingRun a type parameter that can be set to MergeLastLevel?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, aren't we also checking this already above here in levelsInvariant?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes perhaps the merging runs in the levels can always use a MergePolicy while the merging runs in the MergingTree (specifically OngoingTreeMerge) could use a MergeType.

Something like:

data MergingRun s mergetype =
    MergingRun !mergetype !(STRef s MergingRunState)

and use it at MergingRun s MergePolicy or MergingRun s MergeType.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, that check is indeed redundant.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it's worth enforcing these things statically. Maybe if we statically knew the exact type of merge, but in this case we only know that it's not a union merge. In other cases it can be union merge or last level merge, but not mid level.

We could also enforce a lot of other invariants statically, i.e. levelling and last level merges only being possible on the last level, but have so far chosen to check them dynamically instead to keep the types simple. Maybe it's worth giving it a try at some point, but this PR is already quite big.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this work, but there are a few different ways of organise it, so I'll open a separate PR.

@dcoutts
Copy link
Collaborator

dcoutts commented Dec 2, 2024

BTW, I tried to review the PR per commit but it wasn't always clear to me what the goal of the changes are (e.g., 0a0fb75). Maybe we could add some more hints to commit messages?

I second this. For example " prototype: move MergePolicy out of MergingRun " yes it looks fine, but I don't know why we want to do that.

@mheinzel mheinzel force-pushed the mheinzel/prototype-union branch from ad30901 to 489c2d2 Compare December 3, 2024 15:27
@mheinzel
Copy link
Collaborator Author

mheinzel commented Dec 3, 2024

BTW, I tried to review the PR per commit but it wasn't always clear to me what the goal of the changes are (e.g., 0a0fb75). Maybe we could add some more hints to commit messages?

That's good feedback, I updated some commits messages and will make sure to be clearer about each changes purpose in the future.

More documentation and fixes for other comments will follow soon.

Copy link
Collaborator

@dcoutts dcoutts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments so far.

I'm up to (but not finished with) patch "prototype: add supplyUnionCredits".

prototypes/ScheduledMerges.hs Outdated Show resolved Hide resolved
Comment on lines 503 to 626
(wb, layers) <- allLayers lsm
let lookupRes = submitLookups ks (concat layers)
let acc = updateLookupAcc (lookupAccFromBuffer ks wb) lookupRes
return $ map (\k -> queryLookupAcc k acc) ks
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're changing our minds, perhaps just squash this patch with the previous one.

prototypes/ScheduledMerges.hs Outdated Show resolved Hide resolved
prototypes/ScheduledMerges.hs Outdated Show resolved Hide resolved
@@ -57,7 +57,8 @@ import GHC.Stack (HasCallStack, callStack)

data LSM s = LSMHandle !(STRef s Counter)
!(STRef s (LSMContent s))
data LSMContent s = LSMContent Buffer (Levels s)

data LSMContent s = LSMContent Buffer (Levels s) (Maybe (MergingTree s))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could do with documenting the obvious here: these are effectively all the levels of the LSM tree. First the buffer (level 0), the normal levels, and optionally a final level resulting from a table union (the merging tree).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found it useful in a few places to have a term for the non-writebuffer non-union levels. I went with "regular levels", could also be "normal levels", any preferences?

prototypes/ScheduledMerges.hs Outdated Show resolved Hide resolved
prototypes/ScheduledMerges.hs Outdated Show resolved Hide resolved
prototypes/ScheduledMerges.hs Outdated Show resolved Hide resolved
[r] -> return (CompletedMerge r)
rs' -> do
let !cost = sum (map runSize rs')
let !debt = cost -- TODO: is there a reason to claim a larger debt?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So with normal runs we claim a debt that would cover the maximum cost, because we want to be sure we're supplying enough credits for the worst case.

But here, the cost is arbitrary, and the user has to supply enough credits to cover it over whatever time period they think makes sense. So yeah, I think debt = cost here is fine.

prototypes/ScheduledMerges.hs Outdated Show resolved Hide resolved
Copy link
Collaborator

@dcoutts dcoutts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looking pretty good.

prototypes/ScheduledMerges.hs Outdated Show resolved Hide resolved
prototypes/ScheduledMerges.hs Outdated Show resolved Hide resolved
prototypes/ScheduledMerges.hs Outdated Show resolved Hide resolved
prototypes/ScheduledMerges.hs Outdated Show resolved Hide resolved
Comment on lines 1013 to 1014
[t1, t2] -> splitEqually credits t1 t2
_ -> error $ "supplyCreditsPendingMerge: "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes fair enough. We don't need to support n-way union. And if we did we'd need a more general n-way split evenly impl.

Copy link
Collaborator Author

@mheinzel mheinzel Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We support n-way unions it now, but that is a separate commit, otherwise I'll have to squash a lot of things or deal with the conflicts.

AUnion <$> elements vars
<*> elements vars)
]
-- TODO: only supply to tables with unions?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or using the latest QC lockstep lib the generators can inspect the model, so they can just look and see if the table contains a union or not. At least, assuming it's visible to the model. Perhaps that's what you mean about adding a flag.

prototypes/ScheduledMerges.hs Outdated Show resolved Hide resolved
prototypes/ScheduledMerges.hs Outdated Show resolved Hide resolved
prototypes/ScheduledMerges.hs Outdated Show resolved Hide resolved
prototypes/ScheduledMerges.hs Outdated Show resolved Hide resolved
@mheinzel mheinzel force-pushed the mheinzel/prototype-union branch 3 times, most recently from b3e5d95 to fbf574b Compare January 13, 2025 15:48
IncomingRun should be `ir`, MergingRun `mr`.
@mheinzel mheinzel force-pushed the mheinzel/prototype-union branch from fbf574b to fccb9c0 Compare January 14, 2025 20:07
MergingRuns will soon not only exist inside the regular levels, but also
as part of a MergingTree, where the concept of MergePolicy doesn't make
sense. Therefore, we move it out into the thing that holds the level's
MergingRun, where it's still available where it was before, but is not
required for union-related MergingRuns.
As a preparation for table unions, we need to know how many credits
are left over when supplying them to a MergingRun.
Also, this commit cleans up the namespace a little before new function
are added.
@mheinzel mheinzel force-pushed the mheinzel/prototype-union branch 2 times, most recently from 0a8e334 to af068b9 Compare January 17, 2025 20:36
Lookups in unions bring some complexity, so we want this prototype to
demonstrate a viable approach that is somewhat similar to the real
implementation: Accumulating with a left fold, starting with the
writebuffer.

As opposed to the real implementation, we accumulate the result for each
key individually, as opposed to a batched vector of lookup results. The
principle remains the same.
This more closely follows the implementation and will allow to trace
different events in depending on whether the MergingRun is used in the
regular levels or the MergingTree.
This statically enforces invariants on pending level merges, also making
it more straight-forward to exploit when doing lookups.
Additionally, it avoids having to wrap a table's runs into additional
STRefs to make them MergingTrees.
We will however still supply union credits to tables with a union debt
that has already been fully paid off (tree merge got completed). This is
difficult to avoid, since the model doesn't know how much work it is to
complete a particular tree/union merge.
@mheinzel mheinzel force-pushed the mheinzel/prototype-union branch from af068b9 to e9cc0e8 Compare January 17, 2025 20:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants