-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Table union support in prototype #477
base: main
Are you sure you want to change the base?
Conversation
c0c6301
to
0b4dff4
Compare
0b4dff4
to
4e7e82e
Compare
There are still a few TODOs and I might not do proper tracing in this PR, but the lookups are batched now. I'm not sure about commit 4e7e82e, do you think it makes sense to include? |
cd7bfb5
to
ad30901
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! 🚀 This is a rather superficial review, since I don't understand the ins and outs of unions as well as you and @dcoutts do, so I feel I can't comment much on the details (yet). I hope it's useful nonetheless.
BTW, I tried to review the PR per commit but it wasn't always clear to me what the goal of the changes are (e.g., 0a0fb75). Maybe we could add some more hints to commit messages?
prototypes/ScheduledMerges.hs
Outdated
newtype MergingTree s = | ||
MergingTree (STRef s (MergingTreeState s)) | ||
|
||
data MergingTreeState s = CompletedTreeMerge !Run | ||
-- | Reuses MergingRun (with its STRef) to allow | ||
-- sharing existing merges. | ||
| OngoingTreeMerge !(MergingRun s) | ||
| PendingTreeMerge !(PendingMerge s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two types should be documented
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I second Joris's request for these to have some explanation. I know what they are, but this should be for everyone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I documented the types and also added a big top-level comment explaining the general approach to unions.
prototypes/ScheduledMerges.hs
Outdated
mergePolicyForLevel _ [] = MergePolicyLevelling | ||
mergePolicyForLevel _ _ = MergePolicyTiering |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this also depend on whether there is a union on the last level? The way I see it now, it does not make much sense if the the last real level before the union uses levelling if it does not also remove deletes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. This means that inserts on a table with a union is cheaper than on one without a union, since we do more work on levelling merges than tiering merges. So potentially we could even spend the "saved" credits on merging the union? These credits are probably well spent, as they help to bring down the cost of lookups, but this also further muddies the separation between the progress of union and non-union (level) merges. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it so we don't do levelling if there is a union level. I also left a TODO for the idea of supplying some of the regular (non-union) credits into the union.
prototypes/ScheduledMerges.hs
Outdated
| MergeStartedEvent { | ||
mergePolicy :: MergePolicy, | ||
mergeLast :: MergeLastLevel, | ||
mergeDebt :: MergeDebt, | ||
mergePolicy :: Maybe MergePolicy, | ||
mergeType :: MergeType, | ||
mergeDebt :: Int, | ||
mergeCost :: Int, | ||
mergeRunsSize :: [Int] | ||
} | ||
| MergeCompletedEvent { | ||
mergePolicy :: MergePolicy, | ||
mergeLast :: MergeLastLevel, | ||
mergePolicy :: Maybe MergePolicy, | ||
mergeType :: MergeType, | ||
mergeSize :: Int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have separate constructors for different types of merges? I don't have a preference, and I'm not sure if it's useful, but it might make it clear where in the merge algorithm these things are coming from
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's one of the questions to answer around tracing. It's a little complicated, since level merges (merges created by inserting values) also become part of a union and might get completed through supplyUnionCredits
. Should these then have a different CompletedEvent than merges created through union
? Seems a bit awkward to have to make this distinction (especially in case we don't want the "tweak PendingMerge representation" change, which currently tracks the origin of a merge in a way).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I postponed this question for now, but left a TODO to tackle tracing soonish. In particular, we strictly only trace inside the regular levels now, so with a little refactoring the Maybe
wasn't necessary.
prototypes/ScheduledMerges.hs
Outdated
lookups :: LSM s -> [Key] -> ST s [LookupResult Value Blob] | ||
lookups (LSMHandle _ lsmr) ks = do | ||
LSMContent wb ls ul <- readSTRef lsmr | ||
runs <- concat <$> flattenLevels ls | ||
let runsAcc = lookupsBufferAndRuns ks wb runs | ||
acc <- case ul of | ||
NoUnion -> | ||
return runsAcc | ||
Union t _ -> do | ||
treeAcc <- lookupsTree ks t | ||
return $ mergeLookupAccs [runsAcc, treeAcc] | ||
return $ map (\k -> queryLookupAcc k acc) ks | ||
|
||
lookup :: LSM s -> Key -> ST s (LookupResult Value Blob) | ||
lookup lsm k = do | ||
runs <- concat <$> allLayers lsm | ||
return $ doLookup k runs | ||
|
||
doLookup :: Key -> [Run] -> LookupResult Value Blob | ||
doLookup k = | ||
foldr (\run continue -> | ||
case Map.lookup k run of | ||
Nothing -> continue | ||
Just (Insert v mb) -> Found v mb | ||
Just Delete -> NotFound | ||
Just (Mupsert v) -> case continue of | ||
NotFound -> Found v Nothing | ||
Found v' mb -> Found (resolveValue v v') mb) | ||
NotFound | ||
lookup lsm k = head <$> lookups lsm [k] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced we have to prototype batched lookups so closely to the implementation, though I'm aware I could also be missing context on why it's necessary.
The lookups need to do bracketing properly, but that seems orthogonal to single-key vs. batched lookups style. Would it not lead to simpler code if we were implement the batch lookup in terms of the single key lookup, like it was before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lookups into unions are somewhat complicated, depend on the representation of MergingTrees and offer many different design choices. So I wanted to be able to play around with them here and get a good idea of what the implications are for the real code.
I think you're right in that I could write very similar code handling a single key only, which means the accumulator would just be a Maybe Op
and I could remove a few types. The correspondence between prototype and code then becomes a little less obvious, but should still be easy to translate, so maybe a better choice indeed. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am handling each key individually now, the code still looks pretty similar. It still follows the real implementation relatively closely: accumulating a Maybe Op
, starting with the entry from the write buffer. It also has a separate function for performing a batch of lookups, so it's clear how many of these are needed when doing lookups into a merging tree.
prototypes/ScheduledMerges.hs
Outdated
MergePolicyLevelling -> do | ||
case ir of | ||
Merging _ (MergingRun mergeType _) -> | ||
assertST (mergeType == MergeLevel (mergeLast ls)) | ||
_ -> pure () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that MerginRun
s in the levels should have MergeLevel
seems like something we could enforce statically with types. Maybe by giving MergingRun
a type parameter that can be set to MergeLastLevel
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, aren't we also checking this already above here in levelsInvariant
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes perhaps the merging runs in the levels can always use a MergePolicy
while the merging runs in the MergingTree
(specifically OngoingTreeMerge
) could use a MergeType
.
Something like:
data MergingRun s mergetype =
MergingRun !mergetype !(STRef s MergingRunState)
and use it at MergingRun s MergePolicy
or MergingRun s MergeType
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, that check is indeed redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it's worth enforcing these things statically. Maybe if we statically knew the exact type of merge, but in this case we only know that it's not a union merge. In other cases it can be union merge or last level merge, but not mid level.
We could also enforce a lot of other invariants statically, i.e. levelling and last level merges only being possible on the last level, but have so far chosen to check them dynamically instead to keep the types simple. Maybe it's worth giving it a try at some point, but this PR is already quite big.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this work, but there are a few different ways of organise it, so I'll open a separate PR.
I second this. For example " prototype: move MergePolicy out of MergingRun " yes it looks fine, but I don't know why we want to do that. |
ad30901
to
489c2d2
Compare
That's good feedback, I updated some commits messages and will make sure to be clearer about each changes purpose in the future. More documentation and fixes for other comments will follow soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments so far.
I'm up to (but not finished with) patch "prototype: add supplyUnionCredits".
prototypes/ScheduledMerges.hs
Outdated
(wb, layers) <- allLayers lsm | ||
let lookupRes = submitLookups ks (concat layers) | ||
let acc = updateLookupAcc (lookupAccFromBuffer ks wb) lookupRes | ||
return $ map (\k -> queryLookupAcc k acc) ks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're changing our minds, perhaps just squash this patch with the previous one.
prototypes/ScheduledMerges.hs
Outdated
@@ -57,7 +57,8 @@ import GHC.Stack (HasCallStack, callStack) | |||
|
|||
data LSM s = LSMHandle !(STRef s Counter) | |||
!(STRef s (LSMContent s)) | |||
data LSMContent s = LSMContent Buffer (Levels s) | |||
|
|||
data LSMContent s = LSMContent Buffer (Levels s) (Maybe (MergingTree s)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could do with documenting the obvious here: these are effectively all the levels of the LSM tree. First the buffer (level 0), the normal levels, and optionally a final level resulting from a table union (the merging tree).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found it useful in a few places to have a term for the non-writebuffer non-union levels. I went with "regular levels", could also be "normal levels", any preferences?
prototypes/ScheduledMerges.hs
Outdated
[r] -> return (CompletedMerge r) | ||
rs' -> do | ||
let !cost = sum (map runSize rs') | ||
let !debt = cost -- TODO: is there a reason to claim a larger debt? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So with normal runs we claim a debt that would cover the maximum cost, because we want to be sure we're supplying enough credits for the worst case.
But here, the cost is arbitrary, and the user has to supply enough credits to cover it over whatever time period they think makes sense. So yeah, I think debt = cost here is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looking pretty good.
prototypes/ScheduledMerges.hs
Outdated
[t1, t2] -> splitEqually credits t1 t2 | ||
_ -> error $ "supplyCreditsPendingMerge: " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes fair enough. We don't need to support n-way union. And if we did we'd need a more general n-way split evenly impl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We support n-way unions it now, but that is a separate commit, otherwise I'll have to squash a lot of things or deal with the conflicts.
prototypes/ScheduledMergesTestQLS.hs
Outdated
AUnion <$> elements vars | ||
<*> elements vars) | ||
] | ||
-- TODO: only supply to tables with unions? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or using the latest QC lockstep lib the generators can inspect the model, so they can just look and see if the table contains a union or not. At least, assuming it's visible to the model. Perhaps that's what you mean about adding a flag.
b3e5d95
to
fbf574b
Compare
IncomingRun should be `ir`, MergingRun `mr`.
fbf574b
to
fccb9c0
Compare
MergingRuns will soon not only exist inside the regular levels, but also as part of a MergingTree, where the concept of MergePolicy doesn't make sense. Therefore, we move it out into the thing that holds the level's MergingRun, where it's still available where it was before, but is not required for union-related MergingRuns.
As a preparation for table unions, we need to know how many credits are left over when supplying them to a MergingRun. Also, this commit cleans up the namespace a little before new function are added.
0a8e334
to
af068b9
Compare
Lookups in unions bring some complexity, so we want this prototype to demonstrate a viable approach that is somewhat similar to the real implementation: Accumulating with a left fold, starting with the writebuffer. As opposed to the real implementation, we accumulate the result for each key individually, as opposed to a batched vector of lookup results. The principle remains the same.
This more closely follows the implementation and will allow to trace different events in depending on whether the MergingRun is used in the regular levels or the MergingTree.
Also starts tracking union debt in Table.
This statically enforces invariants on pending level merges, also making it more straight-forward to exploit when doing lookups. Additionally, it avoids having to wrap a table's runs into additional STRefs to make them MergingTrees.
We will however still supply union credits to tables with a union debt that has already been fully paid off (tree merge got completed). This is difficult to avoid, since the model doesn't know how much work it is to complete a particular tree/union merge.
af068b9
to
e9cc0e8
Compare
Tests pass, but still needs a bit of cleaning up, additional invariants and proper tracing. The lookups are currently quite naive and inefficient.