-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor snap-sync. #2916
Refactor snap-sync. #2916
Conversation
- add optional target block - add conditional block import
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you describe why this is needed and how it will be used? There is answer to "what?", but not "why?" right now.
Also it'd make review significantly easier if you can extract get_blocks_from_target_segment
before adding features to it, otherwise there is code moving around that makes the diff appear much larger than it is and it no longer fits the screen, so a lot of jumping back and forth trying to understand what has actually changed, also I tend to do things like last_segment_index
to target_segment_index
renaming done in a separate commit as well to further reduce visual noise.
let signed_block = decode_block::<Block>(&block_bytes) | ||
.map_err(|error| format!("Failed to decode archived block: {error}"))?; | ||
let (header, extrinsics) = signed_block.block.deconstruct(); | ||
if import_blocks_from_downloaded_segment { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the use case of not doing this though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we name it something like import_a_single_block
or similar? Right now ti indicates we do not import blocks, but we do, we just import one instead of all.
Here is the original snap-sync for domains @nazar-pc Domain snap-syncPrerequisites
Algorithm
|
Co-authored-by: Nazar Mokrynskyi <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above comment provides additional context, which is certainly helpful, but it answers the question "How?", it still doesn't say "Why?" or "What is the problem that we're trying to solve?".
It helps a lot if you open a PR with a problem and solution description, such that reviewer can understand what the goal is.
This is not just for me (you assumed I know everything that I need to for review), but also for auditors and community members that may review this PR.
For example recent refactoring PR #2912 didn't introduce any features at all, but explained briefly why refactoring is done and what is coming after it for some context of the changes.
What I expected in this PR is a description that looks something like this (based on my understanding after our 1:1 discussion):
This is the first step towards implementing Snap sync for domains.
One of the initial requirements for it is that we need to find the last archived block that updated last confirmed block for a particular domain and resume sync from there.
Currently Snap sync assumes we always sync first block of the last segment, but with this new requirement we need to be able to potentially purposefully sync to an older block instead and this PR updates Snap sync API to do just that.
As you can see your comment contains instructions on "How?" we do that step by step, but doesn't actually explain why would we need to do that in the first place.
This is not the first time I'm asking "Why?" and I hope example helps now that I undestand what we're trying to achieve here.
On a separate topic I thought about implementation some more and I'm not sure we need to Snap sync/import the newer block before (without notification, etc.) before syncing older block.
IIRC there should be a way to download just a subset of the state with a proof.
So potentially after you download and decode the block, you can get a state root out of it and ask peers to give you the contents of a known key and a proof that it ties back to this storage root.
I'm not sure how difficult it would be to do that, but it would avoid some additional edge-cases like when one of the future blocks is already imported (after this initial snap sync) and there is no corresponding notification about its import anymore even though domain (or something on consensus side) may actually expect consistent notifications for each block without gaps and would break in interesting ways otherwise.
let mut current_segment_index = last_segment_index; | ||
|
||
loop { | ||
if segment_header.last_archived_block().number == target_block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think this is a correct check. The fact that you have a specific last archived block number doesn't mean it is contained fully in this segment. I think this segment_header.last_archived_block().number == target_block
part should simply be removed, the actual correct check is done below with < target_block
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check would be incorrect if we have blocks spanning more than two segments. Otherwise, we will download and import an extra segment. However, your requested change is consistent with the previous discussion about the theoretical big blocks. Please, let me know if you want to change it despite the extra segments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do want to change it. It will make no difference now and will allow us to potentially avoid very tricky bugs in the future. We do not have any inherent guarantees about block size here. Also it is not like we need to add code, we need to remove a condition that makes things easier to understand, not harder (in this particular case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above comment provides additional context, which is certainly helpful, but it answers the question "How?", it still doesn't say "Why?" or "What is the problem that we're trying to solve?".
Indeed, the original PR description misses this explanation. I provided the context after our offline discussion and I will add more information next time.
So potentially after you download and decode the block, you can get a state root out of it and ask peers to give you the contents of a known key and a proof that it ties back to this storage root.
I'm not sure how difficult it would be to do that,
It makes sense as an optimization but I'm also not sure about the implementation difficulty.
...but it would avoid some additional edge-cases like when one of the future blocks is already imported (after this initial snap sync) and there is no corresponding notification about its import anymore even though domain (or something on consensus side) may actually expect consistent notifications for each block without gaps and would break in interesting ways otherwise.
This edge case must be addressed anyway to implement the "domains snap-sync" (Prerequisites section 3.2)
let mut current_segment_index = last_segment_index; | ||
|
||
loop { | ||
if segment_header.last_archived_block().number == target_block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check would be incorrect if we have blocks spanning more than two segments. Otherwise, we will download and import an extra segment. However, your requested change is consistent with the previous discussion about the theoretical big blocks. Please, let me know if you want to change it despite the extra segments.
let signed_block = decode_block::<Block>(&block_bytes) | ||
.map_err(|error| format!("Failed to decode archived block: {error}"))?; | ||
let (header, extrinsics) = signed_block.block.deconstruct(); | ||
if import_blocks_from_downloaded_segment { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This edge case must be addressed anyway to implement the "domains snap-sync" (Prerequisites section 3.2)
What I meant is that with suggested way of implementing it we do not have that issue anymore because we avoid importing future block in the first place, so there will be nothing to address.
let mut current_segment_index = last_segment_index; | ||
|
||
loop { | ||
if segment_header.last_archived_block().number == target_block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do want to change it. It will make no difference now and will allow us to potentially avoid very tricky bugs in the future. We do not have any inherent guarantees about block size here. Also it is not like we need to add code, we need to remove a condition that makes things easier to understand, not harder (in this particular case).
let signed_block = decode_block::<Block>(&block_bytes) | ||
.map_err(|error| format!("Failed to decode archived block: {error}"))?; | ||
let (header, extrinsics) = signed_block.block.deconstruct(); | ||
if import_blocks_from_downloaded_segment { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we name it something like import_a_single_block
or similar? Right now ti indicates we do not import blocks, but we do, we just import one instead of all.
/// Note: setting import_state_block_only will import only the block related to the state (a block | ||
/// number equal or less than target_block or the first block of the last archived segment) and | ||
/// disable importing the remaining blocks of the downloaded segment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought target_block
is target block, but this comment is saying a lower block number may be imported instead, is that really true and if so then why not importing the block that is requested as one would expect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our current code supports the case of importing the first block of the segment with a state and possibly importing the remaining blocks later. Are you suggesting to implement: 1) import blocks before the block, 2) import the state block, 3) import the remaining blocks ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not suggesting anything, I'm asking about what was actually implemented because the comment doesn't match what I expected the code would do.
5007ab7
to
43d991e
Compare
The last commit removes the inconsistent feature of skipping block import using a special flag for snap-sync (following the offline discussion). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll squash-merge
The PR contains changes related to the upcoming snap-sync for domains:
New
get_blocks_from_target_segment
calculates the target segment for download by the optional target block (None = last segment).Code contributor checklist: