Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] Make online and offline memory use vectors of pages instead of hashmaps #1224

Merged
merged 17 commits into from
Jan 22, 2025

Conversation

Golovanov399
Copy link
Contributor

@Golovanov399 Golovanov399 commented Jan 16, 2025

This resolves INT-2985 and INT-2984.

Instead of FxHashMap for memory, we now use the new structure PageVec which is Vec<Option<Vec<T>>>, where memory is split into pages of fixed size and we create the whole page when the element is not touched yet, and therefore it should give us faster access to elements.

In reality, this depends on the page size, on the queries locality and probably something else. In particular, for regex, execute time and trace gen time reduced by 12%, but for other benchmarks the results are different.

Among other dependent changes, the default as_height is now not 29 but 3, otherwise we would have to create half a billion PageVecs every time (and we don't need more than 8 address spaces anyway).

@Golovanov399 Golovanov399 force-pushed the feat/offline-memory-paged-vec branch from 6ca9442 to 5070149 Compare January 16, 2025 22:21

This comment has been minimized.

This comment has been minimized.

@Golovanov399 Golovanov399 changed the title Feat/offline memory paged vec [feat] Make online and offline memory use vectors of pages instead of a hashmap Jan 17, 2025
@Golovanov399 Golovanov399 changed the title [feat] Make online and offline memory use vectors of pages instead of a hashmap [feat] Make online and offline memory use vectors of pages instead of hashmaps Jan 17, 2025

This comment has been minimized.

crates/vm/src/system/memory/controller/mod.rs Outdated Show resolved Hide resolved
crates/vm/src/system/memory/controller/mod.rs Outdated Show resolved Hide resolved
crates/vm/src/system/memory/merkle/tests/mod.rs Outdated Show resolved Hide resolved
let label = pointer / CHUNK as u32;
assert!(address_space - as_offset < (1 << as_height));
assert!(label < (1 << address_height));
if initial_memory.get(&(address_space, pointer)) != Some(value) {
assert!(pointer < ((CHUNK << address_height).div_ceil(PAGE_SIZE) * PAGE_SIZE) as u32);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the bound just pointer < (CHUNK << address_height)? The boundary chip won't support pointer outside this range.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless pointer is from the last page that was padded to PAGE_SIZE elements and therefore it exceeds the supposed memory

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that would still be an invalid pointer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If address_height is 27 and CHUNK is 8, then the max addressable cell is 2^30 - 1. If I understand correctly, this assertion might allow me to address into 2^30 or higher, if the PAGE_SIZE doesn't evenly divide 2^30. From the perspective of PagedVec, that's a fine index. But from the perspective of memory, that's not a valid pointer.

Or what am I missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but how does it break anything here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a test, so I'm not really that concerned. But the point of the assertion is to assert that pointer is a valid pointer. And valid pointers are in the range 0..CHUNK << address_height and, in particular, have nothing to do with the PAGE_SIZE of the underlying PagedVec.

crates/vm/src/system/memory/mod.rs Show resolved Hide resolved
let block = self
.block_data
.entry((address_space, pointer + i))
.or_insert_with(|| Self::initial_block_data(pointer + i, self.initial_block_size));
.get_mut(&(address_space, pointer + i))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we would do get_mut on line 336 so we don't need to do it here, but I guess this be problematic for the borrow checker. We should find a way to do this without so many accesses though.

crates/vm/src/system/memory/offline.rs Outdated Show resolved Hide resolved
crates/vm/src/system/memory/paged_vec.rs Outdated Show resolved Hide resolved
let result = page[range.start - page_start..range.end - page_start].to_vec();
for (j, value) in range.zip(values.into_iter()) {
page[j - page_start] = value.clone();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we do copy_from_slice or something?

crates/vm/src/system/memory/paged_vec.rs Outdated Show resolved Hide resolved

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

@jonathanpwang jonathanpwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I'd like us to make get_range more performant (or what I feel like is the more performant implementation) before merging

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link

group app.proof_time_ms app.cycles app.cells_used leaf.proof_time_ms leaf.cycles leaf.cells_used
verify_fibair (+51 [+2.2%]) 2,321 511,919 19,310,859 - - -
fibonacci_program (+13 [+0.2%]) 6,025 1,500,137 51,487,838 - - -
regex_program (-1167 [-6.2%]) 17,760 4,190,904 165,010,909 - - -
ecrecover_program 2,583 285,401 (-17264 [-0.1%]) 15,075,033 - - -

Commit: 8ce871d

Benchmark Workflow

@Golovanov399 Golovanov399 merged commit acdb0e2 into main Jan 22, 2025
22 checks passed
@Golovanov399 Golovanov399 deleted the feat/offline-memory-paged-vec branch January 22, 2025 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants