-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
style+perf: clean-up and optimize remove_empty_byte_from_padded_bytes_unchecked fn #41
style+perf: clean-up and optimize remove_empty_byte_from_padded_bytes_unchecked fn #41
Conversation
…_unchecked function
There were a bunch of warnings that some of our set fmt properties were not being run: Warning: can't set `wrap_comments = true`, unstable features are only available in nightly channel. Warning: can't set `normalize_comments = true`, unstable features are only available in nightly channel.
Getting "error: toolchain 'nightly-x86_64-unknown-linux-gnu' is not installed" on github, and don't feel like debugging. Not even sure how cargo/rust are installed. Do they come preloaded by default? This reverts commit 6e87e0a.
Note: Apologies about the large number of edits that are just formatting.... applied |
src/kzg.rs
Outdated
/// Precompute the primitive roots of unity for binary powers that divide r - 1 | ||
/// TODO(anupsv): Move this to the constants file. Ref: https://github.com/Layr-Labs/rust-kzg-bn254/issues/31 | ||
/// Precompute the primitive roots of unity for binary powers that divide r | ||
/// - 1 TODO(anupsv): Move this to the constants file. Ref: https://github.com/Layr-Labs/rust-kzg-bn254/issues/31 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why break a line at -1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
argh that's what our linter when ran with nightly version does.... think I should just revert that commit?
@anupsv we'll need to look at that linter config at some point. It seems not that great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted the cargo +nightly fmt
commit and formatted with stable rust instead. PTAL
wait, exactly which approach you implemented? confusing to read " I do have to not however that the version with iterators (the one in this PR) is faster on 32KiB inputs but (slightly) slower on 32MiB." Code wise it looks right to mee |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Updated PR description, should have read "I do have to NOTE however" |
This reverts commit ae70bf5.
This was a fun weekend. Got to learn a crap ton about rust iterators, assembly output, godbolt, llvm, etc.
I was just trying to make this function cleaner by adopting a functional iterator, but in doing so realized the code was then much slower (up to 7x depending on input size). With 2 small modifications, managed to get the output to use pre-allocated output vector and use simd instructions for copying, which made the code 2-7x FASTER depending on input size.
Benchmarks are available in master...perf--remove-empty-byte-from-padded-bytes-fn-benchmark. Here are the results (function_fast is the function implemented in this PR):
for 32B inputs
for 32KiB inputs
for 32MiB inputs
Note: I decided to implement the functional_fast function instead of the fast function (which contains the same logic but written without iterators), because I personally find it cleaner to read. I do have to note however that the version with iterators (the one in this PR) is faster on 32KiB inputs but (slightly) slower on 32MiB. If we ever have teams sending huge bytes in the future, we might want to implement both approaches and let them pick and choose? Or perhaps have a wrapper that dispatches to the correct implementation based on input size?