Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create more efficient diffs #140

Closed
wants to merge 6 commits into from
Closed

Conversation

heifner
Copy link
Member

@heifner heifner commented May 14, 2024

Use a more efficient diff for finalizer_policy and proposer_policy.

Resolves #5

@heifner heifner requested a review from linh2931 May 14, 2024 18:57
@heifner heifner added the OCI Work exclusive to OCI team label May 14, 2024
@heifner heifner changed the title Gh 5 ordered diff with move Create move efficient diffs May 14, 2024
size_t s = 0;
size_t t = 0;

/// Generate diff_result that when `apply_diff(orig, diff_result)` will modify source to be equal to target.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't modify source any more.

@@ -24,87 +34,138 @@ template <typename T, typename SizeType = size_t, template<typename Y, typename.
requires std::equality_comparable<T> && std::random_access_iterator<typename Container<T>::iterator>
class ordered_diff {
public:

/// All indexes are in reference to the target. Apply in order: remove, insert, move.
/// remove_indexes are in reverse order so they can be applied directly, see apply_diff.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can give a simple example here, if not too much trouble.

auto targ_sorted = sort_input(target);

size_t orig_index = 0;
size_t cur_index = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not named targ_index?

result.remove_indexes.emplace_back(orig_sorted[orig_index].second);
++orig_index;
} else if (cmp == std::weak_ordering::greater) {
assert(targ_sorted[cur_index].second <= std::numeric_limits<SizeType>::max());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not strictly <?

} else {
cmp = (orig_sorted[orig_index].first <=> targ_sorted[cur_index].first);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add comment to each of the cmp ==

}
}

std::ranges::sort(result.remove_indexes, [](auto&& lhs, auto&& rhs) { return lhs > rhs; }); // descending
std::ranges::sort(result.insert_indexes, [](auto&& lhs, auto&& rhs) { return lhs.first < rhs.first; });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do insert_indexes and move_indexes need to be sorted?

/// @return the modified container now equal to original target
/// @param diff the diff_result created from diff(orig, target), apply_diff(std::move(orig), diff_result) => target
/// @param container the orig of diff(orig, target) to modify using the diff_result to produce the target
/// @return the modified container now equal to target
template <typename X>
requires std::same_as<std::decay_t<X>, diff_result>
static Container<T> apply_diff(Container<T>&& container, X&& diff) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should container be sorted first?

It is also better to be named original or orig.

for (SizeType index : diff.remove_indexes) {
container.erase(container.begin() + index + offset);
--offset;
// Remove from the container based on diff.remove_indexes which is descending order
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possible way to do apply_diff is:

  1. Have a fresh target.
  2. Skip every index in diff.remove_indexes
  3. Insert every value in diff.insert_indexes into the target
  4. Move value in diff.move_indexes to right location in target.

This way the erases/copies can be avoided, and seems less code.

std::ranges::sort(result.insert_indexes, [](auto&& lhs, auto&& rhs) { return lhs.first < rhs.first; });
std::ranges::sort(result.move_indexes, [](auto&& lhs, auto&& rhs) { return lhs.first < rhs.first; });

// update move indexes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section of updating move indexes is quite complicated. Can it somehow be avoided by tweaking other parts of the algorithm?

@@ -24,87 +34,138 @@ template <typename T, typename SizeType = size_t, template<typename Y, typename.
requires std::equality_comparable<T> && std::random_access_iterator<typename Container<T>::iterator>
class ordered_diff {
public:

/// All indexes are in reference to the target. Apply in order: remove, insert, move.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't remove_indexes for original?

@ericpassmore
Copy link
Contributor

Note:start
group: IF
category: PERFORMANCE
summary: Implement a more efficient diff for finalizer_policy and proposer_policy.
Note:end

@heifner heifner marked this pull request as draft May 17, 2024 16:51
@greg7mdp greg7mdp changed the title Create move efficient diffs Create more efficient diffs Jun 13, 2024
@heifner
Copy link
Member Author

heifner commented Aug 6, 2024

Not going to make release. May re-visit for a future consensus release.

@heifner heifner closed this Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCI Work exclusive to OCI team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use diffs for proposer finalizer policy and proposer policy in block_header extension
4 participants