Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(webhook): webhook waiting for persistency. #20164

Open
wants to merge 25 commits into
base: main
Choose a base branch
from

Conversation

KeXiangWang
Copy link
Contributor

@KeXiangWang KeXiangWang commented Jan 15, 2025

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

This pr does several things:

  1. Introduce a new interface on task_service called fast_insert (maybe should be called lightweight_insert).
    With this interface,
    a. RW can insert dml data directly into tables without the complicated optimization process and batch execution process, which is more efficient.
    b. the new interfaces allow callers to wait for persistency, which means, if enabled, the function call will only return after the data has been commited to persistency layer. To be specific, it will wait until the epoch when the insert happens has been commited in hummock.
  2. Refactor the webhook implementation based on the above mentioned new interface.

One ongoing refactor: with FastInsertContext saved as a map, we don't need to create a new sessionimpl every time anymore. After the communication with @st1page, pause for now.

Two known bugs to fix:

  1. the dummy session will lead to memory leakage.
  2. table with webhook should not be allowed to add column via alter table.

Checklist

  • I have written necessary rustdoc comments.
  • I have added necessary unit tests and integration tests.
  • I have added test labels as necessary.
  • I have added fuzzing tests or opened an issue to track them.
  • My PR contains breaking changes.
  • My PR changes performance-critical code, so I will run (micro) benchmarks and present the results.
  • My PR contains critical fixes that are necessary to be merged into the latest release.

Documentation

  • My PR needs documentation updates.
Release note

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license-eye has checked 5561 files.

Valid Invalid Ignored Fixed
2346 3 3212 0
Click to see the invalid file list
  • src/batch/src/executor/fast_insert.rs
  • src/frontend/src/scheduler/distributed/fast_insert.rs
  • src/frontend/src/scheduler/fast_insert.rs
Use this command to fix any missing license headers
```bash

docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes header fix

</details>

src/batch/src/executor/fast_insert.rs Outdated Show resolved Hide resolved
src/frontend/src/scheduler/distributed/fast_insert.rs Outdated Show resolved Hide resolved
src/frontend/src/scheduler/fast_insert.rs Outdated Show resolved Hide resolved
@KeXiangWang KeXiangWang marked this pull request as ready for review January 21, 2025 23:48
@KeXiangWang KeXiangWang requested a review from a team as a code owner January 21, 2025 23:48
@KeXiangWang KeXiangWang requested a review from lmatz January 21, 2025 23:48
Copy link

gru-agent bot commented Jan 22, 2025

This pull request has been modified. If you want me to regenerate unit test for any of the files related, please find the file in "Files Changed" tab and add a comment @gru-agent. (The github "Comment on this file" feature is in the upper right corner of each file in "Files Changed" tab.)

@KeXiangWang KeXiangWang requested a review from st1page January 22, 2025 01:01
@st1page st1page requested a review from chenzl25 January 22, 2025 02:21
@fuyufjh fuyufjh self-requested a review January 22, 2025 02:48
// Can be any address, we use the port of meta to indicate that it's a internal request.
let dummy_addr = Address::Tcp(SocketAddr::new(IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)), 5691));

// FIXME(kexiang): the dummy_session can lead to memory leakage
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should get rid of Session in this fast_insert approach. Any reason that blocks us from doing so?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, Can we bypass creating a session to get catalog_reader? I think the catalog is just inside the SESSION_MANAGER's frontendEnv

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Comment on lines 198 to 199
// A special insert node for non-pgwire insert, not really a batch node.
message FastInsertNode {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really a batch node.

Yeah, recommend to flatten these fields inside message FastInsertRequest

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants