Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize merge commit sync mode #54995

Open
banmoy opened this issue Jan 13, 2025 · 0 comments
Open

Optimize merge commit sync mode #54995

banmoy opened this issue Jan 13, 2025 · 0 comments
Labels
type/enhancement Make an enhancement to StarRocks

Comments

@banmoy
Copy link
Contributor

banmoy commented Jan 13, 2025

Backgroud

The response to a merge commit request has two modes

  • sync: BE adds data to the buffer, waits for the load to finish, and returns a response with the load result.
  • async: BE adds data to the buffer and returns a response immediately. It can not guarantee the load is successful. The client needs to poll the load state from FE (via HTTP _get_load_state) later.

The synchronous mode returns the final result to the client, simplifying subsequent processing on the client side and reducing the pressure on the FE HTTP server caused by polling load state. When used in conjunction with brpc, it does not block the client because both brpc client and server can handle requests asynchronously. The combination of sync mode and brpc is a good choice.

Currently, for sync mode, each request on the BE side polls FE periodically to obtain the load result (via thrift rpc getLoadTxnStatus). This approach is simple but has the following drawbacks, especially under high concurrency:

  1. Requests merged into a single transaction do not share the rpc result, leading to redundant rpc calls.
  2. The polling mechanism is inefficient: a small polling interval increases the FE rpc pressure, while a large interval reduces real-time responsiveness.

Therefore, we aim to optimize the mechanism how to get the load result on BE side.

Solution

Overall, we aim to address the issue through the following approaches:

  1. FE Actively Sends Results to Coordinator:
    After the load is completed, the FE will proactively send the result (transaction state) to the coordinator backend. This reduces unnecessary polling on the BE side and improves real-time responsiveness.

  2. Introduce a TxnStateCache Component on the BE Side:
    This component will manage transaction states and provide the following functionalities:

    • Caching FE-Sent Transaction States:
      In real-time imports, only the most recent transactions are used. An LRUCache can store a sufficient number of transactions while automatically evicting older ones.
    • Subscription Mechanism:
      Each load request can subscribe to the state of a transaction. Once the transaction state is updated, the request will be notified proactively.
    • Low-Frequency Polling for Subscribed Transactions:
      For transactions with active subscriptions, the BE will poll the FE for their state at a low frequency. This addresses scenarios where FE fails to push results (e.g., FE leader switch or crash), avoiding unnecessary waiting.

The framework is as follows:
image

FE Side TxnStateDispatcher
responsible for sending the transaction state to the coordinator BE. After the load is completed, it will submit a dispatch task to the dispatcher, including information such as the database, transaction id, and backend id. The dispatcher will make its best effort to send the state. If the retry limit is exceeded, the task will be discarded. Additionally, the dispatcher is stateless, meaning it will not continue sending previous tasks after an FE leader switch or restart. In such cases, the BE-side polling mechanism will handle resolving the transaction state.

BRPC update_transaction_load
A new brpc interface, update_transaction_state, has been added to allow the FE to send transaction states to the BE.

BE Side TxnStateCache

  • DynamicCache: an lru cache whose entries are TxnStateHandler. The handler stores the transaction state, the information of subscribers (TxnStateSubscriber), and decides whether to poll state. The cache manages the lifecycle of the handler. If reach the capacity, it will evict the oldest handler on which there is no subscriber. The transaction state will be updated when receiving txn state pushed by FE or polled by TxnStatePoller, then it notifies subscribers who are waiting on it
  • TxnStateSubscriber: each load request subscribes the state. It holds a reference to TxnStateHandler so that the lru cache will not evict the entry before the subscriber stop waiting
  • TxnStatePoller: schedule and execute the txn state poll tasks. The task will update the transaction state after finished
    • pending poll tasks: the initial poll task for a transaction is submitted to the poller when the first TxnStateSubscriber is created, and will be scheduled periodically until the transaction reaches the final state (VISIBLE/AOBRTED/UNKNOWN) or there is no subscriber
    • schedule thread: a single thread that schedules pending tasks to execute according to their execution time
    • Execute thread pool: the thread pool that execute the poll task. The task will send thrift rpc getLoadTxnStatus to FE to get the current state, and then update TxnStateHandler

You can see the overall implementation in this draft pr #54676. It will be divided into two PRs

@banmoy banmoy added the type/enhancement Make an enhancement to StarRocks label Jan 13, 2025
gengjun-git pushed a commit that referenced this issue Jan 15, 2025
…55001 (#55071)

This is the second PR of merge commit sync mode optimization #54995. Introduce TxnStateDispatcher on FE side. You can see #54995 for details

Signed-off-by: PengFei Li <[email protected]>
banmoy added a commit to banmoy/starrocks that referenced this issue Jan 15, 2025
…tarRocks#55001 (StarRocks#55071)

This is the second PR of merge commit sync mode optimization StarRocks#54995. Introduce TxnStateDispatcher on FE side. You can see StarRocks#54995 for details

Signed-off-by: PengFei Li <[email protected]>
banmoy added a commit to banmoy/starrocks that referenced this issue Jan 15, 2025
…tarRocks#55001 (StarRocks#55071)

This is the second PR of merge commit sync mode optimization StarRocks#54995. Introduce TxnStateDispatcher on FE side. You can see StarRocks#54995 for details

Signed-off-by: PengFei Li <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement Make an enhancement to StarRocks
Projects
None yet
Development

No branches or pull requests

1 participant