Optimize merge commit sync mode #54995

banmoy · 2025-01-13T06:25:42Z

Backgroud

The response to a merge commit request has two modes

sync: BE adds data to the buffer, waits for the load to finish, and returns a response with the load result.
async: BE adds data to the buffer and returns a response immediately. It can not guarantee the load is successful. The client needs to poll the load state from FE (via HTTP _get_load_state) later.

The synchronous mode returns the final result to the client, simplifying subsequent processing on the client side and reducing the pressure on the FE HTTP server caused by polling load state. When used in conjunction with brpc, it does not block the client because both brpc client and server can handle requests asynchronously. The combination of sync mode and brpc is a good choice.

Currently, for sync mode, each request on the BE side polls FE periodically to obtain the load result (via thrift rpc getLoadTxnStatus). This approach is simple but has the following drawbacks, especially under high concurrency:

Requests merged into a single transaction do not share the rpc result, leading to redundant rpc calls.
The polling mechanism is inefficient: a small polling interval increases the FE rpc pressure, while a large interval reduces real-time responsiveness.

Therefore, we aim to optimize the mechanism how to get the load result on BE side.

Solution

Overall, we aim to address the issue through the following approaches:

FE Actively Sends Results to Coordinator:
After the load is completed, the FE will proactively send the result (transaction state) to the coordinator backend. This reduces unnecessary polling on the BE side and improves real-time responsiveness.
Introduce a TxnStateCache Component on the BE Side:
This component will manage transaction states and provide the following functionalities:
- Caching FE-Sent Transaction States:
  In real-time imports, only the most recent transactions are used. An LRUCache can store a sufficient number of transactions while automatically evicting older ones.
- Subscription Mechanism:
  Each load request can subscribe to the state of a transaction. Once the transaction state is updated, the request will be notified proactively.
- Low-Frequency Polling for Subscribed Transactions:
  For transactions with active subscriptions, the BE will poll the FE for their state at a low frequency. This addresses scenarios where FE fails to push results (e.g., FE leader switch or crash), avoiding unnecessary waiting.

The framework is as follows:

FE Side TxnStateDispatcher
responsible for sending the transaction state to the coordinator BE. After the load is completed, it will submit a dispatch task to the dispatcher, including information such as the database, transaction id, and backend id. The dispatcher will make its best effort to send the state. If the retry limit is exceeded, the task will be discarded. Additionally, the dispatcher is stateless, meaning it will not continue sending previous tasks after an FE leader switch or restart. In such cases, the BE-side polling mechanism will handle resolving the transaction state.

BRPC update_transaction_load
A new brpc interface, update_transaction_state, has been added to allow the FE to send transaction states to the BE.

BE Side TxnStateCache

DynamicCache: an lru cache whose entries are TxnStateHandler. The handler stores the transaction state, the information of subscribers (TxnStateSubscriber), and decides whether to poll state. The cache manages the lifecycle of the handler. If reach the capacity, it will evict the oldest handler on which there is no subscriber. The transaction state will be updated when receiving txn state pushed by FE or polled by TxnStatePoller, then it notifies subscribers who are waiting on it
TxnStateSubscriber: each load request subscribes the state. It holds a reference to TxnStateHandler so that the lru cache will not evict the entry before the subscriber stop waiting
TxnStatePoller: schedule and execute the txn state poll tasks. The task will update the transaction state after finished
- pending poll tasks: the initial poll task for a transaction is submitted to the poller when the first TxnStateSubscriber is created, and will be scheduled periodically until the transaction reaches the final state (VISIBLE/AOBRTED/UNKNOWN) or there is no subscriber
- schedule thread: a single thread that schedules pending tasks to execute according to their execution time
- Execute thread pool: the thread pool that execute the poll task. The task will send thrift rpc getLoadTxnStatus to FE to get the current state, and then update TxnStateHandler

You can see the overall implementation in this draft pr #54676. It will be divided into two PRs

Introduce Introduce TxnStateCache on BE side [Enhancement] Introduce TxnStateCache for merge commit sync mode #55001
Introduce TxnStateDispatch on FE side [Enhancement] Introduce TxnStateDispatcher for merge commit sync mode #55001 #55071

The text was updated successfully, but these errors were encountered:

…55001 (#55071) This is the second PR of merge commit sync mode optimization #54995. Introduce TxnStateDispatcher on FE side. You can see #54995 for details Signed-off-by: PengFei Li <[email protected]>

…tarRocks#55001 (StarRocks#55071) This is the second PR of merge commit sync mode optimization StarRocks#54995. Introduce TxnStateDispatcher on FE side. You can see StarRocks#54995 for details Signed-off-by: PengFei Li <[email protected]>

banmoy added the type/enhancement Make an enhancement to StarRocks label Jan 13, 2025

This was referenced Jan 13, 2025

[Enhancement] Introduce TxnStateCache for merge commit sync mode #55001

Merged

[Enhancement] Introduce TxnStateDispatcher for merge commit sync mode #55001 #55071

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize merge commit sync mode #54995

Optimize merge commit sync mode #54995

banmoy commented Jan 13, 2025 •

edited

Loading

Optimize merge commit sync mode #54995

Optimize merge commit sync mode #54995

Comments

banmoy commented Jan 13, 2025 • edited Loading

Backgroud

Solution

banmoy commented Jan 13, 2025 •

edited

Loading