Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] adaptive enable the exchange compression #54956

Merged

Conversation

murphyatwork
Copy link
Contributor

@murphyatwork murphyatwork commented Jan 10, 2025

Why I'm doing:

What I'm doing:

Change the default value of the variable transmission_compression_type from NO_COMPRESSION to AUTO, which means:

  1. Compression will be automatically enabled if it is determined to be beneficial.
  2. The current compression codec used is LZ4.

Rationale:

  1. Profitability in Network-Intensive Workloads:
    For workloads that involve transferring large amounts of data, such as shuffling, network bandwidth can become a bottleneck. In such cases, data compression can significantly improve performance.
  2. Complementing Existing Encoding:
    While StarRocks already employs data encoding techniques to reduce data size, there are specific scenarios where additional compression can further optimize data transmission.
  3. Adaptive strategy using Thompson Sampling
    The effectiveness of compression depends on various factors, such as data type, data distribution and network utilization. Therefore, it is preferable to adopt an adaptive strategy rather than relying on a hard-coded approach. In this case, we utilize Thompson Sampling, a reward-based algorithm that dynamically adjusts decisions based on observed outcomes.

Evaluation on TPCDS:

Iteration AUTO (seconds) LZ4_FRAME (seconds) NO_COMPRESSION (seconds)
5 203.286 203.281 197.032
6 190.394 198.979 194.756
7 192.274 199.043 197.174
8 190.508 203.807 196.597
9 188.213 205.744 202.426
10 199.782 209.569 201.487
Grand Total 1164.457 1220.423 1189.472

Fixes #issue

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.4
    • 3.3
    • 3.2
    • 3.1
    • 3.0

@wanpengfei-git wanpengfei-git requested a review from a team January 10, 2025 10:26
@murphyatwork murphyatwork force-pushed the murphy_opt_exchange_compress branch from bc15ba8 to 4bd1e16 Compare January 10, 2025 10:30
@murphyatwork murphyatwork enabled auto-merge (squash) January 10, 2025 12:38
Copy link

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

Copy link

[FE Incremental Coverage Report]

pass : 2 / 2 (100.00%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/qe/SessionVariable.java 1 1 100.00% []
🔵 com/starrocks/common/util/CompressionUtils.java 1 1 100.00% []

Copy link

[BE Incremental Coverage Report]

pass : 40 / 44 (90.91%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 be/src/serde/encode_context.cpp 2 4 50.00% [54, 55]
🔵 be/src/exec/pipeline/exchange/exchange_sink_operator.cpp 19 21 90.48% [406, 745]
🔵 be/src/util/runtime_profile.h 1 1 100.00% []
🔵 be/src/util/compression/block_compression.cpp 1 1 100.00% []
🔵 be/src/serde/compress_strategy.cpp 17 17 100.00% []

@VariableMgr.VarAttr(name = TRANSMISSION_COMPRESSION_TYPE)
private String transmissionCompressionType = "NO_COMPRESSION";
@VariableMgr.VarAttr(name = TRANSMISSION_COMPRESSION_TYPE)
private String transmissionCompressionType = "AUTO";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will it be compatibility issue if FE upgraded to the newer version while the BE still stays in old version, especially during the upgrade.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. it's a session variable, so the behavior on BE is totally driven by FE

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that means during the upgrade, FE sends to BE with "AUTO" but BE doesn't recognize and fail the request? or BE just fallback to the no_compression behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the best practice is upgrading the BE first

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't wise to count on the "best practice" which doesn't exist on any doc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, actually it's a common practice of StarRocks, otherwise it would be over-complicated to add some new stuff in the codebase. the common practice is to use a session variable to control the behavior, so before upgrading the FE this new feature will not be enabled.

@murphyatwork murphyatwork merged commit b92f139 into StarRocks:main Jan 13, 2025
71 of 73 checks passed
Copy link

@Mergifyio backport branch-3.4

@github-actions github-actions bot removed the 3.4 label Jan 13, 2025
Copy link
Contributor

mergify bot commented Jan 13, 2025

backport branch-3.4

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request Jan 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants