Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[INLONG-7056][Sort]Adjust sort resources according to data scale #10915

Closed
wants to merge 1 commit into from

Conversation

PeterZh6
Copy link
Contributor

Fixes #7056

Motivation

Currently, the total amount of resources for the Flink Sort Job comes from the configuration file flink-sort-plugin.properties, meaning that all submitted sort jobs will use the same amount of resources. When the data scale is large, the resources may be insufficient, and when the data scale is small, the resources may be wasted. Therefore, dynamically adjusting the number of resources according to the amount of data is a critically needed function.

Modifications

Before submitting a job to Flink with org.apache.inlong.manager.plugin.flink.FlinkService#submitJobBySavepoint, the org.apache.inlong.manager.plugin.flink.FlinkParallelismOptimizer will first query the average data volume from the past hour and adjust the parallelism based on this data volume.
Meanwhile, this function can be swiched on or off and maxmimum message for one core can be configured in flink-sort-plugin.properties

Verifying this change

  • This change is a trivial rework/code cleanup without any test coverage.

  • This change is already covered by existing tests, such as:
    When creating a stream in Data Ingestion, you can try to make the source data constantly increase and reach a significant amount (approximately more than 2000 per hour). Then, resubmit the job. You should notice that the parallelism of the Flink job corresponding to the stream will be larger than the default value of 1. This change will also be reflected in the manager logs.

  • [] This change added tests and can be verified as follows:

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @PeterZh6, thank you for submitting a PR to InLong 💖 We will respond as soon as possible ⏳
This seems to be your first PR 🌠 Please be sure to follow our Contribution Guidelines.
If you have any questions in the meantime, you can also ask us on the InLong Discussions 🔍

@PeterZh6 PeterZh6 closed this Aug 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature][Sort] Adjust sort resources according to data scale
1 participant