-
Notifications
You must be signed in to change notification settings - Fork 448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
limits memory and cpu used by compaction reservation request #5185
Conversation
Added threads pools to execute compaction reservation request in order to limit memory and cpu used by executing reservations. Request queued up for the pool could still potentially use a lot of memory. Did two things to control memory of things in the queue. First only allow a compactor process to have one reservation processing at time. Second made the data related to a resevation request a soft reference which should allow it be garbage collected if memory gets low while it sitting in the queue. Once the request starts executing it obtains a strong refrence to the data so it can no longer be garbage collected. fixes apache#5177
protected CompactionMetadata reserveCompaction(CompactionJobQueues.MetaJob metaJob, | ||
String compactorAddress, ExternalCompactionId externalCompactionId) { | ||
|
||
if (activeCompactorReservationRequest.contains(compactorAddress)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to think if there would be a race condition here if the thread pool was configured to be greater than 1 to handle multiple requests. I don't think there is because the same compactor should not be sending in multiple requests at the exact same time, so if there was multiple attempts and that leads to one of the Preconditions.checkState() checks on the add/remove failing then that is probably a good thing as it means there is an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah if something really unusual is going on that Precondition check would catch it. Under normal case of compactor retrying the first check will catch it. It would probably be good add some info to the preconditions check in case it does fire.
// Use a soft reference for this in case free memory gets low while this is sitting in the queue | ||
// waiting to process. This object can contain the tablets list of files and if there are lots | ||
// of tablet with lots of files then that could start to cause memory problems. | ||
private final SoftReference<CompactionJobQueues.MetaJob> metaJobRef; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a neat way to help with the memory issue. I haven't looked but I am curious if there's some way to get statistics from the JVM to find out how often this reference is being freed that could be useful to monitor this. It may not be necessary though as the just watching total memory usage is probably good enough as this reference should only get cleared when low in memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was trying to deal with a problem I have been seeing while testing using the FlakyBulkBatchWriter in accumulo-testing that will sometimes submit a large numbers of files for a tablet. This can cause lots of memory pressure on the manager when lots of tablet have lots of files and the manager is trying to keep a lot of that in memory. For this case its really hard to reason about what will be on the thread pool queue and how much memory it is using. So decided to use a soft refernce for now, but I am uncertain about it. Opened #5188 and added comment pointing to that issue in this PR. I think if #5188 were implemented that this soft reference could be removed. Would still want to restrict compactors from having multiple request queued/running. I think if compactors can only have one thing queued for reservation and the queued data is only compaction jobs and no tablet metadata, then the memory usage would be unlikely to cause a problem.
Added threads pools to execute compaction reservation request in order to limit memory and cpu used by executing reservations. Request queued up for the pool could still potentially use a lot of memory. Did two things to control memory of things in the queue. First only allow a compactor process to have one reservation processing at time. Second made the data related to a resevation request a soft reference which should allow it be garbage collected if memory gets low while it sitting in the queue. Once the request starts executing it obtains a strong refrence to the data so it can no longer be garbage collected.
fixes #5177