Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move the tracking of unsplittable tablets to metadata table #4317

Merged
merged 17 commits into from
Mar 8, 2024

Conversation

cshannon
Copy link
Contributor

@cshannon cshannon commented Feb 28, 2024

This adds a new column to store information for tracking unsplittable tablets in the metadata table instead of in memory. This information can be used by the tablet management iterator to know if a tablet needs to split by checking the column and comparing it to the current metadata and avoid unnecessarily trying to split a tablet that can't be split. The data stored includes a hash of the file set and the relevant config related to splits and if this changes then the iterator will try and split again.

The metadata is stored as Json and the configs included include split threshold, table max end row size, and max open files as well as the file set hash. If a split happens on the tablet in the future (because the metadata changed such a the file set and the iterator attempted to split again) the column will be cleaned up by fate in UpdateTablets repo.

For the schema, I wasn't sure of the best approach so I just went ahead and created a new Split column family and unspilttable column qualifier. I thought it might be useful in case we wanted to store more things in the future related to splits in the new column family, however I can easily change the schema if desired or store the new metadata as a column part of an existing column family or just make it a stand alone column family/qualifier like the UserCompactionRequested or Merged column families.

I marked this as a draft for now as it still needs more tests (only one test was modified so far) to be added and some other clean up work plus as I said I wasn't sure if the schema was the best approach.

This closes #4177

This adds a new column to store information for tracking unsplittable
tablets in the metadata table instead of in memory. This information
can be used by the tablet management iterator to know if a tablet needs
to split and avoid unnecessarily trying to split a tablet that can't be
split. The data stored includes a hash of the file set and the relevant
config related to splits and if this changes then the iterator will try
and split again.
@cshannon cshannon self-assigned this Feb 28, 2024
Copy link
Contributor

@keith-turner keith-turner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realized there may be cases where the unsplittable marker could just hang around forever, even when its no longer applicable. Something like the following

  • A tablet needs to split, but cannot so the new unsplittable column is written
  • A compaction happens that filters a lot of the tablets data. The tablet is no longer a candidate for split because it shrunk.

In the case above the unsplittable marker does not get removed. It does not cause any harm. The only problem with this case is if someone is actively scanning the metadata table looking for problematic tablets then they will find false postives.

@cshannon cshannon marked this pull request as ready for review March 2, 2024 18:20
@cshannon
Copy link
Contributor Author

cshannon commented Mar 2, 2024

@keith-turner - This is ready for another review, I realized I had to add a recheck in the FindSplits repo to see if we still are above the split threshold if no split points are found and the marker already exists. This is because the tablet management iterator will now try and split anytime there is a metadata change for an unsplittable tablet and it's possible we no longer need to split (like after a compaction) and we need to clear the marker. Before this change the marker was only cleared after a split was done.

I also added tests for TabletManagement and constraints as well as a couple ITs. There is one IT that will test that the marker is set and then that it is cleared after splits run successfully and a second test that will test that the marker is set and then cleared if we no longer need to split such as the split threshold increasing but no split was done.

@cshannon
Copy link
Contributor Author

cshannon commented Mar 2, 2024

One thing to mention is now that FindSplits fate repo has to recheck if a split is needed to clear the marker, it requires having the same logic for checking the threshold in multiple spots so I am wondering if we should consolidate that. For example, now SplitUtils, FindSplits, and TabletManagementIterator all do checks on the split threshold and it seems error prone if we change the logic in the future and one spot is updated and not the others.

Also, all of those spots now need to compute the size by iterating over the files so I was wondering if we should move this into TabletMetadata class itself so it could be cached and re-used. Currently I updated the SplitUtils findSplits() method to now take the estimatedSize as an argument so it's not computed more than once in FindSplits but that seems a bit odd. I was wondering if maybe we should add a getter that can compute the size and cache it (we could use Suppliers.memoize()) and then if any calling code needs to get the size of the files it could just call something like tabletMetadata.getFileSize(). The supplier could default to just returning 0 if files are not loaded.

I was also trying to think if we could simplify having to recheck the threshold in the condition of no splits found and the marker exists. SplitUtils findSplits() is already called before that point and already does the threshold check (as noted above) but it doesn't return why it found no splits. Another option is to return a reason from that method for finding no splits (besides the empty set currently) as if the reason returned was the threshold wasn't met then we could avoid having to recheck.

The only other thing was I was trying to think of if there are more test cases we should write.

Copy link
Contributor

@keith-turner keith-turner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still reviewing changes, posting comment made so far.

@keith-turner
Copy link
Contributor

I was wondering if maybe we should add a getter that can compute the size and cache it (we could use Suppliers.memoize()) and then if any calling code needs to get the size of the files it could just call something like tabletMetadata.getFileSize(). The supplier could default to just returning 0 if files are not loaded.

That sounds nice, could throw an exception instead of return zero if files were not fetched.

@keith-turner
Copy link
Contributor

One thing to mention is now that FindSplits fate repo has to recheck if a split is needed to clear the marker, it requires having the same logic for checking the threshold in multiple spots so I am wondering if we should consolidate that. For example, now SplitUtils, FindSplits, and TabletManagementIterator all do checks on the split threshold and it seems error prone if we change the logic in the future and one spot is updated and not the others.

Consolidating that decision logic to a single class would be really nice and less error prone.

Copy link
Contributor

@keith-turner keith-turner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished review everything.

@cshannon
Copy link
Contributor Author

cshannon commented Mar 8, 2024

@keith-turner - looks like your new test in #4323 caused this PR to break...which is good news as that was the intent. I will update this PR to add the column to the test

[INFO] Results:
[INFO] 
[ERROR] Failures: 
[ERROR]   UpdateTabletsTest.checkColumns:100 The split code does not know how to handle UNSPLITTABLE ==> expected: <true> but was: <false>
[ERROR]   UpdateTabletsTest.testManyColumns:363 
  Unexpected method call TabletMetadata.getUnSplittable():
[INFO] 
[ERROR] Tests run: 56, Failures: 2, Errors: 0, Skipped: 0

@cshannon cshannon merged commit 490a463 into apache:elasticity Mar 8, 2024
8 checks passed
@cshannon cshannon deleted the accumulo-4177 branch March 8, 2024 22:35
@ctubbsii ctubbsii added this to the 4.0.0 milestone Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants