Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SERVER-98272 Create remediation for if the timestamp embedded in the bucket's ID does not match the control.min timestamp #144

Open
wants to merge 22 commits into
base: master
Choose a base branch
from

Conversation

seristof
Copy link
Collaborator

@seristof seristof commented Jan 8, 2025

Tested on the following cases:

  1. HELP ticket case repro
  2. validate_timeseries_id_timestamp.js
  3. Checking that it doesn't overwrite extended range timestamps (where it is a known issue and it is ok for the timestamp embedded in the bucket's ID does not match the control.min timestamp; it will not cause an warning/error on validation versions with SERVER-97441, so a description was added about how this script should only be run on versions that have validation with SERVER-97441)

Additional improvements:

  • Added some minor changes to documentation for recently-added bucket version mismatch to clarify the difference in errors vs. warnings in validate for v8.1+ and prior to v8.1 respectively.
  • Changed the bucket version mismatch and mixed schema remediation to use {background: true} during validation.

print(
'Validating that there are no buckets that have a mismatched embedded bucket id timestamp and control.min timestamp ...\n');
db.getMongo().setReadPref('secondaryPreferred');
const validateRes = coll.validate();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this use {background:true} for the validation step? Considering this will run on live customer clusters and will take time to execute.

}

print(
'\nScript successfully fixed have a mismatched embedded bucket id timestamp and control.min timestamp!');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove have here (I think). This log line also implies (semantically) that it only fixed a single bucket.

Copy link

@4hp-4int 4hp-4int left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, the script worked for me, although I had difficulty validating exactly what changed. My feedback for the script at a high level would be to make the logged output more meaningful, and slightly less noisy if possible. Perhaps only logging when we detect an actual mismatch.

embedded-timestamp-mismatch/README.md Outdated Show resolved Hide resolved
Comment on lines 44 to 46
// b) Insert the measurements back into the collection. These will go into
// new buckets.
// c) Delete the mismatched embedded bucket id timestamp and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] If we update the wording in the README, we should also update it here.

@seristof
Copy link
Collaborator Author

seristof commented Jan 9, 2025

Overall, the script worked for me, although I had difficulty validating exactly what changed. My feedback for the script at a high level would be to make the logged output more meaningful, and slightly less noisy if possible. Perhaps only logging when we detect an actual mismatch.

Thanks for the feedback! Deleted some redundant log lines and changed the validate to be with {background: true} for mixed-schema, embedded-timestamp, and bucket verson mismatch (and tested that these all returned the same results).

@seristof seristof requested a review from 4hp-4int January 9, 2025 17:36
const oidTimestamp = bucket._id.getTimestamp();
const controlMinTimestamp = bucket.control.min.t

if (oidTimestamp != controlMinTimestamp) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should take extended range dates into account. We shouldn't reinsert all data with extended range.

Copy link
Collaborator Author

@seristof seristof Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, and tested using extended range docs from timeseries_filter_extended_range.js and ensuring that they don't get overwritten.

// Prior to v8.1, buckets that have a mismatched embedded bucket id timestamp
// and control.min timestamp will lead to a warning during validation.
//
if (validateRes.errors.length != 0 || validateRes.warnings.length != 0) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we compare the actual error/warning output? I don't want unrelated warnings to report that there still exist buckets with mismatched _id

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added checking for the correct log id for bucket version mismatch as well.

Copy link

@DamianWasilewicz DamianWasilewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script itself looks good to me! Nice work Stephanie


// If this collection has extended-range measurements, we cannot assert that
// the minTimestamp matches the embedded timestamp.
print(controlMinTimestamp == oidTimestamp)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to keep this log in for each measurement, should we maybe add more detail to it? More-so than just True/False.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this was a debugging print I forgot to remove 😅 . Thanks for pointing it out!

@seristof seristof requested a review from henrikedin January 10, 2025 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants