-
-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(3742): improve user segmentation with BigInt-based random generation #5110
Conversation
9762ab9
to
b5337b6
Compare
b5337b6
to
c650ada
Compare
packages/remote-feature-flag-controller/src/utils/user-segmentation-utils.ts
Outdated
Show resolved
Hide resolved
packages/remote-feature-flag-controller/src/utils/user-segmentation-utils.ts
Outdated
Show resolved
Hide resolved
packages/remote-feature-flag-controller/src/utils/user-segmentation-utils.ts
Outdated
Show resolved
Hide resolved
packages/remote-feature-flag-controller/src/utils/user-segmentation-utils.ts
Outdated
Show resolved
Hide resolved
packages/remote-feature-flag-controller/src/utils/user-segmentation-utils.ts
Show resolved
Hide resolved
… 4 in uuidv4 properly
9e0b3cc
to
87cd8f5
Compare
* | ||
* For UUIDv4 format, the following normalizations are applied: | ||
* - Removes all dashes from the UUID | ||
* - Remove version (4) bits and replace with 'f' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this, I think we need to remove it rather than replace it with f
. If we replace it with f
, we're skewing the result that amount away from zero. But if we remove it completely, it won't impact the result.
* - Remove version (4) bits and replace with 'f' | |
* - Remove version (4) bits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, though I notice that the test does show that the minimum UUID does result in zero. Odd. It looks like the result would be skewed to me though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I tested it and it does skew the results, but by too small an amount. You need to use a precision of 10^15 to see a non-zero result from the minimum input, but we're only using precision of 10^6.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using an implicit minimum of zero, and a maximum of 'f' * length
, perhaps we can achieve perfect distribution by updating the final calculation to something like this:
function uuidToBigInt(id: string) {
return BigInt(`0x${uuid.replace(/-/gu, '')}`);
}
const MIN_UUID_V4 = '00000000-0000-4000-8000-000000000000';
const MAX_UUID_V4 = 'ffffffff-ffff-4fff-bfff-ffffffffffff';
const MIN_UUID_V4_BIGINT = uuidToBigInt(MIN_UUI_V4);
const MAX_UUID_V4_BIGINT = uuidToBigInt(MAX_UUI_V4);
...
export function generateDeterministicRandomNumber(
metaMetricsId: string,
): number {
let idValue: BigInt;
let maxValue: BigInt;
// uuidv4 format
if (uuidValidate(metaMetricsId) && uuidVersion(metaMetricsId) === 4) {
// Adjust both values by subtracting the minimum value, so that the result isn't biased away from zero
idValue = uuidToBigInt(metaMetricsId) - MIN_UUID_V4_BIGINT;
maxId = MAX_UUID_V4_BIGINT - MIN_UUID_V4_BIGINT;
} else {
// hex format with 0x prefix
idValue = BigInt(`0x${metaMetricsId.slice(2)}`;
maxValue = BigInt(`0x${'f'.repeat(cleanId.length)}`);
}
// Use BigInt division first, then convert to number to maintain precision
return Number((value * BigInt(1_000_000)) / maxValue) / 1_000_000;
}
By adjusting the minimum and maximum for the UUIDv4 case, it ensures we use the entire range of 0-1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I verified the max, and it's also skewed for the same reason as the minimum, but only if you increase precision to 10^15. Not bad! Close enough that we won't notice, it's well under a percentage.
The suggestion I left here would give us a perfect distribution (JS rounding aside; it could get slightly better with bignumber.js
and/or more precision). But what you have here is good enough I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion ! Converting uuid to hex value and then normalizing the UUID range to start from 0 instead of removing the bits within is a much better solution indeed! Adapted in 8724cc3
packages/remote-feature-flag-controller/src/utils/user-segmentation-utils.test.ts
Show resolved
Hide resolved
packages/remote-feature-flag-controller/src/utils/user-segmentation-utils.test.ts
Show resolved
Hide resolved
packages/remote-feature-flag-controller/src/utils/user-segmentation-utils.ts
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Two pending non-blocking suggestions, one to improve the distribution (we're biased slightly away from 0 and 1 right now for UUIDv4 IDs), and the second to add a guard against unexpected IDs.
a9a7672
to
a66b49f
Compare
a66b49f
to
3ae4dd8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
## Explanation - fix(3742): improve user segmentation with BigInt-based random generation ([#5110](#5110)) - fix(3742): change getMetaMetricsId to only sync func type ([#5108](#5108)) <!-- Thanks for your contribution! Take a moment to answer these questions so that reviewers have the information they need to properly understand your changes: * What is the current state of things and why does it need to change? * What is the solution your changes offer and how does it work? * Are there any changes whose purpose might not obvious to those unfamiliar with the domain? * If your primary goal was to update one package but you found you had to update another one along the way, why did you do so? * If you had to upgrade a dependency, why did you do so? --> ## References <!-- Are there any issues that this pull request is tied to? Are there other links that reviewers should consult to understand these changes better? Are there client or consumer pull requests to adopt any breaking changes? For example: * Fixes #12345 * Related to #67890 --> ## Changelog <!-- If you're making any consumer-facing changes, list those changes here as if you were updating a changelog, using the template below as a guide. (CATEGORY is one of BREAKING, ADDED, CHANGED, DEPRECATED, REMOVED, or FIXED. For security-related issues, follow the Security Advisory process.) Please take care to name the exact pieces of the API you've added or changed (e.g. types, interfaces, functions, or methods). If there are any breaking changes, make sure to offer a solution for consumers to follow once they upgrade to the changes. Finally, if you're only making changes to development scripts or tests, you may replace the template below with "None". --> ## Checklist - [ ] I've updated the test suite for new or updated code as appropriate - [x] I've updated documentation (JSDoc, Markdown, etc.) for new or updated code as appropriate - [x] I've highlighted breaking changes using the "BREAKING" category above as appropriate - [x] I've prepared draft pull requests for clients and consumer packages to resolve any breaking changes
Explanation
Replace hash-based random number generation with BigInt-based implementation for better distribution and format support. The new implementation properly handles both UUIDv4 and hex-format metaMetricsIds, providing more consistent
and reliable user segmentation.
References
Addresses: #5051 (comment)
Changelog
@metamask/remote-feature-flag-controller
generateDeterministicRandomNumber
to handle both uuidv4(mobile new) and hex(mobile old and extension) sideChecklist