-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorporate komi into ratings (particularly for small boards) #46
Incorporate komi into ratings (particularly for small boards) #46
Conversation
For now, only Also, there are some MASSIVE small board handicaps in there, such as 8-stones on a 9x9 board, which is probably about equivalent to 52 stones on a 19x19 board. |
(I've put most updates in the bug, #45) |
Add komi integration to the ratings math (optionally). - This slightly adjusts 19x19 games, since the "first" handicap stone (dropping komi) only gives black a 1/2-rank head start. - Big changes for 9x9 and 13x13 games. The effective rank (after adjustment) is bounded between 0 and 39. This path can be turned on using: - `--compute-handicap-via-komi-small` for 9x9 and 13x13 boards - `--compute-handicap-via-komi-19x19` for 19x19 boards
This is more accurate, especially for small boards.
Use `get_handicap_rank_difference` for skipping games. This is a WIP because I've only updated analysis/analyze_glicko2_one_game_at_a_time.py, but it's a problem everywhere.
2e260f7
to
2197d9e
Compare
Did a force-push after a big rebase which cleans up the commit history substantially. |
…incorporate-komi-into-ratings And centralize the check for rank difference > 9.
After #49 lands, this will be ready as well. The code is now fairly clean. Still probably needs some iteration (and definitely more analysis). |
@anoek, I think this might be ready to merge now. Happy to split up further into multiple PRs though if you'd prefer to review pieces separately. Still a few things I want to do (add configuration options for) but I figure it can be a different PR. Here's what's in the branch (doesn't have clean commits in this order, but could rewrite this way if it's useful for reviewing):
|
(Most of the "noise" in the diff is from adding parameters to |
|
||
def get_handicap_adjustment(rating: float, handicap: int, size: int, komi: float, rules: str) -> float: | ||
rank_difference = get_handicap_rank_difference(handicap, size, komi, rules) | ||
effective_rank = min(39, max(0, rating_to_rank(rating) + rank_difference)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to ponder here, I think the reason why I didn't clamp the rank between 30k-9d for this was intentional (but not necessarily correct, that decision is certainly open to discussion).
My gut says that by clamping this we're going to introduce some unnecessary noise. There are legitimately terrible players (including purposefully terrible beginner bots), and there are some legitimately >9d "players" (think every strong KataGo bot). I call out the bots in specific because they have a whole lot of games and can actually move the needle when it comes to where a lot of players land with their ratings. While we clamp those for display to the users, that's more out of convention and UX than a good mathematical reason. I think for the purposes of doing good ratings I think it does a bit of harm to pretend like everyone is squished within those ranks.
Again, open to discussion if you have a good reason to clamp them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main reasoning here was practical, to avoid a MathDomainError (can't remember if it was in rank_to_rating
or in the subsequent ratings update calculation). I did NOT play around with the bounds of the clamping... maybe -10 (40k) to 45 would be sufficient to avoid the MathDomainError. Maybe these clamp values could be parameterized? Maybe we should depend on SkipLogic to preclude the need for this?
In terms of handling players with true ranks outside the UX, I'm not sure how reasonable results can be if the range (output) of rating_to_rank
and the domain (input) of rank_to_rating
don't match. But that's the kind of data I think I was seeing (after applying small board handicap).
Note that I saw the math coming up with effective handicap as high as 52, from an 8-stone 9x9 game early in OGS history (not sure there was only one....). (Maybe that error predated the SkipLogic
change and wouldn't be necessary now...)
BTW, IMO,:
- For
rank_to_rating
/rating_to_rank
calculations, we should choose a domain and range forranks
that match. Either we clamp for things outside of that, or don't rate games that handicaps put outside of that, or something. - For UX, we can choose a smaller set of ranks to clamp to. So, the
rating_to_rank
might have a player down as 0 or -5, but the UX would show 25k (not 30k or 35k).
Not sure how that matches what we have right now...
(Feels a bit like it would be worthwhile to consider more of these changes in isolation... maybe I should split up this PR more after all... not today though)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main reasoning here was practical, to avoid a MathDomainError (can't remember if it was in
rank_to_rating
or in the subsequent ratings update calculation).
Don't have much time today, but my initial attempts to reproduce this (by commenting out the mitigation and the new SkipLogic code) haven't been successful...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converted back to a draft for now— since I can't reproduce the MathDomainError anymore it seems better to remove the clamping for now, and if we add it back it should be parameterized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, found it, when I temporarily reverted the clamping, I inserted a bug in get_handicap_adjustment
, which is why I couldn't reproduce.
The domain error is on tallygameanalytics:
Processing OGS data
2,739,321 / 30,811,698 games processed. 192.3s remainingTraceback (most recent call last):
File "/Users/dexonsmith/Repos/online-go/goratings/analysis/analyze_glicko2_one_game_at_a_time.py", line 99, in <module>
tally.add_glicko2_analytics(analytics)
File "/Users/dexonsmith/Repos/online-go/goratings/analysis/util/TallyGameAnalytics.py", line 102, in add_glicko2_analytics
self.prediction_cost[size][speed][rank][handicap] += - math.log(result.expected_win_rate if black_won else 1 - result.expected_win_rate)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: math domain error
Anyway, I still think the options can land without fixing the MathDomainError, and then we can decide whether to fix with clamping or to update the TalylGameAnalytics to handle an expected_win_rate
of 0 (which I assume is what's causing the domain error). Sending a new PR soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, though honestly if we're hitting the domain error it's because our rating code is busted somewhere, so having it blow up isn't necessarily a bad thing (although having a more appropriate error would probably be better in an ideal world).
Fix an off-by-one bug in the value of extra moves in get_handicap_rank_difference, and rewrite the math using extra variables so it's easier to visually verify. The bug was in the calculation of the number of extra moves. The correct value is: num_extra_moves = handicap - 1 if handicap > 1 else 0 but the old code forgot the `- 1`.
Closing in favour of #51, which is a bit cleaner. |
Add komi integration to the ratings math. This path can be turned on using:
--compute-handicap-via-komi-small
for 9x9 and 13x13 boards--compute-handicap-via-komi-19x19
for 19x19 boardsThis requires passing extra info to
get_handicap_adjustment
:Unfortunately the rules don't seem to be in the database (starting out by hardcoding to
"japanese"
). Maybe we can fix that?Relates to #45