-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/build/cmd/coordinator: add health check and graph to track hourly remaining GitHub API rate limit #44406
Comments
Based on golang/crypto#143 (comment), this issue might be affecting more services than just GopherBot. |
This is happening today too, affecting the "close cherry pick issues" task.
|
From GerritBot logs today:
|
Ticket opened w/ Github, awaiting response. |
Poked Github. No rate limit increase is in the cards. We'll be held to 5k/hour unless we upgrade to an enterprise account (being related to the Google {which is an enterprise account} doesn't seem to help us here). I think we'll need to fix the underlying issues. edit: We'd get 15k/hour if we upgraded. |
Thanks for the update. I'm working on getting a graph of our rate limit usage. Having that should help get a sense of how much/often the rate limit is being exceeded, and how much we need to decrease its usage by. |
Change https://golang.org/cl/303670 mentions this issue: |
Change https://golang.org/cl/303669 mentions this issue: |
Replace low-level Stackdriver monitoring API usage for OpenCensus with a Stackdriver exporter. To benefit local development, expose metrics at an /metrics endpoint (to be picked up with Prometheus). This makes it much easier to add new metrics, to test them locally, and brings our metrics solution in sync with what's currently in use in x/playground (see CL 302769). It's expected to be preferable to migrate to OpenTelemetry in the future when a good migration path becomes available, and both x/build and x/playground can be updated at that time. This CL is based on work in CL 229679 and CL 138522. For golang/go#26779. For golang/go#44406. For golang/go#17104. Co-authored-by: Alexander Rakoczy <[email protected]> Co-authored-by: Emmanuel T Odeke <[email protected]> Change-Id: Iad45730feace471db1668e828b7c9775377be8a9 Reviewed-on: https://go-review.googlesource.com/c/build/+/303669 Run-TryBot: Dmitri Shuralyov <[email protected]> TryBot-Result: Go Bot <[email protected]> Trust: Dmitri Shuralyov <[email protected]> Reviewed-by: Alexander Rakoczy <[email protected]> Reviewed-by: Emmanuel Odeke <[email protected]>
In recent times, it has been observed that the GitHub API rate limit quota of 5000 requests per hour is being occasionally exceeded. It should be very helpful to have a graph that tracks remaining rate limit over time to better understand the current state and how much effect future code changes have on improving it. Also add a health check to coordinator's health section that prints a warning when the GitHub rate limit is known to be exceeded. This can help when observing GopherBot or GerritBot problems: we'll be able to tell if they're likely caused by GitHub rate limit issues or if the cause must be something else. For golang/go#44406. Change-Id: Id75d70129a75292a6d3f9c722636a8b740ca05a1 Reviewed-on: https://go-review.googlesource.com/c/build/+/303670 Run-TryBot: Dmitri Shuralyov <[email protected]> TryBot-Result: Go Bot <[email protected]> Reviewed-by: Alexander Rakoczy <[email protected]> Trust: Dmitri Shuralyov <[email protected]>
I'm going to retitle this issue to be about adding a health check and metrics to improve visibility into this issue, and close it since it's done. We can file new issues for future improvements to reduce the amount of time the rate limit is exceeded. |
Change https://golang.org/cl/308790 mentions this issue: |
A good amount of time has passed since the deletedIssues map was last updated, and the "freeze old issues" task was needlessly making 34 API calls to freeze issues that are gone. After this change, that task is making 0 API calls (whenever there aren't existing issues to freeze). Some gardening tasks were converted to be more general and run on more issue trackers in CL 233377, so update the deletedIssues map to track the repo ID in addition to the issue number. For golang/go#28320. Updates golang/go#22635. Updates golang/go#44406. Updates golang/go#39008. Change-Id: I3b477bf717f7d97676e9ef950214a3598ec3abd2 Reviewed-on: https://go-review.googlesource.com/c/build/+/308790 Trust: Dmitri Shuralyov <[email protected]> Reviewed-by: Carlos Amedee <[email protected]>
While investigating #44404, I noticed in GopherBot's logs that it was failing to take some actions due to exceeding GitHub API rate limit quota:
When rate limit is exceeded, GopherBot stops being reliable for its users, and regular maintenance tasks do not occur.
This may be related to heavy activity, or perhaps it's caused by increased deterioration of issue #28320. This issue is to keep an eye on big of a problem it is and what we need to do here.
CC @golang/release.
The text was updated successfully, but these errors were encountered: