Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build/cmd/coordinator: add health check and graph to track hourly remaining GitHub API rate limit #44406

Closed
dmitshur opened this issue Feb 19, 2021 · 10 comments
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@dmitshur
Copy link
Contributor

dmitshur commented Feb 19, 2021

While investigating #44404, I noticed in GopherBot's logs that it was failing to take some actions due to exceeding GitHub API rate limit quota:

$ kubectl logs -f gopherbot-deployment-6c6d86d5b9-s88c8 | grep "API rate limit"
[...]
2021/02/19 00:19:47 cl2issue: GET https://api.github.com/repos/golang/go/issues/44295/comments?per_page=1000&since=2021-02-19T00%3A14%3A00Z: 403 API rate limit of 5000 still exceeded until 2021-02-19 00:47:55 +0000 UTC, not making remote request. [rate reset in 28m08s]
[...]

When rate limit is exceeded, GopherBot stops being reliable for its users, and regular maintenance tasks do not occur.

This may be related to heavy activity, or perhaps it's caused by increased deterioration of issue #28320. This issue is to keep an eye on big of a problem it is and what we need to do here.

CC @golang/release.

@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Feb 19, 2021
@dmitshur dmitshur added this to the Unreleased milestone Feb 19, 2021
@gopherbot gopherbot added the Builders x/build issues (builders, bots, dashboards) label Feb 19, 2021
@dmitshur dmitshur changed the title x/build/cmd/gopherbot: GopherBot may exceed rate limit x/build/cmd/gopherbot: may exceed GitHub API rate limit Feb 19, 2021
@dmitshur
Copy link
Contributor Author

Based on golang/crypto#143 (comment), this issue might be affecting more services than just GopherBot.

@dmitshur
Copy link
Contributor Author

dmitshur commented Mar 1, 2021

This is happening today too, affecting the "close cherry pick issues" task.

2021/03/01 20:54:36 close cherry pick issues: GET https://api.github.com/repos/golang/go/issues/44464/comments?per_page=1000&since=2021-02-26T10%3A29%3A53Z: 403 API rate limit of 5000 still exceeded until 2021-03-01 20:56:13 +0000 UTC, not making remote request. [rate reset in 1m43s]

@dmitshur
Copy link
Contributor Author

From GerritBot logs today:

2021/03/15 17:47:59 getFullPR(ctx, "golang", "website", 40): b.githubClient.Do: GET https://api.github.com/repos/golang/website/pulls/40: 403 API rate limit of 5000 still exceeded until 2021-03-15 18:05:45 +0000 UTC, not making remote request. [rate reset in 17m45s]

@toothrot toothrot assigned toothrot and jeremyfaller and unassigned toothrot Mar 16, 2021
@jeremyfaller
Copy link
Contributor

Ticket opened w/ Github, awaiting response.

@jeremyfaller
Copy link
Contributor

jeremyfaller commented Mar 17, 2021

Poked Github. No rate limit increase is in the cards. We'll be held to 5k/hour unless we upgrade to an enterprise account (being related to the Google {which is an enterprise account} doesn't seem to help us here). I think we'll need to fix the underlying issues.

edit: We'd get 15k/hour if we upgraded.

@dmitshur
Copy link
Contributor Author

dmitshur commented Mar 17, 2021

Thanks for the update.

I'm working on getting a graph of our rate limit usage. Having that should help get a sense of how much/often the rate limit is being exceeded, and how much we need to decrease its usage by.

@dmitshur dmitshur self-assigned this Mar 19, 2021
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/303670 mentions this issue: cmd/coordinator: add health check for GitHub API quota

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/303669 mentions this issue: cmd/coordinator: migrate to OpenCensus for metrics

gopherbot pushed a commit to golang/build that referenced this issue Mar 23, 2021
Replace low-level Stackdriver monitoring API usage for OpenCensus
with a Stackdriver exporter. To benefit local development, expose
metrics at an /metrics endpoint (to be picked up with Prometheus).

This makes it much easier to add new metrics, to test them locally,
and brings our metrics solution in sync with what's currently in
use in x/playground (see CL 302769). It's expected to be preferable
to migrate to OpenTelemetry in the future when a good migration path
becomes available, and both x/build and x/playground can be updated
at that time.

This CL is based on work in CL 229679 and CL 138522.

For golang/go#26779.
For golang/go#44406.
For golang/go#17104.

Co-authored-by: Alexander Rakoczy <[email protected]>
Co-authored-by: Emmanuel T Odeke <[email protected]>
Change-Id: Iad45730feace471db1668e828b7c9775377be8a9
Reviewed-on: https://go-review.googlesource.com/c/build/+/303669
Run-TryBot: Dmitri Shuralyov <[email protected]>
TryBot-Result: Go Bot <[email protected]>
Trust: Dmitri Shuralyov <[email protected]>
Reviewed-by: Alexander Rakoczy <[email protected]>
Reviewed-by: Emmanuel Odeke <[email protected]>
gopherbot pushed a commit to golang/build that referenced this issue Mar 23, 2021
In recent times, it has been observed that the GitHub API rate limit
quota of 5000 requests per hour is being occasionally exceeded.

It should be very helpful to have a graph that tracks remaining rate
limit over time to better understand the current state and how much
effect future code changes have on improving it.

Also add a health check to coordinator's health section that prints
a warning when the GitHub rate limit is known to be exceeded. This
can help when observing GopherBot or GerritBot problems: we'll be
able to tell if they're likely caused by GitHub rate limit issues
or if the cause must be something else.

For golang/go#44406.

Change-Id: Id75d70129a75292a6d3f9c722636a8b740ca05a1
Reviewed-on: https://go-review.googlesource.com/c/build/+/303670
Run-TryBot: Dmitri Shuralyov <[email protected]>
TryBot-Result: Go Bot <[email protected]>
Reviewed-by: Alexander Rakoczy <[email protected]>
Trust: Dmitri Shuralyov <[email protected]>
@dmitshur
Copy link
Contributor Author

dmitshur commented Apr 6, 2021

I'm going to retitle this issue to be about adding a health check and metrics to improve visibility into this issue, and close it since it's done.

We can file new issues for future improvements to reduce the amount of time the rate limit is exceeded.

@dmitshur dmitshur changed the title x/build/cmd/gopherbot: may exceed GitHub API rate limit x/build/cmd/coordinator: add health check and graph to track hourly remaining GitHub API rate limit Apr 6, 2021
@dmitshur dmitshur closed this as completed Apr 6, 2021
@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Apr 6, 2021
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/308790 mentions this issue: cmd/gopherbot: add more deleted issues to deletedIssues map

gopherbot pushed a commit to golang/build that referenced this issue Apr 9, 2021
A good amount of time has passed since the deletedIssues map was last
updated, and the "freeze old issues" task was needlessly making 34 API
calls to freeze issues that are gone. After this change, that task is
making 0 API calls (whenever there aren't existing issues to freeze).

Some gardening tasks were converted to be more general and run on more
issue trackers in CL 233377, so update the deletedIssues map to track
the repo ID in addition to the issue number.

For golang/go#28320.
Updates golang/go#22635.
Updates golang/go#44406.
Updates golang/go#39008.

Change-Id: I3b477bf717f7d97676e9ef950214a3598ec3abd2
Reviewed-on: https://go-review.googlesource.com/c/build/+/308790
Trust: Dmitri Shuralyov <[email protected]>
Reviewed-by: Carlos Amedee <[email protected]>
@golang golang locked and limited conversation to collaborators Apr 9, 2022
@heschi heschi moved this to Done in Go Release Sep 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
Archived in project
Development

No branches or pull requests

4 participants