Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provide "slow ramp time" to protect new app instances from overloading #463

Open
metskem opened this issue Jan 23, 2025 · 2 comments
Open

Comments

@metskem
Copy link

metskem commented Jan 23, 2025

Proposed Change

As a developer I would like to have a "slow ramp time" for app instances that have just become healthy. The behaviour is 100% similar to the F5 slow ramp time.
The gorouter should know the uptime of individual app instances, and for instances that just became healthy, it should not send the full share of load to it immediately, but rather slowly increase the request rate over a period specified by the slow_ramp_time.

This should provide better survivability for new app instances that need to warm up first (jit compile code, warm up backend connection pools etc..) before they provide good enough response times.

Acceptance criteria

A common scenario these days is the following:

We have a Java app running 10 instances by default. The app uses the App Autoscaler to increase the number of instances during high load periods.
However, when additional app instances have become healthy, they immediately get their share of the load which they cannot or hardly handle because Java code needs to be jitted first, connection pools initialized and so on, this results in excessive response times and outages for some customers, or the instance becomes completely unresponsive and CF kills it because the health-check times out.
As a result some teams decide to not use this dynamic scaling but just deploy the maximum number of instances all the time, which adds up to the costs.

A similar scenario is when an instance crashes (for whatever reason), it gets automatically restarted but keeps on crashing once the full load comes in again. Only drastically increasing the health-check timeout can help (but that requires a full redeployment of the app).

If the gorouter would be able to gradually increase the load to new instances over the given ramp time (~ 10-30 secs), then we expect these new instances to survive and provide better average response times and/or fewer slow responses.

It could be implemented as a regular gorouter (boshrelease) configuration, or (even better) as a per-route option.

Related links

No response

@peanball
Copy link
Contributor

Playing devil's advocate a little here, this sounds like a workaround for an issue in the app.

Please note that you can also use some of the existing Gorouter features to implement something like this on your end.

Gorouter will transparently retry requests that were rejected by the backend on another backend.

And finally, there is now the ready check in addition to the health check. When an app reports as ready, its route will be registered. You could use that to mark your app ready and not-ready in increasing intervals while it starts up.

@metskem
Copy link
Author

metskem commented Jan 24, 2025

Thanks for the quick response.
Although I agree that the app itself can arrange these things, we think it's better not to bother all our developers with it, they should focus on writing functional (business) code, and preferably the platform should handle these kind of things.
The suggestion about the readiness check is a good one. I will suggest that to them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

2 participants