Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API rate limiting #33

Open
ScotterC opened this issue Sep 19, 2024 · 2 comments
Open

API rate limiting #33

ScotterC opened this issue Sep 19, 2024 · 2 comments

Comments

@ScotterC
Copy link

This could be added to the api models by interpreting response headers or maybe an option given to Reranker which limits the amount of requests per min.

For instance, Jina when not on premium is 60 rpm. Cohere is 10 rpm on trial key and 1000 rpm on production key

@bclavie
Copy link
Collaborator

bclavie commented Sep 27, 2024

This could be useful yes!

I'd see an optional "max_requests_per_minute" argument to the loading an API reranker, along with a retries_on_failure: int and max_time_between_retries parameters which would specify the max number of retries? Either (or both) being set would result in:

  • time.sleep(60 - time_spent_on_the_most_recent_max_requests_per_minute + 5 to have a buffer) on hitting the max RPM
  • Automatically retry retries_on_failure times, starting with a 1s backoff and increasing to max_time_between_retries

These would be optional and most likely default to:

  • max_requests_per_minute: whatever we can find for a production API key for a given provider
  • retries_on_failure: 3
  • max_time_between_retries: 15

Is this something you'd be interesting in contributing a PR for? Otherwise I'll add it to the to-do as a low priority item!

@ScotterC
Copy link
Author

Sorry, I've moved on from this particular need but I could imagine forking/working on it in the future if it becomes a blocker. Better off categorizing as a low-pri item on your end 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants