-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
langchain-google-vertexai: retryable errors are not retried #7493
Comments
Hey @siviter-t! I'm here to help you with any bugs, questions, or contributions you have. Let's get this issue sorted out together. The issue you're encountering with retryable errors not being retried in the The To address this, you might consider customizing the retry logic to handle the 429 error specifically. This could involve modifying the If you have control over the code, you could adjust the To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
Thanks for flagging this! @jacoblee93 - Please assign this to me if you don't get to it yourself or if @siviter-t doesn't get to it first. {: |
Thank you @afirstenberg! |
In my local tests in our repo, overriding this handler and simply dropping the final throw does the trick for rate limits and I observed it demonstrates the default exponential back-off config of const vertexWorkaroundRetryHandler = (error: any) => {
const status = error?.response?.status ?? 0;
if (status === 0) {
// What is this?
console.error("failedAttemptHandler", error);
}
// What errors shouldn't be retried?
if (STATUS_NO_RETRY.includes(+status)) {
throw error;
}
log.trace({ src: "vertexWorkaroundRetryHandler", error }, "Received retryable error");
};
const model = new ChatVertexAI({
model: "gemini-1.5-flash-002",
temperature: 0,
onFailedAttempt: vertexWorkaroundRetryHandler,
maxRetries: 6,
}); |
Checked other resources
Example Code
Error Message and Stack Trace (if applicable)
Description
We've encountered resource exhaustion errors with the Vertex AI integration in production lately, specifically with Gemini 1.5 Flash. This is essentially langchain-ai/langchain/issues/22241 where others have reported limits on the tokens per minute.
I would have thought these would be automatically retried with langchain's p-retry implementation. Debugging into the source I see that the defaultFailedAttemptHandler is overridden by a failedAttemptHandler in langchain-google-common.
Though this tests for errors that should not be retried it ultimately throws all errors.
System Info
Mac
node v22.11.0
pnpm 9.13.2
[email protected]
The text was updated successfully, but these errors were encountered: