langchain-google-vertexai: retryable errors are not retried #7493

siviter-t · 2025-01-09T17:08:47Z

Checked other resources

I added a very descriptive title to this issue.
I searched the LangChain.js documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain.js rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

import { ChatVertexAI } from "@langchain/google-vertexai";
import { faker } from "@faker-js/faker";

const llm = new ChatVertexAI({
  model: "gemini-1.5-flash-002",
  temperature: 0,
  maxRetries: 2,
});

// Attempt to break the suggested 32 k tokens / min context limit
// A 100 tokens is ~ 60-80 English words for Gemini
// https://ai.google.dev/gemini-api/docs/tokens?lang=node
// https://github.com/langchain-ai/langchain/issues/22241
const manyTokens = faker.lorem.words({ min: 60, max: 80 }).repeat(1280); // More than 32 k, less than 128 k

const aiMsg = await llm.invoke([
  [
    "system",
    "You are a helpful assistant that summarises the user's content",
  ],
  ["human", manyTokens],
]);

Error Message and Stack Trace (if applicable)

{
  "code": 429,
  "errors": [
    {
      "message": "Resource exhausted. Please try again later. Please refer to https: //cloud.google.com/vertex-ai/generative-ai/docs/error-code-429 for more details.",
      "domain": "global",
      "reason": "rateLimitExceeded"
    }
  ]
}

Description

We've encountered resource exhaustion errors with the Vertex AI integration in production lately, specifically with Gemini 1.5 Flash. This is essentially langchain-ai/langchain/issues/22241 where others have reported limits on the tokens per minute.

I would have thought these would be automatically retried with langchain's p-retry implementation. Debugging into the source I see that the defaultFailedAttemptHandler is overridden by a failedAttemptHandler in langchain-google-common.
Though this tests for errors that should not be retried it ultimately throws all errors.

System Info

Mac
node v22.11.0
pnpm 9.13.2
[email protected]

dosubot · 2025-01-09T17:10:36Z

Hey @siviter-t! I'm here to help you with any bugs, questions, or contributions you have. Let's get this issue sorted out together.

The issue you're encountering with retryable errors not being retried in the langchain-google-vertexai integration is related to how the failedAttemptHandler in langchain-google-common is implemented. This handler checks for specific error conditions and throws errors that should not be retried. However, it appears that it ultimately throws all errors, which might be why your rate limit exceeded error (HTTP 429) is not being retried.

The failedAttemptHandler specifically checks for certain HTTP status codes and error conditions to determine if an error should not be retried. These include status codes like 400, 401, 402, 403, 404, 405, 406, 407, and 409, as well as specific error messages and codes such as "Cancel", "AbortError", and "ECONNABORTED" ^[1]^[2].

To address this, you might consider customizing the retry logic to handle the 429 error specifically. This could involve modifying the failedAttemptHandler to allow retries for the 429 status code or implementing a custom retry strategy that handles this case.

If you have control over the code, you could adjust the failedAttemptHandler to not throw the 429 error, allowing the p-retry mechanism to retry the request. Alternatively, you could implement a custom retry logic that specifically handles the 429 error by catching it and retrying the request manually.

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other}

afirstenberg · 2025-01-09T22:34:20Z

Thanks for flagging this!

@jacoblee93 - Please assign this to me if you don't get to it yourself or if @siviter-t doesn't get to it first. {:
I also want to look if the retrys are with an exponential backoff or not (they weren't last time I looked at this).

jacoblee93 · 2025-01-10T00:39:37Z

Thank you @afirstenberg!

siviter-t · 2025-01-10T17:08:45Z

In my local tests in our repo, overriding this handler and simply dropping the final throw does the trick for rate limits and I observed it demonstrates the default exponential back-off config of p-retry - though that looks finickity to test in jest. Unsure whether this is a sufficient fix and whether other classes of error ought to be ignored. Happy to file a PR for this when I next find a chance.

const vertexWorkaroundRetryHandler = (error: any) => {
  const status = error?.response?.status ?? 0;

  if (status === 0) {
    // What is this?
    console.error("failedAttemptHandler", error);
  }

  // What errors shouldn't be retried?
  if (STATUS_NO_RETRY.includes(+status)) {
    throw error;
  }

  log.trace({ src: "vertexWorkaroundRetryHandler", error }, "Received retryable error");
};

const model = new ChatVertexAI({
  model: "gemini-1.5-flash-002",
  temperature: 0,
  onFailedAttempt: vertexWorkaroundRetryHandler,
  maxRetries: 6,
 });

dosubot bot added the auto:bug Related to a bug, vulnerability, unexpected error with an existing feature label Jan 9, 2025

jacoblee93 assigned afirstenberg Jan 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

langchain-google-vertexai: retryable errors are not retried #7493

langchain-google-vertexai: retryable errors are not retried #7493

siviter-t commented Jan 9, 2025 •

edited

Loading

dosubot bot commented Jan 9, 2025

afirstenberg commented Jan 9, 2025

jacoblee93 commented Jan 10, 2025

siviter-t commented Jan 10, 2025 •

edited

Loading

langchain-google-vertexai: retryable errors are not retried #7493

langchain-google-vertexai: retryable errors are not retried #7493

Comments

siviter-t commented Jan 9, 2025 • edited Loading

Checked other resources

Example Code

Error Message and Stack Trace (if applicable)

Description

System Info

dosubot bot commented Jan 9, 2025

afirstenberg commented Jan 9, 2025

jacoblee93 commented Jan 10, 2025

siviter-t commented Jan 10, 2025 • edited Loading

siviter-t commented Jan 9, 2025 •

edited

Loading

siviter-t commented Jan 10, 2025 •

edited

Loading