-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Langgraph agent is too slow #2920
Comments
Thanks for sharing! At first glance, this seems like LLM provider latency and not LangGraph code latency. Why do you think it's caused by LangGraph? Would be happy to dig in further if you have more information. In OP's description, it seems like the agent has at least three LLM calls: the agent call requesting the tool, the llm call within the tool, and then the llm call in response. To speed things up you typically want to either decrease the number of LLM calls if possible or speed up each LLM call ( by reducing the amount of context passed to the LLM or by using a faster/cheaper LLM if the task is simple enough) or if the situation permits, parallelize work. |
Will close in a few days if no one else provides information indicating the latency indeed is related to langgraph |
I’m a long conversations, where the history is getting longer, I can see that only the final step, is taking by far a lot of time: It’s not a normal response time. |
Based on the trace, the latency is clearly coming from the LLM and the tool call, why do you think this is a LangGraph latency issue? |
I have tested the LLM by calling API directly, and it respond pretty quickly. I have had similar observations as @bigsela, specifically I have noticed that it becomes slow when thread history is included, i.e we pass thread id inside invoke method. |
But is the latency actually coming from the langgraph or from the LLM having to think longer because of the long message history and tool calls involved? Could you provide some specific examples where latency is caused by the library and not by the LLMs? |
I’ll have it on Sunday and will share , in longer sequences, where the agent get to decide to call another tool before finalizing the flow, it won’t use this long time to decide it, even with history. It chooses fast to use another tool, and then in the final step, when giving a final answer , it takes a long time to do so. This brings me to the idea, that it’s related to the last step and langraph/langchain, and not connected to the llm. |
Sure, if you can provide a minimal reproducible example of the issue that demonstrates an issue with langgraph/langgchain we'll investigate https://stackoverflow.com/help/minimal-reproducible-example |
All of these are either LLM latency (e.g. OpenAI model calls) or tool execution, none of these indicate |
I think I'm also seeing a slowdown when using langgraph with a similar large context case, in my case with base64 files. Will create a simple script with timings and share. |
Same problem. Did you solve it? |
Been a bit busy on a project so haven't tried yet! But it's the long weekend so should have some time. |
Checked other resources
Example Code
Error Message and Stack Trace (if applicable)
No response
Description
I am using a very basic graph structure to call a tool, basically the code is the same as provided by LangGraph documentation (https://langchain-ai.github.io/langgraph/tutorials/customer-support/customer-support/#define-graph), but for model I am using Gemini for agent. The problem is when I use the tool separately, it generates response pretty quick (I am using Gemini inside the tool as well) but when I call the agent, it takes a lot of time. I actually noticed this with langChain runnable as well. Does anyone know why is this happening and how it v=can be resolved?
System Info
langchain-core version: 0.2.40
The text was updated successfully, but these errors were encountered: