Scaling to production -> OSError: [Errno 24] Too many open files socket.accept() out of system resource #714

lukasugar · 2024-07-24T10:56:43Z

Problem

When my LangServe app gets ~1000 concurrent requests, it breaks with error:

OSError: [Errno 24] Too many open files
socket.accept() out of system resource

Mitigation/quickfix

I've checked the soft ulimit of the VM, it was only 1024, while the hard limit is 524288. I've increased the soft limit to be 100000, which should mitigate the issue for now.

Better way of doing it?

I'm curious if there's a better way of handling this issue. Even with the increased limit of allowed open files, is there something I can do in my app to make it better/more resilient?

What my code looks like

I define chains like this, straightforward:

review_text_chain = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            REVIEW_SYSTEM_PROMPT,
        ),
        ("user", "{text}"),
    ]
) | ChatOpenAI(model="gpt-4o")

and pass them to the router:

# python
from fastapi import APIRouter
from langserve import add_routes

router = APIRouter()

add_routes(router, review_text_chain, path="/api/v1/review_text")

I'm calling this service from a separatenestjs application, like this:

// typescript
private async callAiReview(document: string): Promise<any> {
    const analysisResponse = await fetch('www.path_to_my_endpoint/api/v1/review_text/invoke', {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
      },
      body: document,
    });

    return await analysisResponse.json();
  }

There's a bunch of documents, and for each I'm calling callAiReview method.

Are there things in the app that can be improved? Maybe async?

Should I use async in LangServe? How?

I'm aware that I could use batch instead of invoke, but other than that, are there improvements to be made?

How to make LangServe work with production load?

The text was updated successfully, but these errors were encountered:

eyurtsev · 2024-08-02T01:29:40Z

Duplicate of #717.

Issue likely stemmed either using a misconfigured langsmith tracing client or being rate limited by the langsmith client.

eyurtsev closed this as completed Aug 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scaling to production -> OSError: [Errno 24] Too many open files socket.accept() out of system resource #714

Scaling to production -> OSError: [Errno 24] Too many open files socket.accept() out of system resource #714

lukasugar commented Jul 24, 2024

eyurtsev commented Aug 2, 2024

Scaling to production -> OSError: [Errno 24] Too many open files socket.accept() out of system resource #714

Scaling to production -> OSError: [Errno 24] Too many open files socket.accept() out of system resource #714

Comments

lukasugar commented Jul 24, 2024

Problem

Mitigation/quickfix

Better way of doing it?

What my code looks like

Are there things in the app that can be improved? Maybe async?

eyurtsev commented Aug 2, 2024