Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connect timeout error for Ollama server when trying to insert embeddings for a large number of documents #191

Open
punkish opened this issue Dec 26, 2024 · 5 comments
Assignees
Labels

Comments

@punkish
Copy link

punkish commented Dec 26, 2024

🐛 Describe the bug

Using the following code (that works for smaller number of documents -- tested with LIMIT 200)

    .setModel(new Ollama({ modelName: "llama3.2", baseUrl: ollama }))
    .setEmbeddingModel(
        new OllamaEmbeddings({ 
            model: 'nomic-embed-text', 
            baseUrl: ollama 
        })
    )
    .setVectorDatabase(new LibSqlDb({ path: './zai.db' }))
    .build();

async function loadData() {
    const result = await db.execute("SELECT fulltext FROM t LIMIT 20000");
    
    for (const row of result.rows) {
        app.addLoader(new TextLoader({ text: row.fulltext }));
    }
}

loadData();

I got the following error

node:internal/deps/undici/undici:13178
      Error.captureStackTrace(err);
            ^

TypeError: fetch failed
    at node:internal/deps/undici/undici:13178:13
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async post (file:///Users/punkish/Projects/zai/node_modules/ollama/dist/shared/ollama.cddbc85b.mjs:117:20)
    at async Ollama.embed (file:///Users/punkish/Projects/zai/node_modules/ollama/dist/shared/ollama.cddbc85b.mjs:407:22)
    at async RetryOperation._fn (/Users/punkish/Projects/zai/node_modules/p-retry/index.js:50:12) {
  [cause]: ConnectTimeoutError: Connect Timeout Error (attempted addresses: ::1:11434)
      at onConnectTimeout (node:internal/deps/undici/undici:2331:28)
      at node:internal/deps/undici/undici:2283:50
      at Immediate._onImmediate (node:internal/deps/undici/undici:2315:13)
      at process.processImmediate (node:internal/timers:483:21) {
    code: 'UND_ERR_CONNECT_TIMEOUT'
  }
}

Node.js v20.16.0

Ollama is running and is available on port 11434

@punkish punkish changed the title unreachable Ollama server when trying to insert embeddings for a large number of documents connect timeout error for Ollama server when trying to insert embeddings for a large number of documents Dec 26, 2024
@adhityan
Copy link
Collaborator

Do you have an idea of how many documents it can embed safely before this error appears? This timeout seems to be based on the capacity of the machine itself. I tried in two different machines and got two different sizes before timeout. I am thinking it may make sense to provide an optional parameter to specify the chunk size which will break the insert into different groups inserted sequentially.

@punkish
Copy link
Author

punkish commented Dec 27, 2024

Do you have an idea of how many documents it can embed safely before this error appears?

I will test incrementally and let you know by the end of today. As of now, I have been able to insert 2000 documents in one go.

Note: When I use the term document, I mean it in the sense of an article with anywhere between a couple of hundred to several thousand words (these are excerpts of articles from scientific papers).

This timeout seems to be based on the capacity of the machine itself.

I am using an M1 MacBook Pro with 16 GB RAM. The SQLite database (with almost a million articles) was created on the same machine without any problem, and took a couple of hours to create. The original data were individual XML files that were parsed and broken up into tables, with the main table containing a lot of metadata as well as the fulltext of the articles. This fulltext is what I am using to create embeddings and store as the vector database, also in SQLite format.

I will report back today with the limit at which my process breaks.

Many thanks.

@punkish
Copy link
Author

punkish commented Dec 27, 2024

an update:

  • loading 5000 articles took: 14:32.024 (m:ss.mmm), created a 197MB db with 45592 vectors
  • loading 10000 crashed leaving a 120MB db with 27895 vectors

@punkish
Copy link
Author

punkish commented Dec 27, 2024

I am thinking it may make sense to provide an optional parameter to specify the chunk size which will break the insert into different groups inserted sequentially.

From what I understand, input text has to be tokenized and chunked so that embeddings of the right size can be generated. This level of chunking of my input text, the individual articles, is already happening internally managed by embedjs. My problem is that I am throwing a very large number of articles at it. But, this is what I am unable to understand -- the model generating the embeddings, in my case ollama, should never have to worry about that if there is a queue handling the way the input text is being fed to it.

Using rough calculations, if 5000 articles took ~15 mins, a million articles will take about 300 mins, or 5 hours, to generate embeddings.

Copy link

This issue is stale because it has been open for 14 days with no activity.

@github-actions github-actions bot added the stale label Jan 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants