Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with ChromaDB Integration in LangChain #7283

Open
5 tasks done
stevearagonsite opened this issue Nov 27, 2024 · 6 comments
Open
5 tasks done

Problems with ChromaDB Integration in LangChain #7283

stevearagonsite opened this issue Nov 27, 2024 · 6 comments
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature

Comments

@stevearagonsite
Copy link

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain.js documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain.js rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

I am encountering issues when using ChromaDB through LangChain integration, particularly with the new image version chromadb/chroma:0.5.20. Specifically, there seems to be a problem with queries containing an empty where: {} clause, which results in a "bad request" error.

Problem
Below are examples of successful and failing queries to illustrate the issue:

Query with a specific where clause, has been executing correctly:

{
  "query_embeddings":  ...,
  "where": {
    "url": {
      "$eq": "/detalle/carrera/1478/licenciatura-en-higiene-y-seguridad-en-el-trabajo"
    }
  },
  "n_results": 1
}

Query with error

{
  "query_embeddings":  ...,
  "where": {},
  "n_results": 10
}

Last query result in errors, making it impossible to proceed with the intended functionality.

Code Example
Below is a snippet of the Typescript code being used:

import { Injectable } from '@nestjs/common';
import { OpenAIEmbeddings } from '@langchain/openai';
import { Chroma } from '@langchain/community/vectorstores/chroma';
import { Document } from '@langchain/core/documents';

private async getStore(index: string, layer: string): Promise<Chroma> {
  const collectionName = `${index}-${layer}`;
  this.vectorStore = new Chroma(this.embeddings, {
    collectionName: collectionName,
    url: this.configService.get<string>('CHROMA_HOST'),
    collectionMetadata: {
      'hnsw:space': 'cosine',
      'hnsw:construction_ef': 800,
      'hnsw:search_ef': 2000,
      'hnsw:M': 100,
    },
  });

  await this.vectorStore.ensureCollection();
  return this.vectorStore;
}

public async search(
  index: string,
  query: string = '',
  layer: string,
  K: number = 5,
  options: any = {}
) {
  const store = await this.getStore(index, layer);
  if (Object.keys(options).length) {
    return await store.similaritySearch(query, K, flattenedOptions);
  }
  return await store.similaritySearch(query, K, null);
}

Error Message and Stack Trace (if applicable)

[Nest] 70537  - 11/27/2024, 3:44:54 PM   ERROR [WsExceptionsHandler] Expected where to have exactly one operator, got {}
InvalidArgumentError: Expected where to have exactly one operator, got {}
    at createErrorByType (/Project-Name/node_modules/.pnpm/[email protected][email protected][email protected][email protected][email protected]_/node_modules/chromadb/src/Errors.ts:93:14)
    at chromaFetch (/Project-Name/node_modules/.pnpm/[email protected][email protected][email protected][email protected][email protected]_/node_modules/chromadb/src/ChromaFetch.ts:56:21)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at Collection.query (/Project-Name/node_modules/.pnpm/[email protected][email protected][email protected][email protected][email protected]_/node_modules/chromadb/src/Collection.ts:278:13)
    at async Chroma.similaritySearchVectorWithScore (/Project-Name/node_modules/.pnpm/@[email protected]_@[email protected]_@[email protected]_@aws-sdk+c_3tl3jrvzjtn5vfkkihah6voyku/node_modules/@langchain/community/dist/vectorstores/chroma.cjs:347:24)
    at async Chroma.similaritySearch (/Project-Name/node_modules/.pnpm/@[email protected][email protected][email protected][email protected]_/node_modules/@langchain/core/dist/vectorstores.cjs:108:25)

Description

  • I am encountering issues when using ChromaDB through LangChain integration, particularly with the new image version chromadb/chroma:0.5.20.

System Info

Macbook silicon M1
Node: 20.14.0
pnpm: 9.14.2

 pnpm info langchain

[email protected] | MIT | deps: 12 | versions: 301
Typescript bindings for langchain
https://github.com/langchain-ai/langchainjs/tree/main/langchain/

keywords: llm, ai, gpt3, chain, prompt, prompt engineering, chatgpt, machine learning, ml, openai, embeddings, vectorstores

dist
.tarball: https://registry.npmjs.org/langchain/-/langchain-0.3.6.tgz
.shasum: f4313d202ce168d29bfcf81a551147cd4986779f
.integrity: sha512-erZOIKXzwCOrQHqY9AyjkQmaX62zUap1Sigw1KrwMUOnVoLKkVNRmAyxFlNZDZ9jLs/58MaQcaT9ReJtbj3x6w==
.unpackedSize: 2.9 MB

dependencies:
@langchain/openai: >=0.1.0 <0.4.0        js-yaml: ^4.1.0                          openapi-types: ^12.1.3                   yaml: ^2.2.1                             
@langchain/textsplitters: >=0.0.0 <0.2.0 jsonpointer: ^5.0.1                      p-retry: 4                               zod-to-json-schema: ^3.22.3              
js-tiktoken: ^1.0.12                     langsmith: ^0.2.0                        uuid: ^10.0.0                            zod: ^3.22.4                             

maintainers:
- nfcampos <[email protected]>
- jacoblee93 <[email protected]>
- andrewnguonly <[email protected]>
- davidduong <[email protected]>
- hwchase17 <[email protected]>
- basproul <[email protected]>

dist-tags:
latest: 0.3.6                              next: 0.3.2-rc.0                           tag-for-publishing-older-releases: 0.2.20  

published a week ago by jacoblee93 <[email protected]>
Copy link

dosubot bot commented Nov 27, 2024

Hey @stevearagonsite! I'm here to help you with any bugs, questions, or contributions you have. Let's work on getting this sorted out together!

The issue you're encountering with the "bad request" error when using an empty where clause in ChromaDB queries through LangChain.js is likely due to how LangChain.js handles empty 'where' clauses. When no filters are provided, LangChain.js returns an empty string for the WHERE clause, which can lead to unexpected behavior or errors if the underlying database or API expects a specific structure [1].

To resolve this, you should ensure that your queries do not include an empty where clause. If you want to perform a query without any specific conditions, you might need to adjust your logic to omit the where field entirely or provide a default condition that is always true.

Here's a potential adjustment to your code:

public async search(
  index: string,
  query: string = '',
  layer: string,
  K: number = 5,
  options: any = {}
) {
  const store = await this.getStore(index, layer);
  const hasValidWhereClause = options && Object.keys(options).length > 0;
  return await store.similaritySearch(query, K, hasValidWhereClause ? options : null);
}

This change checks if the options object has any keys before including it in the query, which should prevent the "bad request" error when the where clause is empty.

To continue talking to Dosu, mention @dosu.

@dosubot dosubot bot added the auto:bug Related to a bug, vulnerability, unexpected error with an existing feature label Nov 27, 2024
@stevearagonsite
Copy link
Author

Hey @stevearagonsite! I'm here to help you with any bugs, questions, or contributions you have. Let's work on getting this sorted out together!

The issue you're encountering with the "bad request" error when using an empty where clause in ChromaDB queries through LangChain.js is likely due to how LangChain.js handles empty 'where' clauses. When no filters are provided, LangChain.js returns an empty string for the WHERE clause, which can lead to unexpected behavior or errors if the underlying database or API expects a specific structure [1].

To resolve this, you should ensure that your queries do not include an empty where clause. If you want to perform a query without any specific conditions, you might need to adjust your logic to omit the where field entirely or provide a default condition that is always true.

Here's a potential adjustment to your code:

The error is in the similaritySearch function. This function calls the ChromaDB library and then the chromaFetch method, passing an empty object regardless of the options defined in the arguments.
The call is made without filtering metadata, which results in undefined or null values.

@MattMannEC
Copy link

I am experiencing the same issue with chromadb image 0.5.20 and langchain integration.

^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/langchain_chroma/vectorstores.py", line 680, in similarity_search_with_score results = self.__query_collection( ^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/langchain_core/utils/utils.py", line 53, in wrapper return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/langchain_chroma/vectorstores.py", line 379, in __query_collection return self._collection.query( ^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/chromadb/api/models/Collection.py", line 224, in query query_results = self._client._query( ^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/chromadb/telemetry/opentelemetry/__init__.py", line 146, in wrapper return f(*args, **kwargs) ^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/chromadb/api/fastapi.py", line 528, in _query resp_json = self._make_request( ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/chromadb/api/fastapi.py", line 90, in _make_request BaseHTTPClient._raise_chroma_error(response) File "/usr/local/lib/python3.11/site-packages/chromadb/api/base_http_client.py", line 103, in _raise_chroma_error raise Exception(f"{resp.text} (trace ID: {trace_id})") Exception: {"error":"InvalidArgumentError","message":"Expected where to have exactly one operator, got {}"} (trace ID: 0)

@KylinMountain
Copy link

same there.

    raise ValueError(f"Expected where to have exactly one operator, got {where}")
ValueError: Expected where to have exactly one operator, got {} in query.

@getsalesgriffin
Copy link

I solved this issue by creating a virtual environment and installing all my dependencies (chromadb, langchain, etc) from a requirements.txt file. Pretty sure I had an incompatible version of langchain.

@yesidc
Copy link

yesidc commented Dec 19, 2024

I had the same issue and solved it by upgrading to chromadb===0.5.23 (Previous version chromadb===0.4.22)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature
Projects
None yet
Development

No branches or pull requests

5 participants