Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: update integration docs for mistral ai embedding model #25253

Merged
merged 3 commits into from
Aug 14, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
205 changes: 184 additions & 21 deletions docs/docs/integrations/text_embedding/mistralai.ipynb
Original file line number Diff line number Diff line change
@@ -1,81 +1,244 @@
{
"cells": [
{
"cell_type": "raw",
"id": "afaf8039",
"metadata": {},
"source": [
"---\n",
"sidebar_label: MistralAI\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "b14a24db",
"id": "9a3d6f34",
"metadata": {},
"source": [
"# MistralAI\n",
"# MistralAIEmbeddings\n",
"\n",
"This will help you get started with MistralAI embedding models using LangChain. For detailed documentation on `MistralAIEmbeddings` features and configuration options, please refer to the [API reference](https://api.python.langchain.com/en/latest/embeddings/langchain_mistralai.embeddings.MistralAIEmbeddings.html).\n",
"\n",
"## Overview\n",
"### Integration details\n",
"\n",
"import { ItemTable } from \"@theme/FeatureTables\";\n",
"\n",
"<ItemTable category=\"text_embedding\" item=\"MistralAI\" />\n",
"\n",
"This notebook explains how to use MistralAIEmbeddings, which is included in the langchain_mistralai package, to embed texts in langchain."
"## Setup\n",
"\n",
"To access MistralAI embedding models you'll need to create a/an MistralAI account, get an API key, and install the `langchain-mistralai` integration package.\n",
"\n",
"### Credentials\n",
"\n",
"Head to [https://console.mistral.ai/](https://console.mistral.ai/) to sign up to MistralAI and generate an API key. Once you've done this set the MISTRALAI_API_KEY environment variable:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "0ab948fc",
"id": "36521c2a",
"metadata": {},
"outputs": [],
"source": [
"# pip install -U langchain-mistralai"
"import getpass\n",
"import os\n",
"\n",
"if not os.getenv(\"MISTRALAI_API_KEY\"):\n",
" os.environ[\"MISTRALAI_API_KEY\"] = getpass.getpass(\"Enter your MistralAI API key: \")"
]
},
{
"cell_type": "markdown",
"id": "67c637ca",
"id": "c84fb993",
"metadata": {},
"source": [
"## import the library"
"If you want to get automated tracing of your model calls you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "5709b030",
"id": "39a4953b",
"metadata": {},
"outputs": [],
"source": [
"from langchain_mistralai import MistralAIEmbeddings"
"# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
"# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")"
]
},
{
"cell_type": "markdown",
"id": "d9664366",
"metadata": {},
"source": [
"### Installation\n",
"\n",
"The LangChain MistralAI integration lives in the `langchain-mistralai` package:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "1756b1ba",
"execution_count": null,
"id": "64853226",
"metadata": {},
"outputs": [],
"source": [
"embedding = MistralAIEmbeddings(api_key=\"your-api-key\")"
"%pip install -qU langchain-mistralai"
]
},
{
"cell_type": "markdown",
"id": "4a2a098d",
"id": "45dd1724",
"metadata": {},
"source": [
"# Using the Embedding Model\n",
"With `MistralAIEmbeddings`, you can directly use the default model 'mistral-embed', or set a different one if available."
"## Instantiation\n",
"\n",
"Now we can instantiate our model object and generate chat completions:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "584b9af5",
"id": "9ea7a09b",
"metadata": {},
"outputs": [],
"source": [
"embedding.model = \"mistral-embed\" # or your preferred model if available"
"from langchain_mistralai import MistralAIEmbeddings\n",
"\n",
"embeddings = MistralAIEmbeddings(\n",
" model=\"mistral-embed\",\n",
")"
]
},
{
"cell_type": "markdown",
"id": "77d271b6",
"metadata": {},
"source": [
"## Indexing and Retrieval\n",
"\n",
"Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our RAG tutorials under the [working with external knowledge tutorials](/docs/tutorials/#working-with-external-knowledge).\n",
"\n",
"Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "be18b873",
"id": "d817716b",
"metadata": {},
"outputs": [],
"outputs": [
{
"data": {
"text/plain": [
"'LangChain is the framework for building context-aware reasoning applications'"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Create a vector store with a sample text\n",
"from langchain_core.vectorstores import InMemoryVectorStore\n",
"\n",
"text = \"LangChain is the framework for building context-aware reasoning applications\"\n",
"\n",
"vectorstore = InMemoryVectorStore.from_texts(\n",
" [text],\n",
" embedding=embeddings,\n",
")\n",
"\n",
"# Use the vectorstore as a retriever\n",
"retriever = vectorstore.as_retriever()\n",
"\n",
"# Retrieve the most similar text\n",
"retrieved_documents = retriever.invoke(\"What is LangChain?\")\n",
"\n",
"# show the retrieved document's content\n",
"retrieved_documents[0].page_content"
]
},
{
"cell_type": "markdown",
"id": "e02b9855",
"metadata": {},
"source": [
"## Direct Usage\n",
"\n",
"Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n",
"\n",
"You can directly call these methods to get embeddings for your own use cases.\n",
"\n",
"### Embed single texts\n",
"\n",
"You can embed single texts or documents with `embed_query`:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "0d2befcd",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[-0.04443359375, 0.01885986328125, 0.018035888671875, -0.00864410400390625, 0.049652099609375, -0.00\n"
]
}
],
"source": [
"single_vector = embeddings.embed_query(text)\n",
"print(str(single_vector)[:100]) # Show the first 100 characters of the vector"
]
},
{
"cell_type": "markdown",
"id": "1b5a7d03",
"metadata": {},
"source": [
"### Embed multiple texts\n",
"\n",
"You can embed multiple texts with `embed_documents`:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "2f4d6e97",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[-0.04443359375, 0.01885986328125, 0.0180511474609375, -0.0086517333984375, 0.049652099609375, -0.00\n",
"[-0.02032470703125, 0.02606201171875, 0.051605224609375, -0.0281982421875, 0.055755615234375, 0.0019\n"
]
}
],
"source": [
"res_query = embedding.embed_query(\"The test information\")\n",
"res_document = embedding.embed_documents([\"test1\", \"another test\"])"
"text2 = (\n",
" \"LangGraph is a library for building stateful, multi-actor applications with LLMs\"\n",
")\n",
"two_vectors = embeddings.embed_documents([text, text2])\n",
"for vector in two_vectors:\n",
" print(str(vector)[:100]) # Show the first 100 characters of the vector"
]
},
{
"cell_type": "markdown",
"id": "98785c12",
"metadata": {},
"source": [
"## API Reference\n",
"\n",
"For detailed documentation on `MistralAIEmbeddings` features and configuration options, please refer to the [API reference](https://api.python.langchain.com/en/latest/embeddings/langchain_mistralai.embeddings.MistralAIEmbeddings.html).\n"
]
}
],
Expand Down
Loading