From 417d75f4a299f49604cfc9a5d09ee6f9b7e238ae Mon Sep 17 00:00:00 2001 From: Eugene Yurtsev Date: Tue, 13 Aug 2024 20:33:13 -0400 Subject: [PATCH] docs[patch]: Update integration docs for AzureOpenAIEmbeddings (#25311) https://github.com/langchain-ai/langchain/issues/24856 --------- Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com> Co-authored-by: isaac hershenson --- .../text_embedding/azureopenai.ipynb | 246 +++++++++++------- 1 file changed, 156 insertions(+), 90 deletions(-) diff --git a/docs/docs/integrations/text_embedding/azureopenai.ipynb b/docs/docs/integrations/text_embedding/azureopenai.ipynb index 7bf50a4ce9caa..3ae4a4fb551d5 100644 --- a/docs/docs/integrations/text_embedding/azureopenai.ipynb +++ b/docs/docs/integrations/text_embedding/azureopenai.ipynb @@ -2,195 +2,261 @@ "cells": [ { "cell_type": "raw", - "id": "0aed0743", + "id": "afaf8039", "metadata": {}, "source": [ "---\n", - "keywords: [AzureOpenAIEmbeddings]\n", + "sidebar_label: AzureOpenAI\n", "---" ] }, { "cell_type": "markdown", - "id": "c3852491", + "id": "9a3d6f34", "metadata": {}, "source": [ - "# Azure OpenAI\n", + "# AzureOpenAIEmbeddings\n", "\n", - "Let's load the Azure OpenAI Embedding class with environment variables set to indicate to use Azure endpoints." + "This will help you get started with AzureOpenAI embedding models using LangChain. For detailed documentation on `AzureOpenAIEmbeddings` features and configuration options, please refer to the [API reference](https://api.python.langchain.com/en/latest/embeddings/langchain_openai.embeddings.azure.AzureOpenAIEmbeddings.html).\n", + "\n", + "## Overview\n", + "### Integration details\n", + "\n", + "import { ItemTable } from \"@theme/FeatureTables\";\n", + "\n", + "\n", + "\n", + "## Setup\n", + "\n", + "To access AzureOpenAI embedding models you'll need to create an Azure account, get an API key, and install the `langchain-openai` integration package.\n", + "\n", + "### Credentials\n", + "\n", + "You’ll need to have an Azure OpenAI instance deployed. You can deploy a version on Azure Portal following this [guide](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=web-portal).\n", + "\n", + "Once you have your instance running, make sure you have the name of your instance and key. You can find the key in the Azure Portal, under the “Keys and Endpoint” section of your instance.\n", + "\n", + "```bash\n", + "AZURE_OPENAI_ENDPOINT=\n", + "AZURE_OPENAI_API_KEY=\n", + "AZURE_OPENAI_API_VERSION=\"2024-02-01\"\n", + "```" ] }, { "cell_type": "code", - "execution_count": null, - "id": "228faf0c", + "execution_count": 8, + "id": "36521c2a", "metadata": {}, "outputs": [], "source": [ - "%pip install --upgrade --quiet langchain-openai" + "import getpass\n", + "import os\n", + "\n", + "if not os.getenv(\"OPENAI_API_KEY\"):\n", + " os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"Enter your AzureOpenAI API key: \")" + ] + }, + { + "cell_type": "markdown", + "id": "c84fb993", + "metadata": {}, + "source": [ + "If you want to get automated tracing of your model calls you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:" ] }, { "cell_type": "code", - "execution_count": 1, - "id": "8a6ed30d-806f-4800-b5fd-d04126be9060", + "execution_count": 9, + "id": "39a4953b", "metadata": {}, "outputs": [], "source": [ - "import os\n", - "\n", - "os.environ[\"AZURE_OPENAI_API_KEY\"] = \"...\"\n", - "os.environ[\"AZURE_OPENAI_ENDPOINT\"] = \"https://.openai.azure.com/\"" + "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n", + "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass(\"Enter your LangSmith API key: \")" ] }, { - "cell_type": "code", - "execution_count": 2, - "id": "20179bc7-3f71-4909-be12-d38bce009b18", + "cell_type": "markdown", + "id": "d9664366", "metadata": {}, - "outputs": [], "source": [ - "from langchain_openai import AzureOpenAIEmbeddings\n", + "### Installation\n", "\n", - "embeddings = AzureOpenAIEmbeddings(\n", - " azure_deployment=\"\",\n", - " openai_api_version=\"2023-05-15\",\n", - ")" + "The LangChain AzureOpenAI integration lives in the `langchain-openai` package:" ] }, { "cell_type": "code", - "execution_count": 3, - "id": "f8cb9dca-738b-450f-9986-5c3efd3c6eb3", + "execution_count": null, + "id": "64853226", "metadata": {}, "outputs": [], "source": [ - "text = \"this is a test document\"" + "%pip install -qU langchain-openai" ] }, { - "cell_type": "code", - "execution_count": 4, - "id": "0fae0295-b117-4a5a-8b98-500c79306551", + "cell_type": "markdown", + "id": "45dd1724", "metadata": {}, - "outputs": [], "source": [ - "query_result = embeddings.embed_query(text)" + "## Instantiation\n", + "\n", + "Now we can instantiate our model object and generate chat completions:" ] }, { "cell_type": "code", - "execution_count": 5, - "id": "65a01ddd-0bbf-444f-a87f-93af25ef902c", + "execution_count": 11, + "id": "9ea7a09b", "metadata": {}, "outputs": [], "source": [ - "doc_result = embeddings.embed_documents([text])" + "from langchain_openai import AzureOpenAIEmbeddings\n", + "\n", + "embeddings = AzureOpenAIEmbeddings(\n", + " model=\"text-embedding-3-large\",\n", + " # dimensions: Optional[int] = None, # Can specify dimensions with new text-embedding-3 models\n", + " # azure_endpoint=\"https://.openai.azure.com/\", If not provided, will read env variable AZURE_OPENAI_ENDPOINT\n", + " # api_key=... # Can provide an API key directly. If missing read env variable AZURE_OPENAI_API_KEY\n", + " # openai_api_version=..., # If not provided, will read env variable AZURE_OPENAI_API_VERSION\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "77d271b6", + "metadata": {}, + "source": [ + "## Indexing and Retrieval\n", + "\n", + "Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our RAG tutorials under the [working with external knowledge tutorials](/docs/tutorials/#working-with-external-knowledge).\n", + "\n", + "Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`." ] }, { "cell_type": "code", - "execution_count": 6, - "id": "45771052-68ca-4e03-9c4f-a0c7796d9442", + "execution_count": 5, + "id": "d817716b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "[-0.012222584727053133,\n", - " 0.0072103982392216145,\n", - " -0.014818063280923775,\n", - " -0.026444746872933557,\n", - " -0.0034330499700826883]" + "'LangChain is the framework for building context-aware reasoning applications'" ] }, - "execution_count": 6, + "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "doc_result[0][:5]" + "# Create a vector store with a sample text\n", + "from langchain_core.vectorstores import InMemoryVectorStore\n", + "\n", + "text = \"LangChain is the framework for building context-aware reasoning applications\"\n", + "\n", + "vectorstore = InMemoryVectorStore.from_texts(\n", + " [text],\n", + " embedding=embeddings,\n", + ")\n", + "\n", + "# Use the vectorstore as a retriever\n", + "retriever = vectorstore.as_retriever()\n", + "\n", + "# Retrieve the most similar text\n", + "retrieved_documents = retriever.invoke(\"What is LangChain?\")\n", + "\n", + "# show the retrieved document's content\n", + "retrieved_documents[0].page_content" ] }, { "cell_type": "markdown", - "id": "e66ec1f2-6768-4ee5-84bf-a2d76adc20c8", - "metadata": {}, - "source": [ - "## [Legacy] When using `openai<1`" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1b40f827", + "id": "e02b9855", "metadata": {}, - "outputs": [], "source": [ - "# set the environment variables needed for openai package to know to reach out to azure\n", - "import os\n", + "## Direct Usage\n", + "\n", + "Under the hood, the vectorstore and retriever implementations are calling `embeddings.embed_documents(...)` and `embeddings.embed_query(...)` to create embeddings for the text(s) used in `from_texts` and retrieval `invoke` operations, respectively.\n", + "\n", + "You can directly call these methods to get embeddings for your own use cases.\n", "\n", - "os.environ[\"OPENAI_API_TYPE\"] = \"azure\"\n", - "os.environ[\"OPENAI_API_BASE\"] = \"https://