embeddings, fine-tuning

distantmagic · Jul 4, 2024 · 90c5451 · 90c5451
1 parent c784944
commit 90c5451
Show file tree

Hide file tree

Showing 5 changed files with 40 additions and 2 deletions.
diff --git a/book.toml b/book.toml
@@ -3,7 +3,7 @@ authors = ["Mateusz Charytoniuk"]
 language = "en"
 multilingual = false
 src = "src"
-title = "LLMOps Handbook"
+title = "LLMOps Handbook (work in progress)"
 
 [output.html]
 additional-js = [

diff --git a/src/SUMMARY.md b/src/SUMMARY.md
@@ -5,14 +5,15 @@
     - [Contributing](./introduction/contributing.md)
 - [General Concepts]()
     - [Continuous Batching](./general-concepts/continuous-batching/README.md)
-    - [Embeddings]()
+    - [Embedding](./general-concepts/embedding/README.md)
     - [Input/Output](./general-concepts/input-output/README.md)
     - [Large Language Model](./general-concepts/large-language-model/README.md)
     - [Load Balancing](./general-concepts/load-balancing/README.md)
         - [Forward Proxy]()
         - [Reverse Proxy]()
     - [Model Parameters]()
     - [Supervisor]()
+    - [Vector Database]()
 - [Deployments]()
     - [llama.cpp](./deployments/llama.cpp/README.md)
         - [Production Deployment]()

diff --git a/src/fine-tuning/README.md b/src/fine-tuning/README.md
@@ -1 +1,7 @@
 # Fine-tuning
+
+Fine-tuning is taking a pre-trained model and further training it on a new task. This is typically useful when you want to repurpose a model trained on a large-scale dataset for a new task with less data available.
+
+In practice, that means fine-tuning allows the model to adapt to the new data without forgetting what it has learned before. 
+
+A good example might be the [sqlcoder](https://github.com/defog-ai/sqlcoder) model, which is a fine-tuned [starcoder](https://github.com/bigcode-project/starcoder) model (which is a general coding model) to be exceptionally good at producing SQL.
diff --git a/src/general-concepts/embedding/README.md b/src/general-concepts/embedding/README.md
@@ -0,0 +1,19 @@
+# Embedding
+
+Formally, embedding represents a word (or a phrase) in a vector space. In this space, words with similar meanings are close to each other. 
+
+For example, the words "dog" and "cat" might be close to each other in the vector space because they are both animals. 
+
+## RGB Analogy
+
+Because embeddings can be vectors with 4096 or more dimensions, it might be hard to imagine them and get a good intuition on how they work in practice.
+
+A good analogy for getting an intuition about embeddings is to imagine them as points in 3D space first. 
+
+Let's assume a color represented by RGB is our embedding. It is a 3D vector with 3 values: red, green, and blue representing 3 dimensions. Similar colors in that space are placed near each other. Red is close to orange, blue and green are close to teal, etc.
+
+Embeddings work similarly. Words and phrases are represented by vectors, and similar words are placed close to each other in the vector space.
+
+Searching through similar embeddings to a given one means we are looking for vectors that are placed close to the given embedding.
+
+![RGB Space](https://upload.wikimedia.org/wikipedia/commons/8/83/RGB_Cube_Show_lowgamma_cutout_b.png)
diff --git a/src/retrieval-augmented-generation/README.md b/src/retrieval-augmented-generation/README.md
@@ -1 +1,13 @@
 # Retrieval Augmented Generation
+
+Retrieval Augmented Generation is a technique to improve the quality of the generated text. 
+
+In practice, and in a significant simplification, RAG is about injecting data into [Large Language Model](/general-concepts/large-language-model) prompt. 
+
+For example, let's say the user asks the LLM:
+- `What are the latest articles on our website?` 
+
+To augment the response, you need to intercept the user's question and tell LLM to respond in a way more or less like: 
+- `You are a <inser persona here>. Tell the user that the latest articles on our site are <insert latest articles metadata here>`
+
+That is greatly simplified, but generally, that is how it works. Along the way, [embeddings](/general-concepts/embeddings) and [vector databases](/general-concepts/vector-database) are involved.