Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TELCODOCS-2134 updating AI pattern #507

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

kquinn1204
Copy link
Collaborator

@kquinn1204 kquinn1204 commented Dec 5, 2024

[TELCODOCS-2134]: Implement updates based on audit to LLM and RAG generation pattern

Issue:https://issues.redhat.com/browse/TELCODOCS-2134

Link to docs preview: http://507.docs-pr.validatedpatterns.io/

@mbaldessari
Copy link
Contributor

This is an automated message:

You can preview this docs PR at http://507.docs-pr.validatedpatterns.io
Note that they get generated every five minutes, so please wait a bit.

@openshift-ci openshift-ci bot added the size/L label Dec 5, 2024
@openshift-ci openshift-ci bot added size/XL and removed size/L labels Dec 11, 2024
@openshift-ci openshift-ci bot added size/L and removed size/XL labels Dec 11, 2024
@openshift-ci openshift-ci bot added size/XL and removed size/L labels Dec 11, 2024
@kquinn1204
Copy link
Collaborator Author

@hbisht-RH-ccs would appreciate your review of this PR

Copy link
Collaborator

@hbisht-RH-ccs hbisht-RH-ccs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kquinn1204 , I've just added a few comments, rest LGTM. Great work. Thanks!


Use the instructions to add nodes with GPU in OpenShift cluster running in AWS cloud. Nodes with GPU will be tainted to allow only pods that required GPU to be scheduled to these nodes
By default the GPU nodes deployed are of instance type `g5.2xlarge`. If for some reason you want to change this maybe due to performance issues carry out the following steps:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The GPU nodes deployed are of instance type" is in the passive voice.
"maybe due to performance issues" is informal and could be more precise.
We can rewrite
By default, GPU nodes use the instance type g5.2xlarge. If you need to change the instance type—such as to address performance requirements, follow these steps:

- Red Hat Openshift cluster running in AWS. Supported regions are us-west-2 and us-east-1.
- GPU Node to run Hugging Face Text Generation Inference server on Red Hat OpenShift cluster.
- Create a fork of the [rag-llm-gitops](https://github.com/validatedpatterns/rag-llm-gitops.git) git repository.

## Demo Description & Architecture

The goal of this demo is to demonstrate a Chatbot LLM application augmented with data from Red Hat product documentation running on [Red Hat OpenShift AI](https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-ai). It deploys an LLM application that connects to multiple LLM providers such as OpenAI, Hugging Face, and NVIDIA NIM.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we should use some other word here instead of demonstrates such as showcase..

## Deploying the demo
## Prerequisites

- Podman
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of just Podman may be we can write
Podman is installed on your system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants