-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TELCODOCS-2134 updating AI pattern #507
base: main
Are you sure you want to change the base?
Conversation
This is an automated message: You can preview this docs PR at http://507.docs-pr.validatedpatterns.io |
@hbisht-RH-ccs would appreciate your review of this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kquinn1204 , I've just added a few comments, rest LGTM. Great work. Thanks!
|
||
Use the instructions to add nodes with GPU in OpenShift cluster running in AWS cloud. Nodes with GPU will be tainted to allow only pods that required GPU to be scheduled to these nodes | ||
By default the GPU nodes deployed are of instance type `g5.2xlarge`. If for some reason you want to change this maybe due to performance issues carry out the following steps: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The GPU nodes deployed are of instance type" is in the passive voice.
"maybe due to performance issues" is informal and could be more precise.
We can rewrite
By default, GPU nodes use the instance type g5.2xlarge
. If you need to change the instance type—such as to address performance requirements, follow these steps:
- Red Hat Openshift cluster running in AWS. Supported regions are us-west-2 and us-east-1. | ||
- GPU Node to run Hugging Face Text Generation Inference server on Red Hat OpenShift cluster. | ||
- Create a fork of the [rag-llm-gitops](https://github.com/validatedpatterns/rag-llm-gitops.git) git repository. | ||
|
||
## Demo Description & Architecture | ||
|
||
The goal of this demo is to demonstrate a Chatbot LLM application augmented with data from Red Hat product documentation running on [Red Hat OpenShift AI](https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-ai). It deploys an LLM application that connects to multiple LLM providers such as OpenAI, Hugging Face, and NVIDIA NIM. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we should use some other word here instead of demonstrates such as showcase..
## Deploying the demo | ||
## Prerequisites | ||
|
||
- Podman |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of just Podman may be we can write
Podman is installed on your system.
[TELCODOCS-2134]: Implement updates based on audit to LLM and RAG generation pattern
Issue:https://issues.redhat.com/browse/TELCODOCS-2134
Link to docs preview: http://507.docs-pr.validatedpatterns.io/