Skip to content

Commit

Permalink
docs: add a link to NeMo's known issues page (#402)
Browse files Browse the repository at this point in the history
Signed-off-by: ashors1 <[email protected]>
Co-authored-by: Terry Kong <[email protected]>
  • Loading branch information
ashors1 and terrykong authored Nov 15, 2024

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
1 parent 7d970b8 commit f82bbf4
Showing 10 changed files with 27 additions and 1 deletion.
5 changes: 5 additions & 0 deletions docs/user-guide/cai.rst
Original file line number Diff line number Diff line change
@@ -62,6 +62,11 @@ This section is a step-by-step tutorial that walks you through how to run a full

7. Run inference.

.. note::
Before starting this tutorial, be sure to review the :ref:`introduction <model-aligner-intro>` for tips on setting up your NeMo-Aligner environment.

If you run into any problems, refer to NeMo's `Known Issues page <https://docs.nvidia.com/nemo-framework/user-guide/latest/knownissues.html>`__. The page enumerates known issues and provides suggested workarounds where appropriate.

.. image:: ../assets/cai_flow.png

Step 1: Download models and datasets
2 changes: 2 additions & 0 deletions docs/user-guide/dpo.rst
Original file line number Diff line number Diff line change
@@ -7,6 +7,8 @@ Model Alignment by DPO, RPO, and IPO

.. note::
Before starting this tutorial, be sure to review the :ref:`introduction <model-aligner-intro>` for tips on setting up your NeMo-Aligner environment.

If you run into any problems, refer to NeMo's `Known Issues page <https://docs.nvidia.com/nemo-framework/user-guide/latest/knownissues.html>`__. The page enumerates known issues and provides suggested workarounds where appropriate.

The NeMo Framework supports efficient model alignment via the NeMo-Aligner codebase.

2 changes: 2 additions & 0 deletions docs/user-guide/draftp.rst
Original file line number Diff line number Diff line change
@@ -8,6 +8,8 @@ Fine-Tuning Stable Diffusion with DRaFT+
.. note::
Before starting this tutorial, be sure to review the :ref:`introduction <model-aligner-intro>` for tips on setting up your NeMo-Aligner environment.

If you run into any problems, refer to NeMo's `Known Issues page <https://docs.nvidia.com/nemo-framework/user-guide/latest/knownissues.html>`__. The page enumerates known issues and provides suggested workarounds where appropriate.

In this tutorial, we will go through the step-by-step guide for fine-tuning a Stable Diffusion model using DRaFT+ algorithm by NVIDIA.
DRaFT+ enhances the DRaFT `DRaFT <https://arxiv.org/pdf/2309.17400.pdf>`__ algorithm by mitigating mode collapse and improving diversity through regularization.
For more technical details on the DRaFT+ algorithm, check out our technical blog.
6 changes: 6 additions & 0 deletions docs/user-guide/knowledge-distillation.rst
Original file line number Diff line number Diff line change
@@ -9,6 +9,12 @@ There are two primary benefits of knowledge distillation compared to standard su

There are many variants of knowledge distillation. NeMo Aligner supports training the student model to match the top-K logits of the teacher model. In this tutorial, we will go through fine-tuning a 2B student using a fine-tuned Nemotron 8B chat model.

.. note::
Before starting this tutorial, be sure to review the :ref:`introduction <model-aligner-intro>` for tips on setting up your NeMo-Aligner environment.

If you run into any problems, refer to NeMo's `Known Issues page <https://docs.nvidia.com/nemo-framework/user-guide/latest/knownissues.html>`__. The page enumerates known issues and provides suggested workarounds where appropriate.


Obtain the fine-tuned teacher and pre-trained student models
############################################################
To start, we must first download both the pre-trained student and fine-tuned teacher models
3 changes: 2 additions & 1 deletion docs/user-guide/modelalignment.rsts
Original file line number Diff line number Diff line change
@@ -29,6 +29,7 @@ To use a pre-built container, run the following code:
Please use the latest tag in the form yy.mm.(patch).

.. note::
Some of the subsequent tutorials require accessing gated Hugging Face models. For details on how to access these models, refer to ``this document <https://docs.nvidia.com/nemo-framework/user-guide//latest/generaltips.html#working-with-hugging-face-models>``__.
- Some of the subsequent tutorials require accessing gated Hugging Face models. For details on how to access these models, refer to `this document <https://docs.nvidia.com/nemo-framework/user-guide/latest/best-practices.html#working-with-hugging-face-models>`__.
- If you run into any problems, refer to NeMo's `Known Issues page <https://docs.nvidia.com/nemo-framework/user-guide/latest/knownissues.html>`__. The page enumerates known issues and provides suggested workarounds where appropriate.


2 changes: 2 additions & 0 deletions docs/user-guide/rlhf.rst
Original file line number Diff line number Diff line change
@@ -8,6 +8,8 @@ Model Alignment by RLHF
.. note::
Before starting this tutorial, be sure to review the :ref:`introduction <model-aligner-intro>` for tips on setting up your NeMo-Aligner environment.

If you run into any problems, refer to NeMo's `Known Issues page <https://docs.nvidia.com/nemo-framework/user-guide/latest/knownissues.html>`__. The page enumerates known issues and provides suggested workarounds where appropriate.

For the purposes of this tutorial, we will go through the entire Reinforcement Learning from Human Feedback (RLHF) pipeline using models from the NeMo Framework. These models can include LLaMa or Mistral, and our scripts will function consistently across them.

RLHF is usually preceded by a Supervised Fine-Tuning (SFT). We should first follow the :ref:`Prerequisite guide <prerequisite>` and the :ref:`SFT guide <sft>`. After obtaining the SFT model, we will use this to start the RLHF process. We will use the `PPO <https://arxiv.org/abs/1707.06347>`__ algorithm for reinforcement learning on the `Anthropic-HH-RLHF <https://huggingface.co/datasets/Anthropic/hh-rlhf>`__ dataset.
2 changes: 2 additions & 0 deletions docs/user-guide/rs.rst
Original file line number Diff line number Diff line change
@@ -8,6 +8,8 @@ Model Alignment by Rejection Sampling
.. note::
Before starting this tutorial, be sure to review the :ref:`introduction <model-aligner-intro>` for tips on setting up your NeMo-Aligner environment.

If you run into any problems, refer to NeMo's `Known Issues page <https://docs.nvidia.com/nemo-framework/user-guide/latest/knownissues.html>`__. The page enumerates known issues and provides suggested workarounds where appropriate.

In this tutorial, we will guide you through the process of aligning a NeMo Framework model using rejection sampling. This method can be applied to various models, including LLaMa and Mistral, with our scripts functioning consistently across different models.

Rejection Sampling is usually preceded by a Supervised Fine-Tuning (SFT). We should first follow the :ref:`Prerequisite guide <prerequisite>` and the :ref:`SFT guide <sft>`. After obtaining the SFT model, we will also need to train a reward model as in :ref:`PPO guide <ppo>`. We will use the rejection sampling algorithm on the `Anthropic-HH-RLHF <https://huggingface.co/datasets/Anthropic/hh-rlhf>`__ dataset.
2 changes: 2 additions & 0 deletions docs/user-guide/sft.rst
Original file line number Diff line number Diff line change
@@ -71,6 +71,8 @@ Model Alignment by Supervised Fine-Tuning (SFT)

.. note::
Before starting this tutorial, be sure to review the :ref:`introduction <model-aligner-intro>` for tips on setting up your NeMo-Aligner environment.

If you run into any problems, refer to NeMo's `Known Issues page <https://docs.nvidia.com/nemo-framework/user-guide/latest/knownissues.html>`__. The page enumerates known issues and provides suggested workarounds where appropriate.

Fine-Tune with a Prompt-Response Dataset
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
2 changes: 2 additions & 0 deletions docs/user-guide/spin.rst
Original file line number Diff line number Diff line change
@@ -11,6 +11,8 @@ For details on the SPIN algorithm, refer to the paper: `https://arxiv.org/abs/24

.. note::
Before starting this tutorial, be sure to review the :ref:`introduction <model-aligner-intro>` for tips on setting up your NeMo-Aligner environment.

If you run into any problems, refer to NeMo's `Known Issues page <https://docs.nvidia.com/nemo-framework/user-guide/latest/knownissues.html>`__. The page enumerates known issues and provides suggested workarounds where appropriate.

Obtain a Pretrained Model
#########################
2 changes: 2 additions & 0 deletions docs/user-guide/steerlm.rst
Original file line number Diff line number Diff line change
@@ -47,6 +47,8 @@ This section is a step-by-step tutorial that walks you through how to run a full

.. note::
Before starting this tutorial, be sure to review the :ref:`introduction <model-aligner-intro>` for tips on setting up your NeMo-Aligner environment.

If you run into any problems, refer to NeMo's `Known Issues page <https://docs.nvidia.com/nemo-framework/user-guide/latest/knownissues.html>`__. The page enumerates known issues and provides suggested workarounds where appropriate.

Download the Llama 2 LLM Model
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

0 comments on commit f82bbf4

Please sign in to comment.