Mixtral readme update (#29)

* Updated readme for Mixtral Signed-off-by: quic-amitraj <[email protected]> * Fixed error and broken lins Signed-off-by: quic-amitraj <[email protected]> * Fixed table Signed-off-by: quic-amitraj <[email protected]> * Fixed bug Signed-off-by: quic-amitraj <[email protected]> * Fixed bug Signed-off-by: quic-amitraj <[email protected]> --------- Signed-off-by: quic-amitraj <[email protected]>
quic · May 28, 2024 · 369f453 · 369f453
1 parent 893de86
commit 369f453
Showing 1 changed file with 7 additions and 8 deletions.
diff --git a/README.md b/README.md
@@ -11,7 +11,8 @@
 
 *Latest news* :fire: <br>
 
-- [coming soon] support for more popular [models](#models-coming-soon) and inference optimization techniques like continuous batching and speculative decoding <br>
+- [coming soon] Support for more popular [models](#models-coming-soon) and inference optimization techniques like continuous batching and speculative decoding <br>
+- [05/2024] Added support for [Mixtral-8x7B](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1).
 - [04/2024] Initial release of [efficient transformers](https://github.com/quic/efficient-transformers) for seamless inference on pre-trained LLMs.
 
 ## Train anywhere, Infer on Qualcomm Cloud AI with a Developer-centric Toolchain
@@ -37,7 +38,6 @@ For other models, there is comprehensive documentation to inspire upon the chang
 
 ## Validated Models
 
-
 * [GPT2](https://huggingface.co/openai-community/gpt2)
 * [Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf)
 * [Llama-2-13b-chat-hf](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf)
@@ -48,10 +48,10 @@ For other models, there is comprehensive documentation to inspire upon the chang
 * [Salesforce/xgen-7b-8k-base](https://huggingface.co/Salesforce/xgen-7b-8k-base)
 * [MPT-7b](https://huggingface.co/mosaicml/mpt-7b)
 * [Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1)
+* [Mixtral-8x7B](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1)
 
-
-**Models Coming Soon..**
-* [Mixtral-8x7B](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1)
+## Models Coming Soon
+
 * [Falcon-40b](https://huggingface.co/tiiuae/falcon-40b) 
 * [Starcoder2-15b](https://huggingface.co/bigcode/starcoder2-15b) 
 * [Phi-3](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct)
@@ -110,18 +110,17 @@ In summary:
 
 | High Level APIs | Sample use | Arguments         |
 |-----------------|------------|-------------------|
-
 | QEfficient.cloud.infer           |   [click here](#1-use-qefficientcloudinfer)         |  <li>model_name : $\color{green} {Mandatory}$</li> <li>num_cores : $\color{green} {Mandatory}$</li> <li>device_group : $\color{green} {Mandatory}$</li><li>batch_size : Optional [Default-1]</li> <li>prompt_len : Optional [Default-32]</li> <li>ctx_len : Optional [Default-128]</li><li>mxfp6 : Optional </li> <li>mxint8 : Optional </li><li>hf_token : Optional </li><li>cache_dir : Optional ["cache_dir" in current working directory]</li><li>**prompt : Optional</li><li>**prompts_txt_file_path : Optional</li>|
 | QEfficient.cloud.execute  |     [click here](#2-use-of-qefficientcloudexcute)       |   <li>model_name : $\color{green} {Mandatory}$</li> <li>device_group : $\color{green} {Mandatory}$</li><li>qpc_path : $\color{green} {Mandatory}$</li><li>prompt : Optional [Default-"My name is"]</li> <li>cache_dir : Optional ["cache_dir" in current working directory]</li><li>hf_token : Optional </li><li>**prompt : Optional</li><li>**prompts_txt_file_path : Optional</li> |
 
-**One argument, prompt or prompts_txt_file_path must be passed.
+**One argument, prompt or prompts_txt_file_path must be passed.**
 
 ### 1. Use QEfficient.cloud.infer 
 
 This is the single e2e python api in the library, which takes model_card name as input along with other compile args if necessary and does everything in one go. 
 
 * Torch Download → Optimize for Cloud AI 100 → Export to ONNX → Verify (CPU) → Compile on Cloud AI 100 → [Execute](#2-use-of-qefficientcloudexecute)
-* Its skips the ONNX export/compile stage if ONNX file or qpc found on path
+* It skips the ONNX export/compile stage if ONNX file or qpc found on path
 
 
 ```bash