Skip to content

A project which is aimed to create a virtual avatar based assistant using LLM's with dynamic voice and dynamic animations with many other cool features similar to that of GATEBOX

License

Notifications You must be signed in to change notification settings

NeuralHarbour/LLM-Based-3D-Avatar-Assistant

Repository files navigation


LLM Based Holographic Assistant

Shaping Perspectives Shifting Realities

madewithlove

Follow Us

Huggingface Spaces X LinkedIn Instagram

Moe Counter!

About The Project

A virtual waifu / assistant that you can speak to through your mic and it'll speak back to you! Has many features such as:

  • You can speak to her with a mic
  • It can speak back to you
  • It can recognize gestures and perform appropriate actions based on it
  • Has dynamic voice controls
  • Has short-term memory and long-term memory
  • Can open apps
  • Smarter than you
  • Multilingual
  • Can learn psychology
  • Can recognize different people
  • Can sing,dance do lot of other stuff.
  • Can control your smart home like Alexa
  • Dynamic animations

More features I'm planning to add soon in the Roadmap. Also, here's a summary of how it works for those of you who want to know:

First the speech which is obtained from the mic is first passed into the OpenAI Whisper model for detecting the language of the speech and then its passed on to the speech recognizer, the response is then sent to the client via a websocket, the transcribed text is then sent to gemini LLM, and the response from Jenna is printed to the unity console and appended to conversation_log.json, and finally, the response is spoken by our finedtuned Parler TTS with the expressions for the face of the character and voice and the animation will be dynamically generated for the character by our own custom animation model which we are currently working on. For the time being we are using preset animations.

Created Based on our publication https://ieeexplore.ieee.org/document/10576146

(back to top)

Built With

(back to top)

Getting Started

Prerequisites

  1. Install Python 3.10.11 and set it as an environment variable in PATH
  2. Install GIT
  3. Install CUDA 11.7 if you have an Nvidia GPU
  4. Install Visual Studio Community 2022 and select Desktop Development with C++ in the install options
  5. Install Unity
  6. Install FFMPEG

How to run / Steps to reproduce


1. Create a new project in unity with URP Settings

Step1


2. Replace the assets folder with the one in this branch

3. Open the Unity Scene from the scenes folder

Step3


4. Insert the API keys into the .env files

Step4


5. Run the STT service using: python STT.py

6. Launch the app by clicking on the play button

7. Wait for the application to download the required models on the first start

8. Say 'Start' to activate or 'Stop' to deactivate works like an OS

output


Compiled EXE version coming soon !!!

Roadmap

  • Long-term memory
  • Time and date awareness
  • mixed reality integration
  • Gatebox-style hologram
  • Dynamic Voice like GPT4o
  • Alexa-like smart home control
  • Multilingual
  • Mobile version
  • Easier setup
  • Compiling into one exe
  • Localized
  • Home Control
  • Outfit Change
  • Add custom skills
  • Music Player
  • Different modes
  • Face Recognition
  • Dynamic Animations
  • OS like working
  • Wakeword detection
  • Mood recognition
  • Weather Forecast

(back to top)

License

Distributed under the GNU Affero General Public License v3.0 License. See LICENSE.txt for more information.

(back to top)

Contact and Socials

E-mail: [email protected]
Project Link: https://github.com/NeuralHarbour/LLM-Based-3D-Avatar-Assistant/v2.0

(back to top)

Acknowledgments

(back to top)

About

A project which is aimed to create a virtual avatar based assistant using LLM's with dynamic voice and dynamic animations with many other cool features similar to that of GATEBOX

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •