Chat With RTX: Your new Personal Assistant

March 14, 2024

Chat With RTX: Your new Personal Assistant

After the conception of Chat-GPT, AI became what we can call a Personalized solution provider. From making it write our literature essays in college, to creating workout routines at the gym, It has shortened the amount of workload for almost every task. But all this would always be limited to a chatbot's most recent data update. For example, Chat-GPT 3.5 will often fumble when tasked with fetching information past January 2022. Besides this, you would be limited to a Large Language Model(LLM) only being able to answer your questions based on the information available to it on the internet. But what if you needed something more localized? What if your professor gave you a series of PDF documents and had you answer questions from them alone. Chat-GPT can't help you there. So what now???

Here's where Nvidia's all-new ChatWithRTX comes in!

Chat With RTX is a chatbot that lets you apply GPT LLM capabilities on personalized digital content such as documents, videos, notes, YouTube videos, etc. Powered by Nvidia TensorRT and RTX acceleration, Chat-With-RTX strives to provide fast and contextually relevant information based on document data-wise queries. All that one has to do is head over to the Nvidia website and hit 'Download Now'. It's that simple. You need an RTX 30 series or higher GPU.

But how does this all work?

TensorRT-LLM wraps TensorRT’s deep learning compiler—which includes optimized kernels from FasterTransformer, pre-and post-processing, and multi-GPU and multi-node communication—in a simple open-source Python API for defining, optimizing, and executing LLMs for inference in production. Using Tensor cores, the app analyzes newly uploaded file data and can immediately generate fairly accurate textual results using Mistral. The app employs Mistral and Llama 2 to deliver a more casual and friendly information presentation system. It implements RAG(Retrieval Augmented Generation), a technique for adding data obtained from additional sources to generative AI models to improve its precision and reliability.

What about Security?

Nvidia promises that a bot coded like a personalized assistant can streamline workflow all while keeping user data safe. Now while this may be true, a chatbot's core further development primarily relies on user data. Analyzing where and how a generated response went wrong is all dependent on what the user's prompt originally was and how far the obtained result deviated from the expected one. Nvidia claims that since Chat-With-RTX is locally hosted, the user has complete control over their personal data, but then the question arises - What is Nvidia going to use to continue training their new baby?

Now, while one might expect pristine results from Chat-With-RTX since it only needs to compute data from resources you feed it, it is important to remember that this is a demo app that is only a month old. LLMs are always undergoing changes to their neural network aiming to attain better, more accurate results. All that we can do till we get an updated version of Chat-With-RTX is use it to answer the next quiz from one of those nasty college professors haha!

Search This Blog

Joez Tech Talks

Chat With RTX: Your new Personal Assistant

But how does this all work?

What about Security?

Comments

Post a Comment

Popular Posts

Why is Oracle doing so well?

Understanding Neuromorphic Computing