Let’s understand what is Retrieval Augmented Generation (RAGs) and how does it work in this article below.

Imagine creating a ChatGPT-like interface that taps into our own knowledge base to answer our queries.

That’s precisely what RAG offers you!

Today, I’ll delve into each component required to develop a RAG application and share a working project by the end!

Let’s go!

  1. Custom knowledge base:

Custom Knowledge Base: A collection of relevant and up-to-date information that serves as a foundation for RAG.

It can be a database, a set of documents, or a combination of both

2.Chunking:

Chunking is the process of breaking down a large input text into smaller pieces.

This ensures that the text fits the input size of the embedding model and improves retrieval efficiency.

Implementing a smart chunking strategy can greatly enhance your RAG system!

3. Embeddings & Embedding Model:

A technique for representing text data as numerical vectors, which can be input into machine learning models.

The embedding model is responsible for converting text into these vectors.

4. Vector Databases:

A collection of pre-computed vector representations of text data for fast retrieval and similarity search, with capabilities like CRUD operations, metadata filtering, and horizontal scaling.

5. User Chat Interface:

A user-friendly interface that allows users to interact with the RAG system, providing input query and receiving output.

The query is converted to an embedding which is used to retrieve relevant context from Vector DB!

6. Prompt Template:

The process of generating a suitable prompt for the RAG system, which can be a combination of the user query and the custom knowledge base.

This is given as an input to an LLM that produces the final response!

Thanks to the thread in X from @akshay

By Ramesh Fernandez 466 Views

Leave a Reply