Posts

(AI Blog#17) RAG - Preparing Knowledge Base - Data Extractions, Chunking, Embedding, Vector Store/DB

RAG( Retrieval Augmented Generation ) is a technique to make AI models(LLMs) more accurate, up-to-date and context aware by combining two things: Retrieval (fetching relevant data) Generation (creating response using an LLM) Why RAG is needed ? Traditional LLMs (like GPT models): Have fixed knowledge(based on training data) Can hallucinate(make up answers) Don't know your private/company data RAG solves this issue by injecting real-time or custom data into the model. Even before building a RAG pipeline, we need to prepare our Knowledge Base. We will discuss in detail about preparing knowledge base in this blog. This is very important step in the process of building a RAG pipeline. RAG Pipeline : Kindly refer below image in detail for all the topics that we discuss in this blog and the next blog. Understand that LLMs are pre-trained models and extract the data from the internet via various sources and train the model. To make you clear, if you ask LLMs a question like " What i...
Recent posts