Hello! I'm Shivansh Fulper, a passionate ML and Data science enthusiast currently 3rd year at IIIT Jabalpur and having with a keen interest in leveraging AI to build stuffs. Recently, I found an opportunity to get internship at sarvam.ai .and it involved an assignment to build a RAG system with some agents/tools also a bonus if I would integrate sarvam TTS .
Okay, so, picture this. I get this assignment from Sarvam.ai. It's all about building this cool interactive tool to help students learn about sound from the NCERT textbook. Now, I'm not gonna lie, I was a bit intimidated at first. AI, LLMs, vector databases... it all sounded super fancy and complex. Also this was my first time working on a RAG but hey, challenge accepted!
The problem I set out to solve was multifaceted:
Now the question was what tools to develop. So I decided to pinpoint some basic questions like :
With these questions in mind, I dove into the world of RAG and AI agents. Let me take you through my development journey.
The core of my project is the RAG (Retrieval-Augmented Generation) system. Here's how I built it:
ingest.py
script to load and process the NCERT Sound chapter PDF. I used both PyPDFLoader and PDFPlumberLoader to ensure robust PDF parsing. The text was then split into manageable chunks using RecursiveCharacterTextSplitter
.vector_db.py
to create a Chroma vector store. This allows for efficient similarity searches based on text embeddings. I chose the "sentence-transformers/all-MiniLM-L6-v2" model for generating embeddings, striking a balance between performance and accuracy.rag_system.py
. Here, I integrated Google's Gemini 1.5 Flash model for generating responses. The RAG system retrieves relevant context from the vector store and uses it to inform the AI's responses to user queries.app.py
). This exposes endpoints for various functions like generating responses, creating quizzes, and more.frontend.py
). This provides an intuitive interface for students to ask questions, take quizzes, and access other learning tools.With the RAG system in place, I focused on developing specialized agents and tools to enhance the learning experience: