2025-03-12 00:44:12 -07:00
2025-02-27 02:21:19 -08:00
2025-03-10 08:48:30 +03:30
2025-03-03 04:22:12 -08:00
2025-03-10 08:54:32 +03:30
2025-03-10 08:58:45 +03:30
2025-03-10 09:03:00 +03:30
2025-03-10 13:45:26 +03:30
2025-03-03 04:06:07 -08:00
2025-03-11 10:27:07 +03:30
2025-01-26 21:01:53 +03:30
2025-03-01 01:40:03 -08:00
2025-03-09 16:58:31 +03:30

Deploying Agentic RAG Systems to Perform Various Tasks Using LLMs

This repository showcases the implementation of a Retrieval-Augmented Generation (RAG) system for answering questions using large language models (LLMs) and document retrieval. The system integrates document indexing, chunking, and similarity search with advanced language models like gemma2:9b to provide context-aware responses. Additionally, it incorporates a web-browsing agent for retrieving live data.

Table of Contents

Overview

The project is designed to perform tasks like document-based question answering, real-time information retrieval via web scraping, and context-aware response generation. It leverages multiple techniques:

  • RAG (Retrieval-Augmented Generation): Uses document indexing and retrieval for question answering.
  • Web Browsing: Fetches live data to answer real-time queries.
  • Chroma and FAISS: Index and retrieve relevant document chunks efficiently.

The system is multilingual and supports Persian language queries.

Installation

Steps Performed: Document Processing: The documents are chunked into smaller segments for efficient retrieval. Index Creation or Loading: An FAISS index or Chroma-based vector store is created or loaded for similarity search. Query Answering: A set of queries is processed, and answers are generated using LLMs, based on the retrieved document chunks or web content. Results are saved in an output file (response.txt or agent_results.txt).

Components

RAG System

The RAG system includes:

Document Chunking: Splitting large documents into smaller chunks to improve retrieval performance. Index Creation: Using FAISS (or Chroma) for indexing the document chunks based on their embeddings. Similarity Search: Utilizing cosine similarity for retrieving relevant chunks during query processing.

Answer Generator

The Answer Generator class interacts with the RAG system to fetch the most relevant document chunks based on a given question. It then uses the LLM to generate a context-aware response.

Chroma-based RAG

An alternative RAG implementation using Chroma for storing and querying document embeddings is also included. This utilizes LangChain's Chroma integration for efficient vector store management and querying.

Web Browsing Agent

The Web Browsing Agent fetches real-time information from the web by scraping web pages. The agent can be used to get live data on current events, statistics, and more.

Doc Search Agent

Deep Search Agent

Results

The system successfully processes predefined questions and generates responses based on the relevant document context. Additionally, the web-browsing agent retrieves live data for real-time questions, providing a comprehensive, multi-source approach to answering queries.

Example:

Screenshot from 2025-02-24 11-19-27

The system demonstrates effective integration of multiple techniques to solve complex QA tasks.

License

This project is licensed under the MIT License.

Description
No description provided
Readme 231 KiB
Languages
Jupyter Notebook 100%