Skip to content

ArtisticProgramming/BookRetrievalAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“š BookRetrievalAI – Flexible .NET RAG Chatbot System with Azure OpenAI, Ollama & Qdrant Vector DB

A modular Retrieval-Augmented Generation (RAG) system built with .NET 9, powered by Semantic Kernel, supporting:

  • ☁️ Azure OpenAI
  • πŸ–₯ Local LLMs via Ollama
  • 🧠 Qdrant Vector Database (Docker)
  • πŸ”„ Configuration-based provider switching
  • πŸ”§ Custom local model & embedding selection

πŸš€ What This Project Does

This project implements a complete RAG pipeline:

  1. Parse book summaries dataset
  2. Chunk content into smaller segments
  3. Generate embeddings
  4. Store vectors in Qdrant
  5. Retrieve relevant chunks
  6. Build contextual prompt
  7. Generate final answer using selected LLM provider

You can switch between Azure OpenAI and local Ollama models without changing code β€” only configuration.


🧠 What is RAG?

Retrieval-Augmented Generation (RAG) improves LLM responses by:

  • Searching relevant information from a vector database
  • Injecting that context into the prompt
  • Generating grounded, data-aware answers

Instead of relying only on model training data, RAG uses your own dataset.


πŸ— Architecture Overview

User Question
      ↓
Embedding Model
      ↓
Qdrant Vector Search
      ↓
Context Builder
      ↓
Prompt Builder
      ↓
Chat Model (Azure or Ollama)
      ↓
Final Response

πŸ“¦ Technologies Used

🧠 Semantic Kernel

Used for:

  • Chat completion
  • Embedding generation
  • Prompt orchestration
  • Multi-provider abstraction

☁️ Azure OpenAI

Default configuration:

  • Chat Model: gpt-4o-mini
  • Embedding Model: text-embedding-3-small

⚑ You can change deployment names in appsettings.json to use any Azure deployment you create.


πŸ–₯ Ollama (Local Models)

Default configuration:

  • Chat Model: qwen2.5:3b
  • Embedding Model: nomic-embed-text

πŸ”§ You Can Use ANY Local Model

This system is not limited to the default models.

You can use any chat model or embedding model supported by Ollama.

Simply update:

"ChatModel": "your-local-chat-model",
"EmbeddingModel": "your-local-embedding-model"

As long as the model exists in Ollama, the system can use it.


🧠 Qdrant

Vector database used to:

  • Store embeddings
  • Perform similarity search
  • Retrieve relevant chunks

Runs locally via Docker.


βš™οΈ Setup Guide


1️⃣ Install Qdrant (Required)

Run using Docker:

docker run -p 6334:6333 qdrant/qdrant

Qdrant will be available at:

http://localhost:6334

Your config:

"QdrantEndpoint": "http://localhost:6334"

2️⃣ Install Ollama for Local LLM

Download from:

https://ollama.com

Pull required models:

ollama pull qwen2.5:3b
ollama pull nomic-embed-text

Start Ollama:

ollama serve

Default endpoint:

http://localhost:11434

πŸ”„ Configuration Guide

All configuration is controlled via appsettings.json.


☁️ Using Azure OpenAI

"AzureOpenAI": {
  "Endpoint": "https://your-endpoint.openai.azure.com/",
  "ApiKey": "YOUR_API_KEY",
  "ChatDeployment": "gpt-4o-mini",
  "EmbeddingDeployment": "text-embedding-3-small",
  "collectionName": "books",
  "Enabled": true
}

Disable Ollama:

"Ollama": {
  "Enabled": false
}

πŸ–₯ Using Ollama (Local Mode)

"Ollama": {
  "Endpoint": "http://localhost:11434",
  "ChatModel": "your-model",
  "EmbeddingModel": "your-embedding-model",
  "collectionName": "booksWithOllama",
  "Enabled": true
}

Disable Azure:

"AzureOpenAI": {
  "Enabled": false
}

πŸ“Š Dataset

Default dataset file:

dataset/booksummaries.txt

Important

The included dataset contains 100 book summaries. It is intentionally small for:

  • Fast testing
  • Quick indexing
  • Development purposes

For high-scale RAG testing, you can download the full CMU Book Summary Dataset:

https://www.kaggle.com/datasets/ymaricar/cmu-book-summary-dataset

This dataset contains thousands of book summaries and is ideal for:

  • Performance testing
  • Large vector indexing
  • Real-world RAG benchmarking

After downloading, update:

"DatasetFilePath": "your-new-dataset-path"

πŸ›  How to Run

  1. Start Qdrant (Docker)
  2. (Optional) Start Ollama
  3. Configure appsettings.json
  4. Run project:
dotnet run

πŸ“Œ Notes

  • Qdrant must be running before indexing
  • Ollama must be running if local mode enabled
  • Azure requires valid API key and deployment names
  • Collections are separated per embedding model
  • When changing embedding models, use a new collection name

About

Flexible .NET RAG Chatbot System with Azure OpenAI, Ollama & Qdrant Vector DB

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors