📚 BookRetrievalAI – Flexible .NET RAG Chatbot System with Azure OpenAI, Ollama & Qdrant Vector DB

A modular Retrieval-Augmented Generation (RAG) system built with .NET 9, powered by Semantic Kernel, supporting:

☁️ Azure OpenAI
🖥 Local LLMs via Ollama
🧠 Qdrant Vector Database (Docker)
🔄 Configuration-based provider switching
🔧 Custom local model & embedding selection

🚀 What This Project Does

This project implements a complete RAG pipeline:

Parse book summaries dataset
Chunk content into smaller segments
Generate embeddings
Store vectors in Qdrant
Retrieve relevant chunks
Build contextual prompt
Generate final answer using selected LLM provider

You can switch between Azure OpenAI and local Ollama models without changing code — only configuration.

🧠 What is RAG?

Retrieval-Augmented Generation (RAG) improves LLM responses by:

Searching relevant information from a vector database
Injecting that context into the prompt
Generating grounded, data-aware answers

Instead of relying only on model training data, RAG uses your own dataset.

🏗 Architecture Overview

User Question
      ↓
Embedding Model
      ↓
Qdrant Vector Search
      ↓
Context Builder
      ↓
Prompt Builder
      ↓
Chat Model (Azure or Ollama)
      ↓
Final Response

📦 Technologies Used

🧠 Semantic Kernel

Used for:

Chat completion
Embedding generation
Prompt orchestration
Multi-provider abstraction

☁️ Azure OpenAI

Default configuration:

Chat Model: gpt-4o-mini
Embedding Model: text-embedding-3-small

⚡ You can change deployment names in appsettings.json to use any Azure deployment you create.

🖥 Ollama (Local Models)

Default configuration:

Chat Model: qwen2.5:3b
Embedding Model: nomic-embed-text

🔧 You Can Use ANY Local Model

This system is not limited to the default models.

You can use any chat model or embedding model supported by Ollama.

Simply update:

"ChatModel": "your-local-chat-model",
"EmbeddingModel": "your-local-embedding-model"

As long as the model exists in Ollama, the system can use it.

🧠 Qdrant

Vector database used to:

Store embeddings
Perform similarity search
Retrieve relevant chunks

Runs locally via Docker.

⚙️ Setup Guide

1️⃣ Install Qdrant (Required)

Run using Docker:

docker run -p 6334:6333 qdrant/qdrant

Qdrant will be available at:

http://localhost:6334

Your config:

"QdrantEndpoint": "http://localhost:6334"

2️⃣ Install Ollama for Local LLM

Download from:

https://ollama.com

Pull required models:

ollama pull qwen2.5:3b
ollama pull nomic-embed-text

Start Ollama:

ollama serve

Default endpoint:

http://localhost:11434

🔄 Configuration Guide

All configuration is controlled via appsettings.json.

☁️ Using Azure OpenAI

"AzureOpenAI": {
  "Endpoint": "https://your-endpoint.openai.azure.com/",
  "ApiKey": "YOUR_API_KEY",
  "ChatDeployment": "gpt-4o-mini",
  "EmbeddingDeployment": "text-embedding-3-small",
  "collectionName": "books",
  "Enabled": true
}

Disable Ollama:

"Ollama": {
  "Enabled": false
}

🖥 Using Ollama (Local Mode)

"Ollama": {
  "Endpoint": "http://localhost:11434",
  "ChatModel": "your-model",
  "EmbeddingModel": "your-embedding-model",
  "collectionName": "booksWithOllama",
  "Enabled": true
}

Disable Azure:

"AzureOpenAI": {
  "Enabled": false
}

📊 Dataset

Default dataset file:

dataset/booksummaries.txt

Important

The included dataset contains 100 book summaries. It is intentionally small for:

Fast testing
Quick indexing
Development purposes

For high-scale RAG testing, you can download the full CMU Book Summary Dataset:

https://www.kaggle.com/datasets/ymaricar/cmu-book-summary-dataset

This dataset contains thousands of book summaries and is ideal for:

Performance testing
Large vector indexing
Real-world RAG benchmarking

After downloading, update:

"DatasetFilePath": "your-new-dataset-path"

🛠 How to Run

Start Qdrant (Docker)
(Optional) Start Ollama
Configure appsettings.json
Run project:

dotnet run

📌 Notes

Qdrant must be running before indexing
Ollama must be running if local mode enabled
Azure requires valid API key and deployment names
Collections are separated per embedding model
When changing embedding models, use a new collection name

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
BookRetrievalAI.Service		BookRetrievalAI.Service
BookRetrievalAI		BookRetrievalAI
assets		assets
.gitattributes		.gitattributes
.gitignore		.gitignore
BookRetrievalAI.sln		BookRetrievalAI.sln
LICENSE		LICENSE
Readme.md		Readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📚 BookRetrievalAI – Flexible .NET RAG Chatbot System with Azure OpenAI, Ollama & Qdrant Vector DB

🚀 What This Project Does

🧠 What is RAG?

🏗 Architecture Overview

📦 Technologies Used

🧠 Semantic Kernel

☁️ Azure OpenAI

🖥 Ollama (Local Models)

🔧 You Can Use ANY Local Model

🧠 Qdrant

⚙️ Setup Guide

1️⃣ Install Qdrant (Required)

2️⃣ Install Ollama for Local LLM

🔄 Configuration Guide

☁️ Using Azure OpenAI

🖥 Using Ollama (Local Mode)

📊 Dataset

Important

🛠 How to Run

📌 Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📚 BookRetrievalAI – Flexible .NET RAG Chatbot System with Azure OpenAI, Ollama & Qdrant Vector DB

🚀 What This Project Does

🧠 What is RAG?

🏗 Architecture Overview

📦 Technologies Used

🧠 Semantic Kernel

☁️ Azure OpenAI

🖥 Ollama (Local Models)

🔧 You Can Use ANY Local Model

🧠 Qdrant

⚙️ Setup Guide

1️⃣ Install Qdrant (Required)

2️⃣ Install Ollama for Local LLM

🔄 Configuration Guide

☁️ Using Azure OpenAI

🖥 Using Ollama (Local Mode)

📊 Dataset

Important

🛠 How to Run

📌 Notes

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages