DeskChat is a fully local, privacy-first desktop chatbot that can understand and answer questions about your own files.
It runs entirely on your machine and combines:
- A normal conversational chatbot
- Retrieval-Augmented Generation (RAG) over your documents
- A multi-agent reasoning pipeline (Answerer → Critic → Refiner)
- Optional free web search (DuckDuckGo)
- Built-in document tools (summaries, TOC, citation extraction)
- A clean web UI built with Streamlit
- Optional secure remote access via ngrok
Nothing is sent to any cloud AI service. All models run locally using Ollama.
- Chat like a normal AI assistant
- Upload PDFs, text, Word, CSV, JSON, HTML, images, and more
- Ask questions about your own documents
- Keep conversations local and private
For harder questions, DeskChat can use multiple internal agents:
- Answerer — produces a first draft answer
- Critic — reviews it and points out issues
- Refiner — improves it into a final answer
This gives you more structured, higher-quality responses when needed.
- Generate academic summaries of uploaded papers
- Extract table-of-contents / section structure
- Extract cited authors and references
- Runs entirely on your machine
- No cloud AI APIs
- No telemetry or tracking
- Optional web search using DuckDuckGo
- Optional remote access via ngrok (with basic auth)
- Automatic cleanup of old data
You → Streamlit UI → Core Logic → Local Models (Ollama)
│
├── Vector Store (Chroma)
├── Document Loaders
├── Multi-Agent Pipeline
└── Optional Web Search
Everything stays local unless you explicitly enable web search or remote access.
deskchat/
│
├── app.py # Streamlit entry point
├── config.py # Configuration & constants
├── requirements.txt
├── README.md
│
├── core/ # All non-UI logic
│ ├── housekeeping.py # Cleanup tasks
│ ├── session.py # Session & chat persistence
│ ├── loaders.py # File loading logic
│ ├── vectorstore.py # Indexing and retrieval
│ ├── web.py # DuckDuckGo search
│ ├── refiners.py # Summaries, TOC, authors
│ ├── agents.py # Multi-agent logic
│ ├── generation.py # Single-agent RAG / chat
│ ├── ngrok.py # Remote access helpers
│ └── utils.py # Utility helpers
│
├── ui/ # Streamlit UI components
│ ├── sidebar.py
│ ├── knowledge.py
│ └── chat.py
│
├── data/ # Uploaded files
└── storage/
├── chats/ # Saved chat histories
└── indexes/ # Vector indexes
- Python 3.10+
- Ollama installed and running
- 16GB RAM recommended (more for large models)
- GPU recommended but optional
pip install -r requirements.txtPull the models DeskChat uses:
ollama pull qwen3:8b
ollama pull qwen2.5:7b
ollama pull gemma2:9b
ollama pull starling-lm:7b-alpha
ollama pull mxbai-embed-largestreamlit run app.pyThen open:
http://localhost:8501
If you want to access DeskChat from your phone or another device:
ngrok config add-authtoken YOUR_TOKENThen enable remote access inside the sidebar UI.
Basic authentication is enabled by default so random people cannot access it.
- Drag & drop files into the upload area
- Click Build / Rebuild index
- RAG — uses your documents as context
- Plain Chat — normal chatbot, no documents
- Multi-Agent — Answerer + Critic + Refiner pipeline
For each uploaded file you can:
- Generate a summary
- Generate a table of contents
- Extract cited authors
By default:
- Chats and indexes older than 10 days are deleted automatically
- You can change this in
config.py
| Data type | Leaves your machine? |
|---|---|
| Chat messages | No |
| Documents | No |
| Embeddings | No |
| Models | No |
| Web queries | Only if enabled |
| Remote access | Only if enabled |
- DeskChat does not include content moderation unless you add it.
- Output quality and bias depend on the underlying models.
- Intended for personal use, research, and experimentation.
Run with live reload:
streamlit run app.py --server.runOnSave trueMIT — do whatever you want with it.