Semantic search for Obsidian vaults using LanceDB and cloud or local embedding models
-
Updated
Jun 15, 2026 - Python
Semantic search for Obsidian vaults using LanceDB and cloud or local embedding models
A Python project that deploys a Local RAG chatbot using Ollama API. Refines answers with Deep Research from external websites, and uses both Embedding and LLM models.
One brain. Every AI agent. Nothing forgotten. — Self-hosted memory layer via MCP + Postgres + pgvector
MCP RAG server — local embeddings, your docs never leave your machine. Private knowledge base + web search for Claude, Cursor, and Ollama. Drop your docs, connect your AI client, done.
Turn your voice into intelligent, linked notes inside Obsidian
A Python project that deploys a Local RAG chatbot using Ollama API. Refines answers with internal RAG knowledge base, and uses both Embedding and LLM models.
A Python project that deploys a Local RAG chatbot using Ollama API and vLLM API. Refines answers with internal RAG knowledge base, using both Embedding and Rerank models to improve accuracy of context provided to LLM models.
Semantic code search for VS Code, powered by NightOwl-CodeEmbedding — my own ModernBERT Bi-Encoder trained from scratch. Codex/MCP ready!!
MCP server that runs local LLMs (with full access to MCP tools included). Callable by Python to chain MCP tools with local intelligence.
A Fast API server that provides local text and multi-modal embedding using LlamaIndex Hugging Face Embedding
Sandboxed local AI (openclaw compatible) assistants and inference orchestrator
A lightweight Retrieval-Augmented Generation (RAG) agent powered by Groq AI and local embeddings, built to process and understand text data efficiently. It retrieves relevant context from your own files and generates accurate, natural-language responses -all while keeping your data private and running locally.
Memory-as-a-Service for AI Agents & LLMs. Add persistent memory, pgvector-based semantic search, and automatic semantic deduplication with 3 simple REST API endpoints. Comes with an LRU embedding cache and a developer analytics dashboard.
claude-router is a local prompt router that picks the right Claude model tier and prepends the right scaffold using local embeddings before you call the API. A deterministic routing layer for eval, research, content, and review prompts that helps teams stop overspending on Sonnet and Opus when Haiku plus structure is enough.
Offline Express.js QA API using Ollama. Parse PDFs, embed locally, search and chat with your private docs — no cloud needed.
FastAPI | Postgres | Sentence Transformer | Local Embeddings saved to PGVector | JWT AUTH
Memory traces for AI agents - Self-improving memory system with quality control and drift detection
Add a description, image, and links to the local-embeddings topic page so that developers can more easily learn about it.
To associate your repository with the local-embeddings topic, visit your repo's landing page and select "manage topics."