Skip to content

ThaiLearnCoding/SmartHouseBot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SmartHouseBot

Motivation

Smart homes need a voice assistant that is fast, private, and reliable. Cloud-only assistants add latency, privacy risk, and cost. This project provides a local-first baseline that can control devices, report telemetry, and respond naturally in Vietnamese while keeping the system safe and auditable.

Problem Statement

We need a production-ready baseline that:

  • Understands Vietnamese voice commands reliably.
  • Responds naturally with low latency.
  • Executes device control safely and predictably.
  • Works locally by default for privacy and cost control.
  • Scales from a demo to a real home deployment.

Solution Overview

SmartHouseBot is a full-stack, local-first smart home assistant with streaming voice responses. It combines on-device speech-to-text (PhoWhisper), a local LLM (Ollama) for intent and natural response generation, and local text-to-speech (Piper). The backend enforces strict validation and logs actions for auditing.

Technical Architecture

System Flow

  1. Audio input is recorded in the browser and sent to the backend.
  2. STT transcribes audio to text using PhoWhisper.
  3. LLM intent classifies the command and returns a structured intent.
  4. Validation + safety guardrails apply (clamp values, reject invalid actions).
  5. Device control executes via CoreIoT RPC (if needed).
  6. Natural response is generated by the LLM and streamed token-by-token.
  7. TTS generates audio, which is streamed in chunks for playback.

Core Components

Frontend

  • React 19 + Vite
  • Zustand state management
  • Streaming playback with Web Audio

Backend

  • FastAPI application
  • Voice pipeline: STT -> LLM intent -> device action -> LLM response -> TTS
  • Audit log for intent resolution and actions

Models

  • STT: PhoWhisper via transformers
  • LLM: Ollama (local)
  • TTS: Piper

Protocols and Interfaces

  • REST for telemetry, devices, and health
  • WebSocket for streaming assistant tokens and audio chunks
  • CoreIoT RPC for device control

API Endpoints

Health

  • GET /api/health

Devices

  • GET /api/devices/status
  • POST /api/devices/led
  • POST /api/devices/servo

Telemetry

  • GET /api/telemetry/latest
  • GET /api/telemetry/history?range_hours=24

Voice

  • POST /api/voice/text-turn
  • POST /api/voice/audio-turn
  • POST /api/voice/transcribe
  • WS /api/voice/stream

Repository Layout

SmartHouseBot/
  assets/
    models/
  backend/
    app/
      clients/
      controllers/
      core/
      middleware/
      routers/
      schemas/
      services/
      main.py
  frontend/
    src/
      components/
      lib/
      pages/
      services/
      store/
  scripts/
  package.json
  requirements.txt
  .env.example

Setup From Scratch

1) Prerequisites

  • Python 3.11+
  • Node.js 18+
  • FFmpeg
  • Ollama (for local LLM)

2) Clone and install

git clone <your-repo-url>
cd SmartHouseBot
python -m venv .venv
pip install -r requirements.txt
npm install
cd frontend
npm install
cd ..

3) Install FFmpeg (Windows)

./scripts/install_ffmpeg.ps1

Verify:

ffmpeg -version

4) Install and setup Ollama

  1. Install from https://ollama.com/download
  2. Pull a model:
ollama pull qwen2.5:3b-instruct

5) Configure environment

cp .env.example .env

Required:

  • COREIOT_EMAIL
  • COREIOT_PASSWORD
  • COREIOT_DEVICE_ID
  • PIPER_MODEL=assets/models/vi_VN-vais1000-medium.onnx

Common optional:

  • PHO_WHISPER_MODEL
  • PHO_WHISPER_DEVICE
  • HF_HOME
  • HF_HUB_OFFLINE
  • LLM_ENABLED=true
  • OLLAMA_MODEL=qwen2.5:3b-instruct
  • LLM_INTENT_ENABLED=true
  • AUDIT_LOG_ENABLED=true

6) Run

npm run dev

Starts:

  • FastAPI on http://localhost:8000
  • Vite on http://localhost:5173

Safety and Reliability

  • Intent validation and clamping for device control
  • LLM JSON repair and fallback to rule-based parsing
  • Audit log (backend/logs/audit.jsonl)
  • Rate limiting for voice and device endpoints

Operational Notes

  • STT and TTS are local for privacy
  • WebSocket streams tokens and audio for low latency
  • Offline mode supported with HF_HUB_OFFLINE + cached models

Testing

pytest backend/tests/test_voice_service.py

Suggested Next Steps

  • Add log rotation for audit logs
  • Add user-level access control for device actions
  • Add per-device permission policies

About

VoiceBot + Chatbot for Smart House Application - Control IoT devices seamlessly

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors