Skip to content

sumerc/zee

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

157 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

zee
zee

Voice transcription that stays out of your way.
Supports Groq, OpenAI, Mistral, ElevenLabs and Deepgram models.
Push-to-talk, tap-to-toggle, or real-time streaming. Pure Go. Sub-second fast.

Go 1.24 macOS From the river to the sea, Palestine will be free

zee in action

Highlights

  • System tray app — lives in the menu bar. Switch microphones, transcription providers, and languages from the tray menu. Dynamic icons show recording and warning states.
  • Two recording modes — push-to-talk (hold hotkey) or tap-to-toggle (tap to start/stop).
  • Real-time streaming — when a streaming-capable model is selected (e.g. Deepgram Nova-3), words appear as you speak and auto-paste into the focused window incrementally.
  • Fast batch mode — HTTP keep-alive, TLS connection reuse, pre-warmed connections, streaming encoder runs during recording (not after). Typical key-release to clipboard: under 500ms.
  • Auto-paste — transcribed text goes straight to clipboard and pastes into the active window. In streaming mode, each new phrase pastes as it arrives.
  • Silence detection — VAD-based voice activity detection warns when no speech is heard. In streaming mode, auto-closes recording after 30 seconds of silence.
  • Pure Go encoding — MP3 and FLAC encoders, no CGO. Three formats: mp3@16 (smallest), mp3@64 (balanced), flac (lossless).
  • Multiple providers — Groq, OpenAI, Mistral, ElevenLabs, and Deepgram, switchable from the tray menu at runtime.
  • 36 languages — select transcription language from the tray menu or via -lang flag.
  • Cross-platform — minimal dependencies, pure Go where possible.
    • macOS
    • Linux
    • Windows

Install

One-liner (recommended)

curl -fsSL https://raw.githubusercontent.com/sumerc/zee/main/install.sh | bash

Downloads the latest DMG, verifies its SHA256 against checksums.txt, copies Zee.app to /Applications, and clears the quarantine attribute. Pin a version with VERSION=vX.Y.Z bash.

Manual DMG

  1. Download Zee-<version>.dmg from the latest release
  2. Open the DMG and drag Zee.app to Applications
  3. Clear quarantine: xattr -cr /Applications/Zee.app

CLI binary

For terminal usage:

# Apple Silicon
curl -L https://github.com/sumerc/zee/releases/latest/download/zee_darwin_arm64.tar.gz | tar xz

# Intel
curl -L https://github.com/sumerc/zee/releases/latest/download/zee_darwin_amd64.tar.gz | tar xz
GROQ_API_KEY=xxx ./zee              # Groq Whisper
DEEPGRAM_API_KEY=xxx ./zee          # Deepgram (streaming auto-enabled when a streaming model is selected from the tray)
./zee -debug-transcribe             # include transcription text logs

Note: When running from a terminal, macOS permissions (Microphone, Accessibility) are granted to the terminal app (e.g. Ghostty, iTerm2, Terminal), not to zee itself.

Build from source

git clone https://github.com/sumerc/zee && cd zee
make build        # CLI binary
make app          # macOS DMG

Usage

Set at least one API key, then run zee:

export GROQ_API_KEY=your_key       # batch mode (Groq Whisper)
export OPENAI_API_KEY=your_key     # batch mode (OpenAI Whisper)
export DEEPGRAM_API_KEY=your_key   # streaming mode (Deepgram)
export MISTRAL_API_KEY=your_key    # batch mode (Mistral Voxtral)
export ELEVENLABS_API_KEY=your_key # batch mode (ElevenLabs Scribe)
zee                                # starts in menu bar, hold Ctrl+Shift+Space to record

Note: export only works in the current terminal session. To make API keys available to Zee.app when launched from Spotlight or Applications, use launchctl:

launchctl setenv GROQ_API_KEY your_key

Add this to your ~/.zshrc so it runs on every login.

zee runs as a system tray app in the menu bar. Hold Ctrl+Shift+Space to record, release to transcribe. Result auto-pastes into the focused window.

Use the tray menu to switch microphones, providers, and languages — or use -setup for initial device selection.

macOS Permissions

On first run, macOS will prompt for permissions:

  1. Microphone — Required for audio recording. System Settings → Privacy & Security → Microphone.

  2. Accessibility — Required for global hotkey and auto-paste. System Settings → Privacy & Security → Accessibility.

If permissions aren't granted, zee will fail silently or the hotkey won't register. Run with -doctor to diagnose permission issues.

Testing

make test                                      # unit tests
make test-integration                          # integration tests (builds binary, requires GROQ_API_KEY)
make integration-test WAV=test/data/short.wav  # single-file integration test (requires GROQ_API_KEY)
make benchmark WAV=file.wav RUNS=5             # multiple runs for timing

Flags

Flag Default Description
-format mp3@16 Audio format: mp3@16, mp3@64, or flac
-autopaste true Auto-paste into focused window
-setup false Select microphone device
-device (default) Use named microphone device
-lang en Language code (e.g., en, es, fr)
-debug true Enable diagnostic logging
-debug-transcribe false Enable transcription text logging
-doctor false Run system diagnostics and exit
-logpath OS-specific Log directory (use ./ for current dir)
-hints - Vocabulary hints for transcription (comma-separated)
-transcribe - Audio file to transcribe and exit
-benchmark - WAV file for benchmarking
-runs 3 Benchmark iterations
-version false Print version and exit

Environment

Variable Description
ZEE_LOG_PATH Log directory override
ZEE_PPROF pprof server address (e.g., :6060)
ZEE_CRASH=1 Trigger synthetic crash for crash-log testing
ZEE_LONGPRESS_DURATION Hybrid hotkey long-press threshold (e.g., 350ms)
ZEE_SAVE_LAST_AUDIO=1 Enable tray action to save the last recording sample

About

Started as a vibe-coding project but turned into a standalone app I use daily for all my speech-to-text. Built with AI, love, and care — the kind of polish you get when you actually use the thing you're building.

About

Voice transcription that stays out of your way. Push-to-talk, tap-to-toggle, or real-time streaming. Pure Go. Sub-second fast. Supports Groq, Deepgram, OpenAI, Mistral models.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors