Skip to content

rxtech-lab/podcast-bot

Repository files navigation

debate-bot

A multi-agent live show generator that turns Markdown scripts into broadcast-style video + audio. Multiple LLM "agents" play out debates, lateral-thinking puzzles (海龜湯), panel discussions, and TV series episodes, narrated with TTS and rendered into a TV-channel UI (HLS) or a downloadable MP4.

It ships as a single Go binary that embeds a React (Vite) single-page app and orchestrates the LLM, TTS, image, and music providers behind the scenes.

Features

  • Multiple content typesdebate, situation-puzzle, discussion, series. Each topic.md declares its type in front-matter.
  • Two server modes:
    • stream (default) — airs every queued topic over per-channel HLS video + MP3 audio, with a TV-tuner web UI. New .md files dropped into a watched folder are picked up live, no restart needed.
    • video — no channels; the browser uploads a script.md (and, for series, a zip of prior generations) and the server renders a downloadable .mp4.
  • Audio-only feed — render a podcast-style .mp3 (mixed TTS + music bed) plus a subtitles.vtt sidecar, skipping all image/video generation. Opt in per job (audio_only) or force it server-wide with the --audio flag.
  • Pluggable providers — OpenAI-compatible chat endpoint, Azure / ElevenLabs TTS, Gemini (Lyria music + scene image generation).
  • MCP tools — optional mcp.json lets agents call external Model Context Protocol tools.

Requirements

Tool Version Why
Go 1.25+ builds the backend (uses CGO for the SQLite driver)
ffmpeg + ffplay recent live-stream pacing, audio concat, playback (both must be on PATH)
bun latest installs & builds the React frontend
C toolchain (gcc/clang) required because mattn/go-sqlite3 is a CGO package

API credentials (see Environment) for your chat / TTS / image / music providers.

Setup

1. Clone & install toolchains

git clone https://github.com/sirily11/debate-bot.git
cd debate-bot

# macOS
brew install go ffmpeg oven-sh/bun/bun

# Debian/Ubuntu
sudo apt-get install -y golang ffmpeg build-essential
curl -fsSL https://bun.sh/install | bash

2. Configure environment

Copy the provided .env and fill in your keys:

cp .env .env.local   # or edit .env in place

.env is loaded automatically at startup (it takes precedence over your shell env).

3. Build

make build      # builds the frontend (bun) then the Go binary into ./bin/debate-bot

Or build the pieces individually:

make frontend   # bun install && bun run build  -> internal/server/web-dist
make backend    # go build -> bin/debate-bot

Environment

Required vars (validated at startup — the process refuses to boot if any are missing):

Var Required Description
OPENAI_BASE_URL OpenAI-compatible chat endpoint shared by host + agents
OPENAI_API_KEY API key for the chat endpoint
HOST_MODEL model id used by the host/moderator agent
COMPRESSION_MODEL model used to compress per-agent memory when it grows
GEMINI_API_KEY drives Lyria music + Gemini scene image generation
COMPRESSION_BASE_URL defaults to OPENAI_BASE_URL
COMPRESSION_API_KEY defaults to OPENAI_API_KEY
SCENE_PLANNER_MODEL model for the visual-director pass; defaults to HOST_MODEL
LLM_INPUT_COST_PER_MILLION optional input-token price used when the provider does not return cost usage
LLM_OUTPUT_COST_PER_MILLION optional output-token price used when the provider does not return cost usage
AZURE_SPEECH_KEY / AZURE_SPEECH_REGION when tts_provider: azure Azure Speech credentials
ELEVENLABS_API_KEY when tts_provider: eleven ElevenLabs credentials
CLOUDFLARE_ACCOUNT_ID / CLOUDFLARE_API_TOKEN for summary PDF export Cloudflare Browser Rendering credentials (token needs "Browser Rendering - Edit") used to render a podcast summary into a downloadable PDF; empty returns 503 from GET /api/discussions/{id}/summary/pdf
OUT_DIR output root for audio/video/transcripts (default ./out)
SERIES_ROOT cross-run archive root for series episodes (default OUT_DIR)
APP_PASSWORD if set, gate the web UI + API behind this password (same as --password)
REVENUECAT_WEBHOOK_AUTH for points purchases shared secret expected in Authorization on POST /api/revenuecat/webhook; empty disables purchase credits
POINTS_COST_LEVERAGE multiplier over the points sale rate used for usage charges; default 3
POINTS_PER_USD_COST exact raw points-per-provider-dollar override; bypasses POINTS_COST_LEVERAGE when set
POINTS_PRODUCT_GRANTS for points purchases comma-separated RevenueCat product-id to point grants, e.g. points_1000:1000,points_5000:5000
POINTS_SIGNUP_GRANT optional starter balance granted once per signed-in user

Provider-specific TTS keys are only required when a topic.md selects that provider.

Points purchases with RevenueCat

The iOS app uses RevenueCat for paywalls, but the server owns the points balance. After a successful purchase, the app polls GET /api/points/balance; points are credited only when RevenueCat posts a webhook to POST /api/revenuecat/webhook.

Configure every top-up product explicitly. The server does not infer points from the App Store price:

REVENUECAT_WEBHOOK_AUTH=change-me
POINTS_PRODUCT_GRANTS="points_1000:1000,points_5000:5000,points_10000:10000"

Use the exact RevenueCat product_id values. For example, if the products are app.rxlab.debatebot.points1000, app.rxlab.debatebot.points5000, and app.rxlab.debatebot.points10000, configure:

POINTS_PRODUCT_GRANTS="app.rxlab.debatebot.points1000:1000,app.rxlab.debatebot.points5000:5000,app.rxlab.debatebot.points10000:10000"

Built-in defaults exist for early testing (consumable:1000, monthly:6667, yearly:0), but production top-ups should be listed with their real product ids. If a purchase webhook arrives for an unknown product id, the server rejects it with 400 {"error":"invalid_product_id"} instead of silently crediting 0 points.

For iOS local development, copy iOS/Config/Secrets.xcconfig.example to iOS/Config/Secrets.xcconfig and set:

REVENUECAT_API_KEY = your-revenuecat-development-public-sdk-key
REVENUECAT_API_KEY_PROD = your-revenuecat-production-public-sdk-key

Debug uses REVENUECAT_API_KEY. Release maps the app's RevenueCatAPIKey Info.plist value to REVENUECAT_API_KEY_PROD.

To test points locally, RevenueCat must be able to reach your server. Either expose the local engine with a tunnel and use that public URL as the RevenueCat webhook URL, or simulate the webhook yourself:

curl -X POST http://localhost:8000/api/revenuecat/webhook \
  -H 'Authorization: change-me' \
  -H 'Content-Type: application/json' \
  --data '{"event":{"id":"evt-local-1","type":"INITIAL_PURCHASE","app_user_id":"OAUTH_SUBJECT","product_id":"points_1000"}}'

app_user_id must be the signed-in OAuth subject without the oauth: prefix; the server credits oauth:<app_user_id>. The app should fetch the balance at least once before a purchase so the backend has registered that user locally; otherwise the webhook returns 400 {"error":"invalid_user_id"}. Use a fresh webhook event.id for each manual test because RevenueCat events are idempotent; replaying the same id returns the unchanged balance with credited:0 and duplicate:true.

Running

Stream mode (TV channels) — default

./bin/debate-bot server \
  --channel ./channels/channels.json \
  --content "./topics/*.md" \
  --addr :3000

Then open http://localhost:3000. Each topic.md front-matter must declare a channel that matches an id in channels.json. The directory behind --content is auto-watched: drop a new .md in and it airs without a restart.

Flags:

Flag Default Description
--content path or glob to topic .md file(s); repeatable
--channel ./channels.json channel registry ({id, number, title} array)
--mcp optional mcp.json for MCP tools
--out $OUT_DIR output directory override
--addr :3000 HTTP listen address
--password $APP_PASSWORD gate the web UI + API behind a password (see below)

Password protection

Start the server with --password (or set APP_PASSWORD) to require a login:

./bin/debate-bot server --content "./topics/*.md" --password "hunter2"
# or
APP_PASSWORD=hunter2 ./bin/debate-bot server --content "./topics/*.md"

When a password is set, the SPA shows a login screen and every /api/* route returns 401 until the browser signs in. Authentication is a cookie set by POST /api/login, so the SSE event stream and HLS audio/video keep working automatically. Omit the flag (the default) to leave the server open.

Both --mode stream and --mode video honour the password.

Video mode (upload → MP4)

./bin/debate-bot server --mode video --addr :3000 --max-concurrency 2

Open the web UI and upload a script.md; the server renders an MP4 you can download. --max-concurrency caps simultaneous renders.

Video/dashboard-mode flags:

Flag Default Description
--mode stream stream | video | dashboard
--max-concurrency 2 cap on simultaneous renders
--audio false force every job to render as an audio-only feed (see below)
--addr :3000 HTTP listen address
--password $APP_PASSWORD gate the web UI + API behind a password

Audio-only feed (MP3 + subtitles, no images)

Skip all image/video generation and produce a downloadable .mp3 (mixed TTS + music bed) with a subtitles.vtt sidecar — useful for podcast-style output and far cheaper/faster since no scene images are generated. Works for debate, discussion, and series topics.

Two ways to enable it:

  • Per job — submit with audio_only set: the multipart form field audio_only=true, or videoConfig.audio_only: true on POST /api/jobs/json.

  • Server-wide — start with --audio to force every job onto the audio feed regardless of the request:

    ./bin/debate-bot server --mode video --audio --addr :3000

    GET /api/config reports force_audio: true so a frontend can hide video-only controls.

Download the artefacts from GET /api/jobs/{id}/audio (the .mp3) and GET /api/jobs/{id}/subtitles (the .vtt).

For S3-compatible storage, set the engine environment:

S3_BUCKET=your-bucket
S3_REGION=auto
S3_ENDPOINT=https://<account-id>.r2.cloudflarestorage.com
S3_PREFIX=podcasts
S3_ACCESS_KEY_ID=your-access-key
S3_SECRET_ACCESS_KEY=your-secret-key
S3_DOWNLOAD_BASE_URL=https://media.example.com

S3_ENDPOINT supports R2/MinIO/custom S3 APIs. S3_DOWNLOAD_BASE_URL is optional; when set, download URLs use that public/custom domain. When omitted, the engine returns presigned S3 URLs. S3_ACCESS_KEY_ID and S3_SECRET_ACCESS_KEY configure explicit S3/R2 credentials; if they are empty, the AWS SDK falls back to its standard credential chain. With S3 enabled, audio-only jobs must upload the final MP3 successfully and the job download source is the S3 object, not the local staging file.

Dev (hot-reload frontend)

make dev   # Vite on :5173 (proxies /api), Go server on :8080

Topic format

Each topic.md is YAML front-matter + Markdown body. Minimal debate example (see examples/topic.md and examples/discussion.md for full samples):

---
title: "AI 是否會取代程序員"
type: debate          # debate | situation-puzzle | discussion | series
language: zh-CN
channel: tech         # must match an id in channels.json
total_minutes: 30
segment_max_seconds: 60
affirmative:
  - { name: "Linda", model: "gpt-4o" }
negative:
  - { name: "Alice", model: "gpt-4o" }
judge: { model: "gpt-4o" }
---

## Background
...

Make targets

Target Description
make build full production build (frontend + backend)
make frontend / make backend build one half
make run build then run the server
make dev Vite + Go in parallel for development
make gen-assets regenerate the embedded TV-studio background plates
make series-smoke / make series-recap-smoke end-to-end series smoke tests
make tidy go mod tidy + bun install
make clean remove build artifacts

Docker

Build the image and run the server (stream mode):

docker build -t debate-bot .

docker run --rm -p 3000:3000 \
  --env-file .env \
  -v "$PWD/channels:/app/channels" \
  -v "$PWD/topics:/app/topics" \
  -v "$PWD/out:/app/out" \
  debate-bot \
  server --channel ./channels/channels.json --content "./topics/*.md" --addr :3000

For video mode, override the command:

docker run --rm -p 3000:3000 --env-file .env -v "$PWD/out:/app/out" \
  debate-bot server --mode video --addr :3000

The image bundles ffmpeg/ffplay and the compiled binary with the embedded web UI. Mount channels/, your topics folder, and out/ so config and generated media persist outside the container.

Output

Each run writes to OUT_DIR/session-<timestamp>/ — per-channel HLS segments, the stitched debate.mp3, transcript.txt, per-agent memory/, and run.log. Series episodes also archive into SERIES_ROOT/tv-series/<show>/... for cross-episode recaps.

Video-mode jobs land under OUT_DIR/session-<timestamp>/jobs/<jobID>/: video.mp4 (or audio.mp3 for an audio-only feed), the subtitles.vtt sidecar, and the per-turn audio.

Packages

 
 
 

Contributors