AI storytelling companion for children, designed with neurodiversity in mind — powered by Gemini Live API, Google ADK, and Google Cloud.
A child speaks a topic — WonderTale researches real facts, weaves a personalized adventure, illustrates it, and narrates with expressive voices, all in real-time.
Built for the Gemini Live Agent Challenge on Devpost.
Fastest path: You only need a Google AI API key to run the full experience locally. No database, no cloud storage, no billing plan required — the app gracefully degrades when optional services are absent.
| Requirement | Version | Notes |
|---|---|---|
| Python | 3.11+ | python --version to verify |
| Node.js | 18+ | node --version to verify |
| Google AI API Key | — | Free tier at aistudio.google.com/apikey |
Optional (not needed for local demo):
- PostgreSQL — enables story library persistence (without it, stories live in browser localStorage)
- Cloudflare R2 — enables CDN image hosting (without it, illustrations are sent inline as base64)
- Blaze billing plan — enables real Gemini image generation (without it, set
MOCK_IMAGES=truefor colorful placeholder illustrations)
cd WonderTale
# Create and activate virtual environment
python -m venv .venv
# Windows:
.venv\Scripts\activate
# macOS/Linux:
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Configure environment
cp .env.example .envEdit .env — set these two values:
GOOGLE_API_KEY=your-key-here
MOCK_IMAGES=true # Skip Blaze billing; remove this line if you have a paid API keyStart the server:
uvicorn main:app --reload --port 8000You should see:
WonderTale server starting [log_level=INFO]
Initializing database...
Database disabled (no DATABASE_URL configured) — using localStorage-only mode
INFO: Uvicorn running on http://0.0.0.0:8000
In a second terminal:
cd WonderTale/frontend
# Install dependencies
npm install
# Start dev server
npm run devYou should see:
VITE v7.x.x ready in Xms
➜ Local: http://localhost:5173/
- Open http://localhost:5173 in Chrome (recommended for Web Audio API support).
Note on UX: WonderTale was designed exclusively for mobile screens. If viewing on a desktop browser, please use Responsive/Device Design Mode (F12) and set it to a mobile device (like an iPhone or Pixel), or resize your browser window to a narrow portrait aspect ratio. Alternatively, build and test the Android app via Capacitor.
- Complete onboarding — enter a name, age, interests, accessibility preferences, and companion name
- On the home screen, tap a suggestion or type your own topic to start a text-only session. Starting a session from anywhere else (like the mic button) will start a Live Audio session.
- Watch the story flow:
- Researching indicator appears → Research Agent queries Google Search for real facts
- Writing indicator → Story Architect weaves facts into a personalized adventure
- Illustration fades in as the background → AI-generated (or placeholder) scene art
- Text appears with word-by-word reveal → accessibility-formatted story paragraphs
- Choices appear at the bottom → two AI-generated narrative branches
- Discover panel (tap the book icon) → recap facts + comprehension quiz
- Pick a choice to continue to the next chapter — the cycle repeats
To enable live voice conversation with the Gemini Live API:
AUDIO_MODE=trueThis uses gemini-2.5-flash-native-audio-preview for bidirectional audio streaming. The child speaks naturally, the AI narrates with the Aoede voice, and barge-in (interruption) is supported. Requires a microphone and Chrome.
| Issue | Fix |
|---|---|
GOOGLE_API_KEY error on startup |
Verify your key at aistudio.google.com/apikey |
| No illustrations appearing | Set MOCK_IMAGES=true in .env if you don't have a Blaze billing plan |
| WebSocket connection refused | Ensure the backend is running on port 8000; check frontend/.env has VITE_WS_URL="ws://localhost:8000/ws/session" |
ModuleNotFoundError |
Ensure virtualenv is activated: which python should point to .venv/ |
| Port 8000 already in use | uvicorn main:app --reload --port 8001 and update frontend/.env accordingly |
| Database warnings | Expected if DATABASE_URL is not set — the app falls back to localStorage mode |
| Variable | Default | Description |
|---|---|---|
GOOGLE_API_KEY |
— | Gemini API key (required) |
AUDIO_MODE |
false |
true = native Gemini Live audio; false = text debug mode |
IMAGE_MODEL |
gemini-2.5-flash-image |
Image generation model |
MOCK_IMAGES |
false |
true = colourful placeholder PNGs, no API call |
LOG_LEVEL |
INFO |
Python logging level (DEBUG, INFO, WARNING, ERROR) |
DATABASE_URL |
— | PostgreSQL asyncpg URL — if unset, app uses localStorage only |
R2_ACCOUNT_ID |
— | Cloudflare R2 account ID |
R2_ACCESS_KEY_ID |
— | Cloudflare R2 access key |
R2_SECRET_ACCESS_KEY |
— | Cloudflare R2 secret key |
R2_BUCKET_NAME |
wondertale-assets |
R2 bucket name |
R2_PUBLIC_DOMAIN |
— | Public CDN domain for R2 images |
R2_ENDPOINT_URL |
— | R2 S3-compatible endpoint URL |
WonderTale/
├── main.py # FastAPI entry point, lifespan, router includes
├── requirements.txt
├── alembic.ini # Alembic migration config
├── .env.example
│
├── core/ # Agent definitions + shared singletons
│ ├── agent.py # Root orchestrator agents (voice + text modes)
│ └── services.py # ADK Runner + SessionService singletons
│
├── agents/ # Sub-agent definitions
│ ├── research.py # Research Agent (gemini-2.5-flash + Google Search)
│ ├── story.py # Story Architect (gemini-2.5-flash)
│ ├── choices.py # Story Choices Agent — 2 narrative branches
│ └── quiz.py # Quiz Agent — comprehension questions
│
├── tools/ # FunctionTools called by the Orchestrator
│ ├── research_tool.py # research_topic()
│ ├── story_tool.py # generate_story() — runs pipeline + launches background tasks
│ ├── illustration_tool.py # generate_and_queue_illustration() (background)
│ ├── choices_tool.py # generate_story_choices() (background, 3s delay)
│ └── quiz_tool.py # generate_quiz() (background, 6s delay)
│
├── sessions/ # Per-connection session handlers
│ ├── audio.py # Audio mode: run_live bidi + 3-coroutine model
│ ├── text.py # Text mode: run_async + drain_loop
│ ├── helpers.py # apply_profile() shared helper
│ ├── context.py # ContextVars: session_id, user_id, story_id
│ └── media_queue.py # Per-session asyncio.Queue (illustrations, text, choices, quiz)
│
├── routes/ # FastAPI routers
│ ├── health.py # GET /health
│ ├── websocket.py # WS /ws/session (subscription-aware dispatch)
│ ├── profiles.py # GET/PUT /api/profiles/{user_id}
│ ├── stories.py # GET/DELETE /api/stories/{user_id}[/{story_id}]
│ └── subscriptions.py # GET/POST /api/subscriptions/{user_id}[/activate|/cancel]
│
├── subscriptions/ # Subscription system
│ ├── models.py # SubscriptionTier, SubscriptionStatus, TIER_CONFIG, ORM model
│ ├── crud.py # DB operations: get, create, update, increment
│ └── service.py # SubscriptionService: limits, enforcement, serialisation
│
├── db/ # Database layer
│ ├── __init__.py # Async engine, session factory, graceful degradation
│ ├── models.py # ORM models: Profile, Story, StorySegment, Illustration, …
│ ├── crud.py # CRUD helpers for all tables
│ └── migrations/ # Alembic migration scripts
│
├── storage/
│ └── r2.py # Cloudflare R2 client (gracefully disabled if unconfigured)
│
└── frontend/ # React 19 + TypeScript + Vite 7 + TailwindCSS v4
└── src/
├── App.tsx # Provider hierarchy root
├── AppRouter.tsx # Screen routing with Framer Motion transitions
├── context/ # AppContext, AuthContext, SessionContext,
│ # ThemeContext, SubscriptionContext
├── screens/ # All screens (onboarding, home, story, library,
│ # parent dashboard, subscription)
├── components/ # Shared + story-specific UI components
├── hooks/ # useWebSocket, useAudioCapture, useAudioPlayback
├── lib/ # api.ts, subscriptionApi.ts, auth.ts, storage.ts
└── types/ # index.ts, subscription.ts
- Bidirectional audio streaming via Gemini Live API
- Aoede voice — warm, clear, child-friendly
- Context window compression for sessions beyond 10 minutes
- Session resumption on disconnect (token valid ~2 hours)
- Barge-in support — children can interrupt naturally
?mode=audio|textper-connection override
- Full onboarding: name, age, interests, companion name
- Profile injected into every story generation prompt
- Every story makes the child the hero
- Google Search grounding for factual accuracy
- Custom illustration generated per story scene via Gemini image generation
- Illustrations delivered via side-channel
asyncio.Queue— narration never blocked - Fade-in transitions with scene progress indicator
MOCK_IMAGES=truefor local dev without a Blaze API key
- Story Choices — two AI-generated narrative branches to steer the adventure
- Wand / Interruption — tap during narration to change direction
- Discover Panel — recap facts, view research, and take a comprehension quiz
- Dyslexia mode — OpenDyslexic font, increased letter/word spacing, relaxed line height
- ADHD pacing — short segments, animated progress indicator
- Autism structure — predictable narrative scaffolding, emotion labels
- Full audio narration + image alt-text for visual impairment
- Parent Dashboard for managing all accessibility settings
- Completed stories persisted to PostgreSQL with illustrations stored on Cloudflare R2
- Library screen with replay — re-read any past story without a new AI session
- Story details: title, summary, themes, research facts, quiz, choices
| Tier | Price | Audio Mode | Illustrations / Story | Stories / Day |
|---|---|---|---|---|
| Basic | $5 / mo | Text only | 1 | 5 |
| Plus | $20 / mo | Gemini Live audio | 3 | 5 |
- Free 30-day trial activated on first subscription (no payment gateway required)
- Auto-creates a Plus trial for new users on first WebSocket connect
- Tier enforcement at the WebSocket connection layer (audio downgrade) and story tool layer (daily limit, illustration cap)
- Subscription management in Parent Dashboard → Manage Subscription
| Format | Content |
|---|---|
| Query param | user_id=<uuid>, mode=audio|text, resume_token=<token> |
| Binary | Raw PCM audio (16kHz, 16-bit, mono) |
| JSON | { type: "text", text: "..." } |
| JSON | { type: "profile", profile: { name, age, interests, accessibility }, is_resume? } |
| JSON | { type: "story_resume", story_id, chapters_done, total_segments } — restore chapter tracker on resume |
| Type | Content |
|---|---|
| Binary | Raw PCM audio (24kHz, 16-bit, mono) |
thinking |
Agent is processing |
tool_call |
{ name } — tool invocation started |
tool_result |
{ name } — tool completed |
transcription |
{ text } — agent speech transcript (finished utterances only; no partials) |
turn_complete |
Agent turn finished |
interrupted |
Barge-in detected; audio cancelled |
session_resumption |
{ token } — save for reconnect |
illustration |
{ data?, url?, alt, segment_index, total_segments } |
accessibility_text |
{ text, segment_index, total_segments } |
story_choices |
{ choices: [{ label, description, story_direction }] } |
quiz_data |
{ questions: [{ question, options, answer_idx, hint }] } |
subscription_info |
{ tier, status, audio_mode_allowed, max_illustrations_per_story, stories_used_today, max_stories_per_day, trial_end, days_left, is_active } |
subscription_expired |
Trial or subscription is no longer active |
limit_reached |
{ limit_type: "stories"|"illustrations", message } |
error |
{ message } |
| Method | Path | Description |
|---|---|---|
| GET | /health |
Server health check |
| GET | /api/profiles/{user_id} |
Fetch child profile |
| PUT | /api/profiles/{user_id} |
Create / update child profile |
| GET | /api/stories/{user_id} |
List completed stories |
| GET | /api/stories/{user_id}/{story_id} |
Full story detail (segments, illustrations, quiz, choices) |
| DELETE | /api/stories/{story_id} |
Delete a story |
| GET | /api/subscriptions/{user_id} |
Current subscription status |
| POST | /api/subscriptions/{user_id}/activate |
Activate / change trial tier — body: { tier: "basic"|"plus" } |
| POST | /api/subscriptions/{user_id}/cancel |
Cancel subscription |
| Layer | Choice |
|---|---|
| Backend language | Python 3.11+ |
| Web framework | FastAPI |
| Agent framework | Google ADK |
| AI — Voice | gemini-2.5-flash-native-audio-preview-12-2025 |
| AI — Text / Story | gemini-2.5-flash |
| AI — Images | gemini-2.5-flash-image |
| Database | PostgreSQL (asyncpg + SQLAlchemy async) |
| Migrations | Alembic |
| Image storage | Cloudflare R2 (boto3 S3-compatible) |
| Frontend | React 19 + TypeScript + Vite 7 TODO: Port to React Native / Expo |
| Styling | TailwindCSS v4 |
| Animations | Framer Motion |
| Hosting | Google Cloud Run |
Built for the Gemini Live Agent Challenge hackathon.