✨ WonderTale

AI storytelling companion for children, designed with neurodiversity in mind — powered by Gemini Live API, Google ADK, and Google Cloud.

A child speaks a topic — WonderTale researches real facts, weaves a personalized adventure, illustrates it, and narrates with expressive voices, all in real-time.

Built for the Gemini Live Agent Challenge on Devpost.

Architecture

Quick Start

Fastest path: You only need a Google AI API key to run the full experience locally. No database, no cloud storage, no billing plan required — the app gracefully degrades when optional services are absent.

Prerequisites

Requirement	Version	Notes
Python	3.11+	`python --version` to verify
Node.js	18+	`node --version` to verify
Google AI API Key	—	Free tier at aistudio.google.com/apikey

Optional (not needed for local demo):

PostgreSQL — enables story library persistence (without it, stories live in browser localStorage)
Cloudflare R2 — enables CDN image hosting (without it, illustrations are sent inline as base64)
Blaze billing plan — enables real Gemini image generation (without it, set MOCK_IMAGES=true for colorful placeholder illustrations)

Step 1 — Backend

cd WonderTale

# Create and activate virtual environment
python -m venv .venv
# Windows:
.venv\Scripts\activate
# macOS/Linux:
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Configure environment
cp .env.example .env

Edit .env — set these two values:

GOOGLE_API_KEY=your-key-here
MOCK_IMAGES=true          # Skip Blaze billing; remove this line if you have a paid API key

Start the server:

uvicorn main:app --reload --port 8000

You should see:

WonderTale server starting [log_level=INFO]
Initializing database...
Database disabled (no DATABASE_URL configured) — using localStorage-only mode
INFO:     Uvicorn running on http://0.0.0.0:8000

Step 2 — Frontend

In a second terminal:

cd WonderTale/frontend

# Install dependencies
npm install

# Start dev server
npm run dev

You should see:

VITE v7.x.x  ready in Xms
➜  Local:   http://localhost:5173/

Step 3 — Try It

Open http://localhost:5173 in Chrome (recommended for Web Audio API support).

Note on UX: WonderTale was designed exclusively for mobile screens. If viewing on a desktop browser, please use Responsive/Device Design Mode (F12) and set it to a mobile device (like an iPhone or Pixel), or resize your browser window to a narrow portrait aspect ratio. Alternatively, build and test the Android app via Capacitor.
Complete onboarding — enter a name, age, interests, accessibility preferences, and companion name
On the home screen, tap a suggestion or type your own topic to start a text-only session. Starting a session from anywhere else (like the mic button) will start a Live Audio session.
Watch the story flow:
- Researching indicator appears → Research Agent queries Google Search for real facts
- Writing indicator → Story Architect weaves facts into a personalized adventure
- Illustration fades in as the background → AI-generated (or placeholder) scene art
- Text appears with word-by-word reveal → accessibility-formatted story paragraphs
- Choices appear at the bottom → two AI-generated narrative branches
- Discover panel (tap the book icon) → recap facts + comprehension quiz
Pick a choice to continue to the next chapter — the cycle repeats

Audio Mode (Paid API Key Required)

To enable live voice conversation with the Gemini Live API:

AUDIO_MODE=true

This uses gemini-2.5-flash-native-audio-preview for bidirectional audio streaming. The child speaks naturally, the AI narrates with the Aoede voice, and barge-in (interruption) is supported. Requires a microphone and Chrome.

Troubleshooting

Issue	Fix
`GOOGLE_API_KEY` error on startup	Verify your key at aistudio.google.com/apikey
No illustrations appearing	Set `MOCK_IMAGES=true` in `.env` if you don't have a Blaze billing plan
WebSocket connection refused	Ensure the backend is running on port 8000; check `frontend/.env` has `VITE_WS_URL="ws://localhost:8000/ws/session"`
`ModuleNotFoundError`	Ensure virtualenv is activated: `which python` should point to `.venv/`
Port 8000 already in use	`uvicorn main:app --reload --port 8001` and update `frontend/.env` accordingly
Database warnings	Expected if `DATABASE_URL` is not set — the app falls back to localStorage mode

Environment Variables

Variable	Default	Description
`GOOGLE_API_KEY`	—	Gemini API key (required)
`AUDIO_MODE`	`false`	`true` = native Gemini Live audio; `false` = text debug mode
`IMAGE_MODEL`	`gemini-2.5-flash-image`	Image generation model
`MOCK_IMAGES`	`false`	`true` = colourful placeholder PNGs, no API call
`LOG_LEVEL`	`INFO`	Python logging level (`DEBUG`, `INFO`, `WARNING`, `ERROR`)
`DATABASE_URL`	—	PostgreSQL asyncpg URL — if unset, app uses localStorage only
`R2_ACCOUNT_ID`	—	Cloudflare R2 account ID
`R2_ACCESS_KEY_ID`	—	Cloudflare R2 access key
`R2_SECRET_ACCESS_KEY`	—	Cloudflare R2 secret key
`R2_BUCKET_NAME`	`wondertale-assets`	R2 bucket name
`R2_PUBLIC_DOMAIN`	—	Public CDN domain for R2 images
`R2_ENDPOINT_URL`	—	R2 S3-compatible endpoint URL

Project Structure

WonderTale/
├── main.py                      # FastAPI entry point, lifespan, router includes
├── requirements.txt
├── alembic.ini                  # Alembic migration config
├── .env.example
│
├── core/                        # Agent definitions + shared singletons
│   ├── agent.py                 #   Root orchestrator agents (voice + text modes)
│   └── services.py              #   ADK Runner + SessionService singletons
│
├── agents/                      # Sub-agent definitions
│   ├── research.py              #   Research Agent (gemini-2.5-flash + Google Search)
│   ├── story.py                 #   Story Architect (gemini-2.5-flash)
│   ├── choices.py               #   Story Choices Agent — 2 narrative branches
│   └── quiz.py                  #   Quiz Agent — comprehension questions
│
├── tools/                       # FunctionTools called by the Orchestrator
│   ├── research_tool.py         #   research_topic()
│   ├── story_tool.py            #   generate_story() — runs pipeline + launches background tasks
│   ├── illustration_tool.py     #   generate_and_queue_illustration() (background)
│   ├── choices_tool.py          #   generate_story_choices() (background, 3s delay)
│   └── quiz_tool.py             #   generate_quiz() (background, 6s delay)
│
├── sessions/                    # Per-connection session handlers
│   ├── audio.py                 #   Audio mode: run_live bidi + 3-coroutine model
│   ├── text.py                  #   Text mode: run_async + drain_loop
│   ├── helpers.py               #   apply_profile() shared helper
│   ├── context.py               #   ContextVars: session_id, user_id, story_id
│   └── media_queue.py           #   Per-session asyncio.Queue (illustrations, text, choices, quiz)
│
├── routes/                      # FastAPI routers
│   ├── health.py                #   GET /health
│   ├── websocket.py             #   WS /ws/session (subscription-aware dispatch)
│   ├── profiles.py              #   GET/PUT /api/profiles/{user_id}
│   ├── stories.py               #   GET/DELETE /api/stories/{user_id}[/{story_id}]
│   └── subscriptions.py         #   GET/POST /api/subscriptions/{user_id}[/activate|/cancel]
│
├── subscriptions/               # Subscription system
│   ├── models.py                #   SubscriptionTier, SubscriptionStatus, TIER_CONFIG, ORM model
│   ├── crud.py                  #   DB operations: get, create, update, increment
│   └── service.py               #   SubscriptionService: limits, enforcement, serialisation
│
├── db/                          # Database layer
│   ├── __init__.py              #   Async engine, session factory, graceful degradation
│   ├── models.py                #   ORM models: Profile, Story, StorySegment, Illustration, …
│   ├── crud.py                  #   CRUD helpers for all tables
│   └── migrations/              #   Alembic migration scripts
│
├── storage/
│   └── r2.py                    # Cloudflare R2 client (gracefully disabled if unconfigured)
│
└── frontend/                    # React 19 + TypeScript + Vite 7 + TailwindCSS v4
    └── src/
        ├── App.tsx              #   Provider hierarchy root
        ├── AppRouter.tsx        #   Screen routing with Framer Motion transitions
        ├── context/             #   AppContext, AuthContext, SessionContext,
        │                        #   ThemeContext, SubscriptionContext
        ├── screens/             #   All screens (onboarding, home, story, library,
        │                        #   parent dashboard, subscription)
        ├── components/          #   Shared + story-specific UI components
        ├── hooks/               #   useWebSocket, useAudioCapture, useAudioPlayback
        ├── lib/                 #   api.ts, subscriptionApi.ts, auth.ts, storage.ts
        └── types/               #   index.ts, subscription.ts

Key Features

Voice-First

Bidirectional audio streaming via Gemini Live API
Aoede voice — warm, clear, child-friendly
Context window compression for sessions beyond 10 minutes
Session resumption on disconnect (token valid ~2 hours)
Barge-in support — children can interrupt naturally
?mode=audio|text per-connection override

Personalized

Full onboarding: name, age, interests, companion name
Profile injected into every story generation prompt
Every story makes the child the hero
Google Search grounding for factual accuracy

Multimodal

Custom illustration generated per story scene via Gemini image generation
Illustrations delivered via side-channel asyncio.Queue — narration never blocked
Fade-in transitions with scene progress indicator
MOCK_IMAGES=true for local dev without a Blaze API key

Interactive

Story Choices — two AI-generated narrative branches to steer the adventure
Wand / Interruption — tap during narration to change direction
Discover Panel — recap facts, view research, and take a comprehension quiz

Accessible

Dyslexia mode — OpenDyslexic font, increased letter/word spacing, relaxed line height
ADHD pacing — short segments, animated progress indicator
Autism structure — predictable narrative scaffolding, emotion labels
Full audio narration + image alt-text for visual impairment
Parent Dashboard for managing all accessibility settings

Story Library

Completed stories persisted to PostgreSQL with illustrations stored on Cloudflare R2
Library screen with replay — re-read any past story without a new AI session
Story details: title, summary, themes, research facts, quiz, choices

Subscription System

Tier	Price	Audio Mode	Illustrations / Story	Stories / Day
Basic	$5 / mo	Text only	1	5
Plus	$20 / mo	Gemini Live audio	3	5

Free 30-day trial activated on first subscription (no payment gateway required)
Auto-creates a Plus trial for new users on first WebSocket connect
Tier enforcement at the WebSocket connection layer (audio downgrade) and story tool layer (daily limit, illustration cap)
Subscription management in Parent Dashboard → Manage Subscription

WebSocket Protocol

Client → Server

Format	Content
Query param	`user_id=<uuid>`, `mode=audio\|text`, `resume_token=<token>`
Binary	Raw PCM audio (16kHz, 16-bit, mono)
JSON	`{ type: "text", text: "..." }`
JSON	`{ type: "profile", profile: { name, age, interests, accessibility }, is_resume? }`
JSON	`{ type: "story_resume", story_id, chapters_done, total_segments }` — restore chapter tracker on resume

Server → Client

Type	Content
Binary	Raw PCM audio (24kHz, 16-bit, mono)
`thinking`	Agent is processing
`tool_call`	`{ name }` — tool invocation started
`tool_result`	`{ name }` — tool completed
`transcription`	`{ text }` — agent speech transcript (finished utterances only; no partials)
`turn_complete`	Agent turn finished
`interrupted`	Barge-in detected; audio cancelled
`session_resumption`	`{ token }` — save for reconnect
`illustration`	`{ data?, url?, alt, segment_index, total_segments }`
`accessibility_text`	`{ text, segment_index, total_segments }`
`story_choices`	`{ choices: [{ label, description, story_direction }] }`
`quiz_data`	`{ questions: [{ question, options, answer_idx, hint }] }`
`subscription_info`	`{ tier, status, audio_mode_allowed, max_illustrations_per_story, stories_used_today, max_stories_per_day, trial_end, days_left, is_active }`
`subscription_expired`	Trial or subscription is no longer active
`limit_reached`	`{ limit_type: "stories"\|"illustrations", message }`
`error`	`{ message }`

REST API

Method	Path	Description
GET	`/health`	Server health check
GET	`/api/profiles/{user_id}`	Fetch child profile
PUT	`/api/profiles/{user_id}`	Create / update child profile
GET	`/api/stories/{user_id}`	List completed stories
GET	`/api/stories/{user_id}/{story_id}`	Full story detail (segments, illustrations, quiz, choices)
DELETE	`/api/stories/{story_id}`	Delete a story
GET	`/api/subscriptions/{user_id}`	Current subscription status
POST	`/api/subscriptions/{user_id}/activate`	Activate / change trial tier — body: `{ tier: "basic"\|"plus" }`
POST	`/api/subscriptions/{user_id}/cancel`	Cancel subscription

Tech Stack

Layer	Choice
Backend language	Python 3.11+
Web framework	FastAPI
Agent framework	Google ADK
AI — Voice	`gemini-2.5-flash-native-audio-preview-12-2025`
AI — Text / Story	`gemini-2.5-flash`
AI — Images	`gemini-2.5-flash-image`
Database	PostgreSQL (asyncpg + SQLAlchemy async)
Migrations	Alembic
Image storage	Cloudflare R2 (boto3 S3-compatible)
Frontend	React 19 + TypeScript + Vite 7 TODO: Port to React Native / Expo
Styling	TailwindCSS v4
Animations	Framer Motion
Hosting	Google Cloud Run

License

Built for the Gemini Live Agent Challenge hackathon.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

✨ WonderTale

Architecture

Quick Start

Prerequisites

Step 1 — Backend

Step 2 — Frontend

Step 3 — Try It

Audio Mode (Paid API Key Required)

Troubleshooting

Environment Variables

Project Structure

Key Features

Voice-First

Personalized

Multimodal

Interactive

Accessible

Story Library

Subscription System

WebSocket Protocol

Client → Server

Server → Client

REST API

Tech Stack

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
agents		agents
core		core
db		db
docs		docs
frontend		frontend
infra		infra
routes		routes
sessions		sessions
storage		storage
subscriptions		subscriptions
suggestions		suggestions
tools		tools
.dockerignore		.dockerignore
.env.example		.env.example
.gcloudignore		.gcloudignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
alembic.ini		alembic.ini
entrypoint.sh		entrypoint.sh
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

✨ WonderTale

Architecture

Quick Start

Prerequisites

Step 1 — Backend

Step 2 — Frontend

Step 3 — Try It

Audio Mode (Paid API Key Required)

Troubleshooting

Environment Variables

Project Structure

Key Features

Voice-First

Personalized

Multimodal

Interactive

Accessible

Story Library

Subscription System

WebSocket Protocol

Client → Server

Server → Client

REST API

Tech Stack

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages