Hands-on workshop scaffold for teaching Apache Airflow 3.2 through a fictional BookOps data platform.
IMPORTANT: Please complete the following setup before the tutorial. We need to pull Docker images and install packages -- do it at home on fast WIFI.
- Python 3.12
- Docker Desktop with at least 4 GB memory
If your local system doesn't allow you to install things, you can use GitHub Codespaces. Click on Code (top right) -> Codespaces -> Create new codespace on main.
If you use Codespaces instead of local, your Airflow URL will look like:
https://humble-enigma-4xvq5wgx66cqppv-8080.app.github.dev/
Replace humble-enigma-4xvq5wgx66cqppv with your unique codespace name.
Ports:
- 8080 for Airflow
- 5432 for Postgres
- 8501 for the analytics app
To get the Airflow home page after logging in

git clone git@github.com:thelearningdev/pyconus-2026-apache-airflow-tutorial.gitpython3 -m venv .venv
source .venv/bin/activate
export AIRFLOW_HOME=$PWD # important if not airflow will take your home folder for setting up airflow
pip install --upgrade pip
pip install -r requirements.txt
./scripts/start_airflow_standalone.shLogin credentials are generated on first run -- check simple_auth_manager_passwords.json.generated.
Before running bookshop DAGs, add the bookshop_postgres connection in Airflow UI (Admin > Connections):
- Conn ID:
bookshop_postgres - Conn Type:
Postgres - Host:
localhost, Database:bookops, Login:airflow, Password:airflow, Port:5432
docker compose up --buildOpen http://localhost:8080 and sign in with the credentials from simple_auth_manager_passwords.json.generated
On a new terminal:
docker compose exec airflow /bin/bashThen inside the shell:
airflow dags listYou won't see anything at the moment, but by the end of the workshop, you will have more dags.
Connect with a client like DBeaver or pgAdmin using:
postgresql://airflow:airflow@localhost:5432/airflow # Airflow metadata
postgresql://airflow:airflow@localhost:5432/bookops # Workshop data
Open http://localhost:8501/ (Docker only). You will see BookShop Pipeline Dashboard. Errors are expected until you run the pipeline.
End of Setup. The rest we do at the workshop.
Exercises are in the exercises/ folder, one file per exercise (00-10 + reference).
Concepts track (00-09) -- Airflow mechanics in isolation, no database:
| Exercise | Topic |
|---|---|
| 00 | DAG anatomy, @task, >> |
| 01 | Task dependencies |
| 02 | Branching, trigger_rule |
| 03 | Task context |
| 04 | XCom push/pull |
| 05 | Retries and timeouts |
| 06 | Connections and Hooks |
| 07 | Sensors |
| 08 | Scheduling, catchup, {{ ds }} |
| 09 | Assets |
Bookshop track (10-13) -- full ETL pipeline against Postgres:
| Exercise | Airflow Topics | DE Topics |
|---|---|---|
| 10 | PostgresHook, idempotency | CSV ingestion, ON CONFLICT upsert |
| 11 | schedule, catchup, {{ ds }}, XCom |
Incremental loads, backfill |
| 12 | @task.branch, trigger_rule, Assets |
Data quality, quarantine pattern |
| 13 | Dynamic task mapping, Assets | Parallel aggregation, reporting mart |
Starter files are in dagscode/ with _starter.py and _solution.py variants. Copy the starter you're working on into dags/ for Airflow to pick it up.
Reset Airflow DB if things get into a bad state:
docker compose exec airflow /bin/bash
airflow db reset -y
airflow db migrate
airflow dags reserializeIf tasks are not scheduling or running, restart the Airflow terminal or Docker container.

