Skip to content

Anaconda-Labs/building-intelligent-apps-with-anaconda

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

124 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀 Building Intelligent Apps with Anaconda

Mission status: Ready for lift-off

A hands-on curriculum for building production-grade intelligent applications with the Anaconda ecosystem. Every module is a self-contained < 7-minute narrated demo with pre-run outputs or script run_demo.sh — designed to show the code and the decisions, not walk you through setup.

Our data is real: WASP-18 b, a hot Jupiter exoplanet caught transiting its star by NASA's TESS telescope. We process its light curve, build agents to reason about it, deploy those agents to production, secure the supply chain, and ship the result — to a browser tab, to a native app, to an air-gapped server 800km away. Same data. Every module.

(Looking for conda-forge optimized environments? check out the conda-forge branch of this repository.)


🌌 Mission arc:

                     ┌─────────────────────────────────────────────┐
                     │         THE WASP-18 b PIPELINE              │
                     │  a hot Jupiter 1,300 light-years from home  │
                     └─────────────────────────────────────────────┘

 PRE-LAUNCH         -1   MCP orientation        ~5 mins            talk to Anaconda tools before liftoff - requires Claude Desktop
 ────────────────────────────────────────────────────────────────────────────────
 CORE STACK         00   Foundation*            ~1-3 mins          conda — the launch pad - build script
                    01   Data sources           ~1-3 mins          TESS photons → Python → ValidationReport - prerun notebook
                    02   Your first agent*      ~3 mins            one agent, one tool, one answer - build script
                    03   Multi-agent            ~5 mins            crew of agents, Metaflow orchestration - build script
 ────────────────────────────────────────────────────────────────────────────────
 DEEP SPACE         04   Deployment*            ~3 mins            swap the LLM endpoint, keep the agents - prerun notebook
                    05   GPU acceleration*      UNKNOWN            (experimental) Nemotron on NVIDIA iron, 47× faster - prerun notebook
                    06   App architecture       ~5-7 mins          harness, evals, vector memory, cards - build script
                    07   Mission critical       ~5 mins            CVEs, conda-lockfiles, air-gap, AIBOM - prerun notebook
 ────────────────────────────────────────────────────────────────────────────────
 EXTRAVEHICULAR     08   Native apps*           UNKNOWN            (experimental) PyScript (browser) + BeeWare (native) - build script
                    09   Web app                ~1-3 mins          Panel app (browser) - prerun notebook
 ────────────────────────────────────────────────────────────────────────────────
 MISSION CONTROL    --   Example environment                       reference environment that would work for all of the modules
                    --   README                                    **YOURE HERE**

All modules are optimized for GitHub codespaces for ease. Modules that cannot be completed in GH codespaces: *section #00 - mcp your environment requires Claude Desktop *section 02 - your first agent, option B requires Anaconda Desktop *section 04 - deployment and inference, option A requires Anaconda Desktop *section 05 - gpu acceleration, requires Brev (paid service) *section 08 - native applications, option B: BeeWare builds native applications

The payload — ingestion.py and ValidationReport — never changes. What changes is where it runs and what reasons about it.


📡 Modules

-1 — MCP: Pre-flight Checklist

Make sure ground control can hear you. Time: ~5 minutes

The Model Context Protocol is how AI assistants talk to Anaconda tools. Before the first conda install, verify your MCP setup so Claude Desktop can manage environments, query packages, and inspect CVEs on your behalf.

Tools: anaconda-mcp, Claude Desktop


00 — Foundation: The Launch Pad

You can't reach orbit without a stable platform. Time: ~1-3 minutes

Every agent, every pipeline step, every GPU kernel in this curriculum runs inside a conda environment. This module makes that concrete: why conda, how environment isolation works, and the tools that turn a fleeting pip install into a reproducible, lockable, shippable artifact.

Tools: conda, conda-pypi (experimental), conda-forge, Anaconda Distribution


01 — Data Sources: First Contact

Raw photons from 1,300 light-years away, cleaned up and ready for agents. Time: ~1-3 minutes

Built on Daina Bouquin's polars_demo — a real TESS phase-folded light curve of WASP-18 b, a hot Jupiter completing an orbit every 22 hours. We extend it into a production-ready pipeline: schema enforcement, Pydantic validation, IsolationForest anomaly detection, and the ValidationReport that every subsequent module consumes.

PHASE | LC_DETREND | MODEL_INIT
──────────────────────────────
The three columns that travel through the entire curriculum.
  • ingestion.pyload_lightcurve(), validate_lightcurve(), schema enforcement
  • ValidationReport — typed Pydantic output, JSON-serialisable, agent-ready
  • IsolationForest — transit anomaly detection without labelled data
  • agent_context — the structured payload that becomes Module 02's agent input

Tools: Polars, scikit-learn, Pydantic, ingestion.py


02 — Your First Agent: One Crew Member

A single agent, two tools, a classification. Time: ~3 minutes

ingestion.py functions become LangGraph tools. One agent calls load_lightcurve, passes the result to validate_lightcurve, reasons over the ValidationReport, and returns a structured transit classification. Claude Haiku is the default crew member — swap the base_url for AI Navigator or Anaconda Desktop to fly offline. Demo can be completed without model access, but stops at what will be served to the model.

Default LLM: claude-haiku-4-5-20251001 via Anthropic API, or AI Navigator local server.

Tools: LangGraph, Anthropic or openai client


03 — Multi-Agent Architecture: Assemble the Crew

Two agents, one supervisor, foreach parallelism across 50 targets. Time: ~5 minutes

DataAgent and AnalysisAgent fly in formation, coordinated by a LangGraph supervisor. Metaflow wraps the whole operation as a FlowSpec — each agent role gets its own isolated, lockable environment. A dependency conflict between Polars and LangGraph is structurally impossible.

start → ingest (polars, scikit-learn)
      → analyze (openai, langgraph)
      → join → end

Tools: LangGraph, Metaflow 2.18+, FlowSpec


04 — Deployment and Inference: Mission Control Endpoints

Three LLM targets, one agent interface, zero code changes. Time: ~3 minutes

The agents from Module 03 call an LLM via the openai client. That client points at a URL. This module shows what lives at the URL — and proves that swapping it is a one-line env var change.

Target base_url Best for
AI Navigator http://localhost:8080/v1 Local dev, no API key
vLLM (self-hosted) http://server:8000/v1 Production GPU, full control
Anaconda Platform $MODEL_SERVER_BASE_URL Enterprise, governed, AIBOM

Anaconda Platform adds: Model Catalog with HellaSwag/WinoGrande/TruthfulQA benchmarks, downloadable AIBOM (CycloneDX JSON), Responsible AI scoring via Gray Swan, Model Governance for org-wide policy.

Supporting docs in deploy/: ai-navigator.md, vllm.md, anaconda-platform.md

Tools: vLLM, Anaconda Platform Model Servers, inference_client.py


05 — GPU-Accelerated Intelligence: Afterburners

Same pipeline. NVIDIA iron. 47× faster feature engineering. Time: UNKNOWN - experimental

The Module 03 flow gets a CUDA upgrade. compute_features moves from Polars CPU rolling windows to a CUDA Python 1.0 kernel. The LLM switches from Claude Haiku to Nemotron 3 Nano on vLLM via Brev. The agents don't know any of this happened.

Module 03                    Module 05
─────────────────────────    ────────────────────────────────
@conda per step   same  →    + nvidia channel, cuda-python
ingestion.py      same  →    same functions
Claude Haiku            →    Nemotron 3 Nano (BF16) on vLLM
CPU rolling windows     →    CUDA Python kernel
no sandbox              →    NemoClaw security layer (alpha)

What each NVIDIA tool actually is (not Python imports):

  • Brev — CLI to provision an L40S in ~3 minutes: brev create
  • CUDA Python 1.0from cuda.core.experimental import Device — direct kernel access
  • Nemotron — HuggingFace model, served via vLLM, called via openai client
  • NemoClaw — TypeScript CLI + Python blueprint, sandboxed agent runtime (alpha)

conda-pypi note: vLLM is PyPI-only. The pip: section in environment.yml is the current pragmatic path. conda-pypi (experimental, Q1 2026) is the safer long-term approach — converts wheels to .conda format, integrates with the solver. Track: conda/conda-pypi

Benchmarks (50 light curves): 47.5× feature engineering speedup · 4.8× end-to-end

Tools: Brev, CUDA Python 1.0, vLLM, Nemotron 3 Nano (BF16), NemoClaw


06 — App Architecture: Mission Hardening

The pipeline that keeps flying when things go wrong. Time: ~5-7 minutes

Module 03's flow works on good data with a responsive LLM. This module adds four additive patterns that keep it running in production:

Pattern              Metaflow tool         What it solves
─────────────────    ──────────────────    ────────────────────────────────────────
Graceful degradation @catch                One bad target doesn't abort the mission
Eval-as-CI           evaluate step         Assertions run every execution, fail loud
Observability        @card                 HTML reports per target + per run
Agent memory         DuckDB vector store   Past results injected as context at inference

The evaluate step runs assertion functions from evals/assertions.py — plain Python, no Metaflow dependency, testable with pytest. Critical failures raise AssertionError. The @card on end is the single view you check after every production run.

The DuckDB memory store (from the vector DB comparison: pgvector / MongoDB Atlas / Neo4j / DuckDB — embedded, portable, conda-pack-able) gives agents memory across runs: past ValidationReport results retrieved by cosine similarity and injected into the system prompt.

Tools: LangGraph, Pydantic, DuckDB, Metaflow: FlowSpec, @catch, @card, @conda, @retry


07 — Mission-Critical Infrastructure: No Failures Tolerated"

Prove the environment is safe before it flies.

Supply chain security isn't a feature you add at the end. It's the infrastructure the pipeline runs on. Five layers, zero pipeline code changes:

Layer    Tool                               The question it answers
───────  ─────────────────────────────────  ─────────────────────────────────────────
Lock     conda-lockfiles                    Is this environment bit-for-bit reproducible?
Scan     anaconda-audit                     Does it contain known vulnerabilities?
Gate     Anaconda Platform policy filter    Did anything vulnerable get in upstream?
Pack     conda-pack                         Can we deploy without internet access?
Verify   AIBOM + SHA-256                    Is the model file what we think it is?
  • anaconda-audit scan --name app-architecture — CVE scan against NVD/NIST, Anaconda-curated statuses (Active / Cleared / Mitigated / Disputed)
  • Policy filters block packages with CVE score ≥ 7 or Active status before they reach your channel
  • conda-lockfiles turns a floating environment.yml into a pinned deployment contract
  • conda-pack ships the entire environment as a relocatable tarball — Python, CUDA binaries, DuckDB memory store — to a machine with no conda, no internet
  • AIBOM (CycloneDX JSON from Anaconda Platform Model Catalog) includes SHA-256 checksums, benchmark scores, ethical considerations, software dependencies

CI script: lock_and_scan.sh — lock → scan → gate, exits 1 on critical CVEs. Deploy script: pack_and_ship.sh — AIBOM verify → conda-pack → scp to target.

Tools: anaconda-audit, conda-lockfiles, conda-pack, Anaconda Platform, verify_aibom.py


08 — Native Apps: Escape Velocity

The same pipeline, delivered everywhere Python runs.

An addendum. Not required. The answer to: after all that, where else can this fly?

Two options, same five exoplanet targets, same IsolationForest pipeline, same ValidationReport schema:

Option A — PyScript (Python in the browser, no server)

Launched by Anaconda at PyCon US 2022. Pyodide loads a full CPython interpreter into your browser tab via WebAssembly. One HTML file. Select a target → Run Analysis → validation report + three matplotlib charts render in the page. numpy, pandas, matplotlib, scikit-learn all bundled in Pyodide.

cd 08-native-apps && python -m http.server 8080
# open http://localhost:8080 — that's it

Option B — BeeWare (Python as a native OS app)

Funded by Anaconda. Briefcase packages the same Python source into native apps for every platform. Toga maps Python widgets to native OS controls (NSTableView on macOS, GtkTreeView on Linux, ListView on Windows). One pyproject.toml, six targets.

cd 08-native-apps
pip install briefcase
briefcase dev        # opens a native window immediately
briefcase package    # → .dmg / .msi / AppImage / .ipa / .aab

See BUILDING.md for the complete per-platform build guide including signing, distribution, and the full Briefcase command lifecycle.

Tools: PyScript (Pyodide/WASM), BeeWare (Briefcase + Toga)


09 — Web App: Ground Control

The same pipeline, served as an interactive web app. Time: ~1-3 minutes

An addendum. The answer to: how do you put this in front of a non-Python user?

The Module 01 IsolationForest pipeline becomes a Panel app served by a Bokeh server. Same five exoplanet targets, same ValidationReport schema — what changes is the delivery layer. Reactive widgets stream updates to the browser over a websocket as the pipeline runs. No page reloads, no CSV, no setup for the end user.

conda env create -f 09-web-app/environment.yml
conda activate panel-app
panel serve app.py --show
# Opens http://localhost:5006/app

The app surfaces the pipeline as four tabs per run: a live pipeline log, the full ValidationReport as formatted stats, a sortable Tabulator grid of the top 50 anomalous points, and a phase-folded lightcurve plot with anomalies highlighted and the IsolationForest score panel below.

panel-material-ui provides the Material Design sidebar, tabs, and buttons. hvplot turns the pipeline's pandas DataFrames into HoloViews plots in a single method call.

Tools: Panel, HoloViews, hvplot, panel-material-ui, param


🛠️ Core tools

Tool Role First seen
conda / conda-forge Environment management + package distribution 00
conda-lockfiles Reproducible cross-platform lock files 00, 07
conda-pack Portable environment tarballs for air-gap deployment 07, 08
conda-pypi Safer PyPI wheel integration (experimental, Q1 2026) 05
anaconda-audit CVE scanning against NVD/NIST 07
Anaconda Platform Model Catalog, Governance, CVE policy, Model Servers 04, 07
Anaconda MCP Server MCP tool exposure for Anaconda ecosystem -1, 02
Polars Fast DataFrame manipulation 01
Pydantic Typed, validated, JSON-serialisable pipeline outputs 01
Metaflow 2.18+ ML/AI workflow orchestration with per-step @conda 03
LangGraph Agent loop and multi-agent coordination 02, 03
vLLM OpenAI-compatible self-hosted GPU inference server 04, 05
CUDA Python 1.0 Direct CUDA kernel access from Python 05
Brev On-demand GPU instance provisioning 05
Nemotron NVIDIA open-weight model family (via HuggingFace + vLLM) 05
NemoClaw Sandboxed agent runtime (NVIDIA, alpha CLI) 05
DuckDB Embedded vector store for agent memory 06
PyScript Python in the browser via WebAssembly 08
BeeWare Native mobile/desktop apps in Python 08
Panel / HoloViz Reactive web app framework on top of Bokeh 09

⚙️ Optional tools

Tool Role
LangChain Broader LLM tooling ecosystem (02)
FastMCP MCP server construction
pixi Alternative conda environment manager
Numba JIT-compiled Python for GPU/CPU acceleration
RAPIDS GPU-accelerated data science (cuDF, cuML)
Outerbounds Managed Metaflow (Argo + metadata service + UI)

🎯 Learning Objectives

By the end of this curriculum, you will be able to:

  • Create, lock, and reproduce conda environments across platforms using conda-lockfiles and conda-forge
  • Audit a dependency graph for known CVEs using anaconda-audit and interpret Anaconda-curated vulnerability statuses
  • Package a complete Python environment, including CUDA binaries and embedded databases, as a relocatable tarball for air-gapped deployment
  • Build a production-grade ingestion pipeline with schema enforcement, Pydantic validation, and unsupervised anomaly detection
  • Produce structured, JSON-serializable pipeline outputs that downstream agents can consume without additional parsing
  • Implement a single-agent tool loop using LangGraph and the OpenAI-compatible client interface
  • Compose a multi-agent system with a supervisor, role-separated agents, and per-step isolated environments using Metaflow @conda decorators
  • Swap LLM backends (Anthropic API, AI Navigator, vLLM, Anaconda Platform) by changing a single environment variable, with no changes to agent logic
  • Add graceful degradation, eval-as-CI, per-run observability cards, and cross-run vector memory to an existing Metaflow pipeline
  • Verify model provenance using a CycloneDX AIBOM and SHA-256 checksums before deployment
  • Run the same Python analysis pipeline in a browser tab via PyScript and as a native desktop application via BeeWare Briefcase
  • Serve a validated pipeline as an interactive Panel web app with reactive widgets and live plot updates

🌠 Prerequisites

# Anaconda Distribution or Miniconda
# https://www.anaconda.com/download

conda --version   # 26.5.x or later

# Free Anaconda account — required for AI Navigator and Platform features
# https://anaconda.com

Each module has its own environment.yml. Start with -1-mcp-your-environment and work in order — every module builds on the data and patterns from the one before it.


🔭 The data

Every module uses the WASP-18 b phase-folded light curve from NASA's TESS mission, originally prepared by Daina Bouquin for polars_demo.

WASP-18 b is a real exoplanet — a hot Jupiter 10× the mass of Jupiter, completing a full orbit every 22.6 hours. TESS measured its host star's brightness dropping by ~1% each time the planet crossed in front of it. Those brightness measurements are our CSV: 1,800 rows, three columns, one recurring signal buried in noise.

It's a good dataset for this curriculum because it has a real anomaly (the transit dip), realistic noise, and enough physical context that the agent's reasoning outputs make intuitive sense. You don't need to know astrophysics — but if you look it up, the numbers check out.

wasp18b_lightcurve.csv is bundled in each module directory that uses it.


🛸 About

Anaconda demos for PyCon US 2026, Long Beach, May 14–22.

Built with the Anaconda ecosystem for 50M+ Python users.
The pipeline never changes. Everything else does.

For questions contact @dawnwages Target Audience: Software Engineers, AI Developers, ML Engineers, Data Scientists Resource Type: Show Metrics: Stars, forks, completions

🐛 See a problem?

Something broken, unclear, or out of date? Open an issue — bug reports and improvement suggestions both welcome.

License

MIT License - see LICENSE file for details.

About

10-part module to build AI-native apps. From data analysis to multi-agent model harness. For curious learners and serious AI builders.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors