Closed feedback loops β observe, act, evaluate, update, repeat β made structured, mathematical, benchmarkable, and fully engineerable.
Read the Manifesto Β· Explore Patterns Β· Run the Stack Β· Onboarding Paths
| Era | Focus | Optimized Unit | Cognitive Ceiling |
|---|---|---|---|
| 2020β2023 | Prompt Engineering | Single turn, in-context cues | No closure, state loss |
| 2023β2024 | Context Engineering | Static retrieval-augmented memory | Unchanged parameters, no iteration |
| 2024β2025 | Agent Engineering | Autonomous delegation & tools | No systemic evaluation, feedback-blind |
| 2025+ | Loop Engineering | Closed dynamical feedback loops | Unbounded, self-directed systems |
The Hierarchy of Optimization:
- Prompt engineering optimizes a single interaction.
- Agent engineering optimizes an autonomous actor.
- Loop engineering optimizes the entire closed system to get better over time through feedback.
Prompt engineering optimizes a turn. Agent engineering optimizes an actor. Loop engineering optimizes the whole closed system β cheaper to run, faster to ship, impossible to hand-wave, and built to get better every iteration.
| Benefit | What you get | Why teams care |
|---|---|---|
| Lean context | Combine + minify + budget specs down to 34% of raw YAML | More room for actual work in the window β not boilerplate |
| Minutes, not weeks | Golden path β valid LSS β scored loop in ~15 minutes | Stop reinventing loop config every sprint |
| Zero-dollar CI | SimEnv + ReplayEnv run 545 LoopNet trajectories with $0 API spend | Catch regressions before they hit prod invoices |
| Shared failure language | fail.* taxonomy across data, runtime, and bench |
Post-mortems that actually transfer between teams |
| Public receipts | LoopBench 19 tasks Β· 4 suites Β· LES-ranked leaderboard | "It worked in the demo" is no longer a career strategy |
| Production visibility | LTF traces ~70% leaner than raw chat dumps | SREs see iteration quality, not megabytes of prompts |
| One spec layer | Pin lss@1.1.0 once β LoopGym, LoopBench, LoopNet agree |
Zero schema drift across five repos |
| Harness freedom | Claude Code, Cursor, LangGraph, CrewAI, Codex, Aiderβ¦ | Keep your agent stack β add closure on top |
The pitch in one line: ship loops that cost less per turn, score on a leaderboard, replay for free, fail with names, and compound improvement β not another prompt doc lost in Notion.
Ship one flat spec instead of dragging multiple YAML files into context. Measured with le-loopforge 0.5.0 on the research β code β debug library trio.
| Path | Command | Tokens | vs baseline |
|---|---|---|---|
| Separate library YAMLs | load 3 files into context | 3,255 | 100% |
| Flat combine | loop combine --library research-agent,coding-agent,autonomous-debugger |
2,750 | 84% |
| LSS-min JSON | loopctl spec minify combined.yaml |
1,414 | 43% |
| Budgeted combine | loop quick --max-tokens 1200 --library β¦ |
1,101 | 34% |
Same LSS structure. Same evaluators. Same termination contracts. Just less noise between your agent and the job.
Get the entire Loop Engineering toolchain installed instantly.
pip install "le-loop-stack>=0.4.0"# Scaffold a loop spec from an English intent
loopforge intent "Create a code-repair loop with a test-runner evaluator" -o mapped.yaml --suggest-level
# Minify it into LSS-min JSON (saves 40β60% of prompt context space)
loopctl spec minify mapped.yaml --json
# Estimate tokens & score its structural LES
loopctl score --spec mapped.yaml --json| Pillar | Focus Area | Key Artifacts |
|---|---|---|
| Theory | Foundational conceptual rigor | 13 Fundamentals Β· 6-Level Taxonomy Β· 14 Design Patterns |
| Method | Closed-loop lifecycle governance | D-D-M-I-S Framework (Design, Diagnose, Measure, Improve, Scale) |
| Standards | Interoperable specification models | LSS 1.1 (Composition blocks) Β· LES 1.0 (Loop Effectiveness Score) |
| Evidence | Real-world validation & history | Case Studies (AlphaGo, Toyota TPS, PR pipelines, coding agents) |
| Runtime | Execution, scoring, and benchmarks | Dataset registries, replay sandboxes, and the public scorecard |
This repository serves as the narrative and theoretical home for the loop engineering movement. Machine-readable specifications and governance rules live in the canonical Loop Core Engineering repository.
Everything below is live, synchronized, and published across GitHub and PyPI. Version registry: ECOSYSTEM_VERSIONS.md.
flowchart TD
classDef primary fill:#18181b,stroke:#27272a,stroke-width:2px,color:#ffffff;
classDef highlight fill:#f4f4f5,stroke:#18181b,stroke-width:2px,color:#18181b;
classDef standard fill:#ffffff,stroke:#e4e4e7,stroke-width:1.5px,color:#18181b;
DOCS[["β Loop Engineering <br/>(You are here)<br/>Manifesto Β· Patterns Β· Case Studies"]]:::primary
FORGE["β LoopForge<br/>pip install le-loopforge"]:::standard
CTL["loopctl CLI<br/>pip install le-loopctl"]:::standard
CORE[["β Loop Core Engineering<br/>LSS Spec Β· LES Spec Β· Validators"]]:::highlight
NET[("β LoopNet v0.2<br/>545 trajectories")]:::standard
GYM["β LoopGym<br/>pip install loopgym"]:::standard
BENCH["β² LoopBench<br/>pip install loopbench"]:::standard
DOCS --> FORGE
FORGE --> CTL
FORGE --> CORE
CORE --> NET
CORE --> GYM
NET --> GYM
GYM --> BENCH
CORE --> BENCH
FORGE --> GYM
| Repository | Focus | Purpose & Links |
|---|---|---|
| LoopForge | Creation | Scaffold valid LSS specs from patterns Β· loopforge/ Β· pip install le-loopforge Β· loopctl Β· Golden Path |
| Loop Core Engineering | Specs & Governance | The constitutional foundation, schemas, and validators Β· GitHub β |
| LoopNet | Dataset | Ground truth loop executions and trajectories Β· GitHub β Β· Hugging Face β |
| LoopGym | Runtime | Sandboxed simulation environment to run and replay loops Β· GitHub β Β· pip install loopgym |
| LoopBench | Benchmarks | Continuous, public community scoreboard Β· GitHub β Β· pip install loopbench |
- β Complete Install Map: ECOSYSTEM.md
- β Ecosystem Governance: CANONICAL-SOURCE.md
- β PyPI Registry Naming Rules: PYPI_NAMING.md
Every loop is structured as a closed dynamical system:
Observe
β
βΌ
Decide
β
βΌ
Act
β
βΌ
Evaluate
β
βΌ
Update State
β
ββββββββββββ(repeat)ββββββββββββΊ [Observe]
Mathematically formalized as:
Where:
-
$\mathbf{S}$ : State space of the system -
$\mathbf{A}$ : Action space of the loop workers -
$\mathbf{O}$ : Observation space (feedback signals) -
$\mathbf{T}$ : Transition functions ($S \times A \to S$ ) -
$\mathbf{E}$ : Evaluator models (generates scores & rewards) -
$\mathbf{M}$ : Memory representation (episodic & parameter state) -
$\mathbf{\tau}$ : Termination conditions & criteria
β Detailed breakdown: What is a loop?
LSS provides a declarative, machine-readable format to define the architecture, inputs, and constraints of any loop.
loop_name: code-repair-loop
version: "1.1"
objective: "Fix failing tests with minimal diff"
workers:
- role: implementer
evaluators:
- type: test_suite
termination_conditions:
- type: all_tests_pass
- type: max_iterations
value: 10You do not need to replace your existing agent stack. Map your existing agent loop, monitor its trajectories, and benchmark its performance in minutes.
| Harness / Platform | Integration Guide | Target Framework |
|---|---|---|
| Claude Code | integrate/CLAUDE_CODE.md | Anthropic CLI agent |
| OpenAI Codex | integrate/CODEX.md | Codex code models |
| LangGraph | examples/integrate-langgraph/ | LangChain Graphs |
| CrewAI | examples/integrate-crewai/ | Role-playing Multi-agent swarms |
| Cursor | integrate/CURSOR.md | Cursor IDE Composer & Agent |
| OpenAI Agents SDK | integrate/OPENAI_AGENTS.md | OpenAI Swarm/Agents |
| Aider | integrate/AIDER.md | CLI git-integrated coding agent |
| Gemini CLI | integrate/GEMINI_CLI.md | Google Generative AI |
| Profile | Recommended Onboarding Path | Expected Time |
|---|---|---|
| The Theorist | Manifesto β Fundamentals | ~2 hours |
| The Builder | Golden Path v6 β pip install le-loop-stack β Integration Hub |
~15 min |
| The Practitioner | Loop Playground β Live Leaderboard | ~30 min |
| The Researcher | Paper Series β LoopNet v0.2 β Case Studies | ~1 day |
| The Architect | D-D-M-I-S Framework β LES scoring | ~2 hours |
| Path | Purpose | Key Artifacts |
|---|---|---|
manifesto/ |
Founding Principles | The philosophy and paradigm of loop engineering |
fundamentals/ |
Core Theory | 13-topic detailed theoretical foundation of self-improving systems |
taxonomy/ |
Classification | Six-level loop classification taxonomy |
patterns/ |
Design Patterns | 14 engineering patterns described as reusable LSS specs |
framework/ |
Methodology | D-D-M-I-S procedural guide for building and deploying loops |
case-studies/ |
Historical Evidence | Analyses of AlphaGo, Toyota TPS, GitHub PR engines, and coding loops |
loop-library/ |
Spec Library | Production-grade reference loop YAML files |
loopforge/ |
Creation Tools | Interactive scaffolding tools to map intents to LSS specs |
implementations/ |
Code Examples | Minimal reference implementations in Python, LangGraph, and CrewAI |
research/ |
Research Frontier | Active open problems, roadmaps, and paper series |
A preview of pre-declared loops available in loop-library/:
| Reference Spec | Level | Intent / Target Use Case |
|---|---|---|
| Research Agent | Level 2 | Literature review & multi-source synthesis |
| Coding Agent | Level 3 | Autonomous software feature implementation |
| Autonomous Debugger | Level 3 | Test-driven localized software repair |
| Code β Debug (nested) | Level 4 | Coding loop with nested recursive debugging |
| Scenario Swarm (parallel) | Level 4 | SWARM decision rehearsal: 3 parallel perspectives with a unified merged forecast |
| Startup Validator | Level 2 | PMF hypothesis verification and fast lean iterations |
β Browse the Full Spec Library Β· Master Checklist Β· Next Steps
Unified tools to speed up loop design, execution, validation, and benchmarking.
| Tool | Purpose | Source / Usage |
|---|---|---|
loopctl |
Unified CLI tool | tools/loopctl.py Β· Validate, score, level, and diagram LSS specs |
loopforge |
Spec generator | loopforge/ Β· Scaffold complete LSS YAML files from text-based intents |
loop_validator |
Schema validator | tools/loop_validator.py Β· Local LSS schema verification |
daily_checkin |
Automated reporter | scripts/daily_checkin.py Β· Continuous deployment checks |
loop_diagram_generator |
Visualizer | tools/loop_diagram_generator.py Β· Auto-generate clean Mermaid diagrams from LSS YAML |
We welcome contributions to LSS specs, new agent harnesses, case studies, benchmarks, and core tooling.
- β Loop Playground β Create and test your first loop in the sandbox.
- β Community Spotlight β Highlighted community loops and implementations.
- β Reproduction Challenge β Replicate verified benchmark scores.
- β Contributor Guidelines Β· Governance Model Β· Reproduction Manual
@misc{loop-engineering-2026,
title={Loop Engineering: The Discipline of Self-Improving Systems},
author={Loop Engineering Community},
year={2026},
url={https://github.com/KanakMalpani/Loop-Engineering}
}Feedback is the fundamental unit of intelligence.
Loop Engineering makes it engineerable.
MIT License

