Stress-test constitutions with AI-powered agentic politicians before trying them out on a real nation!
constitution-sim is a research-grade multi-agent AI simulator. You give it
a constitution and a scenario; it spins up an LLM-powered agent for each
political role (Executive, Legislature, Judiciary, Media, Bureaucracy)
and lets them act under the rules you wrote, turn by turn. Every action
is checked by a rules engine, every event is logged, and every run is
reproducible from a seed.
Politicians are not utility-maximisers reading from a spec — they deliberate, bargain, posture, and reach for legitimacy. The interesting question is how the rules of a constitution shape that behaviour. So the agents here are LLMs (OpenAI / Anthropic) instructed with a role-specific persona, the constitution they live under, their own goals and utility weights, and a memory of their own recent decisions. They never get to mutate the world directly — every move passes the typed rules engine first.
A deterministic heuristic agent is still available as a no-LLM fallback, so the project also runs offline / in CI / with zero API keys.
- AI cognition is the default. When
OPENAI_API_KEYorANTHROPIC_API_KEYis in the environment,constitution-sim runuses LLM-powered agents out of the box. With no key, it falls back to a deterministic heuristic — same CLI, same outputs, no setup required. - Role-specific personas. Each role (Executive, Legislature, Judiciary, Media, Bureaucracy) gets its own LLM system prompt. The Executive is ambitious; the Judiciary is reactive; the Media chases a narrative; the Bureaucracy implements steadily.
- Agent memory & shared history. Each agent remembers its own recent decisions and can see a public history of what other actors just did (if the constitution allows).
- Inter-agent deliberation. Each turn features a deliberation phase where agents can negotiate, threaten, or signal intent by sending messages to each other's inboxes.
- Schema-driven constitutions. Strict Pydantic v2 models; YAML in, typed objects out. Constitutions can enforce communication limits (e.g. authoritarian gag orders).
- Rules engine is source of truth. Agents propose typed actions; the engine accepts or rejects with a reason. The LLM cannot mutate state directly.
- Partial observability. Each role gets a state view filtered by its
observation_limits. - Institutional metrics. Power concentration, deadlock, trust volatility, legitimacy, corruption pressure, emergency-power drift.
- Repeated-run evaluation harness. Multi-seed runs with pandas / matplotlib output.
- Deterministic when seeded (heuristic mode is byte-for-byte reproducible; LLM mode is reproducible up to provider variance).
- Python 3.10+ (target: 3.14)
pydantic >= 2,PyYAML,pandas,matplotlib,seaborn- For AI cognition:
openai(and/oranthropic)
git clone https://github.com/arianXdev/constitution-sim.git
cd constitution-sim
pip install -e ".[dev,llm]" # core + tests + LLM SDKs (recommended)
# or, no-LLM-only install:
pip install -e ".[dev]"This exposes a constitution-sim console entry point.
export OPENAI_API_KEY=sk-...
constitution-sim run \
--constitution constitutions/advanced_constitution.yaml \
--scenario constitutions/scenario.yaml \
--turns 20 --seed 42 \
--log /tmp/cs/events.jsonl \
--metrics-out /tmp/cs/metrics.csvThat's it. The default --agent-type auto notices the key, spins up
LLM-powered Executive / Legislature / Judiciary / Media / Bureaucracy
agents, and runs the simulation. You'll see a one-liner telling you
which provider was picked.
Want to force a provider explicitly?
constitution-sim run --agent-type openai --model gpt-4o-mini ...
constitution-sim run --agent-type anthropic --model claude-sonnet-4-5 ...Want deterministic, no-API runs (for tests / reproducibility)?
constitution-sim run --agent-type heuristic ...# 1. Validate a constitution YAML against the schema.
constitution-sim validate --constitution constitutions/advanced_constitution.yaml
# 2. Run a simulation (single seed or multi-seed evaluation).
constitution-sim run \
--constitution constitutions/advanced_constitution.yaml \
--scenario constitutions/scenario.yaml \
--turns 30 --runs 5 --seed 42 \
--log /tmp/cs/events.jsonl \
--metrics-out /tmp/cs/metrics.csv \
--plot-dir /tmp/cs/plots
# 3. Replay a recorded event log (structured summary, not re-execution).
constitution-sim replay --log /tmp/cs/eval_logs/run_0_events.jsonl --show-first 5
# 4. Compare two evaluations (e.g. two constitutions).
constitution-sim compare --a /tmp/cs/metrics_A.csv --b /tmp/cs/metrics_B.csvFor each turn, the LLM agent is prompted with:
- A role-specific persona (Executive / Legislature / …).
- The constitution's name, description, and the list of other roles.
- Its own declared goals and utility weights (from the YAML).
- A partial state view filtered by its
observation_limits. - Public political history: recent public actions taken by all actors.
- Inbox messages: any negotiation/signals received during the turn's deliberation phase.
- A short memory of its own recent decisions (and whether they were legal).
- The exact set of typed actions it's allowed to return.
It replies with one JSON object describing a single action. If the LLM returns malformed JSON or an action outside its permission set, the agent silently falls back to the deterministic heuristic policy — the simulator never breaks.
src/constitution_sim/
models/ Pydantic schemas: Constitution, Role, Rule, WorldState, actions
core/ SimulationEngine, RulesEngine, Scheduler, EventLogger
agents/ BaseAgent, DeterministicHeuristicAgent, LLMAgent, providers
scenarios/ Shock model + ScenarioEngine
analysis/ MetricsCollector, Evaluator, plot
app/ CLI (validate / run / replay / compare)
constitutions/
simple_constitution.yaml
advanced_constitution.yaml
strong_executive_constitution.yaml
scenario.yaml
docs/
architecture.md
tutorial.md
tests/
pytest -qAll tests should pass. tests/test_determinism.py explicitly asserts
that two heuristic-mode runs with the same seed produce byte-identical
event logs. tests/test_llm_agent.py::test_live_openai_smoke runs a
real LLM round-trip when OPENAI_API_KEY is set, and is automatically
skipped otherwise.
Compare a balanced constitution against a strong-executive one (3 runs ×
12 turns, seed 11). The strong-executive YAML pushes power_concentration
from ~0.47 to ~0.92 and adds illegal-action attempts to the log: laws
written by one actor, judiciary unable to push back. That's the
framework working as intended — see docs/tutorial.md for a walkthrough.
WorldStateis the single canonical truth; agents only ever see aStateView.- Every action attempt is recorded in the JSONL event log, including the rules-engine reason for any rejection.
Role.observation_limitslets the constitution define what each role can see (e.g. the Bureaucracy doesn't see pending bills inadvanced_constitution.yaml).Role.utility_weightsdrives heuristic voting and is surfaced to LLM agents in their prompt as part of the persona.RulesEnginedoes both permission checks AND state-level legality checks (you can't vote on a non-existent bill, you can't declare emergency powers if the constitution doesn't allow them).
See docs/architecture.md for the full design
and docs/tutorial.md for an end-to-end "use it
like I'm 10" walkthrough.
This is an MVP, not a finished research instrument. The following are explicit non-goals at this stage:
- Persistent economic/demographic simulation (state variables are scalars, not vector economies).
- Fine-tuned LLMs or RL self-play.