Skip to content

thunkpunks/Tracebench

Repository files navigation

Tracebench

Observability and evidence tooling for generative systems.

Tracebench helps developers collect, compare, replay, and validate traces through artifact signatures, comparison, clustering, and validation packets.

Observation before interpretation.

What Tracebench is

Tracebench is an open observability bench for generative systems.

It is designed for workflows where a system produces traces, artifacts, signatures, or run records and the user needs to ask:

  • What happened?
  • Can this behavior be compared across runs?
  • Does a pattern recur under replay or perturbation?
  • What evidence can be packaged for review?
  • What should not be claimed from the evidence?

Tracebench focuses on observable structure and evidence packaging.

What Tracebench is not

Tracebench does not infer meaning, intent, diagnosis, causality, governance status, or admissibility.

Tracebench does not issue receipts.

Validation packets are evidence bundles. Validation packets are not receipts.

Core workflow

trace-like input
  -> artifact
  -> signature
  -> retrieval / comparison
  -> clustering
  -> validation packet
  -> human review

The public reference release uses synthetic fixtures and generic artifact examples. Domain adapters, production telemetry integrations, private corpora, and deployment-specific provenance mappings are intentionally outside this public release.

Relationship to the Constitutional Runtime Substrate

Tracebench is an observational substrate.

The Constitutional Runtime Substrate is an admissibility substrate.

Tracebench asks:

What happened?

The Constitutional Runtime Substrate asks:

What may be committed?

Tracebench can be used independently. Teams that need reviewable commitment workflows may pair Tracebench with downstream admissibility systems such as the Constitutional Runtime Substrate.

Observation comes first. Commitment comes second. Neither replaces the other.

Install for development

python -m pip install -e .
python -m pytest -q

Expected result for this release:

19 passed

License

Apache License 2.0.

About

Tracebench is an observability bench for generative systems.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages