Skip to content

[Engine] Built-in cache step type — replace 5-step cache pattern #81

Description

@ameet

Problem

Any flow that caches LLM or API results currently requires a 5-step pattern:

  1. buildCacheKey — deterministic key from inputs
  2. readCacheFile — file-read with onError: continue
  3. checkCache — evaluate hit/miss/expired (age-based TTL)
  4. buildCacheEntry — quality gate + serialize (after computation)
  5. writeCacheFile — persist to disk (guarded on truthy output)

This pattern must be implemented identically in every cached flow. In a 90+ flow codebase, 12 flows use this pattern = 60 steps dedicated to caching.

Brittleness

The pattern has 4 known failure modes that teams build regression tests to catch:

  1. Step ordering bugbuildCacheEntry references $.steps.buildResult but is declared before it. If execution order changes, cache writes stale data.
  2. Null guard missingwriteCacheFile must check $.steps.buildCacheEntry.output is truthy. Without this guard, it writes null or undefined to the cache file.
  3. onError missingreadCacheFile, buildCacheEntry, and writeCacheFile all need onError: continue. Missing any one causes the flow to fail on cache corruption.
  4. TTL inconsistency — each flow hardcodes its own TTL calculation. Easy to have different TTL logic across flows.

Proposal

Add a first-class cache step type:

{
  "id": "cached",
  "type": "cache",
  "cache": {
    "namespace": "company-research",
    "key": "{{$.input.company}}-{{$.input.stage}}",
    "ttl": "30d",
    "qualityGate": {
      "minLength": 100,
      "requiredFields": ["data", "score"]
    },
    "steps": [
      { "id": "doExpensiveWork", "type": "bash", "bash": { "command": "..." } },
      { "id": "buildResult", "type": "code", "code": { "source": "..." } }
    ]
  }
}

Semantics:

  • On cache hit (key exists, not expired, passes quality gate): skip inner steps, return cached data
  • On cache miss/expired: execute inner steps, validate output against quality gate, write to cache
  • Cache key is deterministic from the template expression
  • TTL supports human-readable durations: 3d, 30d, 1h
  • Quality gate is optional; if present, prevents caching bad results
  • All file I/O and error handling is managed by the engine

Benefits

  • 12 flows × 5 steps = 60 steps reduced to 12 cache blocks
  • Eliminates 4 known brittleness patterns
  • Standardizes TTL format (no more 30 * 24 * 60 * 60 * 1000 in code)
  • Quality gates become declarative, not imperative
  • Cache invalidation becomes a platform feature (one cache clear --namespace X)

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions