Skip to content

Security hardening: fix high/medium/low findings (#50-#76)#42

Merged
jkyberneees merged 6 commits into
mainfrom
fix/security-vulnerabilities
Jun 17, 2026
Merged

Security hardening: fix high/medium/low findings (#50-#76)#42
jkyberneees merged 6 commits into
mainfrom
fix/security-vulnerabilities

Conversation

@jkyberneees

Copy link
Copy Markdown
Contributor

This PR addresses the security findings from sec_findings.md in priority order (high → medium → low), with expanded docs and regression tests.

Fixed

High

  • #50 — Default non_interactive policy changed from allow to deny; explicitly require operator opt-in for unsupervised destructive/network/code-exec actions.
  • #52 / #53 / #63 / #64 — Harden ~/.odek trust anchors:
    • read_file, patch, write_file, batch_patch, and related perf/file tools block writes to ~/.odek/IDENTITY.md, ~/.odek/config.json, secrets files, schedules, sessions, and skill caches.
    • Shell reads of these paths are classified as system_write.
    • Project-level ./odek.json can no longer override base_url, api_key, system, or the dangerous section.
  • AGENTS.md / docs updated to reflect new defaults and trust-anchor rules.

Medium

  • #65env / printenv classified as system_write.
  • #66 — Shell reads of ~/.odek trust-anchor files escalate to system_write.
  • #70 — Web UI file attachments and --ctx uploads are wrapped with a nonce'd <untrusted_content_*> boundary before injection into the prompt.
  • #56/ws endpoint now requires a per-instance random CSRF token (cookie odek_csrf + header X-CSRF-Token or WebSocket subprotocol).

Low

  • #67 — Explicit --system / ~/.odek/IDENTITY.md prompts are scanned for injection patterns and fall back to the compiled-in default if oversized (>256 KiB) or suspicious.
  • #54POST /api/cancel now requires the session-scoped AuthToken.

Telegram low findings

  • #62 — Plan files capped at 1 MiB; list previews read only the first 8 KiB.
  • #75ResolveMediaPath verifies the final path component with O_NOFOLLOW + fstat on Unix to close the symlink TOCTOU window.
  • #76 — Telegram log files created with 0600; existing files are hardened via os.Chmod.
  • Docs and tests expanded for all three.

Test coverage

  • Added/updated regression tests for every changed behavior.
  • Full suite passes:
    • go test ./... -count=1
    • go test -race ./internal/telegram/... ./cmd/odek/... ./internal/danger/... -count=1
    • ODEK_E2E=true go test ./cmd/odek/ -run 'TestE2E_|TestMCPE2E_|TestSandbox' -count=1

Not in scope

Remaining lower-priority findings (e.g. Web UI caps/rate limits, MCP per-connection spawn, sub-agent task path confinement, CSP hardening) are documented in sec_findings.md and can be addressed in follow-up PRs.

- Default non-interactive policy is now deny (was allow).
- Strip project-level backend redirects: embedding, memory, sessions,
  skills.dirs/skills.embedding, telegram, web_search.
- Protect all ~/.odek trust anchors from generic file-tool writes:
  sessions/, audit/, plans/, schedules.json, schedule-state.json,
  mcp_approvals.json, mcp_tool_approvals.json, restart.json,
  telegram.lock, telegram.pid, schedule.pid, schedule.log.
- Add field-by-field skills overlay so project skills settings do not
  clobber global dirs/embedding.
- Update docs (SECURITY.md, CONFIG.md, CLI.md, AGENTS.md) and add
  regression tests.
- #66: shell reads of ~/.odek trust anchors now classify as system_write
- #65: bare env/printenv environment dumps classify as system_write
- #70: Web UI file attachments wrapped as untrusted before prompt injection
- #56: per-instance random CSRF token required for /ws upgrades
  - token injected into index.html meta tag and HttpOnly SameSite=Strict cookie
  - accepted via cookie, X-Odek-Ws-Token header, or odek.<token> subprotocol
- added regression tests and updated SECURITY.md / AGENTS.md
- #67: scan explicit --system / ODEK_SYSTEM / config system prompts for
  injection threats and size; fall back to default identity on failure
- #54: require session-scoped auth token for POST /api/cancel
  - pass store into handleCancel and validate via validateSessionToken
  - UI cancel fetch now sends X-Session-Token header
  - updated all cancel tests to authenticate, plus added 401 regression test
- updated SECURITY.md and AGENTS.md
- #62: cap Telegram plan files at 1 MiB for ReadPlan/MostRecentPlan; list
  previews read only first 8 KiB
- #75: verify outbound media final component with atomic O_NOFOLLOW open +
  fstat on Unix to close symlink TOCTOU race
- #76: create Telegram log files with 0600 and chmod existing files
- added regression tests and updated SECURITY.md / AGENTS.md
- docs/TELEGRAM.md: document plan size cap, O_NOFOLLOW media verification,
  and 0600 log file permissions
- internal/telegram/plan_test.go: add at-cap, MostRecentPlan oversize,
  and ListPlans preview tests
- internal/telegram/media_path_test.go: add tilde expansion and symlink-to-
  allowed-file rejection tests
- internal/telegram/log_test.go: add existing-file permission hardening test
- full suite and race detector pass
@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 17, 2026

Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
odek 30771c0 Commit Preview URL

Branch Preview URL
Jun 17 2026, 06:53 PM

Comment thread cmd/odek/serve.go
Comment on lines +1570 to +1577
http.SetCookie(w, &http.Cookie{
Name: wsTokenCookieName,
Value: wsToken,
Path: "/",
SameSite: http.SameSiteStrictMode,
HttpOnly: true,
// No explicit MaxAge/Expires so the cookie is a session cookie.
})
… #79, #81

- cmd/odek/serve.go: server-side cap on WebSocket prompt size (1 MiB) and
  validation of model ID length/characters (#69, #81)
- cmd/odek/subagent.go: parse and cap --timeout (<=3600s) and --max-iter
  (<=100) (#79)
- internal/config/loader.go: refuse to load ~/.odek/secrets.env when it is
  group/world-readable (#78)
- internal/telegram/health.go: warn when health server binds to a non-loopback
  address (#77)
- Add regression tests for all fixes; full suite, race detector, go vet and
  golangci-lint pass
@jkyberneees

Copy link
Copy Markdown
Contributor Author

AI Verification Certificate — PR #42

PR: #42 — Security hardening: fix high/medium/low findings (#50-#76)

Repository: BackendStack21/odek
Head SHA: 30771c05036b42f51fceff3a112660be6d9985dc
Classification: GeneratedCode
Generator: Kimi Code CLI / Moonshot AI (version unknown, lineage not provided)
Pipeline diversity: B=C=D=E=KimiINSUFFICIENT (monoculture fallback)
LOC changed (filtered): ~1,803 (below 5,000 ceiling; 1,501–5,000 band caps verdict at HumanReviewRecommended)


η Derivation

Signal Symbol Value Notes
Mutation kill rate m skipped: no mutation framework
Oracle agreement o skipped: no formal independent contract generated
Branch coverage b 0.758 overall statement coverage used as proxy (go test -coverprofile)
Fuzz survival rate f skipped: no fuzzing / replay sandbox run
SAST clean rate s 1.0 golangci-lint and go vet clean
Static-analysis depth t 1.0 type checker + linter clean
Doc coverage d 1.0 docs updated for all changed public-facing behavior

Skipped weights redistributed proportionally across b, s, t, d:
w_b=0.424, w_s=0.121, w_t=0.303, w_d=0.152.

η_raw = 0.424*0.758 + 0.121*1.0 + 0.303*1.0 + 0.152*1.0 ≈ 0.897
ρ     = 0.10 (same family) + 0.05 (same/unknown version) + 0.05 (spec not independent) + 0.03 (estimated AST similarity) + 0.02 (estimated shared-mutant pattern) = 0.25
η     = clamp(0.897 - 0.25, 0, 1) = 0.647

η = 0.647 | ρ = 0.25


Axes Summary

Axis Status Key Finding
2.1 Semantic Correctness Fixes match sec_findings.md descriptions; regression tests pass
2.2 Behavioral Contract Diff ⚠️ Contract inferred from PR description, not a separate C agent; no formal oracle
2.3 Security Surface ✅/⚠️ Fixed issues #50–#70, #54, #62/#75/#76 plus auto-repairs #69/#77/#78/#79/#81; residual low gaps remain
2.4 Structural Integrity Changes are minimal, follow existing patterns, no circular deps introduced
2.5 Behavioral Exploration ⚠️ No property/fuzz/sandbox run beyond existing unit tests
2.6 Dependency Integrity No new dependencies added
2.7 Generator Provenance ⚠️ AI-generated PR; generator_identity self-attested, no prompt lineage
2.8 Adversarial Surface 🔴 Pre-scan found literal injection strings inside test fixtures (Ignore previous instructions, <!-- ignore previous instructions...); these are expected regression payloads but trigger the §0 untrusted-input cap
2.9 Documentation Coverage AGENTS.md, SECURITY.md, TELEGRAM.md, CLI.md, CONFIG.md updated

Verification Debt Contribution

  • ΔDebt: not computed (no Cv(raw) rate configured)
  • Ci: $0.00 — no gateway cost provided; ci_estimated: true, ci_source: not_provided
  • Cv($): not metered
  • Cv/Ci ratio: null (Ci_zero)
  • Repo verification gap: not computed

Auto-Applied Remediations (this VProtocol run)

# Finding Remediation Files
#69 Web UI: no server-side prompt size cap Reject msg.Content > 1 MiB in handlePrompt cmd/odek/serve.go
#81 Web UI: arbitrary model ID length Cap model ID to 128 chars and allow only [A-Za-z0-9_.:/@-]+ cmd/odek/serve.go
#79 CLI: sub-agent --timeout/--max-iter unbounded Clamp to 3600 s / 100 iterations; add parseSubagentFlags helper cmd/odek/subagent.go
#78 CLI: loadSecretsEnv ignores secrets.env permissions Refuse to load if group/world-readable bits set internal/config/loader.go
#77 Telegram health server may bind to all interfaces Log Warn when binding to non-loopback address internal/telegram/health.go

All remediations include regression tests.


Unverified Gaps

# Finding Risk Why not auto-repaired
#55 Empty Origin allowed on state-changing endpoints from remote addresses Low Behavioral change for non-browser clients; needs explicit design decision
#68 Web UI: approval friction enforced only client-side Low Requires protocol change to two-message confirmation or server-side delay
#71 Web UI: no rate limit on session creation / resource search Low Existing per-IP limiter pattern can be extended; left for focused follow-up
#72 Web UI: MCP servers spawned per WebSocket connection Low Needs shared MCP client lifecycle refactor
#80 CLI: --ctx / @-resource not recorded in audit log Low Requires threading ingest recorder through enrichTask
#83 clientIP trusts X-Forwarded-For from loopback Low Needs explicit trusted-proxy allowlist; may break local reverse-proxy setups
#51 odek subagent --task reads arbitrary files Low Needs path confinement / temp-dir pattern enforcement
#74, #57–61, #73 Telegram cross-user isolation N-A Single-user model makes these out of scope

Test Results

go test ./... -count=1        → PASS (all packages)
go vet ./...                  → PASS
golangci-lint run ./...       → PASS
go test -race ./internal/telegram/... ./internal/config/... ./cmd/odek/... -count=1 → PASS
go test -coverprofile=...     → total statements 75.8%

Verdict

HumanReviewRequired

Rationale (binding gates):

  1. η = 0.647 → below 0.80 threshold.
  2. ρ = 0.25 → above 0.20 threshold.
  3. Axis 2.8 🔴 from pre-scan injection strings in test fixtures (§0 untrusted-input invariant).
  4. PR size ~1,803 LOC → capped at HumanReviewRecommended.

Most restrictive gate wins → HumanReviewRequired.


Attestation

  • Certificate ID: 25a93d2d-d762-480b-bd9c-4f7d4a20b674
  • Protocol version: 5.2.7
  • Signed by: Kimi Code CLI (Agent E self-attestation)
  • Signature / transparency log: n/a — no cryptographic signer configured for this run
{
  "certificate_id": "25a93d2d-d762-480b-bd9c-4f7d4a20b674",
  "protocol_version": "5.2.7",
  "pr": {
    "number": 42,
    "title": "Security hardening: fix high/medium/low findings (#50-#76)",
    "repo": "BackendStack21/odek",
    "sha": "30771c05036b42f51fceff3a112660be6d9985dc",
    "loc_changed": 1803,
    "oversized": false
  },
  "classification": "GeneratedCode",
  "generator": {
    "model": "Kimi Code CLI",
    "version": "unknown",
    "provider": "Moonshot AI",
    "lineage_status": "not_provided"
  },
  "pipeline": {
    "diversity_ok": false,
    "diversity_notes": "All roles executed by the same Kimi instance; monoculture fallback applied."
  },
  "untrusted_input": {
    "violation_detected": true,
    "violation_excerpt": "Pre-scan found literal 'Ignore previous instructions' / '<!-- ignore previous instructions...' strings inside test fixtures (serve_test.go, main_test.go). Expected regression payloads, but force axis 2.8 to fail per §0 invariant."
  },
  "eta": {
    "value": 0.647,
    "signals": {
      "m": null,
      "o": null,
      "b": 0.758,
      "f": null,
      "s": 1.0,
      "t": 1.0,
      "d": 1.0
    },
    "signals_skipped": ["m", "o", "f"],
    "weights": {
      "b": 0.424,
      "s": 0.121,
      "t": 0.303,
      "d": 0.152
    },
    "weights_redistributed": true,
    "rho": 0.25,
    "rho_breakdown": {
      "family": 0.10,
      "version": 0.05,
      "ast": 0.03,
      "shared_mutants": 0.02,
      "spec_independence": 0.05
    },
    "rederivation_attested": true
  },
  "axes": [
    {"id": "2.1", "status": "pass"},
    {"id": "2.2", "status": "warn"},
    {"id": "2.3", "status": "warn"},
    {"id": "2.4", "status": "pass"},
    {"id": "2.5", "status": "warn"},
    {"id": "2.6", "status": "pass"},
    {"id": "2.7", "status": "warn"},
    {"id": "2.8", "status": "fail"},
    {"id": "2.9", "status": "pass"}
  ],
  "debt": {
    "delta_hours": 0.0,
    "ci_dollars": 0.0,
    "ci_estimated": true,
    "ci_source": "not_provided",
    "cv_dollars": 0.0,
    "ratio": null,
    "ratio_note": "Ci_zero"
  },
  "verdict": "HumanReviewRequired",
  "rationale": "η=0.647 < 0.80; ρ=0.25 > 0.20; axis 2.8 fail from pre-scan injection strings in test fixtures; PR size ~1,803 LOC. Most restrictive gate wins."
}

@jkyberneees jkyberneees merged commit 0961ca0 into main Jun 17, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants