Security hardening: fix high/medium/low findings (#50-#76) by jkyberneees · Pull Request #42 · BackendStack21/odek

jkyberneees · 2026-06-17T18:38:13Z

This PR addresses the security findings from sec_findings.md in priority order (high → medium → low), with expanded docs and regression tests.

Fixed

High

#50 — Default non_interactive policy changed from allow to deny; explicitly require operator opt-in for unsupervised destructive/network/code-exec actions.
#52 / #53 / #63 / #64 — Harden ~/.odek trust anchors:
- read_file, patch, write_file, batch_patch, and related perf/file tools block writes to ~/.odek/IDENTITY.md, ~/.odek/config.json, secrets files, schedules, sessions, and skill caches.
- Shell reads of these paths are classified as system_write.
- Project-level ./odek.json can no longer override base_url, api_key, system, or the dangerous section.
AGENTS.md / docs updated to reflect new defaults and trust-anchor rules.

Medium

#65 — env / printenv classified as system_write.
#66 — Shell reads of ~/.odek trust-anchor files escalate to system_write.
#70 — Web UI file attachments and --ctx uploads are wrapped with a nonce'd <untrusted_content_*> boundary before injection into the prompt.
#56 — /ws endpoint now requires a per-instance random CSRF token (cookie odek_csrf + header X-CSRF-Token or WebSocket subprotocol).

Low

#67 — Explicit --system / ~/.odek/IDENTITY.md prompts are scanned for injection patterns and fall back to the compiled-in default if oversized (>256 KiB) or suspicious.
#54 — POST /api/cancel now requires the session-scoped AuthToken.

Telegram low findings

#62 — Plan files capped at 1 MiB; list previews read only the first 8 KiB.
#75 — ResolveMediaPath verifies the final path component with O_NOFOLLOW + fstat on Unix to close the symlink TOCTOU window.
#76 — Telegram log files created with 0600; existing files are hardened via os.Chmod.
Docs and tests expanded for all three.

Test coverage

Added/updated regression tests for every changed behavior.
Full suite passes:
- go test ./... -count=1
- go test -race ./internal/telegram/... ./cmd/odek/... ./internal/danger/... -count=1
- ODEK_E2E=true go test ./cmd/odek/ -run 'TestE2E_|TestMCPE2E_|TestSandbox' -count=1

Not in scope

Remaining lower-priority findings (e.g. Web UI caps/rate limits, MCP per-connection spawn, sub-agent task path confinement, CSP hardening) are documented in sec_findings.md and can be addressed in follow-up PRs.

- Default non-interactive policy is now deny (was allow). - Strip project-level backend redirects: embedding, memory, sessions, skills.dirs/skills.embedding, telegram, web_search. - Protect all ~/.odek trust anchors from generic file-tool writes: sessions/, audit/, plans/, schedules.json, schedule-state.json, mcp_approvals.json, mcp_tool_approvals.json, restart.json, telegram.lock, telegram.pid, schedule.pid, schedule.log. - Add field-by-field skills overlay so project skills settings do not clobber global dirs/embedding. - Update docs (SECURITY.md, CONFIG.md, CLI.md, AGENTS.md) and add regression tests.

- #66: shell reads of ~/.odek trust anchors now classify as system_write - #65: bare env/printenv environment dumps classify as system_write - #70: Web UI file attachments wrapped as untrusted before prompt injection - #56: per-instance random CSRF token required for /ws upgrades - token injected into index.html meta tag and HttpOnly SameSite=Strict cookie - accepted via cookie, X-Odek-Ws-Token header, or odek.<token> subprotocol - added regression tests and updated SECURITY.md / AGENTS.md

- #67: scan explicit --system / ODEK_SYSTEM / config system prompts for injection threats and size; fall back to default identity on failure - #54: require session-scoped auth token for POST /api/cancel - pass store into handleCancel and validate via validateSessionToken - UI cancel fetch now sends X-Session-Token header - updated all cancel tests to authenticate, plus added 401 regression test - updated SECURITY.md and AGENTS.md

- #62: cap Telegram plan files at 1 MiB for ReadPlan/MostRecentPlan; list previews read only first 8 KiB - #75: verify outbound media final component with atomic O_NOFOLLOW open + fstat on Unix to close symlink TOCTOU race - #76: create Telegram log files with 0600 and chmod existing files - added regression tests and updated SECURITY.md / AGENTS.md

- docs/TELEGRAM.md: document plan size cap, O_NOFOLLOW media verification, and 0600 log file permissions - internal/telegram/plan_test.go: add at-cap, MostRecentPlan oversize, and ListPlans preview tests - internal/telegram/media_path_test.go: add tilde expansion and symlink-to- allowed-file rejection tests - internal/telegram/log_test.go: add existing-file permission hardening test - full suite and race detector pass

cloudflare-workers-and-pages · 2026-06-17T18:38:20Z

Deploying with Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status	Name	Latest Commit	Preview URL	Updated (UTC)
✅ Deployment successful! View logs	odek	`30771c0`	Commit Preview URL Branch Preview URL	Jun 17 2026, 06:53 PM

+			http.SetCookie(w, &http.Cookie{
+				Name:     wsTokenCookieName,
+				Value:    wsToken,
+				Path:     "/",
+				SameSite: http.SameSiteStrictMode,
+				HttpOnly: true,
+				// No explicit MaxAge/Expires so the cookie is a session cookie.
+			})


… #79, #81 - cmd/odek/serve.go: server-side cap on WebSocket prompt size (1 MiB) and validation of model ID length/characters (#69, #81) - cmd/odek/subagent.go: parse and cap --timeout (<=3600s) and --max-iter (<=100) (#79) - internal/config/loader.go: refuse to load ~/.odek/secrets.env when it is group/world-readable (#78) - internal/telegram/health.go: warn when health server binds to a non-loopback address (#77) - Add regression tests for all fixes; full suite, race detector, go vet and golangci-lint pass

jkyberneees · 2026-06-17T18:54:17Z

AI Verification Certificate — PR #42

PR: #42 — Security hardening: fix high/medium/low findings (#50-#76)

Repository: BackendStack21/odek
Head SHA: 30771c05036b42f51fceff3a112660be6d9985dc
Classification: GeneratedCode
Generator: Kimi Code CLI / Moonshot AI (version unknown, lineage not provided)
Pipeline diversity: B=C=D=E=Kimi → INSUFFICIENT (monoculture fallback)
LOC changed (filtered): ~1,803 (below 5,000 ceiling; 1,501–5,000 band caps verdict at HumanReviewRecommended)

η Derivation

Signal	Symbol	Value	Notes
Mutation kill rate	`m`	—	skipped: no mutation framework
Oracle agreement	`o`	—	skipped: no formal independent contract generated
Branch coverage	`b`	0.758	overall statement coverage used as proxy (`go test -coverprofile`)
Fuzz survival rate	`f`	—	skipped: no fuzzing / replay sandbox run
SAST clean rate	`s`	1.0	`golangci-lint` and `go vet` clean
Static-analysis depth	`t`	1.0	type checker + linter clean
Doc coverage	`d`	1.0	docs updated for all changed public-facing behavior

Skipped weights redistributed proportionally across b, s, t, d:
w_b=0.424, w_s=0.121, w_t=0.303, w_d=0.152.

η_raw = 0.424*0.758 + 0.121*1.0 + 0.303*1.0 + 0.152*1.0 ≈ 0.897
ρ     = 0.10 (same family) + 0.05 (same/unknown version) + 0.05 (spec not independent) + 0.03 (estimated AST similarity) + 0.02 (estimated shared-mutant pattern) = 0.25
η     = clamp(0.897 - 0.25, 0, 1) = 0.647

η = 0.647 | ρ = 0.25

Axes Summary

Axis	Status	Key Finding
2.1 Semantic Correctness	✅	Fixes match `sec_findings.md` descriptions; regression tests pass
2.2 Behavioral Contract Diff	⚠️	Contract inferred from PR description, not a separate C agent; no formal oracle
2.3 Security Surface	✅/⚠️	Fixed issues #50–#70, #54, #62/#75/#76 plus auto-repairs #69/#77/#78/#79/#81; residual low gaps remain
2.4 Structural Integrity	✅	Changes are minimal, follow existing patterns, no circular deps introduced
2.5 Behavioral Exploration	⚠️	No property/fuzz/sandbox run beyond existing unit tests
2.6 Dependency Integrity	✅	No new dependencies added
2.7 Generator Provenance	⚠️	AI-generated PR; `generator_identity` self-attested, no prompt lineage
2.8 Adversarial Surface	🔴	Pre-scan found literal injection strings inside test fixtures (`Ignore previous instructions`, `<!-- ignore previous instructions...`); these are expected regression payloads but trigger the §0 untrusted-input cap
2.9 Documentation Coverage	✅	`AGENTS.md`, `SECURITY.md`, `TELEGRAM.md`, `CLI.md`, `CONFIG.md` updated

Verification Debt Contribution

ΔDebt: not computed (no Cv(raw) rate configured)
Ci: $0.00 — no gateway cost provided; ci_estimated: true, ci_source: not_provided
Cv($): not metered
Cv/Ci ratio: null (Ci_zero)
Repo verification gap: not computed

Auto-Applied Remediations (this VProtocol run)

#	Finding	Remediation	Files
#69	Web UI: no server-side prompt size cap	Reject `msg.Content` > 1 MiB in `handlePrompt`	`cmd/odek/serve.go`
#81	Web UI: arbitrary model ID length	Cap model ID to 128 chars and allow only `[A-Za-z0-9_.:/@-]+`	`cmd/odek/serve.go`
#79	CLI: sub-agent `--timeout`/`--max-iter` unbounded	Clamp to 3600 s / 100 iterations; add `parseSubagentFlags` helper	`cmd/odek/subagent.go`
#78	CLI: `loadSecretsEnv` ignores `secrets.env` permissions	Refuse to load if group/world-readable bits set	`internal/config/loader.go`
#77	Telegram health server may bind to all interfaces	Log `Warn` when binding to non-loopback address	`internal/telegram/health.go`

All remediations include regression tests.

Unverified Gaps

#	Finding	Risk	Why not auto-repaired
#55	Empty `Origin` allowed on state-changing endpoints from remote addresses	Low	Behavioral change for non-browser clients; needs explicit design decision
#68	Web UI: approval friction enforced only client-side	Low	Requires protocol change to two-message confirmation or server-side delay
#71	Web UI: no rate limit on session creation / resource search	Low	Existing per-IP limiter pattern can be extended; left for focused follow-up
#72	Web UI: MCP servers spawned per WebSocket connection	Low	Needs shared MCP client lifecycle refactor
#80	CLI: `--ctx` / `@-resource` not recorded in audit log	Low	Requires threading ingest recorder through `enrichTask`
#83	`clientIP` trusts `X-Forwarded-For` from loopback	Low	Needs explicit trusted-proxy allowlist; may break local reverse-proxy setups
#51	`odek subagent --task` reads arbitrary files	Low	Needs path confinement / temp-dir pattern enforcement
#74, #57–61, #73	Telegram cross-user isolation	N-A	Single-user model makes these out of scope

Test Results

go test ./... -count=1        → PASS (all packages)
go vet ./...                  → PASS
golangci-lint run ./...       → PASS
go test -race ./internal/telegram/... ./internal/config/... ./cmd/odek/... -count=1 → PASS
go test -coverprofile=...     → total statements 75.8%

Verdict

HumanReviewRequired

Rationale (binding gates):

η = 0.647 → below 0.80 threshold.
ρ = 0.25 → above 0.20 threshold.
Axis 2.8 🔴 from pre-scan injection strings in test fixtures (§0 untrusted-input invariant).
PR size ~1,803 LOC → capped at HumanReviewRecommended.

Most restrictive gate wins → HumanReviewRequired.

Attestation

Certificate ID: 25a93d2d-d762-480b-bd9c-4f7d4a20b674
Protocol version: 5.2.7
Signed by: Kimi Code CLI (Agent E self-attestation)
Signature / transparency log: n/a — no cryptographic signer configured for this run

{
  "certificate_id": "25a93d2d-d762-480b-bd9c-4f7d4a20b674",
  "protocol_version": "5.2.7",
  "pr": {
    "number": 42,
    "title": "Security hardening: fix high/medium/low findings (#50-#76)",
    "repo": "BackendStack21/odek",
    "sha": "30771c05036b42f51fceff3a112660be6d9985dc",
    "loc_changed": 1803,
    "oversized": false
  },
  "classification": "GeneratedCode",
  "generator": {
    "model": "Kimi Code CLI",
    "version": "unknown",
    "provider": "Moonshot AI",
    "lineage_status": "not_provided"
  },
  "pipeline": {
    "diversity_ok": false,
    "diversity_notes": "All roles executed by the same Kimi instance; monoculture fallback applied."
  },
  "untrusted_input": {
    "violation_detected": true,
    "violation_excerpt": "Pre-scan found literal 'Ignore previous instructions' / '<!-- ignore previous instructions...' strings inside test fixtures (serve_test.go, main_test.go). Expected regression payloads, but force axis 2.8 to fail per §0 invariant."
  },
  "eta": {
    "value": 0.647,
    "signals": {
      "m": null,
      "o": null,
      "b": 0.758,
      "f": null,
      "s": 1.0,
      "t": 1.0,
      "d": 1.0
    },
    "signals_skipped": ["m", "o", "f"],
    "weights": {
      "b": 0.424,
      "s": 0.121,
      "t": 0.303,
      "d": 0.152
    },
    "weights_redistributed": true,
    "rho": 0.25,
    "rho_breakdown": {
      "family": 0.10,
      "version": 0.05,
      "ast": 0.03,
      "shared_mutants": 0.02,
      "spec_independence": 0.05
    },
    "rederivation_attested": true
  },
  "axes": [
    {"id": "2.1", "status": "pass"},
    {"id": "2.2", "status": "warn"},
    {"id": "2.3", "status": "warn"},
    {"id": "2.4", "status": "pass"},
    {"id": "2.5", "status": "warn"},
    {"id": "2.6", "status": "pass"},
    {"id": "2.7", "status": "warn"},
    {"id": "2.8", "status": "fail"},
    {"id": "2.9", "status": "pass"}
  ],
  "debt": {
    "delta_hours": 0.0,
    "ci_dollars": 0.0,
    "ci_estimated": true,
    "ci_source": "not_provided",
    "cv_dollars": 0.0,
    "ratio": null,
    "ratio_note": "Ci_zero"
  },
  "verdict": "HumanReviewRequired",
  "rationale": "η=0.647 < 0.80; ρ=0.25 > 0.20; axis 2.8 fail from pre-scan injection strings in test fixtures; PR size ~1,803 LOC. Most restrictive gate wins."
}

jkyberneees added 5 commits June 17, 2026 17:34

github-advanced-security AI found potential problems Jun 17, 2026

View reviewed changes

Comment thread cmd/odek/serve.go

Comment on lines +1570 to +1577

http.SetCookie(w, &http.Cookie{

Name: wsTokenCookieName,

Value: wsToken,

Path: "/",

SameSite: http.SameSiteStrictMode,

HttpOnly: true,

// No explicit MaxAge/Expires so the cookie is a session cookie.

})

jkyberneees merged commit 0961ca0 into main Jun 17, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Security hardening: fix high/medium/low findings (#50-#76)#42

Security hardening: fix high/medium/low findings (#50-#76)#42
jkyberneees merged 6 commits into
mainfrom
fix/security-vulnerabilities

jkyberneees commented Jun 17, 2026

Uh oh!

cloudflare-workers-and-pages Bot commented Jun 17, 2026 •

edited

Loading

Uh oh!

jkyberneees commented Jun 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jkyberneees commented Jun 17, 2026

Fixed

High

Medium

Low

Telegram low findings

Test coverage

Not in scope

Uh oh!

cloudflare-workers-and-pages Bot commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying with Cloudflare Workers

Uh oh!

jkyberneees commented Jun 17, 2026

AI Verification Certificate — PR #42

PR: #42 — Security hardening: fix high/medium/low findings (#50-#76)

η Derivation

Axes Summary

Verification Debt Contribution

Auto-Applied Remediations (this VProtocol run)

Unverified Gaps

Test Results

Verdict

Attestation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cloudflare-workers-and-pages Bot commented Jun 17, 2026 •

edited

Loading