Follow-up from PR #216 (SEC-006 / #120) round 4 review.
The orchestrator's /metrics endpoint is now gated by ORCHESTRATOR_SECRET + orchestratorAuthMiddleware (which accepts both X-Orchestrator-Secret and Authorization: Bearer headers). The Prometheus scrape config (deployments/prometheus/prometheus.yml) has commented-out authorization: blocks ready to enable.
What's missing for a production rollout:
- K8s Secret manifest for
orchestrator-secret (not yet wired into any K8s manifest today — ORCHESTRATOR_SECRET is read from env var, but the Secret doesn't exist in deployments/k8s/).
- Prometheus pod volumeMount to mount the secret at
/etc/prometheus/secrets/orchestrator-secret so Prometheus can use the credentials_file stanza.
- Deployment env-var wiring to inject
ORCHESTRATOR_SECRET from the K8s Secret into the orchestrator pod.
Rollout order (documented in prometheus.yml):
- Mount the secret into the Prometheus pod
- Uncomment the
authorization block in the orchestrator scrape job
- Roll out the new prometheus-config ConfigMap and restart Prometheus
- ONLY THEN set
ORCHESTRATOR_SECRET on the orchestrator deployment
If step 4 happens before steps 1-3, every Prometheus scrape returns 401 and the target silently goes DOWN in Grafana — no agent counts, queue depths, or error rates.
References
Follow-up from PR #216 (SEC-006 / #120) round 4 review.
The orchestrator's
/metricsendpoint is now gated byORCHESTRATOR_SECRET+orchestratorAuthMiddleware(which accepts bothX-Orchestrator-SecretandAuthorization: Bearerheaders). The Prometheus scrape config (deployments/prometheus/prometheus.yml) has commented-outauthorization:blocks ready to enable.What's missing for a production rollout:
orchestrator-secret(not yet wired into any K8s manifest today —ORCHESTRATOR_SECRETis read from env var, but the Secret doesn't exist indeployments/k8s/)./etc/prometheus/secrets/orchestrator-secretso Prometheus can use thecredentials_filestanza.ORCHESTRATOR_SECRETfrom the K8s Secret into the orchestrator pod.Rollout order (documented in prometheus.yml):
authorizationblock in the orchestrator scrape jobORCHESTRATOR_SECRETon the orchestrator deploymentIf step 4 happens before steps 1-3, every Prometheus scrape returns 401 and the target silently goes DOWN in Grafana — no agent counts, queue depths, or error rates.
References
deployments/prometheus/prometheus.yml(rollout-sequence comment at top)cmd/orchestrator/http.go(orchestratorAuthMiddleware)