Skip to content

Custom background tasks: observability #799

@2chanhaeng

Description

@2chanhaeng

Third sub-issue of #206.

Background

The core API sub-issue implements task dispatch behavior and logging, with the worker inheriting the generic queue span. This sub-issue layers task-specific telemetry on top, reusing the queue-task metric pattern introduced in #759 and mirroring the existing http_signatures.failure_reason enum pattern in metrics.ts. Splitting it out keeps the core PR small and gives the telemetry surface (span name, attributes, bounded enum, cardinality argument) its own reviewable boundary.

Public surface

Metrics

  • Extend QueueTaskRole from "fanout" | "outbox" | "inbox" to include "task". The type is exported and consumed by OTel attribute builders, so every call site that switches on the role must be reviewed.
  • Add taskName?: string to QueueTaskCommonAttributes, emitted as the fedify.task.name attribute in buildQueueTaskAttributes.
  • Add a bounded enum QueueTaskFailureReason = "deserialization" | "validation" | "unknown_task" | "handler" (mirrors HttpSignatureMetricFailureReason). This initial four-value set maps to the core dispatch decision points and may be refined in later revisions, as long as it stays a small bounded set. Extend recordQueueTaskOutcome with an optional trailing failureReason?: QueueTaskFailureReason (non-breaking); emit fedify.task.failure_reason on the failure counter and duration histogram only when set and result === "failed".
  • recordQueueTaskEnqueued records role: "task" at the enqueue site.

The four failure_reason values map to the core dispatch decision points: "deserialization" (a parse failure on the wire string), "validation" (a schema rejection of the parsed value), "unknown_task" (an unregistered taskName), "handler" (a throw inside the handler).

Spans

  • Span name fedify.task (namespaced under fedify. because tasks are not part of ActivityPub, paralleling activitypub.inbox/outbox/fanout).
  • Attributes: fedify.task.name, fedify.task.attempt, and fedify.task.failure_reason on terminal failures.
  • Parent context propagates from the enqueue site through TaskMessage.traceContext (the field already exists from the core sub-issue).
  • resolvedQueue (the queue actually used after routing — possibly outboxQueue under the "fallback" mode), not unconditionally this.taskQueue, is the value reported for fedify.queue.backend, so the metric stays accurate regardless of fallback.

Implementation note

This sub-issue wires attributes onto the decision points the core sub-issue already implements in #listenTaskMessage and #enqueueTasks; it should not change task behavior (drop/retry semantics stay as shipped by #1). The QueueTaskFailureReason enum and recordQueueTaskOutcome extension may technically land in the foundation commit, but they are owned and exercised here.

Cardinality

Bounded: task names are a registered, known-at-startup set (never derived from message content), and failure_reason is a small bounded enum (currently four values, open to later refinement). Combined cardinality is taskName × |failure_reason| × queue.backend — within OTel attribute safety, matching the containment already used for activitypub.collection.dispatcher.

Out of scope

  • A management UI / inspection RPC.
  • Per-task custom metric attributes beyond taskName (would risk unbounded cardinality).

Acceptance criteria

  • A task span is named fedify.task and carries fedify.task.name and fedify.task.attempt.
  • Parent context is inherited from the enqueue site via traceContext.
  • Each failure path records the correct fedify.task.failure_reason (deserialization / validation / unknown_task / handler).
  • fedify.queue.backend reflects the resolved queue, including the outboxQueue fallback case.
  • recordQueueTaskEnqueued / recordQueueTaskOutcome carry role: "task".
  • Telemetry assertions use TestSpanExporter / createTestTracerProvider from @fedify/fixture; tests pass under Deno, Node.js, and Bun.
  • docs/manual/tasks.md observability section added; CHANGES.md updated; AI usage disclosed per AI_POLICY.md.

References

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Priority

None yet

Effort

None yet

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions