feat(open_feature): FFE APM feature-flag span enrichment (experimental, gated)#5910
feat(open_feature): FFE APM feature-flag span enrichment (experimental, gated)#5910leoromanovsky wants to merge 10 commits into
Conversation
… gate - Bind ddog_ffe_assignment_get_serial_id in feature_flags.c, converting the ddog_Option_I32 tagged union (SOME -> Integer, NONE -> nil) - Register serial_id on the ResolutionDetails C class; add :serial_id to the Ruby ResolutionDetails Struct (nil in build_error) - Thread __dd_split_serial_id + __dd_do_log into build_flag_metadata, read off the Struct (flag_metadata native path disabled, FFL-1450) - Add span_enrichment_enabled gate (DD_EXPERIMENTAL_FLAGGING_PROVIDER_SPAN_ENRICHMENT_ENABLED), distinct from the provider gate, default off; register the env var - RBS for serial_id on both ResolutionDetails classes; C-binding spec asserts the NONE->nil path against real libdatadog
…lator, capture, write - Codec golden vector ZAgUAg== + empty + dedupe/sort + round-trip + SHA256 - Accumulator limits 200/10/20/5/64, JSON object shapes, first-wins, truncation - finally capture: serial_id, subject (do_log+targeting_key), runtime-default via missing variant, no-active-root, error isolation - root-span write integration (ffe_* on finish + state cleanup) and shutdown
…re hook - Add SpanEnrichmentHook with embedded Codec (ULEB128 delta-varint + base64, SHA256), Accumulator (limits 200/10/20/5/64, dedupe, JSON object shapes, first-wins, 64-char truncation), and per-root AccumulatorStore - finally hook captures serial_id/subject/runtime-default (missing-variant detection) keyed to the active root span; error-isolated (never raises) - Write ffe_flags_enc (bare base64), ffe_subjects_enc + ffe_runtime_defaults (JSON objects) on the local root via span_before_finish, then delete state - Gate-gated construction in component (DG-005: nothing built when off); symmetric teardown via shutdown on component.shutdown! - provider.hooks returns [flag_eval, span_enrichment].compact - RBS for the hook + settings gate + component accessor; vendor OF SDK RBS gains value/evaluation_context; Steepfile loads base64; steep + standard clean - Spec covers all 7 required L0 cases incl. gate-off control + codec golden
…anup (CR-01) - Reproduces CR-01: a child span finishing before the local root span triggers the span_before_finish handler, which (pre-fix) unconditionally deletes the per-trace accumulator in its ensure block, so the root emits no ffe_* tags. - New spec builds a trace with a nested child that finishes first, evaluates a flag (serial id + subject), then finishes the root and asserts the root carries ffe_flags_enc / ffe_subjects_enc and state is cleaned up once. - Companion spec: a flag evaluated after a child has already finished still reaches the root write. - Fails against current code (root ffe_flags_enc == nil).
…(CR-01) - write_tags_on_root subscribes to span_before_finish, which fires for every span in the trace. The cleanup (@store.delete / @subscribed.delete) was in an unconditional ensure block, so it ran on every span finish, including child finishes that hit the early non-root return. - In any nested trace the child finishes before the local root, so the first child finish wiped the per-trace accumulator before the root could write its tags, leaving the root with no ffe_* tags. - Move the delete inside the root-only branch (guarded by the root-span check), mirroring the Node reference's spanStates.delete(span) and the Python sibling's _on_span_finish pop: cleanup happens exactly once, when the local root span finishes, and a child finish never touches state. - Contract/codec/C-binding/limits/gate-off behavior unchanged; signatures unchanged (RBS/steep clean).
encode_varint / encode_delta_varint built their byte buffers as UTF-8 strings, so String#<< with an Integer >= 0x80 appended a Unicode codepoint (a 2-byte UTF-8 sequence) instead of a raw byte. This corrupted every serial id whose varint contains a continuation byte: serial 2312 (bytes 88 12) was emitted as C2 88 12, decoding to 296002. ffe_flags_enc / ffe_subjects_enc were wrong for any serial id >= 128. Use binary (ASCII-8BIT) buffers so << appends raw bytes. Add a regression spec covering continuation-byte serial ids (the existing golden vector's deltas are all < 128, so it never exercised this path).
|
👋 Hey @DataDog/ruby-guild, please fill "Change log entry" section in the pull request description. If changes need to be present in CHANGELOG.md you can state it this way **Change log entry**
Yes. A brief summary to be placed into the CHANGELOG.md(possible answers Yes/Yep/Yeah) Or you can opt out like that **Change log entry**
None.(possible answers No/Nope/None) Visited at: 2026-06-17 12:30:50 UTC |
Typing analysisNote: Ignored files are excluded from the next sections.
|
|
BenchmarksBenchmark execution time: 2026-06-17 23:33:15 Comparing candidate commit 89b7543 in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 48 metrics, 1 unstable metrics.
|
…ync + weak lifecycle Address codex review blockers on the FFE APM span-enrichment hook. The supported OpenFeature Ruby SDKs (>= 0.3.1) do not dispatch provider or client hooks at all (Client#fetch_details calls the provider and never invokes a hook), so the previous design — gated on OpenFeature::SDK::Hooks::Hook being defined and relying on #finally — never constructed the hook and silently emitted no ffe_* tags even with DD_EXPERIMENTAL_FLAGGING_PROVIDER_SPAN_ENRICHMENT_ENABLED=true. Changes: - Dispatch enrichment directly from Provider#evaluate via a new SpanEnrichmentHook#capture(flag_key:, variant:, value:, serial_id:, do_log:, targeting_key:) primitive, wrapped never-throw. Works on every supported SDK. SpanEnrichmentHook.available? now returns true unconditionally; #finally is kept as an idempotent delegate for any future SDK that does dispatch hooks. - Guard all store/subscription/accumulator access with a Mutex; take the finish-time tag snapshot under the lock so concurrent evaluations cannot race fetch-or-create, duplicate subscriptions, or mutate-while-encoding. - Key per-trace state in an ObjectSpace::WeakMap so an abandoned trace (root span never finishes) cannot pin its accumulator; the span_before_finish subscription closure captures the accumulator to keep it alive exactly for the trace's lifetime, and both are collected together when the trace is GC'd. - Update RBS sigs for the new capture/tuple-returning store/weak backing. Gate-off behavior is unchanged: the hook is still only constructed when span_enrichment_enabled is true (no idle per-span overhead, DG-005). The frozen wire contract (tag names, ULEB128 delta-varint binary buffer, limits, SHA256 subjects, JSON runtime defaults) is unchanged.
…eature client Add a provider/client spec that drives the production path OpenFeature::SDK.build_client -> Datadog::OpenFeature::Provider#fetch_* -> enrich_span -> ffe_* tags on the local root span, stubbing only the native EvaluationEngine and the active-trace seam (no native ext / libdatadog / Docker). Covers the highest-risk behavior the unit specs cannot: that real Ruby evaluations actually trigger enrichment (the SDK does not dispatch hooks), asserting: - gate ON: ffe_flags_enc after a real string evaluation, - ffe_subjects_enc only when do_log is authorized and a targeting key is present, - runtime-default capture via a missing variant, - child-span evaluations aggregate onto the one local root, - concurrent evaluations on the same root lose no serial ids (Mutex), - enrichment still works after provider reconfiguration, - gate OFF: no ffe_* tags, fully inert. Passes on both the openfeature-min (0.3.1) and openfeature-latest (0.4.1) appraisals.
… SDK >= 0.6 The end-to-end span-enrichment spec failed under openfeature-sdk 0.6.x (Ruby 3.4 CI lane) for two version-specific reasons: 1. 0.6+ dispatches provider hooks during Client#fetch_details, calling Provider#hooks -> Component#flag_eval_hook. The Component instance_double did not allow :flag_eval_hook, raising MockExpectationError. Stub it (nil -> compacted) on both doubles. 2. 0.6+ sets the provider asynchronously (spawns a background init thread), which the leaked-thread detector flagged. Use set_provider_and_wait when available (synchronous, no thread); older SDKs already set synchronously. Also corrected comments that claimed supported SDKs never dispatch hooks (true for < 0.6, false for >= 0.6; the direct dispatch + finally hook are idempotent via the deduping accumulators). No production behavior change.
feat(open_feature): FFE APM feature-flag span enrichment
Summary
Adds Feature Flag Events (FFE) span enrichment to the OpenFeature integration. When feature
flags are evaluated, the evaluation metadata is attached to the root APM span so APM customers
can filter traces and errors by active flag variant, and the FFE/Experimentation platform can
correlate spans with experiments. The wire format matches the merged reference implementation
(
dd-trace-js#8343) so backend/Trino decode is identical.How it works
ffe_*tags.Configuration
Opt-in, off by default:
This is distinct from
DD_EXPERIMENTAL_FLAGGING_PROVIDER_ENABLED.Span tags added
ffe_flags_encffe_subjects_enc__dd_do_log=true){ sha256(key): encodedIds }ffe_runtime_defaults{ flagKey: value }Limits: 200 serial IDs, 10 subjects, 20 experiments/subject, 5 runtime defaults, 64 chars/runtime-default value (UTF-8-safe truncation).
Changes
DD_EXPERIMENTAL_FLAGGING_PROVIDER_SPAN_ENRICHMENT_ENABLED(open_feature/configuration.rb,supported_configurations), off by default; bind the splitserial_idover FFI (ext/libdatadog_api/feature_flags.c).open_feature/hooks/span_enrichment_hook.rb— ULEB128 / delta-varint serial IDs, SHA256-hashed subject keys; accumulator + capture hook writeffe_*tags to the local root span; provider/component wiring.Decisions
encode_varint/encode_delta_varintoriginally built their byte buffer as a UTF-8 string (+''), soString#<<with an integer ≥0x80appended a 2-byte Unicode codepoint — corrupting every serial ID ≥ 128 (serial2312decoded to296002). Fixed to binary buffers ((+'').b), with a regression spec.ffe_*tags.ffe_*are bare tag names on spanmeta(not_dd.-prefixed); subject keys are SHA256 hashes emitted only when logging is authorized.Validation
FFE dogfooding app
Validated live against the
ffe-dogfoodingapp via atrace-intaketee-proxy that captures the raw/v0.4/tracespayload and decodes theffe_*tags. Flagffe-dogfooding-string-flag(serial2312):ffe_flags_encdecoding to serial[2312]plus a SHA256-hashedffe_subjects_enc→[2312](confirming the binary-buffer varint fix; previously this emitted the corrupted296002).ffe_*tags.Local system-tests run
Ran the frozen
system-testsparametric suite (tests/parametric/test_ffe/test_span_enrichment.py, unchanged) against this branch's gem (datadog-2.36.0.dev,libdatadog-33.0.0.1.0):All 18 cases pass, covering the delta-varint codec (serial
2312→ bytes88 12), multi-flag/multi-subject aggregation onto one root span, child-span → root propagation, the 200/10/20/5/64 limits, and SHA256 subject keys gated on__dd_do_log. No SDK source changes beyond the varint fix were required. The system-tests enablement (parametricserver.rb+manifests/ruby.yml) is a separate draft PR againstDataDog/system-tests.Full dogfooding matrix + system-tests (2026-06-17)
Re-validated end-to-end through the real OpenFeature client (gem built from this branch) behind
the
trace-intaketee-proxy, decodingffe_*withscripts/decode_ffe_span_tags.py(root spanscenario, serviceffe-dogfooding-ruby):ffe_flags_enc→[2312];ffe_subjects_enc={sha256(targeting key): ids}only whendo_logffe_*tags; no hook constructedffe_flags_enc=[829,1442,2311,2312], nothing overwrittenffe_*on the local root only; the child carries noneffe_runtime_defaultsraw UTF-8 (héllo-wörld-☃-日本語-Ω,こんにちは,🎉), valid JSON, codepoint-safe 64-char truncation (binary varint buffers encode serial ids ≥128 correctly)ZAgUAg==→[100,108,128,130]Concurrency / lifecycle / reconfiguration are covered by the
Mutex-guarded accumulator,ObjectSpace::WeakMapper-trace keying, and provider-path dispatch in this PR.System-tests: 18 passed (
TEST_LIBRARY=ruby ./run.sh PARAMETRIC -k span_enrichment, libraryruby@2.36.0.dev). No SDK source change was required to pass; the only change needed was in thesystem-testsRuby parametric harness (re-activate the test's root trace operation per/ffe/evaluateso evaluations aggregate onto the test root rather than forking a per-call trace).