Skip to content

EVCacheValue: opt-in compact binary serialization with backwards-compatible reads#196

Open
joegoogle123 wants to merge 1 commit into
sync-getbulk-mixed-keysfrom
evcache-value-binary-serde
Open

EVCacheValue: opt-in compact binary serialization with backwards-compatible reads#196
joegoogle123 wants to merge 1 commit into
sync-getbulk-mixed-keysfrom
evcache-value-binary-serde

Conversation

@joegoogle123

@joegoogle123 joegoogle123 commented Jun 4, 2026

Copy link
Copy Markdown

Summary

Hashed-key values are wrapped in an EVCacheValue envelope (key, value, flags, ttl, createTime) that is currently serialized with Java ObjectOutputStream, adding ~50–80 bytes of structural overhead per item. This adds a compact, length-prefixed binary format for the envelope while remaining fully backwards-compatible on reads.

📊 See benchmark.md for live A/B measurements — under realistic secure configuration (use.secure=true + cluster evcacheAuthZEnforce=true) the binary envelope saves 317 bytes / 63.4% per item. No write failures, no read-latency regression in either security mode.

What changed

  • New EVCacheValueSerde class (com.netflix.evcache.pool) — public-final-non-instantiable codec, owns the wire format and all error handling:
    • static byte[] serialize(EVCacheValue) — length-prefixed binary layout: [magic 0x0C][reserved 0x00][int keyLen][key UTF-8][int valLen][value][int flags][long ttl][long createTime].
    • static EVCacheValue deserialize(byte[]) — bounds-checks length prefixes before allocating; on any corruption / unexpected exception warn-logs the failing field and a truncated hex dump (via Apache Commons Hex.encodeHexString, capped at 1024 bytes) and returns null. Matches BaseSerializingTranscoder's resilience contract (corruption → cache miss → caller refills from source of truth) so a single corrupt entry never crashes a get / getBulk / async pipeline.
    • static boolean isBinaryFormat(byte[]) — exposed for the dispatcher.
  • EVCacheTranscoder becomes a thin dispatcher (no try/catch):
    • serialize: gates on useBinarySerialization && o instanceof EVCacheValueEVCacheValueSerde.serialize; else super.serialize (Java).
    • deserialize: dispatches on EVCacheValueSerde.isBinaryFormatEVCacheValueSerde.deserialize; else super.deserialize.
  • EVCacheValue stays a pure POJO (codec moved out; constructor unchanged from pre-PR).
  • EVCacheImpl reads a Feature Property at client construction and injects it into the (immutable) envelope transcoder.
  • Reads auto-detect format by the leading byte (0xAC 0xED = legacy Java, 0x0C = binary), so a new client decodes existing cache entries unchanged.

Format-flag decision (reuse SERIALIZED + magic byte, not a fresh flag)

The binary envelope keeps the existing SERIALIZED flag and is disambiguated from Java by the leading byte, rather than allocating a new CachedData flag. Rationale:

  • SERIALIZED semantically still means "serialized object → deserialize()"; the codec choice (Java vs binary) lives inside deserialize(). No flag constant is reassigned or repurposed, and decode() branch order is untouched.
  • Consumers that route on SERIALIZED (e.g. the admin inspector, cache-warmer) keep working without a new flag constant to propagate.
  • An old reader that hits binary bytes under SERIALIZED throws StreamCorruptedException (fails loud) rather than silently decoding garbage — which a fresh low-byte flag would cause (decodeString) on old readers.
  • Invariant (documented in EVCacheValueSerde Javadoc): SERIALIZED payloads are self-describing by leading byte; a future third format must use a distinct non-colliding magic + the reserved version byte.

Reserved version byte

Byte index 1 of the binary payload is reserved (always 0x00 today). Reader read-and-ignores; not validated. Reason: forward-compat without an emergency reader rollout. If today's readers rejected any non-zero version, introducing a v2 in the future would require shipping reader support fleet-wide before any writer could emit a v2 byte, and a single misconfigured writer would crash all readers. By accepting any value today, future readers can branch on this byte to introduce breaking format changes backwards-compatibly.

Feature Property (rollout gate)

  • <appName>.envelope.binary.serialization.enabled (global fallback evcache.envelope.binary.serialization.enabled), default false.
  • "Envelope" matches the codebase's existing term for the hashed-key EVCacheValue wrapping (envelopeTranscoder in EVCacheMemcachedClient).
  • Read once at client construction and injected into the immutable transcoder ⇒ deploy/restart required to take effect; this is NOT a live runtime toggle. Flip the property, then redeploy the consuming app.
  • Default-off means production keeps writing Java; reads auto-detect both formats. Roll out reader-first: ship this change everywhere — including the admin inspector and cache-warmer — before enabling the FP for any writer.

Compatibility

  • A client with this change decodes existing Java-serialized values unchanged (dual-format read).
  • With the FP off (default), wire output is byte-identical to today.
  • Corrupt binary payloads degrade to cache miss (null), matching the existing Java path. A single corrupt entry never crashes the caller.

Benchmarks

Full results in benchmark.md. Four independent measurements:

Setup Per-item delta (OOS − binary)
EVCacheValueSerdeTest.measureBinaryVsJavaOverhead (unit) 141.0 bytes
EvCacheBaselineOverheadSmokeTest (consumer app → live cluster) 141.0 bytes
Live ndbench A/B, insecure (use.secure=false) 162.3 bytes (−42.5%)
Live ndbench A/B, secure (mTLS + authz enforced) 317.3 bytes (−63.4%)

Secure mode shows ~2× larger savings because the auth/identity fields in EVCacheValue blow up the OOS class-descriptor cost. Latency was within noise both directions in both modes; no write failures and no read regression.

Testing

EVCacheValueSerdeTest17 cases via the public EVCacheTranscoder.encode/decode API:

  • Binary round-trip across edge cases (empty / unicode key / large 2 MB value / negative flags / zero ttl / negative & MAX createTime / MIN flags)
  • Transcoder wire shape (binary flag on): SERIALIZED flag set, leading byte 0x0C, reserved byte 0x00
  • Default transcoder (flag OFF) writes Java (0xAC 0xED) but reads both formats
  • Backwards-compat: legacy ObjectOutputStream-serialized envelope still decodes
  • Non-EVCacheValue passthrough (ArrayList stays on the Java path even with binary flag on)
  • Size win (binary is ~4.2× smaller for typical small items)
  • Malformed binary: truncated, bogus oversize keyLen, negative keyLen all decode to null (logged with field + hex dump)

Plus two new measurement tests that produce the benchmark numbers:

  • measureBinaryVsJavaOverhead — prints raw + gzipped envelope sizes for binary vs OOS across 7 key/value shapes
  • measureOnRealMemcachedNode — env-gated (MEMCACHED_HOST=...) writes N items in each format to a real memcached node and reads back the STAT bytes/STAT curr_items delta

Full evcache-core suite (./gradlew :evcache-core:test): 28/28 green (EVCacheValueSerdeTest 17, NodeLocatorLookupTest 3, MockEVCacheTest 7, plus runtime tests in other modules).

Chunked-payload integration is not covered by an automated test in this PR — chunking lives in EVCacheClient.createChunks/assembleChunks, which are content-opaque (byte copy + CRC + manifest) and require a live client to exercise. The binary format introduces no new chunking risk by construction: assembleChunks reassembles bytes byte-for-byte and CRC-checks them against the manifest before handing the result to the transcoder.

🤖 Generated with Claude Code

@joegoogle123 joegoogle123 force-pushed the evcache-value-binary-serde branch 15 times, most recently from d443e30 to 3892873 Compare June 4, 2026 22:22
@joegoogle123 joegoogle123 force-pushed the evcache-value-binary-serde branch 10 times, most recently from d46bf97 to 3444f24 Compare June 9, 2026 14:47
@joegoogle123 joegoogle123 requested a review from bihaoxwork June 9, 2026 14:48
Comment thread evcache-core/src/main/java/com/netflix/evcache/EVCacheImpl.java Outdated
Introduces a length-prefixed binary envelope for EVCacheValue, which EVCache
wraps around values when the canonical key is hashed (see
EVCacheImpl.getEVCacheKey). Compared to the default Java ObjectOutputStream
encoding it is materially smaller on the wire and avoids the reflective
decode path.

Wire format:
  [magic 0x0C][reserved 0x00]
  [int keyLen][key UTF-8 bytes]
  [int valLen][value bytes]
  [int flags][long ttl][long createTime]
  [...optional extension fields appended by newer writers...]

- Opt-in per app via FastProperty <app>.envelope.binary.serialization.enabled
  (default false). Existing Java-serialized items still decode -- the reader
  is dual-format, so there is no wire break for clusters with in-flight
  cached items.
- Forward-compat for additive optional fields: append at the end, gate with
  buffer.hasRemaining() in the reader, supply a graceful default when absent.
- Breaking changes route through the reserved/version byte at byte 1 with
  reader-before-writer rollout (see class javadoc).
- Bounds-checked length prefixes return null on bogus input, matching
  BaseSerializingTranscoder's resilience contract.

Tests cover binary round-trip across empty/large/unicode/extremes,
dual-format read, transcoder routing, malformed-input handling, and a
pinned v0 byte array trip-wire so future required-field adds can't be
missed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@joegoogle123 joegoogle123 force-pushed the evcache-value-binary-serde branch from 75d8249 to 40446ae Compare June 23, 2026 19:57
@joegoogle123 joegoogle123 changed the base branch from master to sync-getbulk-mixed-keys June 23, 2026 19:57
final boolean useBinarySerialization = propertyRepository.get(_appName + ".envelope.binary.serialization.enabled", Boolean.class)
.orElseGet("evcache.envelope.binary.serialization.enabled").orElse(false).get();
final int maxValueSize = propertyRepository.get("default.evcache.max.data.size", Integer.class).orElse(20 * 1024 * 1024).get();
this.evcacheValueTranscoder = new EVCacheTranscoder(maxValueSize, ENVELOPE_COMPRESSION_DISABLED, useBinarySerialization);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it intentional to change the compression threshold from Integer.MAX_VALUE to default.evcache.max.data.size?

@shy-1234 shy-1234 Jun 29, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry let me rephrase. I mean the missing evcacheValueTranscoder.setCompressionThreshold(Integer.MAX_VALUE);

it might be by mistake in the past since I also saw default.evcache.compression.threshold in the EVCacheTranscoder constructor. But just wanted to raise that after this change, there can be a behavior difference to the existing clusters. (changing from INT_MAX to the default 120)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setCompressionThreshold is called in the constructor. I think this should preserve the existing behavior

this.evcacheValueTranscoder = new EVCacheTranscoder(maxValueSize, ENVELOPE_COMPRESSION_DISABLED, useBinarySerialization);

Can you elaborate on how we could erroneously go from Integer.MAX_VALUE?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the code

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have mis-read the code... One last nit: the naming ENVELOPE_COMPRESSION_DISABLED seems a bit misleading...? (sounds like a boolean)

evcacheValueTranscoder.setCompressionThreshold(Integer.MAX_VALUE);
// Whether the EVCacheValue envelope (hashed keys) is written using the compact binary format
// instead of native Java serialization.
final boolean useBinarySerialization = propertyRepository.get(_appName + ".envelope.binary.serialization.enabled", Boolean.class)

@shy-1234 shy-1234 Jun 29, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we get this property also in EVCacheTranscoder like the other 2 properties there?

@shy-1234 shy-1234 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just give a second look to the compression threshold thing. Overall LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants