Skip to content

SplotyCode/FaaastJSON

Repository files navigation

FaaastJSON

Note

Old project, not maintained anymore. It ran in production from 2022 to 2025.

The benchmarks are from 2022; see the pom.xml files for the library versions used.

With the latest native vector improvements in Java (Project Panama) we could try a SIMD approach like in the famous Parsing JSON Really Quickly talk by Daniel Lemire

A schema-specific JSON binding library for the JVM. Instead of a generic tokenizer and a tree model, it generates a dedicated reader and writer for each type and lets you trade away the checks you don't need. Because each type gets its own generated codec, C2 can compile and inline that type's read/write path on its own making the branches that do stay more predictable.

Read and write throughput, small record

Reading and writing small records it's is faster then Jackson, Gson, fastjson and dsl-json and you can go further by telling it what you already know about the JSON.

Configure guarantees

Declare what you already know, and the generated codec drops the matching work.

// read fields by position, no key matching
FaaastJson.builder(User.class).assumeCanonicalShape().compile();

// plain scan to the quote, no escape/UTF-8 handling
FaaastJson.builder(Profile.class).assumeAsciiStrings().compile();

FaaastJson.builder(User.class)
    .optimistic()                  // preset as a baseline
    .assumeAsciiStrings()          // plain scan to the quote, no escape/UTF-8 handling (same as trust(Axis.STRING_ESCAPES, Axis.UTF8))
    .require(FIELD_ORDER)          // verify the assumption, throw if it's wrong
    .check(NUMBER_FORMAT)          // validate fully
    .compile();

What each level does:

  • CHECK fully validate. the strict default.
  • ASSUME_WITH_FALLBACK try the fast path, fall back to the safe one when the input doesn't match. This can be a lot slower then CHECK if we have to fallback often.
  • REQUIRE take the fast path but verify the assumption, throw if it's wrong.
  • TRUST skip the check entirely. a promise, so wrong input is undefined for that axis.

A quick feature tour

Codecs — one per type; build once, reuse (thread-safe):

JsonCodec<User> codec = FaaastJson.builder(User.class).compile();
byte[] json = codec.write(user);              // or write(user, buffer, off) straight into your byte[]
User u = codec.read(json);
codec.readInto(json, 0, json.length, u);      // reuse the whole object graph with zero allocation

JSON walker

JsonExtract doc = JsonExtract.over(json);
String       city   = doc.getString("$.address.city");
List<String> firsts = doc.getStrings("$.persons[*].name.first");   // one from every person
int[]        ages   = doc.getInts("$.persons[*].age");

or stream a whole document with a visitor that auto-skips the subtrees you never read:

JsonWalker.walk(json).asObject(o -> o
    .field("id",    f -> order.id = f.asLong())
    .field("lines", f -> f.asArray(line -> line.asObject(l -> l
        .field("sku", s -> skus.add(s.asString()))))));   // "debug" etc. left unread are skipped, not parsed

Lazy objects Declare an interface and you get a parse-on-access view that decodes only what you touch. Currently this views are read only (but you can write them back)

interface Order { long id(); Customer customer(); List<Line> lines(); }
interface Line  { String sku(); int qty(); }

Order o = FaaastJson.builder(Order.class).compile().read(json);
String sku = o.lines().get(0).sku();   // only this path is decoded; customer() stays untouched bytes

Polymorphic types

FaaastJson.builder(Shape.class).discriminator("kind")
    .subtype("circle", Circle.class).subtype("square", Square.class).compile();

Adapters map types you can't annotate (e.g. UUID, Instant) to a JSON value

Compile-time codecs@GenerateCodec emits the codec at build time with no runtime ASM or Unsafe, it also works under GraalVM native image.

Lazy strings a CharSequence field that points at the original bytes and decodes on demand.

Annotations are optional

If you own the type you can tune it with annotations: @JsonName, @JsonIgnore, @JsonOrder, @JsonField, @JsonCreator, @FixedWidth, and @GenerateCodec. But you never need them. For a foreign class you can't change, configure it on the builder instead: adapter(...) to map a whole type, forType(...) / forField(...) for per-type and per-field options, and subtype(...) for polymorphism.

Contribute

A plan IR describes how to codec a type; three backends consume it: a reflective fallback, a runtime ASM generator, and a compile-time annotation processor. The generated reader keeps parser state in locals over the raw byte[] and the writer can size the output exactly e.g. for writing straight into a array.

Two build-time guards keep the generated and hot code honest:

  • Hot-path size each kernel is pinned to its compiled size (@HotPath(bytecodeSize = N)). HotPathBudgetTest recomputes every one and fails on any change, so a change that could push a method out of HotSpot's inlining range shows up in review.
  • Golden bytecode generated codecs are disassembled to readable mnemonics and diffed against a committed *.bc.txt, so in PRs it shows exactly which instructions changed (-Dgolden.update=true to accept).

Build it (Java 8, Maven):

mvn clean install
java -jar faaastjson-benchmarks/target/benchmarks.jar LibraryShootoutBenchmark

Requirements

Java 8+. Maven.

About

A binding library for the JVM that generates bytecode for dedicated reader/writer for each type

Resources

Stars

Watchers

Forks

Contributors