Port JSON handling from Boost.JSON to simdjson#309
Conversation
Replace boost::json across the viewer with simdjson and drop the boost-json vcpkg dependency entirely. - llsdjson: LlsdFromJson converts from simdjson::dom::element; LlsdFromJsonString parses text, with a padded_string overload that parses in place with no internal copy; LlsdToJson serializes LLSD straight to compact JSON text via simdjson's string_builder. Non-finite reals serialize as null; object keys now emit in sorted (LLSD map) order rather than insertion order. - llcorehttputil: JSON response handlers read the body into a padded_string (single full-body copy, no intermediate DOM); postJson/putJson serialize without building a DOM. - lljsonrpcws/llwebsocketmgr: send pre-serialized strings; drop the boost::json::value sendMessage overload. - lltranslate, llfloaterpreference, llappviewerwin32: extract fields directly from simdjson::dom (at_pointer mirrors the old find_pointer paths) with no tree materialization. llvelopack rewrites the release feed as a streaming dom walk + re-emit. - llvoicewebrtc: outgoing data-channel messages built with string_builder helpers; incoming voice data parsed with a reused dom parser. - gltf: Value is now simdjson::dom::element for reads; serialization streams through a new comma-managing JsonWriter over string_builder (LLSD remains forbidden in LL::GLTF); asset extras preserved as minified JSON text; parse failures warn and fail instead of throwing. - llglslshader/llviewerdisplay: manual shader profile dump builds LLSD; 64-bit sample counters stored as Real to avoid S32 overflow. Adds llcommon/tests/llsdjson_test.cpp (13 cases: round-trips, escaping, real precision, non-finite handling, hostile input, padded in-place parse). Adds simdjson to vcpkg.json/cmake and the license notices; removes the json component from Boost.cmake. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (8)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (5)
📝 WalkthroughSummary by CodeRabbit
WalkthroughThis pull request replaces Boost.JSON with simdjson for JSON parsing and introduces a new JsonWriter class for GLTF serialization. It updates LLSD↔JSON conversion APIs, refactors all JSON consumers across HTTP, RPC, shader profiling, translation services, and WebRTC voice handling to use simdjson, and adds comprehensive test coverage. ChangesBoost.JSON to simdjson migration
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
indra/newview/gltf/README.md (1)
34-41:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd a language tag to the fenced code block.
Line 34 starts a bare fenced block, which trips markdownlint MD040. Tag it as
cppto keep the docs lint-clean.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@indra/newview/gltf/README.md` around lines 34 - 41, The fenced code block containing the C++ declarations (serialize(JsonWriter& obj) const and const Primitive& operator=(const Value& src)) needs a language tag to satisfy markdownlint; change the opening fence from ``` to ```cpp so the block is tagged as C++ and MD040 is resolved.Source: Linters/SAST tools
indra/newview/gltf/asset.cpp (1)
851-887:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReset the destination
Assetbefore these selectivecopy(...)calls.Line 854 only overwrites members that exist in the parsed document. Reusing one
Assetinstance for a second load will keep stalemScenes,mBuffers,mExtras, etc. whenever the new file omits those sections, which contradicts the load contract that prior contents are discarded.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@indra/newview/gltf/asset.cpp` around lines 851 - 887, The assignment operator Asset::operator=(const Value& src) selectively copies fields but fails to clear the destination first, so missing keys in src leave stale state (mScenes, mBuffers, mExtras, etc.); fix by resetting all Asset members to their default/empty values at the start of operator= (e.g. reset mVersion, mMinVersion, mGenerator, mCopyright, mExtras and clear all container members like mScenes, mNodes, mMeshes, mMaterials, mBuffers, mBufferViews, mTextures, mSamplers, mImages, mAccessors, mAnimations, mSkins, mExtensionsUsed, mExtensionsRequired) before performing the selective copy so prior contents are discarded when keys are absent.indra/newview/llfloaterpreference.cpp (1)
801-808:⚠️ Potential issue | 🟠 Major | ⚡ Quick winValidate and constrain skin folder names before building filesystem paths.
Line 801 uses manifest-controlled
namedirectly to buildpathname(Lines 802-808). In this ZIP-install path, that enables traversal-style names (../, separators, absolute prefixes) to target locations outside the skins directory. Restrict to a safe basename policy and verify the canonicalized target remains under the skins root beforemkdir/extract.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@indra/newview/llfloaterpreference.cpp` around lines 801 - 808, manifest_member() currently returns a manifest-controlled name used directly to build pathname via gDirUtilp->add and passed to LLFile::mkdir/LLFile::isdir; sanitize this by rejecting or canonicalizing dangerous inputs (remove separators, strip leading "/" or drive letters, collapse ".." components) and enforce a safe basename policy (allow only alphanumerics, hyphen, underscore). After constructing the candidate pathname, resolve its canonical/absolute form and verify it is a descendant of the skins root directory (the original gDirUtilp->getOSUserAppDir() + "skins") before calling LLFile::mkdir or performing extraction; if validation fails, handle as an error. Ensure checks are applied wherever name, pathname, gDirUtilp->add, LLFile::isdir, or LLFile::mkdir are used in this flow.
🧹 Nitpick comments (1)
indra/llcommon/tests/llsdjson_test.cpp (1)
241-299: ⚡ Quick winAdd a regression for a valid-but-unrepresentable integer literal.
The failure cases here cover invalid JSON, but they don't pin what should happen when the JSON is syntactically valid and the numeric literal exceeds LLSD's representable range. A targeted case would catch the current
BIGINTpath accepting the parse while materializingTypeUndefined.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@indra/llcommon/tests/llsdjson_test.cpp` around lines 241 - 299, Add a regression that covers a syntactically valid but unrepresentable integer literal: in the llsdjson_object tests (e.g. near llsdjson_object::test<11> or as a new small test) call LlsdFromJsonString with a JSON top-level big integer like 9223372036854775808 (or an even larger decimal) into an LLSD out and assert the parser does not accept it (bool ok is false), errmsg is set (errmsg non-empty) and out.type() equals LLSD::TypeUndefined; use the existing symbols LlsdFromJsonString, LLSD, and LLSD::TypeUndefined to locate and implement the check so future regressions on the BIGINT path are caught.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@indra/llcommon/llsdjson.cpp`:
- Around line 57-66: The BIGINT branch currently calls val.get_double() and if
that fails leaves result undefined and LlsdFromJsonString() returning success;
change this so a failed get_double() is handled explicitly: inside the
simdjson::dom::element_type::BIGINT case (the block with val.get_double() and
LLSD(bigval)), check the return value and if it is not simdjson::SUCCESS either
(a) propagate the conversion failure out of LlsdFromJsonString() by setting
result to an explicit undefined/empty LLSD and returning false, or (b) choose an
explicit fallback (e.g., convert/store the integer as a string or use
int64/uint64 via val.get_int64()/val.get_uint64() and assign that to result) —
do not leave result uninitialized or let the function return success on
conversion failure. Ensure the code paths reference the same function names
(val.get_double(), LlsdFromJsonString(), and LLSD) so reviewers can find the
change.
In `@indra/newview/gltf/asset.cpp`:
- Around line 946-953: The save() routine currently writes the JSON buffer to an
llofstream without verifying the stream state; update save() to check that the
file was successfully opened (e.g., test file.is_open()) and that the write
completed (check stream state via file.good()/!file.fail() or flush+check) after
file.write(buffer.c_str(), buffer.size()), and return false on any failure
instead of always returning true; locate the code around the
JsonWriter/llofstream usage in save() (JsonWriter writer, serialize(writer),
std::string buffer, llofstream file) and add these checks and appropriate error
handling/return value.
- Around line 856-869: The asset loader preserves asset.extras into mExtras and
copyright into mCopyright but the asset writer (the code emitting the asset
block that currently only writes version, minVersion and generator) omits these
fields; update the asset-serialization code to also output "copyright" (from
mCopyright) and "extras" (from mExtras). When writing extras, emit the stored
minified JSON subtree rather than double-quoting it — i.e., if the writer
supports injecting raw JSON, insert mExtras as raw JSON; otherwise parse mExtras
into a DOM/value and serialize that subtree so the original extras structure is
preserved. Ensure you reference the same member names mExtras and mCopyright so
the saved JSON round-trips the original asset block.
In `@indra/newview/gltf/buffer_util.h`:
- Around line 514-524: The copy overload that fills an std::unordered_map
(copy(const Value& src, std::unordered_map<std::string, T>& dst)) is ignoring
the boolean result of the recursive copy<T> call and is inserting
default-constructed entries on failure; change the logic to first attempt copy
into a temporary T, and only emplace/move it into dst if copy<T>(member.value,
tmp) returns true, otherwise return false (propagate failures). Apply the same
fix to the other map/array-copy sites that call copy(...) for nested elements
(the other copy overloads around the same block) so they don't report success
when nested conversion fails, and update Material::AlphaMode parsing to return
false when the source is not a string rather than leaving the enum default and
returning true. Ensure all places that call copy(...) check and propagate the
boolean result instead of discarding it.
- Around line 616-635: copy_extensions currently loops over an index but always
calls _copy_extension(ext_obj, args...) with the full pack, so variadic pairs
are never advanced; change copy_extensions (the function template handling
Types... args) to recursively consume the argument pack like _write_extension
does: call the two-argument _copy_extension for the first (member, dst) pair and
then invoke copy_extensions(...) on the remaining pack, aggregating success
(e.g., success = _copy_extension(...) || copy_extensions(...)); ensure the base
case handles zero args and that you reference copy_extensions and
_copy_extension (and mirror the recursion style used in _write_extension) so
each pair is actually processed rather than reusing the full pack.
In `@indra/newview/lltranslate.cpp`:
- Around line 446-447: LLGoogleTranslationHandler::parseErrorResponse is using
the wrong JSON paths; update the simdjson pointer lookups to read
"/error/message" and "/error/code" instead of "/data/message" and "/data/code",
keep the same simdjson::SUCCESS checks, populate the existing message and code
variables on success, and preserve current fallback behavior to the raw
error_body when parsing fails.
In `@indra/newview/llvelopack.cpp`:
- Around line 98-123: The feed rewrite currently leaves the original JSON on
failure which allows absolute FileName values and stale sAssetUrlMap entries to
persist; in rewrite_asset_urls (and the similar block at 173-192) clear or
rebuild sAssetUrlMap for this feed before processing, and if a FileName member
is detected but the rewrite to a basename fails, abort and propagate an error
from rewrite_release_feed instead of returning the untouched JSON so callers
(e.g., velopack_download_pending_update) never see absolute FileName values;
alternatively, populate sAssetUrlMap entries mapping the absolute URL to its
basename via extract_basename so the downloader can resolve them, but do not
reuse previous global entries — ensure the map is per-feed or cleared before
each rewrite and make rewrite_release_feed return failure on rewrite errors.
In `@indra/newview/llvoicewebrtc.cpp`:
- Around line 3265-3267: The persisted replay path uses hardcoded 200 when
restoring per-user gain (in the user_gain.emplace_back call after
LLSpeakerVolumeStorage::getInstance()->getSpeakerVolume), which mismatches the
live slider path that uses PEER_GAIN_CONVERSION_FACTOR in
LLVoiceWebRTCConnection::setUserVolume(); change the restoration to use the same
PEER_GAIN_CONVERSION_FACTOR constant instead of 200 so stored volumes and live
slider changes produce identical gains.
---
Outside diff comments:
In `@indra/newview/gltf/asset.cpp`:
- Around line 851-887: The assignment operator Asset::operator=(const Value&
src) selectively copies fields but fails to clear the destination first, so
missing keys in src leave stale state (mScenes, mBuffers, mExtras, etc.); fix by
resetting all Asset members to their default/empty values at the start of
operator= (e.g. reset mVersion, mMinVersion, mGenerator, mCopyright, mExtras and
clear all container members like mScenes, mNodes, mMeshes, mMaterials, mBuffers,
mBufferViews, mTextures, mSamplers, mImages, mAccessors, mAnimations, mSkins,
mExtensionsUsed, mExtensionsRequired) before performing the selective copy so
prior contents are discarded when keys are absent.
In `@indra/newview/gltf/README.md`:
- Around line 34-41: The fenced code block containing the C++ declarations
(serialize(JsonWriter& obj) const and const Primitive& operator=(const Value&
src)) needs a language tag to satisfy markdownlint; change the opening fence
from ``` to ```cpp so the block is tagged as C++ and MD040 is resolved.
In `@indra/newview/llfloaterpreference.cpp`:
- Around line 801-808: manifest_member() currently returns a manifest-controlled
name used directly to build pathname via gDirUtilp->add and passed to
LLFile::mkdir/LLFile::isdir; sanitize this by rejecting or canonicalizing
dangerous inputs (remove separators, strip leading "/" or drive letters,
collapse ".." components) and enforce a safe basename policy (allow only
alphanumerics, hyphen, underscore). After constructing the candidate pathname,
resolve its canonical/absolute form and verify it is a descendant of the skins
root directory (the original gDirUtilp->getOSUserAppDir() + "skins") before
calling LLFile::mkdir or performing extraction; if validation fails, handle as
an error. Ensure checks are applied wherever name, pathname, gDirUtilp->add,
LLFile::isdir, or LLFile::mkdir are used in this flow.
---
Nitpick comments:
In `@indra/llcommon/tests/llsdjson_test.cpp`:
- Around line 241-299: Add a regression that covers a syntactically valid but
unrepresentable integer literal: in the llsdjson_object tests (e.g. near
llsdjson_object::test<11> or as a new small test) call LlsdFromJsonString with a
JSON top-level big integer like 9223372036854775808 (or an even larger decimal)
into an LLSD out and assert the parser does not accept it (bool ok is false),
errmsg is set (errmsg non-empty) and out.type() equals LLSD::TypeUndefined; use
the existing symbols LlsdFromJsonString, LLSD, and LLSD::TypeUndefined to locate
and implement the check so future regressions on the BIGINT path are caught.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: a51b0081-097d-4b14-b27a-3074716eea8b
📒 Files selected for processing (37)
indra/CMakeLists.txtindra/cmake/Boost.cmakeindra/cmake/simdjson.cmakeindra/llcommon/CMakeLists.txtindra/llcommon/llsdjson.cppindra/llcommon/llsdjson.hindra/llcommon/tests/llsdjson_test.cppindra/llcorehttp/lljsonrpcws.cppindra/llcorehttp/llwebsocketmgr.cppindra/llcorehttp/llwebsocketmgr.hindra/llmessage/llcorehttputil.cppindra/llrender/llglslshader.cppindra/llrender/llglslshader.hindra/newview/gltf/README.mdindra/newview/gltf/accessor.cppindra/newview/gltf/accessor.hindra/newview/gltf/animation.cppindra/newview/gltf/animation.hindra/newview/gltf/asset.cppindra/newview/gltf/asset.hindra/newview/gltf/buffer_util.hindra/newview/gltf/common.hindra/newview/gltf/primitive.cppindra/newview/gltf/primitive.hindra/newview/licenses-linux.txtindra/newview/licenses-mac.txtindra/newview/licenses-win32.txtindra/newview/llappviewerwin32.cppindra/newview/llfloaterpreference.cppindra/newview/lltranslate.cppindra/newview/llvelopack.cppindra/newview/llviewerdisplay.cppindra/newview/llviewermenu.cppindra/newview/llviewerprecompiledheaders.hindra/newview/llvoicewebrtc.cppindra/newview/llvoicewebrtc.hindra/vcpkg.json
💤 Files with no reviewable changes (4)
- indra/newview/llvoicewebrtc.h
- indra/newview/llviewermenu.cpp
- indra/llcorehttp/llwebsocketmgr.cpp
- indra/llcorehttp/llwebsocketmgr.h
The Linux CI crash: range-for over get_object()/get_array() results called .value_unsafe() on the simdjson_result temporary, binding the loop range to a reference inside it. Range-for does not extend the inner temporary's lifetime (pre-C++23), so the handle dangled before begin() ran. MSVC's stack layout masked it; gcc -O2 segfaulted. Reproduced and verified the fix against simdjson 4.6.4 with gcc at -O2. Copy the dom::object/dom::array handle to a local before iterating (llsdjson.cpp object conversion, llvelopack feed rewrite). Review feedback (CodeRabbit): - Asset::serialize now emits copyright and the preserved extras subtree (raw JSON via new JsonWriter::rawValue). - Asset::save fails with a warning when the output file cannot be opened or written. - copy_extensions() recursively consumes its (member, dst) pairs, mirroring _write_extension, instead of the vestigial indexed loop. - Material::AlphaMode copy returns false on a non-string source. - LlsdFromJson BIGINT degrades to Real unconditionally (get_bigint digits via strtod when get_double cannot convert); note the branch is unreachable under the default parser configuration, which fails the parse with BIGINT_ERROR instead. - Google Translate error parsing reads /error/message and /error/code (the /data paths were inherited from the boost implementation but never matched the v2 REST error model). - llsdjson_test: regression for an oversized integer literal; README code fences tagged cpp. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Description
Replaces boost::json across the viewer with simdjson 4.6.4 and drops the
boost-jsonvcpkg dependency entirely. simdjson is already a transitive dependency of the pending fastgltf work, so the two efforts converge on one JSON library.Design: simdjson has no mutable DOM, so parse and serialize paths are split. Parsing uses
simdjson::domthrough its error-code interface; serialization streams JSON text throughsimdjson::builder::string_builder(shortest-round-trip float formatting, SIMD string escaping). No call site builds an intermediate tree it does not need.Per-area changes:
LlsdFromJson(dom::element),LlsdFromJsonString(...)— including apadded_stringoverload that parses in place with no internal copy — andLlsdToJson(LLSD) -> std::string.padded_string(single full-body copy, no intermediate DOM);postJsonAndSuspend/putJsonAndSuspendserialize without building a DOM.boost::json::valuesendMessage overload is removed.domfield extraction (at_pointermirrors the oldfind_pointerpaths) — no tree materialization. llvelopack rewrites the release feed as a streaming dom walk + re-emit.string_builderhelpers; incoming voice data parsed with a reused dom parser.Valueis nowsimdjson::dom::elementfor reads; serialization streams through a new comma-managingJsonWriteroverstring_builder. LLSD remains forbidden inLL::GLTF; asset extras are preserved as minified JSON text. README design notes updated.getViewerInfo()); 64-bit sample counters stored as Real to avoid S32 overflow.simdjsonadded to vcpkg.json +indra/cmake/simdjson.cmake(linked PUBLIC from llcommon);jsoncomponent removed from Boost.cmake; simdjson license added to the licenses files.Behavior changes reviewers should know:
null(JSON has no NaN/Infinity).Related Issues
Issue Link:
Checklist
Please ensure the following before requesting review:
Additional Notes
New test suite
indra/llcommon/tests/llsdjson_test.cpp(13 TUT cases: round-trips for every LLSD type, escaping/unicode, full-precision reals, non-finite handling, hostile input, padded in-place parse). llsd/llcorehttp/llmessage ctest suites pass.Runtime paths worth exercising in review: WebRTC voice (spatial updates, per-peer gain/mute, participant levels), GLTF mesh import (.gltf and .glb), Preferences > Skins (manifest parse + add-skin-from-zip), chat translation, Velopack update check, and a Develop-menu frame profile dump (output should be valid JSON).
🤖 Generated with Claude Code