Skip to content

feat: do another round of memory optimisations#316

Merged
vinitkumar merged 3 commits into
masterfrom
refactor/memory-optimization-round-2
Jun 9, 2026
Merged

feat: do another round of memory optimisations#316
vinitkumar merged 3 commits into
masterfrom
refactor/memory-optimization-round-2

Conversation

@vinitkumar

@vinitkumar vinitkumar commented Jun 9, 2026

Copy link
Copy Markdown
Owner

Summary by Sourcery

Stream XML serialization through a lightweight UTF-8 writer to reduce intermediate string allocations while preserving the public dicttoxml API.

New Features:

  • Introduce an internal _XMLWriter helper to accumulate UTF-8-encoded XML output as bytes.

Enhancements:

  • Add streaming variants of dict and list conversion helpers that append XML directly to the shared writer instead of building joined subtree strings.
  • Update dicttoxml XPath and normal code paths to use the streaming writer, reducing peak memory usage for large payloads.
  • Clarify architecture documentation to describe the new streaming writer and its role in normal and XPath serialization.

@sourcery-ai

sourcery-ai Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Reviewer's Guide

Refactors the JSON-to-XML serializer to stream UTF-8 bytes via a new _XMLWriter and append-style helpers instead of building large intermediate strings, applying this both to normal dicttoxml output and XPath 3.1 json-to-xml mode, and updates the architecture docs accordingly.

File-Level Changes

Change Details Files
Introduce a streaming UTF-8 writer and append-style conversion helpers so dicttoxml and XPath modes emit bytes incrementally instead of assembling large intermediate strings.
  • Add _XMLWriter helper that buffers UTF-8-encoded bytes and exposes write()/to_bytes().
  • Add _append_xpath31 that recursively serializes XPath 3.1 json-to-xml structures directly into an _XMLWriter, including optional namespace emission for top-level map/array or wrapped scalars.
  • Introduce _append_convert and its helpers (_append_convert_dict, _append_convert_list, _append_dict2xml_str, _append_list2xml_str, _append_rawitem) that mirror existing convert*/dict2xml_str/list2xml_str logic but write directly to an _XMLWriter instead of returning joined strings.
  • Refactor dicttoxml(xpath_format=True) to build the XPath document by streaming into _XMLWriter based on get_xpath31_tag_name(obj) instead of using convert_to_xpath31 and string replace tricks for xmlns injection.
  • Refactor dicttoxml normal path to construct the XML declaration and optional root wrapper by streaming into _XMLWriter, using _append_convert for the body, and returning output.to_bytes() for both rooted and rootless output.
  • Update architecture documentation to describe dicttoxml’s new streaming behavior via _XMLWriter and clarify that public helpers still return strings while library/CLI paths now write bytes incrementally.
json2xml/dicttoxml.py
lat.md/architecture.md

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@codecov

codecov Bot commented Jun 9, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (389eb56) to head (b06ca10).

Additional details and impacted files
@@            Coverage Diff            @@
##            master      #316   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            6         6           
  Lines          569       609   +40     
=========================================
+ Hits           569       609   +40     
Flag Coverage Δ
unittests 100.00% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The new streaming helpers (e.g. _append_convert, _append_convert_dict, _append_convert_list, _append_xpath31) duplicate a lot of the branching and type-specific logic from the existing convert_* functions; consider factoring shared cases into common helpers so the string-returning and streaming paths stay behaviorally aligned over time.
  • Several _append_* helpers mutate the attr dict in place when attr_type is true (e.g. attr["type"] = ... in _append_dict2xml_str and _append_list2xml_str); if the same attr instance can be reused across calls, it may be safer to copy or construct a fresh dict before mutation to avoid subtle attribute leakage between elements.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The new streaming helpers (e.g. `_append_convert`, `_append_convert_dict`, `_append_convert_list`, `_append_xpath31`) duplicate a lot of the branching and type-specific logic from the existing `convert_*` functions; consider factoring shared cases into common helpers so the string-returning and streaming paths stay behaviorally aligned over time.
- Several `_append_*` helpers mutate the `attr` dict in place when `attr_type` is true (e.g. `attr["type"] = ...` in `_append_dict2xml_str` and `_append_list2xml_str`); if the same `attr` instance can be reused across calls, it may be safer to copy or construct a fresh dict before mutation to avoid subtle attribute leakage between elements.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@vinitkumar vinitkumar merged commit 9463457 into master Jun 9, 2026
48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant