Skip to content

Add RPC auto resizing buffer memory control#17911

Open
jt2594838 wants to merge 9 commits into
apache:masterfrom
Caideyipi:add_rpc_memory_control
Open

Add RPC auto resizing buffer memory control#17911
jt2594838 wants to merge 9 commits into
apache:masterfrom
Caideyipi:add_rpc_memory_control

Conversation

@jt2594838

Copy link
Copy Markdown
Contributor

Summary

  • add memory accounting for RPC AutoResizingBuffer allocation, growth, shrink, close, and reuse
  • wire auto_resizing_buffer_memory_proportion into node-commons memory management
  • disable AutoResizingBuffer memory control when the proportion is <= 0
  • add unit and integration coverage for allocation failure and request rejection paths

Tests

  • mvn spotless:apply -pl iotdb-core/node-commons
  • mvn test -pl iotdb-client/service-rpc -Dtest=AutoResizingBufferTest
  • mvn test -pl iotdb-core/node-commons -am -Dtest=CommonConfigTest -DfailIfNoTests=false -Dsurefire.failIfNoSpecifiedTests=false
  • mvn verify -DskipUTs -Drat.skip=true -Dit.test=IoTDBAutoResizingBufferMemoryIT#testGrowingRequestsAreRejectedWhenBufferMemoryIsExhausted -DfailIfNoTests=false -Dfailsafe.failIfNoSpecifiedTests=false -pl integration-test -am -P with-integration-tests

@Caideyipi

Copy link
Copy Markdown
Collaborator

Thanks for adding the RPC AutoResizingBuffer memory control. I found a few issues that should be fixed before merge:

  1. P1: buffer memory accounting is not released when framed transports are closed. TElasticFramedTransport.close() currently only closes underlying (iotdb-client/service-rpc/src/main/java/org/apache/iotdb/rpc/TElasticFramedTransport.java:123), but the newly accounted readBuffer and writeBuffer are only released by AutoScalingBuffer*Transport.close(). The compressed transport also allocates writeCompressBuffer and readCompressBuffer (TCompressedElasticFramedTransport.java:41) with no close path. This means normal connection close leaks the accounted quota and will eventually reject later connections/requests.

  2. P1: constructor failure paths leak already allocated buffers. In TElasticFramedTransport (lines 94-99), readBuffer is allocated before writeBuffer; if the second allocation fails, the catch block wraps the exception but does not close the first buffer. TCompressedElasticFramedTransport has the same issue for the extra compression buffers. This can happen exactly when the memory quota is near exhaustion, so failed connection creation can consume quota permanently.

  3. P1: Utils.serializeTSStatus() creates an accounted temporary buffer and never closes it. iotdb-core/consensus/src/main/java/org/apache/iotdb/consensus/ratis/utils/Utils.java:192 creates an AutoScalingBufferWriteTransport, writes into it, and returns without releasing the transport. With this PR, each status serialization leaks TEMP_BUFFER_SIZE from the AutoResizingBuffer quota. Please use a try/finally or try-with-resources-style close path; if needed, copy the returned bytes before closing.

  4. P2: the ConfigNode RPC path appears to keep the default no-op memory control. The hook is registered only when MemoryConfig is initialized (iotdb-core/node-commons/src/main/java/org/apache/iotdb/commons/memory/MemoryConfig.java:33), but ConfigNodeRPCService also uses DeepCopyRpcTransportFactory (iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/service/thrift/ConfigNodeRPCService.java:79) and I could not find a MemoryConfig.global() / MemoryConfig.getInstance() initialization path under iotdb-core/confignode. If so, auto_resizing_buffer_memory_proportion does not control ConfigNode RPC buffers.

  5. P3: unrelated RAT exclusion. pom.xml adds AGENTS.md to the apache-rat exclude list, but this PR does not add that file. This is unrelated to the feature and relaxes license checking, so it should be removed from this PR.

I only ran a lightweight whitespace check locally: git diff --check review-pr-17911/base...review-pr-17911/head passed. I did not run the Maven tests.

@jt2594838

jt2594838 commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

Addressed the review comments in 9501c61631:

  1. TElasticFramedTransport.close() now releases the accounted read/write buffers, and compressed transports release their extra compression buffers as well.
  2. Constructor failure paths now close any buffers that were already allocated, covering both framed and compressed transports.
  3. Utils.serializeTSStatus() now copies the serialized bytes and closes the temporary AutoScalingBufferWriteTransport in a finally block.
  4. ConfigNode startup now initializes MemoryConfig, so the ConfigNode RPC path installs the AutoResizingBuffer memory-control hook.

Also added unit coverage for transport close/failure release paths in AutoResizingBufferTest.

Validation run:

  • mvn spotless:apply -pl iotdb-client/service-rpc,iotdb-core/consensus,iotdb-core/confignode
  • mvn test -pl iotdb-client/service-rpc -Dtest=AutoResizingBufferTest
  • mvn compile -pl iotdb-core/consensus,iotdb-core/confignode -am
  • git diff --check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants