Bugfix refill#55
Conversation
📝 WalkthroughWalkthroughChanged Changes
Sequence Diagram(s)sequenceDiagram
rect rgba(135,206,235,0.5)
participant Client
end
rect rgba(144,238,144,0.5)
participant Manager
end
rect rgba(255,182,193,0.5)
participant RDMA_Lib
end
Client->>Manager: send request / wait for reply
Manager->>RDMA_Lib: poll_wc() -> receives ibv_wc
Manager->>RDMA_Lib: receive_wcs().update_requests(-1)
Manager->>RDMA_Lib: receive_wcs().refill(threshold=-1)
RDMA_Lib-->>Manager: posts receives as needed
Manager-->>Client: propagate response
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@rdmalib/lib/queue.cpp`:
- Around line 145-154: The replenish logic in RecvWorkCompletions::replenish
resets _requests to _rcv_buf_size even if post_batched_empty_recv(...) fails,
corrupting queue accounting; modify replenish to check the return/result of
post_batched_empty_recv(qty) and only set _requests = _rcv_buf_size when that
call succeeds, otherwise leave _requests unchanged (and propagate/log the
failure/return false) so receive-queue depletion is not hidden; locate this
behavior in RecvWorkCompletions::replenish and the helper
post_batched_empty_recv to add the success check and appropriate error handling.
In `@rfaas/lib/executor.cpp`:
- Line 41: The initializer that constructs _state uses a narrowing cast of
numcores to int16_t which can overflow; remove that cast and ensure the std::max
comparison is done in a sufficiently wide integer type (e.g. promote
dev.default_receive_buffer_size to int or use std::max<int>) so you compare two
ints and then add 1, then pass the result (converted if the _state constructor
expects a narrower type) — update the expression in the _state(...) initializer
in executor.cpp to perform the max in an int-sized type instead of casting
numcores to int16_t.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9d393705-3303-4e88-8d40-41614520face
📒 Files selected for processing (7)
rdmalib/include/rdmalib/queue.hpprdmalib/lib/queue.cpprdmalib/lib/rdmalib.cpprfaas/include/rfaas/executor.hpprfaas/lib/executor.cppserver/executor_manager/manager.cppserver/executor_manager/manager.hpp
| bool RecvWorkCompletions::replenish() | ||
| { | ||
| if (_requests < _rcv_buf_size) { | ||
| SPDLOG_DEBUG("Post {} requests to buffer at QP {}", _rcv_buf_size - _requests, fmt::ptr(this->qp())); | ||
| this->post_batched_empty_recv(_rcv_buf_size - _requests); | ||
| _requests = _rcv_buf_size; | ||
| return true; | ||
| } | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Don’t mark requests as replenished when posting fails.
post_batched_empty_recv(...) can fail, but _requests is still reset to full. That corrupts queue accounting and can hide receive-queue depletion.
🔧 Proposed fix
bool RecvWorkCompletions::replenish()
{
- if (_requests < _rcv_buf_size) {
- SPDLOG_DEBUG("Post {} requests to buffer at QP {}", _rcv_buf_size - _requests, fmt::ptr(this->qp()));
- this->post_batched_empty_recv(_rcv_buf_size - _requests);
- _requests = _rcv_buf_size;
- return true;
- }
- return false;
+ if(_requests < 0) {
+ _requests = 0;
+ }
+
+ if(_requests < _rcv_buf_size) {
+ const int missing = _rcv_buf_size - _requests;
+ SPDLOG_DEBUG("Post {} requests to buffer at QP {}", missing, fmt::ptr(this->qp()));
+ if(this->post_batched_empty_recv(missing) < 0) {
+ return false;
+ }
+ _requests = _rcv_buf_size;
+ return true;
+ }
+ return false;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| bool RecvWorkCompletions::replenish() | |
| { | |
| if (_requests < _rcv_buf_size) { | |
| SPDLOG_DEBUG("Post {} requests to buffer at QP {}", _rcv_buf_size - _requests, fmt::ptr(this->qp())); | |
| this->post_batched_empty_recv(_rcv_buf_size - _requests); | |
| _requests = _rcv_buf_size; | |
| return true; | |
| } | |
| return false; | |
| } | |
| bool RecvWorkCompletions::replenish() | |
| { | |
| if(_requests < 0) { | |
| _requests = 0; | |
| } | |
| if(_requests < _rcv_buf_size) { | |
| const int missing = _rcv_buf_size - _requests; | |
| SPDLOG_DEBUG("Post {} requests to buffer at QP {}", missing, fmt::ptr(this->qp())); | |
| if(this->post_batched_empty_recv(missing) < 0) { | |
| return false; | |
| } | |
| _requests = _rcv_buf_size; | |
| return true; | |
| } | |
| return false; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@rdmalib/lib/queue.cpp` around lines 145 - 154, The replenish logic in
RecvWorkCompletions::replenish resets _requests to _rcv_buf_size even if
post_batched_empty_recv(...) fails, corrupting queue accounting; modify
replenish to check the return/result of post_batched_empty_recv(qty) and only
set _requests = _rcv_buf_size when that call succeeds, otherwise leave _requests
unchanged (and propagate/log the failure/return false) so receive-queue
depletion is not hidden; locate this behavior in RecvWorkCompletions::replenish
and the helper post_batched_empty_recv to add the success check and appropriate
error handling.
|
|
||
| executor::executor(const std::string& address, int port, int numcores, int memory, int lease_id, device_data & dev): | ||
| _state(dev.ip_address, dev.port, dev.default_receive_buffer_size + 1), | ||
| _state(dev.ip_address, dev.port, std::max(dev.default_receive_buffer_size, (int16_t)numcores) + 1), |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Inspect device_data definition and default_receive_buffer_size type.
rg -n "struct device_data|class device_data|default_receive_buffer_size" -C3 --type=cpp --type=hpp
# 2) Inspect executor constructor signature and usage of numcores.
rg -n "executor\\(const std::string& address, int port, int numcores" -C2 --type=cpp --type=hpp
rg -n "std::max\\(.*numcores|\\(int16_t\\)numcores|_numcores" -C2 --type=cpp --type=hpp
# 3) Check for any explicit input validation/range checks for numcores.
rg -nP "\\bnumcores\\b.*(<=|>=|<|>|assert|expect_|throw)" -C2 --type=cpp --type=hppRepository: spcl/rFaaS
Length of output: 81
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Inspect device_data definition and default_receive_buffer_size type.
echo "=== Searching for device_data definition ==="
rg -n "struct device_data|class device_data|default_receive_buffer_size" -C3
# 2) Inspect executor constructor signature and usage of numcores.
echo "=== Searching for executor constructor ==="
rg -n "executor\(const std::string& address, int port, int numcores" -C2
echo "=== Searching for numcores usage with std::max or int16_t cast ==="
rg -n "std::max.*numcores|\(int16_t\)numcores|_numcores" -C2
# 3) Check for any explicit input validation/range checks for numcores.
echo "=== Searching for numcores range checks ==="
rg -n "numcores.*(<=|>=|<|>|assert)" -C2Repository: spcl/rFaaS
Length of output: 15755
Remove explicit narrowing cast from buffer-size calculation.
Casting numcores (int) to int16_t causes integer overflow if numcores exceeds 32,767, producing an undersized receive buffer.
💡 Proposed fix
- _state(dev.ip_address, dev.port, std::max(dev.default_receive_buffer_size, (int16_t)numcores) + 1),
+ _state(dev.ip_address, dev.port, std::max<int>(dev.default_receive_buffer_size, numcores) + 1),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| _state(dev.ip_address, dev.port, std::max(dev.default_receive_buffer_size, (int16_t)numcores) + 1), | |
| _state(dev.ip_address, dev.port, std::max<int>(dev.default_receive_buffer_size, numcores) + 1), |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@rfaas/lib/executor.cpp` at line 41, The initializer that constructs _state
uses a narrowing cast of numcores to int16_t which can overflow; remove that
cast and ensure the std::max comparison is done in a sufficiently wide integer
type (e.g. promote dev.default_receive_buffer_size to int or use std::max<int>)
so you compare two ints and then add 1, then pass the result (converted if the
_state constructor expects a narrower type) — update the expression in the
_state(...) initializer in executor.cpp to perform the max in an int-sized type
instead of casting numcores to int16_t.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
rfaas/include/rfaas/client.hpp (1)
45-51:⚠️ Potential issue | 🔴 CriticalIncorrect decrement value causes
_requeststracking inconsistency.
poll_wc()can return multiple completions (up to 32), as evidenced by the warning check on line 49. However,update_requests(-1)always decrements by 1, not by the actualresponse_count. This under-decrements_requestswhen N > 1 completions are returned, causing the refill logic to malfunction over time.The correct call should use
-response_count:🐛 Proposed fix
auto [responses, response_count] = _resource_mgr.connection().poll_wc(rdmalib::QueueType::RECV, true); - _resource_mgr.connection().receive_wcs().update_requests(-1); + _resource_mgr.connection().receive_wcs().update_requests(-response_count); _resource_mgr.connection().receive_wcs().refill();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@rfaas/include/rfaas/client.hpp` around lines 45 - 51, The decrement of the pending request counter is wrong: when poll_wc returns multiple completions you must decrement by the actual response_count instead of 1; update the call to receive_wcs().update_requests(...) to pass -response_count (using the response_count variable returned from poll_wc) so the _requests tracking stays consistent and refill() behaves correctly; locate this in the block that calls _resource_mgr.connection().poll_wc(...), _resource_mgr.connection().receive_wcs().update_requests(...), and _resource_mgr.connection().receive_wcs().refill().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@rfaas/include/rfaas/client.hpp`:
- Around line 45-51: The decrement of the pending request counter is wrong: when
poll_wc returns multiple completions you must decrement by the actual
response_count instead of 1; update the call to
receive_wcs().update_requests(...) to pass -response_count (using the
response_count variable returned from poll_wc) so the _requests tracking stays
consistent and refill() behaves correctly; locate this in the block that calls
_resource_mgr.connection().poll_wc(...),
_resource_mgr.connection().receive_wcs().update_requests(...), and
_resource_mgr.connection().receive_wcs().refill().
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 03633af2-75ae-445e-b9fa-97a0f7c4a45f
📒 Files selected for processing (1)
rfaas/include/rfaas/client.hpp
Summary by CodeRabbit
New Features
Bug Fixes
Chores