fix: eliminate stale or concurrent container interference by mkocher · Pull Request #223 · cloudfoundry/bpm-release

mkocher · 2026-06-23T23:39:14Z

The CI build was usually failing but intermittently passing after #219. This was because 1) the vm's /sys/fs/cgroup gets mapped into containers wholesale these days and 2) there were some stale containers on a worker which when present met the test criteria of a container named test-server with 743MB.

Replace the name-based cgroup directory search (findCgroupDir) with an exact-path approach: a new /self-cgroup-path endpoint reads the test-server's own /proc/self/cgroup and returns the cgroup v2 unified-mode path. That path is unique per Docker container ID, so it can never match a stale scope from a previous run or a scope owned by a concurrent build on the same worker. The privileged observer's /cgroup-limits endpoint now accepts cgroup-path and reads memory.max / pids.max directly from that path — no filesystem walk needed.

also fixed a problem where the new unit tests for the test server were hanging because of a flag parsing issue

coderabbitai · 2026-06-23T23:39:27Z

Warning

Review limit reached

@mkocher, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 42 minutes and 8 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more credits in the billing tab to continue.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a7aba736-61c8-4651-9e67-4a6acfaa0602

📥 Commits

Reviewing files that changed from the base of the PR and between 1c6375d and d5f8a72.

📒 Files selected for processing (4)

src/bpm/acceptance/bpm_acceptance_test.go
src/bpm/acceptance/fixtures/test-server/handlers/cgroup_limits.go
src/bpm/acceptance/fixtures/test-server/handlers/cgroup_limits_test.go
src/bpm/acceptance/fixtures/test-server/main.go

Walkthrough

The pull request replaces the previous cgroup directory discovery mechanism (a filesystem walk under /sys/fs/cgroup keyed on process/job name and container ID) with a two-step, path-based flow. A new SelfCgroupPath HTTP handler is added to the test server that reads /proc/self/cgroup, parses the cgroup v2 unified hierarchy entry (0::<path>), and returns the path as JSON. The CgroupLimits handler is rewritten to accept an explicit cgroup-path query parameter and read limits directly from that location under /sys/fs/cgroup. The acceptance test getCgroupLimits() function is updated to first fetch the cgroup path from the agent's new endpoint, then supply it URL-escaped to the privileged observer's /cgroup-limits endpoint. Unit tests for parseCgroupV2Path replace the removed findCgroupDir tests.

Suggested reviewers

mariash
a-hassanin
selzoc
aramprice

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly addresses the main problem solved by this PR: eliminating interference from stale or concurrent containers through a new exact-path approach to cgroup identification.
Description check	✅ Passed	The description clearly explains the problem (intermittent CI failures due to stale containers), the root cause, the solution (exact-path cgroup approach), and mentions the flag parsing fix.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix-cgroup-acceptance-test-scope-naming

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/bpm/acceptance/fixtures/test-server/handlers/cgroup_limits.go`:
- Around line 100-114: The `cgroup-path` query parameter is being concatenated
directly into the filesystem path without validation, creating a path traversal
vulnerability. Validate the `exactPath` variable by canonicalizing it using
filepath operations and enforcing that the resulting full path stays within the
`/sys/fs/cgroup` base directory before passing it to `readCgroupValue` function
calls for reading `memory.max` and `pids.max`. Return an appropriate error
response if the path attempts to escape the intended cgroup subtree.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3284a2f9-fff5-4e95-ab8c-6a2120362fe0

📥 Commits

Reviewing files that changed from the base of the PR and between da8e722 and 1c6375d.

📒 Files selected for processing (4)

src/bpm/acceptance/bpm_acceptance_test.go
src/bpm/acceptance/fixtures/test-server/handlers/cgroup_limits.go
src/bpm/acceptance/fixtures/test-server/handlers/cgroup_limits_test.go
src/bpm/acceptance/fixtures/test-server/main.go

Replace the name-based cgroup directory search (findCgroupDir) with an exact-path approach: a new /self-cgroup-path endpoint reads the test-server's own /proc/self/cgroup and returns the cgroup v2 unified-mode path. That path is unique per Docker container ID, so it can never match a stale scope from a previous run or a scope owned by a concurrent build on the same worker. The privileged observer's /cgroup-limits endpoint now accepts cgroup-path and reads memory.max / pids.max directly from that path — no filesystem walk needed. also fixed a problem where the unit tests were hanging because of a flag parsing issue

coderabbitai Bot requested changes Jun 23, 2026

View reviewed changes

Comment thread src/bpm/acceptance/fixtures/test-server/handlers/cgroup_limits.go

mkocher force-pushed the fix-cgroup-acceptance-test-scope-naming branch from 1c6375d to d5f8a72 Compare June 23, 2026 23:57

coderabbitai Bot approved these changes Jun 23, 2026

View reviewed changes

aramprice approved these changes Jun 24, 2026

View reviewed changes

mkocher merged commit 7866e93 into master Jun 24, 2026
5 checks passed

mkocher deleted the fix-cgroup-acceptance-test-scope-naming branch June 24, 2026 00:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: eliminate stale or concurrent container interference#223

fix: eliminate stale or concurrent container interference#223
mkocher merged 1 commit into
masterfrom
fix-cgroup-acceptance-test-scope-naming

mkocher commented Jun 23, 2026

Uh oh!

coderabbitai Bot commented Jun 23, 2026 •

edited

Loading

Review limit reached

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mkocher commented Jun 23, 2026

Uh oh!

coderabbitai Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented Jun 23, 2026 •

edited

Loading