Skip to content

[24.04_linux-nvidia-6.17-next] writeback: Avoid contention on wb->list_lock when switching inodes#468

Closed
sforshee wants to merge 2 commits into
24.04_linux-nvidia-6.17-nextfrom
6.17/writeback-fixes
Closed

[24.04_linux-nvidia-6.17-next] writeback: Avoid contention on wb->list_lock when switching inodes#468
sforshee wants to merge 2 commits into
24.04_linux-nvidia-6.17-nextfrom
6.17/writeback-fixes

Conversation

@sforshee

@sforshee sforshee commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Summary

Fix NVBug 6291984 for 24.04_linux-nvidia-6.17-next.

The observed symptoms are soft lockups, degraded performance, and NIC hangs. This is cause by a cgroup which exits while owning a large number of inodes, and all of these inodes need to be switched to another wb. This can lead to multiple workers trying to switch inodes, with many spinning on the same wb->list_lock while only one makes forward progress. The fix from upstream switches to use a single work item per wb, managing a queue of inode switch work items.

Fixes

Clean cherryy-picks of a fix from 6.19, plus a follow-on which fixes a use-after-free in the original fix.

e1b849cfa6b6 writeback: Avoid contention on wb->list_lock when switching inodes
6689f01d6740 writeback: Fix use after free in inode_switch_wbs_work_fn()

Testing

A test case was used to reproduce this issue on an unpatched kernel. The same reproducer on a patched kernel remains responsive and does not show soft lockups or NIC hangs.


PR: https://bugs.launchpad.net/bugs/2156922

@nirmoy nirmoy added the help wanted Extra attention is needed label Jun 16, 2026
@github-actions

github-actions Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

PR Validation Report

Patchscan ⚠️ Missing Fixes Detected

The following upstream fix commits appear to be missing:

Details
Checking 2 commits...
checking 296fb8e2e878c8221b382f50c9550cdd725af170 writeback: Fix use after free in inode_switch_wbs_work_fn()..... found upstream
checking a9004e9396c87918fd32aa36bdfbcc54077a7c12 writeback: Avoid contention on wb->list_lock when switching inodes..... found upstream
W: Found fix commit e90a6d668e26e00a72df2d09c173b563468f09c9 for e1b849cfa6b61f1c866a908c9e8dd9b5aaab820b
All fixes:
Fixes for a9004e9396c8 ("writeback: Avoid contention on wb->list_lock when switching inodes")
          e90a6d668e26 ("writeback: drop now-unnecessary rcu_barrier() in cgroup_writeback_umount()")

PR Lint ✅ All checks passed

Details
Checking 2 commits...

Cherry-pick digest:
┌──────────────┬──────────────────────────────────────────────────────────────────┬────────────┬─────────┬───────────────────────────┐
│ Local        │ Referenced upstream / Patch subject                              │ Patch-ID   │ Subject │ SoB chain                 │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 296fb8e2e878 │ 6689f01d6740 writeback: Fix use after free in inode_switch_wbs_w │ match      │ match   │ preserved + sforshee adde │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ a9004e9396c8 │ e1b849cfa6b6 writeback: Avoid contention on wb->list_lock when s │ match      │ match   │ preserved + sforshee adde │
└──────────────┴──────────────────────────────────────────────────────────────────┴────────────┴─────────┴───────────────────────────┘

Lint: all checks passed.

@nirmoy

nirmoy commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

BaseOS Kernel Review

Summary

No issues found across the reviewed commits.

Findings: no problems found

Latest watcher review: open review

Kernel deb build: failed (failure log, build artifacts)

Head: 296fb8e2e878

This comment is maintained by nv-pr-bot. It is updated when the GitHub watcher publishes a newer review.

@sforshee

Copy link
Copy Markdown
Collaborator Author

It looks like I did miss the fix e90a6d668e26 ("writeback: drop now-unnecessary rcu_barrier() in cgroup_writeback_umount()"), though it is only removing a rcu_barrier() made unnecessary by the original fix, so it is not critical by any means.

@jamieNguyenNVIDIA

Copy link
Copy Markdown
Collaborator

Acked-by: Jamie Nguyen <jamien@nvidia.com>

@clsotog

clsotog commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Acked-by: Carol L Soto <csoto@nvidia.com>

@nvmochs

nvmochs commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

@sforshee No issues with the backports.

Acked-by: Matthew R. Ochs <mochs@nvidia.com>


I do have a github hygiene question. Is there a reason the source branch came from the NVIDIA github and not your personal github?

@ltrager ltrager left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acked-by: Lee Trager <ltrager@nvidia.com>

jankara added 2 commits June 16, 2026 19:05
BugLink: https://bugs.launchpad.net/bugs/2156922

There can be multiple inode switch works that are trying to switch
inodes to / from the same wb. This can happen in particular if some
cgroup exits which owns many (thousands) inodes and we need to switch
them all. In this case several inode_switch_wbs_work_fn() instances will
be just spinning on the same wb->list_lock while only one of them makes
forward progress. This wastes CPU cycles and quickly leads to softlockup
reports and unusable system.

Instead of running several inode_switch_wbs_work_fn() instances in
parallel switching to the same wb and contending on wb->list_lock, run
just one work item per wb and manage a queue of isw items switching to
this wb.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
(cherry picked from commit e1b849c)
Signed-off-by: Seth Forshee <sforshee@nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2156922

inode_switch_wbs_work_fn() has a loop like:

  wb_get(new_wb);
  while (1) {
    list = llist_del_all(&new_wb->switch_wbs_ctxs);
    /* Nothing to do? */
    if (!list)
      break;
    ... process the items ...
  }

Now adding of items to the list looks like:

wb_queue_isw()
  if (llist_add(&isw->list, &wb->switch_wbs_ctxs))
    queue_work(isw_wq, &wb->switch_work);

Because inode_switch_wbs_work_fn() loops when processing isw items, it
can happen that wb->switch_work is pending while wb->switch_wbs_ctxs is
empty. This is a problem because in that case wb can get freed (no isw
items -> no wb reference) while the work is still pending causing
use-after-free issues.

We cannot just fix this by cancelling work when freeing wb because that
could still trigger problematic 0 -> 1 transitions on wb refcount due to
wb_get() in inode_switch_wbs_work_fn(). It could be all handled with
more careful code but that seems unnecessarily complex so let's avoid
that until it is proven that the looping actually brings practical
benefit. Just remove the loop from inode_switch_wbs_work_fn() instead.
That way when wb_queue_isw() queues work, we are guaranteed we have
added the first item to wb->switch_wbs_ctxs and nobody is going to
remove it (and drop the wb reference it holds) until the queued work
runs.

Fixes: e1b849c ("writeback: Avoid contention on wb->list_lock when switching inodes")
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260413093618.17244-2-jack@suse.cz
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
(cherry picked from commit 6689f01)
Signed-off-by: Seth Forshee <sforshee@nvidia.com>
@sforshee sforshee force-pushed the 6.17/writeback-fixes branch from 84a43a4 to 296fb8e Compare June 16, 2026 19:06
@sforshee

Copy link
Copy Markdown
Collaborator Author

I do have a github hygiene question. Is there a reason the source branch came from the NVIDIA github and not your personal github?

Um, that was a mistake cause by origin not pointing to the same place between different machines. I'll be more careful about that next time.

@sforshee

sforshee commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator Author

The last push was to add buglinks to the commit descriptions, no code changes.

@nvmochs

nvmochs commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

The last push was to add buglinks to the commit descriptions, no code changes.

To be clear, my comment in Slack was not about adding buglinks to the actual commits (Brad does that as part of his process), but to add a link in the PR description to the LP. I've gone ahead and added one to this PR as an example.

@nvmochs

nvmochs commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Merged, closing PR.

a53b6ede12ed (nnoble/nvidia-6.17-next) writeback: Fix use after free in inode_switch_wbs_work_fn()
6bdaca534b89 writeback: Avoid contention on wb->list_lock when switching inodes

@nvmochs nvmochs closed this Jun 16, 2026
@sforshee sforshee deleted the 6.17/writeback-fixes branch June 16, 2026 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants