Skip to content

archive: cache parent directory fd during tar extraction#26

Open
ctalledo wants to merge 4 commits into
moby:mainfrom
ctalledo:art-227-dirfd-cache
Open

archive: cache parent directory fd during tar extraction#26
ctalledo wants to merge 4 commits into
moby:mainfrom
ctalledo:art-227-dirfd-cache

Conversation

@ctalledo
Copy link
Copy Markdown

Summary

Depends on #25.

Each root.*(path) call re-walks every path component via doInRoot, costing 2D syscalls per call at path depth D. A typical file entry triggers ~5 root.* calls, so at depth 4 that is ~40 syscalls in path traversal alone.

This PR adds a dirCache that keeps one parent-directory fd open between entries. Consecutive entries in the same directory reuse the cached fd for all *at(2) operations, reducing path-traversal cost to ~5 syscalls per entry regardless of depth.

Test plan

  • go test ./... passes

@ctalledo ctalledo force-pushed the art-227-dirfd-cache branch 2 times, most recently from 0043278 to c70cd3d Compare May 21, 2026 23:15
ctalledo added 4 commits May 21, 2026 16:16
The Unpack and UnpackLayer functions validated tar entry paths using a
pure string check (filepath.Join + filepath.Rel). This does not follow
symlinks, so a malicious archive could plant a symlink pointing outside
the extraction root and then write files through it, bypassing the check.

On Linux this is mitigated by chrootarchive wrapping extraction in a
chroot(2) call. On platforms without chroot support (Windows) and for
callers that bypass chrootarchive (e.g. BuildKit ADD --unpack), the
extraction root is not enforced.

Introduce safeResolve (ported from containerd/continuity/fs.RootPath),
which walks each path component with os.Lstat and resolves symlinks
within the extraction root, bounding absolute targets and relative
targets that would escape root back inside it.

Apply safeResolve to:
- Unpack: main path computation and the deferred directory chtimes loop
- UnpackLayer: same
- createTarFile TypeLink: hardlink target resolution

Add a regression test for the symlink-chain bypass: a within-dest
symlink (go_up -> "..") used to redirect a second symlink outside dest
(escape -> "../victim"), which the static check missed.

Fixes ART-225.

Signed-off-by: Cesar Talledo <cesar.talledo@docker.com>
os.Root, used to bound all tar extraction operations within the
extraction root at the kernel level using openat(2) semantics, was
introduced in Go 1.24. Update go.mod and the CI test matrix to require
1.24 as the oldest supported version.

Signed-off-by: Cesar Talledo <cesar.talledo@docker.com>
Replace the path-string-based extraction with os.Root, which bounds all
file operations within the extraction root at the kernel level using
openat(2) semantics. This is the primary defence against tar
path-traversal attacks (ART-226).

Key changes:
- Unpack and UnpackLayer open an os.Root for dest at entry.
- createTarFile takes *os.Root and a root-relative name instead of an
  absolute path and extractDir.
- TypeDir, TypeReg, TypeLink, Lchown, Chmod, and Chtimes all go through
  root.* methods.
- For operations os.Root does not yet support (mknod, xattrs, lchtimes
  on symlinks), absPath is derived via safeResolve so paths remain
  bounded within the root.
- overlayWhiteoutConverter.ConvertRead makes direct unix.Setxattr and
  unix.Mknod syscalls and requires an absolute path; pass safeResolve'd
  absPath rather than the root-relative hdr.Name.
- safepath.go is retained: safeResolve is still used for absPath
  derivation.
- testBreakout updated to skip symlinks: correct under the os.Root
  model, where symlink nodes may have out-of-root targets but traversal
  through them via root.* is rejected by the kernel.

Signed-off-by: Cesar Talledo <cesar.talledo@docker.com>
Each root.*(path) call re-walks every path component via doInRoot,
costing 2D syscalls per call at path depth D. A typical file entry
triggers ~5 root.* calls, so at depth 4 that is ~40 syscalls in path
traversal alone.

Add a dirCache that keeps one parent-directory fd open between entries.
Consecutive entries in the same directory reuse the cached fd for all
*at(2) operations, reducing path-traversal cost to ~5 syscalls per
entry regardless of depth.

Signed-off-by: Cesar Talledo <cesar.talledo@docker.com>
@ctalledo ctalledo force-pushed the art-227-dirfd-cache branch from c70cd3d to 33dca5a Compare May 21, 2026 23:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant