Skip to content

utils/mount/mount_linux.go: replace 'mkdir -p' invocation with MkdirAll() standard Go function#1152

Open
timp87 wants to merge 2 commits into
NetApp:masterfrom
timp87:mount_linux-no-mkdir-cmd
Open

utils/mount/mount_linux.go: replace 'mkdir -p' invocation with MkdirAll() standard Go function#1152
timp87 wants to merge 2 commits into
NetApp:masterfrom
timp87:mount_linux-no-mkdir-cmd

Conversation

@timp87

@timp87 timp87 commented Jun 3, 2026

Copy link
Copy Markdown

It makes trident possible to use on Talos at least with NFS #806

mount.nfs* tools are already being added to the image, see https://github.com/NetApp/trident/blob/v26.02.1/Dockerfile#L13 for example.

So mkdir was the only reason why trident could not be used on Talos with NFS4.

praveene12 and others added 2 commits June 2, 2026 14:56
* Start CSI gRPC before node registration to fix registrar timeout

Root cause: node-driver-registrar (v2.15.0) has a hard ~30s gRPC connection
deadline. Previously, Activate() only created the CSI socket AFTER
nodeRegisterWithController() completed, which could take 38-70s on busy
clusters causing repeated CrashLoopBackOff.

Fix: Start the gRPC server immediately after creation, before node
registration retries. The socket is now available in <1s.

Safety:
- Node registration interceptor gates all Node data-path RPCs
  (stage/unstage/publish/unpublish/expand/stats) with codes.Unavailable
  until registration completes.
- Uses an allow-list (NodeGetInfo, NodeGetCapabilities only) so any
  future Node RPC is blocked by default pre-registration.
- Identity and Controller RPCs are never blocked.
- Added nil-safety to Deactivate() for early-shutdown scenarios.
- Fixed TestWaitQueueSize2 race condition in locks package.

Test coverage:
- Interceptor tests: identity allowed, all-in-one controller allowed,
  unstage/unpublish/unknown blocked, chain-order metrics bypass verified.
- Full frontend/csi package passes (35s, 180+ tests).
- Reproduced pre-fix unsafe behavior on commit 7af0d759e (NodeUnstage,
  NodeUnpublish, and future methods incorrectly allowed pre-registration).

Addresses review feedback from Clinton King and Andrew Kerr.

* Address review: add nodeRegistrationInterceptor only for roles that need it

Move the role-based decision for nodeRegistrationInterceptor from inside
the interceptor to the call site in Plugin.Activate(). The interceptor
is now passed as an extra interceptor to NewNonBlockingGRPCServer only
for CSINode and CSIAllInOne roles. CSIController never includes it in
its gRPC chain, eliminating the runtime role check.

Addresses clintonk review comment
* Address review feedback: nil-safe gRPC stop, test improvements, split locks fix

- Make GracefulStop()/Stop() nil-safe (no-op when server not yet initialized)
  to prevent panic if Deactivate() races with early Activate() (Copilot review)
- Lower interceptor rejection log from Warn to Debug to avoid log spam during
  kubelet retries while registration is slow (Copilot review)
- Strengthen interceptor test assertions from NotNil to Equal (jwebster7 nit)
- Remove fixed 50ms sleep in test, increase registerStarted timeout to 5s for
  CI robustness (Copilot review on flakiness)
- Fix MkdirTemp comment accuracy (Copilot review)
- Revert unrelated pkg/locks/locks_test.go changes (split to separate PR per
  Copilot review suggestion to keep change set focused)

Signed-off-by: praveene12 <Praveen.E@netapp.com>

---------

Signed-off-by: praveene12 <Praveen.E@netapp.com>
…ll() standard Go function

It makes trident possible to use on Talos at least with NFS NetApp#806

mount.nfs* tools are already being added to the image, see https://github.com/NetApp/trident/blob/v26.02.1/Dockerfile#L13 for example.

So `mkdir` was the only reason why trident could not be used on Talos with NFS4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants