fix(configurator): make env_params first-class to fix the trajectory cache key by rutayan-nv · Pull Request #901 · NVIDIA/cloudai

rutayan-nv · 2026-05-26T23:22:58Z

Summary

Turns the four TDD-red tests in #900 green and makes the underlying
design change required to fix domain randomization in cloudai.

Today, CloudAIGymEnv.get_cached_trajectory_result() keys the cache on
action only. An agent revisiting the same action under a different
env_params sample (e.g. drop_rate) gets the first trial's reward
served back from cache, silently corrupting the reward signal. The
parallel artifact (env.csv) lived in workload code and was skipped on
cache hits, so it was sparse and mis-aligned with trajectory.csv.

This PR moves env_params from a workload-private convention into
first-class cloudai schema so the same fix benefits every agent (PPO,
BO, GA, MAB) and every workload uniformly:

TestDefinition.env_params: dict[str, EnvParamSpec] — sibling to
cmd_args; not part of the agent's action space.
TestRun.current_env_params: dict[str, Any] — per-trial sample
attached by the env, visible to the cache key and to downstream
sinks.
TrajectoryEntry gains env_params. The cache key is now
(action, current_env_params).
CloudAIGymEnv builds an EnvParamsObserver whenever the
TestDefinition declares env_params. before_step() samples
deterministically by (seed, name, trial), stashes the sample on
test_run, and appends to env.csv before the cache lookup — so
env.csv is written on every trial (cache hit or miss) and aligns
1:1 with trajectory.csv on step.
Workloads without any [env_params.*] block pay zero overhead: no
observer, no env.csv, identical behaviour to today.

A follow-up PR in cloudaix will delete the now-redundant
env_params.py / env_params_sampler.py / env_params_sink.py
modules in domain_randomization/workloads/mrc_prt/ and route the
demo workload through the cloudai-owned schema.

Test plan

The 4 TDD-red tests added in [TDD-red] CloudAIGymEnv cache key must include env_params (drives domain-randomization correctness fix) #900 (test_cache_miss_when_env_params_differ,
test_step_reruns_workload_when_env_params_change, plus two control
tests) now pass.
9 new unit tests in tests/test_env_params.py cover EnvParamSpec
validation, EnvParamsSampler determinism + per-parameter
independence, CsvSink format + edge cases, and
EnvParamsObserver wiring.
2 new integration tests in tests/test_cloudaigym.py pin the
env.csv ↔ trajectory.csv step-alignment contract and the
zero-overhead claim for workloads without env_params.
Full cloudai test suite: 1479 passed, 4 skipped.
ruff check / ruff format / pyright clean on all touched files.

Stacking

Depends on #900 (the TDD-red tests). Either rebase #900 onto this PR
after merge, or merge them together. Both are stacked on #893
(rpatro/step-ownership-tdd).

Stack & rebase status (updated 2026-06-12)

Stack: main ← #893 ← #900 ← #901 (top)

Re-stacked on #900. Carries the env_params first-class fix + the cache-hit integration test. Folded in the increment_step() test adaptation — the env_params tests now let TestRun.increment_step() drive the trial counter, since #893's step-ownership change sits below this PR in the stack.

Diff vs [TDD-red] CloudAIGymEnv cache key must include env_params (drives domain-randomization correctness fix) #900: functional added: 225 · test added: 267
Gates: green (pytest 1637 passed · ruff · pyright 0 errors)
Depends on: [TDD-red] CloudAIGymEnv cache key must include env_params (drives domain-randomization correctness fix) #900 → [CLI/Core] Polymorphic agent dispatch + TestRun owns trial counter #893 (merge bottom-up). Fork-based PRs, so GitHub bases stay main.

coderabbitai · 2026-05-26T23:23:03Z

📝 Walkthrough

Walkthrough

This PR implements domain randomization (environment parameter sampling) for CloudAI DSE workloads, refactors agent execution to a delegated run() method, introduces a Gymnasium-compatible adapter, and updates trajectory caching and observation logic to account for sampled environment state.

Changes

Domain Randomization and Agent Execution Refactoring

Layer / File(s)	Summary
Gymnasium dependency `pyproject.toml`	Gymnasium ~1.2 is added to dev and rl optional dependencies.
Core TestRun and TestDefinition data model `src/cloudai/_core/test_scenario.py`, `src/cloudai/models/workload.py`	TestRun gains `current_env_params` field and `increment_step()` method to track environment state. TestDefinition gains `env_params: dict[str, EnvParamSpec]` field for declaring domain-randomized parameters per trial.
Agent run() method and handler delegation `src/cloudai/configurator/base_agent.py`, `src/cloudai/cli/handlers.py`	BaseAgent implements concrete `run()` method that resets env, iterates steps via select_action, executes env.step(), constructs feedback for update_policy, logs, and returns exit code. Handler delegates agent execution to `agent.run()` instead of manual iteration, aggregating return codes.
Environment parameter specification and sampling `src/cloudai/configurator/env_params.py`	New module adds EnvParamSpec (categorical values with optional weights), EnvParamsSampler (deterministic per-trial sampling), EnvParamsSink protocol and CsvSink (env.csv persistence), StepObserver protocol, and EnvParamsObserver (samples before step, persists via sink).
CloudAIGymEnv observer and env_params integration `src/cloudai/configurator/cloudai_gym.py`	TrajectoryEntry adds `env_params` field. CloudAIGymEnv builds observers on init, updates `define_observation_space()` to dynamic shape, invokes `observer.before_step()` before step and `observer.after_step()` after trajectory write. Cached and non-cached paths persist `env_params` with trajectory. `get_cached_trajectory_result()` matches cache by both action and env_params values (empty env_params for back-compat).
Gymnasium environment adapter `src/cloudai/configurator/gymnasium_adapter.py`, `src/cloudai/configurator/__init__.py`	New GymnasiumAdapter wraps BaseGym with lazy Gymnasium/NumPy imports, splits parameters into tunable/fixed, exposes tunables as Dict of Discrete action spaces, converts observations to float32 arrays, decodes actions from indices, steps with Gymnasium 5-tuple output, and synchronizes test_run.step (1-based). Exported via module `__all__`.
Test suite `tests/test_env_params.py`, `tests/test_gymnasium_adapter.py`, `tests/test_cloudaigym.py`, `tests/test_handlers.py`, `tests/test_test_scenario.py`	Validates EnvParamSpec constraints, EnvParamsSampler determinism, CsvSink CSV writing, EnvParamsObserver integration, GymnasiumAdapter action/observation spaces and step synchronization, error handling, CloudAIGymEnv step alignment and cache behavior with env_params (miss on differ, hit on match, back-compat absent), workload rerun on param change, env.csv/trajectory.csv alignment, observer execution on cache hit, and agent.run() delegation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

podkidyshev
srivatsankrishnan
jeffnvidia

🐰 With domain params now sampled per trial,
And agents delegating their runs with style,
The Gymnasium adapter joins the fold—
Environment randomization, a story to be told!

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: making env_params first-class to fix the trajectory cache key, which is the core objective of this PR.
Description check	✅ Passed	The description is highly detailed and directly related to the changeset, explaining the problem, solution, design changes, test coverage, and stacking dependencies.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

- Agents that set HAS_CUSTOM_TRAINING_LOOP = True drive their own training loop; handle_dse_job calls agent.train() and skips the per-step env.step loop. - New _run_custom_training_loop helper logs exceptions, returns a process-style exit code, and always invokes agent.shutdown() (when defined) in a finally block so resources are released on both success and failure paths. - CustomTrainingLoopAgent Protocol documents the opt-in contract for type checkers and IDEs.

Pyright rejected calling _run_custom_training_loop(agent, ...) because the plain bool predicate did not narrow agent's static type from BaseAgent to CustomTrainingLoopAgent. Return TypeGuard[CustomTrainingLoopAgent] from _has_custom_training_loop so the truthy branch in handle_dse_job sees the opted-in shape and the helper can call agent.train() directly.

If agent.shutdown() raised from the finally block, Python suppressed the earlier return 0/1 from agent.train() and propagated the exception, breaking the outer test-run loop in handle_dse_job (skipped remaining scenarios, failed to accumulate err |= rc). Wrap shutdown() in its own try/except, log via logging.exception, set rc = 1, and return rc after finally so the helper always honours the (int) -> int contract. Adds tests for shutdown-only failure and combined train+shutdown failure.

…mutator Previously, ``test_run.step`` had no clear owner: the dispatcher set it from outside, the adapter rewound it on ``reset()``, and other callers wrote to it ad hoc. In RLlib custom-loop runs this collapsed every trial onto ``step=1``, overwriting ``trajectory.csv`` and ``env.csv`` rows. Centralize the invariant: ``TestRun.increment_step()`` is the single named mutator, and ``CloudAIGymEnv.step()`` is its only caller. One ``env.step()`` call advances the trial counter by exactly one — independent of any episode or dispatcher concept above the gym env. Contract tests in ``TestIncrementStep`` cover the API; ``test_cloudaigym`` asserts ``step`` is advanced *before* ``output_path`` and trajectory rows are computed, so cached and live trials both record the post-increment value.

…orphism Earlier commits in this PR introduced ``HAS_CUSTOM_TRAINING_LOOP`` + a ``CustomTrainingLoopAgent`` Protocol + a TypeGuard helper + an ``if/else`` in ``handle_dse_job`` to switch between the cloudai step loop and an agent-owned ``train()`` loop. That is a type-tagged conditional dispatching on agent identity — the textbook signal to replace conditional with polymorphism (Fowler). Add a default ``BaseAgent.run() -> int`` that holds the step-loop body (``select_action`` / ``env.step`` / ``update_policy`` per trial). Agents that drive their own training (RLlib, etc.) override ``run()`` to delegate to whatever loop they own and return a process-style exit code. ``handle_dse_job`` collapses to ``err |= agent.run()`` — one line, no branching, no Protocol vocabulary. The handler no longer knows that "custom training loops" exist as a category; that's an agent implementation detail. Net: -89 lines on cloudai. Surface area shrinks (no Protocol, no TypeGuard, no flag). ``test_handlers`` replaces the 5 helper unit tests + 2 dispatcher integration tests with 2 polymorphic tests asserting ``handle_dse_job`` delegates to ``agent.run()`` and propagates its return code.

- GymnasiumAdapter wraps a CloudAI BaseGym as a gymnasium-compatible env: Dict of Discrete actions over the tunable params, float32 Box observation derived from define_observation_space(), gymnasium 5-tuple from step/step_raw. - Fixed (single-value) params are injected on every step so agents only ever see the tunable subset; step_raw bypasses index decoding and fixed injection for callers that already have a fully-formed parameter dict. - The adapter mirrors handle_dse_job's 1-based test_run.step so artifact paths match whether the env is driven by the DSE loop or an external training loop. - CloudAIGymEnv.define_observation_space() now returns one slot per agent metric (at least one), giving adapters the correct Box shape. - gymnasium is an optional dependency: lazy-imported inside the adapter and exposed via a new rl extra (also added to dev) pinned to ~=1.2.

decode_action used to accept partial action dicts and step_raw accepted arbitrary key sets, both silently propagating incomplete params to the underlying env. Validate up front: - decode_action requires keys exactly == self._tunable_params, and each discrete index must be in range. - step_raw requires keys exactly == self._tunable_params | self._fixed_params. A new _assert_keys helper raises ValueError with sorted missing / extra lists so callers see exactly what was wrong. Adds tests for missing key, unknown key, out-of-range index, missing fixed param, and unknown raw key paths.

CloudAIGymEnv.get_cached_trajectory_result(action) keys the trajectory cache on action alone. When a workload declares env_params (e.g. drop_rate) and the agent re-selects the same action under a different env_params sample, the cache returns a stale reward measured under a different env, silently invalidating any domain-randomization workflow. Four tests pin the contract; today two FAIL (bug-exposing), two PASS (back-compat sanity). All 27 pre-existing tests are untouched. FAIL test_cache_miss_when_env_params_differ FAIL test_step_reruns_workload_when_env_params_change PASS test_cache_hit_when_action_and_env_params_match PASS test_cache_hit_when_neither_has_env_params Driver for the follow-up PR that adds env_params as a first-class field on TestDefinition + TrajectoryEntry and rewrites the cache key as (action, env_params). Both unit and integration shapes are covered so the fix can be validated end-to-end through env.step().

Domain randomization broke under the trajectory cache because the cache key was action-only: an agent revisiting the same action under a different env_params sample (e.g. drop_rate) got the *first* trial's reward served back, silently corrupting the reward signal. env.csv was also workload-owned and skipped on cache hits, so it was sparse and mis-aligned with trajectory.csv. This change makes env_params first-class in cloudai so the design works the same way for every agent (PPO, BO, GA, MAB) and every workload: - TestDefinition.env_params: dict[str, EnvParamSpec] - sibling to cmd_args, not part of the action space. - TestRun.current_env_params: dict[str, Any] - per-trial sample attached by the env so the cache key and downstream sinks see it. - TrajectoryEntry gains env_params; the cache key is now (action, current_env_params). - CloudAIGymEnv builds an EnvParamsObserver when env_params is declared. before_step() samples deterministically by (seed, name, trial), stashes the sample on test_run, and appends to env.csv *before* the cache lookup - so env.csv is written on every trial (cache hit or miss) and aligns 1:1 with trajectory.csv by step. - Workloads with no [env_params.*] block pay zero overhead: no observer, no env.csv, identical behaviour to today. Tests: turns the four TDD-red tests from the parent PR green and adds nine unit tests for EnvParamSpec / EnvParamsSampler / CsvSink / EnvParamsObserver plus two integration tests pinning the env.csv ↔ trajectory.csv contract.

Closes the most important gap in the env_params test matrix: the cache-HIT path under an observer-driven TestDefinition.env_params. The previous tests covered each axis separately - cache key unit tests for env_params, a step()-level cache-miss integration with manually set current_env_params, and a step-alignment test where all trials were cache misses. None of them proved that a cache HIT still drives the observer (and therefore env.csv) - which is precisely the sparsity contract the fix was about. The new test: - declares env_params on the TestDefinition (real observer wired in), - pre-seeds the trajectory with an entry whose action AND deterministic env_params sample match what the observer will produce for step 1, - calls env.step() once and asserts: (a) runner.run is NOT called (cache short-circuit honored), (b) env.csv gains a row (observer fired before the cache lookup), (c) the new trajectory entry carries the per-trial env_params. This is the end-to-end proof that the env.csv sparsity bug cannot recur, and that env.csv ↔ trajectory.csv stay step-aligned on the cache-hit path too.

coderabbitai

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/cloudai/configurator/cloudai_gym.py (2)

143-171: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Keep trajectory.csv aligned on the constraint-failure path.

observer.before_step() has already persisted the sampled env row by Line 147, but the early return on Lines 169-171 skips write_trajectory(). Any trial rejected by constraint_check() now leaves an orphan env.csv row, which breaks the PR’s stated 1:1 env.csv ↔ trajectory.csv contract and makes per-trial artifacts inconsistent.

Suggested fix

         if not self.test_run.test.constraint_check(self.test_run, self.runner.system):
             logging.info("Constraint check failed. Skipping step.")
-            return [-1.0], self.rewards.constraint_failure, True, {}
+            observation = [-1.0]
+            reward = self.rewards.constraint_failure
+            self.write_trajectory(
+                TrajectoryEntry(
+                    step=self.test_run.step,
+                    action=action,
+                    reward=reward,
+                    observation=observation,
+                    env_params=dict(self.test_run.current_env_params),
+                )
+            )
+            for observer in self.observers:
+                observer.after_step(self.test_run, observation, reward)
+            return observation, reward, True, {}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 143 - 171, The early
return on constraint_check leaves an env row without a matching trajectory; to
fix, when test_run.test.constraint_check(test_run, runner.system) returns False,
write a TrajectoryEntry (use self.write_trajectory with step=self.test_run.step,
action=action, reward=self.rewards.constraint_failure, observation=[-1.0],
env_params=dict(self.test_run.current_env_params)) and call each
observer.after_step(self.test_run, observation, reward) before returning the
same quadruple (observation, reward, True, {}); this mirrors the cached-result
path and preserves the 1:1 env.csv ↔ trajectory.csv contract.

186-202: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Preserve current_env_params when recovering from runner failures.

On Line 189 the fallback rebuilds self.test_run from self.original_test_run, which resets current_env_params to its default. The trajectory row written on Line 202 then records {} even though env.csv already contains the sampled params, and the cached failure is stored under (action, {}) instead of (action, sampled_env_params). Repeating the same failing trial will miss the cache and produce disagreeing artifacts.

Suggested fix

         if self.runner.test_scenario.test_runs and self.runner.test_scenario.test_runs[0].output_path.exists():
             self.test_run = self.runner.test_scenario.test_runs[0]
         else:
-            self.test_run = copy.deepcopy(self.original_test_run)
+            self.test_run = copy.deepcopy(self.original_test_run).apply_params_set(action)
             self.test_run.step = new_tr.step
             self.test_run.output_path = new_tr.output_path
+            self.test_run.current_env_params = dict(new_tr.current_env_params)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 186 - 202, When
rebuilding self.test_run in the fallback branch (inside cloudai_gym.py) you must
preserve the sampled env params so the trajectory/cache keys match the actual
env.csv; after creating the deep-copied self.test_run from
self.original_test_run and after setting step/output_path, copy the sampled
params into self.test_run.current_env_params (e.g. assign a deepcopy of
self.original_test_run.current_env_params or the sampled params present on the
runner) before calling get_observation/compute_reward and write_trajectory; this
ensures dict(self.test_run.current_env_params) in the TrajectoryEntry reflects
the real sampled parameters.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/configurator/base_agent.py`:
- Around line 116-149: In run(), replace the eager f-string logging calls with
%-style deferred logging so formatting happens only if the message will be
emitted; specifically update the two logging.info calls inside the run method
(the "Running step ..." and "Step {step}: Observation..." messages) to use
logging.info("Running step %s (of %s) with action %s", step, self.max_steps,
action) and logging.info("Step %s: Observation: %s, Reward: %.4f", step, ...,
reward) respectively, ensuring you format the observation list without using
f-strings so the logging framework can lazily interpolate.

In `@src/cloudai/configurator/env_params.py`:
- Around line 1-2: The file header years in
src/cloudai/configurator/env_params.py are incorrect for the touched year;
update the two top lines (the SPDX copyright line and the subsequent Copyright
line) by replacing "2024-2026" with the exact year string the repo's header test
expects for this file (e.g., "2026") so the header matches
tests/test_check_copyright_headers.py.

In `@src/cloudai/configurator/gymnasium_adapter.py`:
- Around line 133-165: Add a public close(self) -> None method to
GymnasiumAdapter that safely delegates to the wrapped environment's close
method: check for an attribute (e.g. getattr(self._env, "close", None)), call it
if callable, and return None; do not assume inheritance from gymnasium.Env. Add
the method near other lifecycle helpers (e.g. alongside
_sync_underlying_step_counter, _step_with_params) so callers that call
env.close() won't raise AttributeError.
- Around line 77-80: GymnasiumAdapter.__init__ currently filters action space
lists into _tunable_params (len>1) and _fixed_params (len==1) but silently drops
any keys with empty lists; modify GymnasiumAdapter.__init__ to detect any entry
in the raw_action_space where len(v) == 0 and raise a ValueError identifying the
offending key(s) so callers (e.g., TestRun.param_space) fail fast; add a unit
test that constructs a TestRun or raw_action_space containing an empty list for
a cmd_args key and asserts that GymnasiumAdapter(...) raises ValueError with a
message mentioning the empty key(s).

In `@tests/test_env_params.py`:
- Around line 86-87: The test uses a broad pytest.raises(ValueError) when
calling sink.write(0, {"drop_rate": 0.0}); narrow it by adding a match parameter
to pytest.raises that matches the specific ValueError message expected (e.g.,
the message about invalid drop_rate or index) so the test asserts the exact
error condition raised by sink.write; update the pytest.raises invocation around
sink.write to include match="expected message substring" referencing the
sink.write call and pytest.raises.
- Around line 1-2: The copyright header in the file contains the range
"2025-2026" which fails CI; update the header lines (the SPDX line and the
Copyright line that currently reads "2025-2026 NVIDIA CORPORATION & AFFILIATES")
to use a single year "2026" so they match the expected format for new 2026
files.

In `@tests/test_handlers.py`:
- Around line 287-288: The override of select_action in tests/test_handlers.py
has the wrong signature; change the function declaration for select_action in
that file to match the base class signature BaseAgent.select_action(self,
observation: list[float] | None = None) (i.e., add the observation parameter
with the same type and default) and keep the existing body (raising
AssertionError) so pyright/CI type-checking passes while preserving the test
intent.

---

Outside diff comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 143-171: The early return on constraint_check leaves an env row
without a matching trajectory; to fix, when
test_run.test.constraint_check(test_run, runner.system) returns False, write a
TrajectoryEntry (use self.write_trajectory with step=self.test_run.step,
action=action, reward=self.rewards.constraint_failure, observation=[-1.0],
env_params=dict(self.test_run.current_env_params)) and call each
observer.after_step(self.test_run, observation, reward) before returning the
same quadruple (observation, reward, True, {}); this mirrors the cached-result
path and preserves the 1:1 env.csv ↔ trajectory.csv contract.
- Around line 186-202: When rebuilding self.test_run in the fallback branch
(inside cloudai_gym.py) you must preserve the sampled env params so the
trajectory/cache keys match the actual env.csv; after creating the deep-copied
self.test_run from self.original_test_run and after setting step/output_path,
copy the sampled params into self.test_run.current_env_params (e.g. assign a
deepcopy of self.original_test_run.current_env_params or the sampled params
present on the runner) before calling get_observation/compute_reward and
write_trajectory; this ensures dict(self.test_run.current_env_params) in the
TrajectoryEntry reflects the real sampled parameters.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 0097c65c-5ab7-4bd2-8683-6e1ea26cc182

📥 Commits

Reviewing files that changed from the base of the PR and between 18ce72e and d076095.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (14)

pyproject.toml
src/cloudai/_core/test_scenario.py
src/cloudai/cli/handlers.py
src/cloudai/configurator/__init__.py
src/cloudai/configurator/base_agent.py
src/cloudai/configurator/cloudai_gym.py
src/cloudai/configurator/env_params.py
src/cloudai/configurator/gymnasium_adapter.py
src/cloudai/models/workload.py
tests/test_cloudaigym.py
tests/test_env_params.py
tests/test_gymnasium_adapter.py
tests/test_handlers.py
tests/test_test_scenario.py

coderabbitai · 2026-06-12T21:38:04Z

+    def run(self) -> int:
+        """
+        Orchestrate this agent's exploration over ``self.env``.
+
+        Default: a step loop driven by the dispatcher (``select_action`` →
+        ``env.step`` → ``update_policy`` per trial). Agents that drive their
+        own training loop (e.g. RLlib-based agents calling ``algo.train()``)
+        override this method.
+
+        Returns:
+            int: Process-style return code (``0`` success, non-zero failure).
+            ``handle_dse_job`` accumulates this via ``err |= agent.run()``.
+        """
+        observation, _ = self.env.reset()
+        for _ in range(self.max_steps):
+            result = self.select_action(observation=observation)
+            if result is None:
+                break
+            step, action = result
+            logging.info(f"Running step {step} (of {self.max_steps}) with action {action}")
+            prev_observation = observation
+            observation, reward, done, *_ = self.env.step(action)
+            self.update_policy(
+                {
+                    "trial_index": step,
+                    "value": reward,
+                    "observation": observation,
+                    "prev_observation": prev_observation,
+                    "action": action,
+                    "done": done,
+                }
+            )
+            logging.info(f"Step {step}: Observation: {[round(obs, 4) for obs in observation]}, Reward: {reward:.4f}")
+        return 0


🚀 Performance & Scalability | 🔵 Trivial | 💤 Low value

Consider lazy-evaluation logging for consistency with Ruff G004.

The f-string logging on lines 135 and 148 works correctly but eagerly evaluates formatting arguments even when the log level would skip the message. Using %-style formatting (logging.info("Running step %s (of %s) with action %s", step, self.max_steps, action)) allows the logging framework to defer evaluation, improving performance in production when info-level logs may be disabled.

This is a minor style and performance optimization; the current implementation is functionally correct.

♻️ Proposed refactor to %-style logging

- logging.info(f"Running step {step} (of {self.max_steps}) with action {action}") + logging.info("Running step %s (of %s) with action %s", step, self.max_steps, action) prev_observation = observation observation, reward, done, *_ = self.env.step(action) self.update_policy( { "trial_index": step, "value": reward, "observation": observation, "prev_observation": prev_observation, "action": action, "done": done, } ) - logging.info(f"Step {step}: Observation: {[round(obs, 4) for obs in observation]}, Reward: {reward:.4f}") + logging.info( + "Step %s: Observation: %s, Reward: %.4f", + step, + [round(obs, 4) for obs in observation], + reward, + )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def run(self) -> int:

"""

Orchestrate this agent's exploration over ``self.env``.

Default: a step loop driven by the dispatcher (``select_action`` →

``env.step`` → ``update_policy`` per trial). Agents that drive their

own training loop (e.g. RLlib-based agents calling ``algo.train()``)

override this method.

Returns:

int: Process-style return code (``0`` success, non-zero failure).

``handle_dse_job`` accumulates this via ``err |= agent.run()``.

"""

observation, _ = self.env.reset()

for _ in range(self.max_steps):

result = self.select_action(observation=observation)

if result is None:

break

step, action = result

logging.info(f"Running step {step} (of {self.max_steps}) with action {action}")

prev_observation = observation

observation, reward, done, *_ = self.env.step(action)

self.update_policy(

{

"trial_index": step,

"value": reward,

"observation": observation,

"prev_observation": prev_observation,

"action": action,

"done": done,

}

)

logging.info(f"Step {step}: Observation: {[round(obs, 4) for obs in observation]}, Reward: {reward:.4f}")

return 0

def run(self) -> int:

"""

Orchestrate this agent's exploration over ``self.env``.

Default: a step loop driven by the dispatcher (``select_action`` →

``env.step`` → ``update_policy`` per trial). Agents that drive their

own training loop (e.g. RLlib-based agents calling ``algo.train()``)

override this method.

Returns:

int: Process-style return code (``0`` success, non-zero failure).

``handle_dse_job`` accumulates this via ``err |= agent.run()``.

"""

observation, _ = self.env.reset()

for _ in range(self.max_steps):

result = self.select_action(observation=observation)

if result is None:

break

step, action = result

logging.info("Running step %s (of %s) with action %s", step, self.max_steps, action)

prev_observation = observation

observation, reward, done, *_ = self.env.step(action)

self.update_policy(

{

"trial_index": step,

"value": reward,

"observation": observation,

"prev_observation": prev_observation,

"action": action,

"done": done,

}

)

logging.info(

"Step %s: Observation: %s, Reward: %.4f",

step,

[round(obs, 4) for obs in observation],

reward,

)

return 0

🧰 Tools

🪛 Ruff (0.15.15)

[warning] 135-135: Logging statement uses f-string

(G004)

[warning] 148-148: Logging statement uses f-string

(G004)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/base_agent.py` around lines 116 - 149, In run(), replace the eager f-string logging calls with %-style deferred logging so formatting happens only if the message will be emitted; specifically update the two logging.info calls inside the run method (the "Running step ..." and "Step {step}: Observation..." messages) to use logging.info("Running step %s (of %s) with action %s", step, self.max_steps, action) and logging.info("Step %s: Observation: %s, Reward: %.4f", step, ..., reward) respectively, ensuring you format the observation list without using f-strings so the logging framework can lazily interpolate.

coderabbitai · 2026-06-12T21:38:05Z

+# SPDX-FileCopyrightText: NVIDIA CORPORATION & AFFILIATES
+# Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.


📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win

Fix the copyright year header.

CI is already failing here: the new file is checked as touched in 2026, but the header still says 2024-2026. Update it to the exact year string the repo’s header test expects for this file, otherwise the PR cannot merge. Based on learnings, this repository’s copyright check requires touched-year formatting to match tests/test_check_copyright_headers.py.

🧰 Tools

🪛 GitHub Actions: CI / 0_Run pytest (3.12).txt

[error] 1-1: Copyright header year check failed in tests/test_check_copyright_headers.py. AssertionError: Copyright year is not valid. Expected '# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.' but found '# Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.'

🪛 GitHub Actions: CI / 1_Run pytest (3.10).txt

[error] 1-1: Copyright header check failed in tests/test_check_copyright_headers.py. Copyright year is not valid. Expected '# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.' but found '# Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.'

🪛 GitHub Actions: CI / Run pytest (3.10)

[error] 1-1: pytest tests/test_check_copyright_headers.py::test_src_copyright_header failed: Copyright year is not valid. Expected '# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.' but found '# Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.'

🪛 GitHub Actions: CI / Run pytest (3.12)

[error] 1-1: test_src_copyright_header failed (Copyright year is not valid). Expected: '# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.'; Found: '# Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.'

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/env_params.py` around lines 1 - 2, The file header years in src/cloudai/configurator/env_params.py are incorrect for the touched year; update the two top lines (the SPDX copyright line and the subsequent Copyright line) by replacing "2024-2026" with the exact year string the repo's header test expects for this file (e.g., "2026") so the header matches tests/test_check_copyright_headers.py.

Sources: Learnings, Pipeline failures

coderabbitai · 2026-06-12T21:38:05Z

+        raw_action_space = env.define_action_space()
+        self._tunable_params: dict[str, list] = {k: v for k, v in raw_action_space.items() if len(v) > 1}
+        self._fixed_params: dict[str, Any] = {k: v[0] for k, v in raw_action_space.items() if len(v) == 1}
+


🎯 Functional Correctness | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== BaseGym contract and declarations ===" rg -nC3 --type=py '\bclass\s+BaseGym\b|def define_action_space\s*$' src tests echo echo "=== Concrete define_action_space implementations ===" rg -nC3 --type=py 'def define_action_space\s*\(' src tests echo echo "=== Existing validation for empty candidate lists ===" rg -nC3 --type=py 'ValueError|assert|len\([^)]*$\s*==\s*0|if not [^:]+:' src tests

Repository: NVIDIA/cloudai

Length of output: 50372

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== Inspect gymnasium_adapter.py around lines 60-110 ===" nl -ba src/cloudai/configurator/gymnasium_adapter.py | sed -n '50,130p' echo echo "=== Search for empty candidate validation in gymnasium_adapter.py ===" rg -n --type=py 'len$[^)]*$\s*==\s*0|==\s*0|empty|candidate|action_space' src/cloudai/configurator/gymnasium_adapter.py echo echo "=== Search tests for empty candidate cases ===" rg -n --type=py 'empty|len$[^)]*$\s*==\s*0|ValueError|candidate' tests/test_gymnasium_adapter.py echo echo "=== Show entire tests/test_gymnasium_adapter.py (small file?) ===" wc -l tests/test_gymnasium_adapter.py nl -ba tests/test_gymnasium_adapter.py | sed -n '1,220p'

Repository: NVIDIA/cloudai

Length of output: 176

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== Inspect gymnasium_adapter.py around lines 60-110 ===" nl -ba src/cloudai/configurator/gymnasium_adapter.py | sed -n '50,130p' echo echo "=== Search for empty candidate validation in gymnasium_adapter.py ===" rg -n --type=py 'len$[^)]*$\s*==\s*0|==\s*0|empty|candidate|action_space' src/cloudai/configurator/gymnasium_adapter.py echo echo "=== Search tests for empty candidate cases ===" rg -n --type=py 'empty|len$[^)]*$\s*==\s*0|ValueError|candidate' tests/test_gymnasium_adapter.py echo echo "=== Show tests/test_gymnasium_adapter.py size and contents (if small) ===" wc -l tests/test_gymnasium_adapter.py nl -ba tests/test_gymnasium_adapter.py | sed -n '1,260p'

Repository: NVIDIA/cloudai

Length of output: 176

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== Inspect gymnasium_adapter.py around lines 60-130 ===" sed -n '60,130p' src/cloudai/configurator/gymnasium_adapter.py echo echo "=== Search for empty candidate validation in gymnasium_adapter.py ===" rg -n --type=py 'len$[^)]*$\s*==\s*0|==\s*0|empty|candidate|action_space' src/cloudai/configurator/gymnasium_adapter.py echo echo "=== Search tests for empty candidate cases ===" rg -n --type=py 'empty|len$[^)]*$\s*==\s*0|ValueError|candidate' tests/test_gymnasium_adapter.py echo echo "=== Test file size and contents (first 260 lines) ===" wc -l tests/test_gymnasium_adapter.py sed -n '1,260p' tests/test_gymnasium_adapter.py

Repository: NVIDIA/cloudai

Length of output: 12033

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== Inspect BaseGym define_action_space docstring ===" sed -n '1,120p' src/cloudai/configurator/base_gym.py echo echo "=== Inspect cloudai_gym define_action_space and where param_space comes from ===" sed -n '1,200p' src/cloudai/configurator/cloudai_gym.py echo echo "=== Search for define_action_space usage/validation of empty lists ===" rg -n --type=py 'define_action_space|param_space|action_space' src/cloudai/configurator echo echo "=== Search for explicit checks against empty candidate lists ([] or len==0) ===" rg -n --type=py 'len$[^)]*$\s*==\s*0|not\s+[^:]+|empty\s+|ValueError|assert' src/cloudai/configurator/gymnasium_adapter.py src/cloudai/configurator/base_gym.py src/cloudai/configurator/cloudai_gym.py

Repository: NVIDIA/cloudai

Length of output: 15364

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== Search for 'param_space' definition/assignment ===" rg -n --type=py '\bparam_space\b\s*=' src echo echo "=== Search for 'apply_params_set' and how action keys/values are validated ===" rg -n --type=py 'def apply_params_set\s*$' src/cloudai echo echo "=== Show BaseGym / adapter helper _assert_keys and surrounding logic ===" sed -n '110,210p' src/cloudai/configurator/gymnasium_adapter.py echo echo "=== Search for empties in param_space values (len(...)==0) across src ===" rg -n --type=py 'len\([^)]*$\s*==\s*0|==\s*\[\s*\]|\[\s*\]\s*$' src/cloudai

Repository: NVIDIA/cloudai

Length of output: 113

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== Locate all param_space references ===" rg -n --type=py '\bparam_space\b' src echo echo "=== Search for 'explorable' / action space construction utilities ===" rg -n --type=py '(explorable|action_space|param_space|candidate|candidates|param|grid_search)' src/cloudai/configurator echo echo "=== Search for where TestRun params set are applied/validated ===" rg -n --type=py 'apply_params_set|current_env_params|increment_step' src/cloudai echo echo "=== Search for any explicit validation of action values (empty list, len==0) ===" rg -n --type=py 'len$[^)]*$\s*==\s*0|ValueError\([^)]*empty|empty candidate|candidate list' src/cloudai

Repository: NVIDIA/cloudai

Length of output: 12650

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== Inspect src/cloudai/_core/test_scenario.py around param_space property ===" sed -n '130,230p' src/cloudai/_core/test_scenario.py echo echo "=== Inspect apply_params_set in src/cloudai/_core/test_scenario.py ===" sed -n '230,340p' src/cloudai/_core/test_scenario.py echo echo "=== Inspect env_params categorical spec validation for candidate lists ===" sed -n '1,220p' src/cloudai/configurator/env_params.py

Repository: NVIDIA/cloudai

Length of output: 11074

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== Locate TestDefinition / cmd_args model types ===" rg -n --type=py '\bclass\s+TestDefinition\b|cmd_args.*BaseModel|extra_env_vars|is_dse_excluded_arg|is_dse_job' src/cloudai/_core src/cloudai echo echo "=== Locate implementation of is_dse_excluded_arg ===" rg -n --type=py 'def\s+is_dse_excluded_arg\s*$' src/cloudai echo echo "=== Inspect likely DSE candidate-list validations (min_length, non-empty, ValueError) ===" rg -n --type=py 'min_length\s*=\s*1|min_length\s*=\s*2|non-empty|nonempty|empty|len\(.*$\s*==\s*0|min_items|min_length|validate.*list|model_validator.*list' src/cloudai/_core src/cloudai/configurator src/cloudai

Repository: NVIDIA/cloudai

Length of output: 13895

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "=== Inspect src/cloudai/models/workload.py around TestDefinition and DSE helpers ===" sed -n '1,260p' src/cloudai/models/workload.py echo echo "=== Inspect remaining part of workload.py for cmd_args model validators ===" sed -n '260,520p' src/cloudai/models/workload.py echo echo "=== Search within workload.py for list validators on cmd_args / action params ===" rg -n --type=py 'cmd_args|model_validator|validator|Field\(|min_length|min_items|empty|ValueError' src/cloudai/models/workload.py

Repository: NVIDIA/cloudai

Length of output: 9136

Add a fail-fast guard for empty candidate lists
TestRun.param_space includes cmd_args entries whenever the value is a list (even []), but GymnasiumAdapter.__init__ only treats len(v) > 1 as tunable and len(v) == 1 as fixed—so len(v) == 0 keys are dropped from both _tunable_params and _fixed_params, silently changing the action schema and leaving the underlying cmd_args unset. Add a ValueError guard for len(v) == 0 in GymnasiumAdapter and a unit test for the empty-list case.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/gymnasium_adapter.py` around lines 77 - 80, GymnasiumAdapter.__init__ currently filters action space lists into _tunable_params (len>1) and _fixed_params (len==1) but silently drops any keys with empty lists; modify GymnasiumAdapter.__init__ to detect any entry in the raw_action_space where len(v) == 0 and raise a ValueError identifying the offending key(s) so callers (e.g., TestRun.param_space) fail fast; add a unit test that constructs a TestRun or raw_action_space containing an empty list for a cmd_args key and asserts that GymnasiumAdapter(...) raises ValueError with a message mentioning the empty key(s).

coderabbitai · 2026-06-12T21:38:05Z

+    def render(self) -> None:
+        self._env.render()
+
+    @staticmethod
+    def _assert_keys(received: Any, expected: set[str], ctx: str) -> None:
+        received_set = set(received)
+        if received_set == expected:
+            return
+        missing = sorted(expected - received_set)
+        extra = sorted(received_set - expected)
+        raise ValueError(f"{ctx} keys mismatch; missing={missing}, extra={extra}")
+
+    def _step_with_params(self, params: dict[str, Any]) -> tuple[Any, float, bool, bool, dict[str, Any]]:
+        self._sync_underlying_step_counter()
+        obs, reward, done, info = self._env.step(params)
+        self._step_count += 1
+        return self._as_obs_array(obs), float(reward), bool(done), False, info
+
+    def _sync_underlying_step_counter(self) -> None:
+        """
+        Mirror ``handle_dse_job``'s 1-based ``test_run.step`` so artifact paths match.
+
+        The first step is written under ``…/<iteration>/1/``, matching how
+        ``handle_dse_job`` numbers steps; this keeps reports and trajectory
+        analysis consistent regardless of whether the env is driven by the
+        DSE loop or by an external training loop wrapping the adapter.
+        """
+        test_run = getattr(self._env, "test_run", None)
+        if test_run is not None:
+            test_run.step = self._step_count + 1
+
+    def _as_obs_array(self, obs: Any) -> Any:
+        return self._np.asarray(obs, dtype=self._np.float32)


🩺 Stability & Availability | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail FILE="src/cloudai/configurator/gymnasium_adapter.py" echo "== File head ==" sed -n '1,220p' "$FILE" echo "== Search for close() in this file ==" rg -n "def close\\(" "$FILE" || true echo "== Search for close usage in repo ==" rg -n "\\.close\$" src || true echo "== Search for gymnasium_adapter usage ==" rg -n "gymnasium_adapter" -S . || true

Repository: NVIDIA/cloudai

Length of output: 7178

🏁 Script executed:

#!/bin/bash set -euo pipefail rg -n "close" tests/test_gymnasium_adapter.py src/cloudai/configurator/gymnasium_adapter.py || true sed -n '1,260p' tests/test_gymnasium_adapter.py

Repository: NVIDIA/cloudai

Length of output: 7906

🏁 Script executed:

#!/bin/bash set -euo pipefail # Inspect BaseGym for close() fd -a "base_gym.py" src/cloudai/configurator -t f -x 'echo {}' || true BASE_GYM_FILE="$(fd -a "base_gym.py" src/cloudai/configurator -t f | head -n 1)" echo "Using: $BASE_GYM_FILE" sed -n '1,240p' "$BASE_GYM_FILE" echo "== Search for close() usage on GymnasiumAdapter instances ==" rg -n "GymnasiumAdapter\\(" -S src tests || true rg -n "gymnasium_adapter\\.GymnasiumAdapter|configurator\\.GymnasiumAdapter|from cloudai\\.configurator import GymnasiumAdapter" -S src tests || true rg -n "\\.close\\(\$" -S src tests || true echo "== Search for __getattr__ / delegation patterns in gymnasium_adapter.py ==" rg -n "__getattr__|__getattribute__|wrap|delegate" src/cloudai/configurator/gymnasium_adapter.py || true

Repository: NVIDIA/cloudai

Length of output: 4640

🌐 Web query:

Gymnasium Env close method API optional wrapper close

💡 Result:

In the Gymnasium API, the close method is used to release resources (such as rendering windows, open files, or database connections) used by an environment [1][2]. Implementation of the close method is optional in custom environments [2]. If an environment does not require any specific resource cleanup, you do not need to implement this method [2]. When using Gymnasium wrappers: 1. The base gymnasium.Wrapper class provides a default close method that calls close on the underlying environment attribute (self.env.close) [3]. 2. If you are creating a custom wrapper that adds its own resource-dependent logic, you can override the close method [4][5]. In such cases, it is standard practice to ensure the underlying environment's close method is still called, typically by including super.close in your override to maintain the chain of closing across wrapped environments [3]. You should not interact with an environment after calling its close method [2]. Additionally, the Gymnasium environment checker utility may test the close method to ensure it functions correctly and can be called multiple times without raising an error [6].

Citations:

1: https://gymnasium.farama.org/api/env/

2: https://gymnasium.farama.org/main/tutorials/gymnasium_basics/environment_creation/

3: https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/core.py

4: https://gymnasium.farama.org/main/tutorials/gymnasium_basics/implementing_custom_wrappers/

5: https://gymnasium.farama.org/tutorials/gymnasium_basics/implementing_custom_wrappers/

6: https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/utils/env_checker.py

🌐 Web query:

Gymnasium Env class has close() default implementation no-op core.py

💡 Result:

In the Gymnasium library, the Env class does not define a default, no-op implementation of the close method in its core.py file. Rather, close is defined as an abstract method or a required interface that custom environments are expected to implement if they need to release external resources [1][2][3]. While Gymnasium's documentation acknowledges that in many cases a custom environment may not need to implement close, the base Env class does not provide a functional default [4][5]. Users are encouraged to override this method to perform necessary cleanup, such as closing windows (e.g., Pygame) or releasing file handles [4][5]. Because there is no default implementation in the base Env class, calling close on an environment that has not explicitly defined it—or failing to implement it when resources are involved—can lead to issues where resources remain open [4][6]. Consequently, users must explicitly manage environment cleanup to ensure proper resource management [6].

Citations:

1: https://github.com/Farama-Foundation/Gymnasium/blob/58c6b7b49da20418b21df632866eb215f8c26876/gymnasium/core.py

2: https://gymnasium.farama.org/api/env/

3: https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/core.py

4: https://gymnasium.farama.org/main/tutorials/gymnasium_basics/environment_creation/

5: https://gymnasium.farama.org/v1.1.0/tutorials/gymnasium_basics/environment_creation/

6: Removed destructor of VectorEnv as it causes VectorEnv.close() to be called unintentionally. Fixes #1521. Farama-Foundation/Gymnasium#1522

Add close() to the Gymnasium-shaped adapter API

GymnasiumAdapter is a standalone “Env-shaped” object (it doesn’t inherit from gymnasium.Env/Wrapper) and the class provides no close() method, so teardown code that calls env.close() can fail with AttributeError. Delegate to the wrapped BaseGym.close when available.

♻️ Proposed fix

def render(self) -> None: self._env.render() + def close(self) -> None: + close = getattr(self._env, "close", None) + if callable(close): + close() + `@staticmethod` def _assert_keys(received: Any, expected: set[str], ctx: str) -> None: received_set = set(received)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def render(self) -> None:

self._env.render()

@staticmethod

def _assert_keys(received: Any, expected: set[str], ctx: str) -> None:

received_set = set(received)

if received_set == expected:

return

missing = sorted(expected - received_set)

extra = sorted(received_set - expected)

raise ValueError(f"{ctx} keys mismatch; missing={missing}, extra={extra}")

def _step_with_params(self, params: dict[str, Any]) -> tuple[Any, float, bool, bool, dict[str, Any]]:

self._sync_underlying_step_counter()

obs, reward, done, info = self._env.step(params)

self._step_count += 1

return self._as_obs_array(obs), float(reward), bool(done), False, info

def _sync_underlying_step_counter(self) -> None:

"""

Mirror ``handle_dse_job``'s 1-based ``test_run.step`` so artifact paths match.

The first step is written under ``…/<iteration>/1/``, matching how

``handle_dse_job`` numbers steps; this keeps reports and trajectory

analysis consistent regardless of whether the env is driven by the

DSE loop or by an external training loop wrapping the adapter.

"""

test_run = getattr(self._env, "test_run", None)

if test_run is not None:

test_run.step = self._step_count + 1

def _as_obs_array(self, obs: Any) -> Any:

return self._np.asarray(obs, dtype=self._np.float32)

def render(self) -> None:

self._env.render()

def close(self) -> None:

close = getattr(self._env, "close", None)

if callable(close):

close()

`@staticmethod`

def _assert_keys(received: Any, expected: set[str], ctx: str) -> None:

received_set = set(received)

if received_set == expected:

return

missing = sorted(expected - received_set)

extra = sorted(received_set - expected)

raise ValueError(f"{ctx} keys mismatch; missing={missing}, extra={extra}")

def _step_with_params(self, params: dict[str, Any]) -> tuple[Any, float, bool, bool, dict[str, Any]]:

self._sync_underlying_step_counter()

obs, reward, done, info = self._env.step(params)

self._step_count += 1

return self._as_obs_array(obs), float(reward), bool(done), False, info

def _sync_underlying_step_counter(self) -> None:

"""

Mirror ``handle_dse_job``'s 1-based ``test_run.step`` so artifact paths match.

The first step is written under ``…/<iteration>/1/``, matching how

``handle_dse_job`` numbers steps; this keeps reports and trajectory

analysis consistent regardless of whether the env is driven by the

DSE loop or by an external training loop wrapping the adapter.

"""

test_run = getattr(self._env, "test_run", None)

if test_run is not None:

test_run.step = self._step_count + 1

def _as_obs_array(self, obs: Any) -> Any:

return self._np.asarray(obs, dtype=self._np.float32)

🧰 Tools

🪛 Ruff (0.15.15)

[warning] 137-137: Dynamically typed expressions (typing.Any) are disallowed in received

(ANN401)

[warning] 143-143: Avoid specifying long messages outside the exception class

(TRY003)

[warning] 164-164: Dynamically typed expressions (typing.Any) are disallowed in obs

(ANN401)

[warning] 164-164: Dynamically typed expressions (typing.Any) are disallowed in _as_obs_array

(ANN401)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/gymnasium_adapter.py` around lines 133 - 165, Add a public close(self) -> None method to GymnasiumAdapter that safely delegates to the wrapped environment's close method: check for an attribute (e.g. getattr(self._env, "close", None)), call it if callable, and return None; do not assume inheritance from gymnasium.Env. Add the method near other lifecycle helpers (e.g. alongside _sync_underlying_step_counter, _step_with_params) so callers that call env.close() won't raise AttributeError.

coderabbitai · 2026-06-12T21:38:05Z

+# SPDX-FileCopyrightText: NVIDIA CORPORATION & AFFILIATES
+# Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.


📐 Maintainability & Code Quality | 🔴 Critical | ⚡ Quick win

Fix copyright header year to match CI expectations.

The copyright header shows 2025-2026 but CI expects 2026 for new files created in 2026. This is breaking the copyright header check in CI.

📅 Proposed fix

# SPDX-FileCopyrightText: NVIDIA CORPORATION & AFFILIATES -# Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. # SPDX-License-Identifier: Apache-2.0

Based on learnings: In the NVIDIA/cloudai repo, copyright headers' year formatting must match the test logic enforced by tests/test_check_copyright_headers.py. New files should list only the year they were created.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# SPDX-FileCopyrightText: NVIDIA CORPORATION & AFFILIATES

# Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

# SPDX-FileCopyrightText: NVIDIA CORPORATION & AFFILIATES

# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

🧰 Tools

🪛 GitHub Actions: CI / 0_Run pytest (3.12).txt

[error] 1-1: Copyright header year check failed in tests/test_check_copyright_headers.py. AssertionError: Copyright year is not valid. Expected '# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.' but found '# Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.'

🪛 GitHub Actions: CI / 1_Run pytest (3.10).txt

[error] 1-1: Copyright header check failed in tests/test_check_copyright_headers.py. Copyright year is not valid. Expected '# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.' but found '# Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.'

🪛 GitHub Actions: CI / Run pytest (3.10)

[error] 1-1: pytest tests/test_check_copyright_headers.py::test_src_copyright_header failed: Copyright year is not valid. Expected '# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.' but found '# Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.'

🪛 GitHub Actions: CI / Run pytest (3.12)

[error] 1-1: test_src_copyright_header failed (Copyright year is not valid). Expected: '# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.'; Found: '# Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.'

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_env_params.py` around lines 1 - 2, The copyright header in the file contains the range "2025-2026" which fails CI; update the header lines (the SPDX line and the Copyright line that currently reads "2025-2026 NVIDIA CORPORATION & AFFILIATES") to use a single year "2026" so they match the expected format for new 2026 files.

Sources: Learnings, Pipeline failures

coderabbitai · 2026-06-12T21:38:05Z

+    with pytest.raises(ValueError):
+        sink.write(0, {"drop_rate": 0.0})


📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Add match parameter to narrow the pytest.raises scope.

The bare pytest.raises(ValueError) is broad. Adding a match parameter would verify the specific error condition being tested.

🔍 Proposed fix

- with pytest.raises(ValueError): - sink.write(0, {"drop_rate": 0.0}) + with pytest.raises(ValueError, match=r"step.*must.*positive|step.*0"): + sink.write(0, {"drop_rate": 0.0})

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

with pytest.raises(ValueError):

sink.write(0, {"drop_rate": 0.0})

with pytest.raises(ValueError, match=r"step.*must.*positive|step.*0"):

sink.write(0, {"drop_rate": 0.0})

🧰 Tools

🪛 Ruff (0.15.15)

[warning] 86-86: pytest.raises(ValueError) is too broad, set the match parameter or use a more specific exception

(PT011)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_env_params.py` around lines 86 - 87, The test uses a broad pytest.raises(ValueError) when calling sink.write(0, {"drop_rate": 0.0}); narrow it by adding a match parameter to pytest.raises that matches the specific ValueError message expected (e.g., the message about invalid drop_rate or index) so the test asserts the exact error condition raised by sink.write; update the pytest.raises invocation around sink.write to include match="expected message substring" referencing the sink.write call and pytest.raises.

Source: Linters/SAST tools

coderabbitai · 2026-06-12T21:38:05Z

+    def select_action(self) -> tuple[int, dict[str, Any]]:  # pragma: no cover - never called
+        raise AssertionError("select_action must not be called when run() is overridden")


🎯 Functional Correctness | 🔴 Critical | ⚡ Quick win

Fix select_action() signature to match base class.

The override is missing the observation parameter that exists in BaseAgent.select_action(self, observation: list[float] | None = None). This breaks pyright type checking and CI.

🔧 Proposed fix

- def select_action(self) -> tuple[int, dict[str, Any]]: # pragma: no cover - never called + def select_action(self, observation: list[float] | None = None) -> tuple[int, dict[str, Any]]: # pragma: no cover raise AssertionError("select_action must not be called when run() is overridden")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def select_action(self) -> tuple[int, dict[str, Any]]: # pragma: no cover - never called

raise AssertionError("select_action must not be called when run() is overridden")

def select_action(self, observation: list[float] | None = None) -> tuple[int, dict[str, Any]]: # pragma: no cover

raise AssertionError("select_action must not be called when run() is overridden")

🧰 Tools

🪛 GitHub Actions: CI / 2_Linting.txt

[error] 287-287: pyright failed (hook id: pyright). Method "select_action" overrides class "BaseAgent" incompatibly: positional parameter count mismatch (base has 2, override has 1). Parameter 2 mismatch: base "observation" is keyword parameter, override parameter is position-only (reportIncompatibleMethodOverride).

🪛 GitHub Actions: CI / Linting

[error] 287-287: pyright (hook id: pyright) failed: Method "select_action" overrides class "BaseAgent" in an incompatible manner (reportIncompatibleMethodOverride). Positional parameter count mismatch (base has 2, override has 1). Parameter 2 mismatch: base parameter "observation" is keyword parameter, override parameter is position-only.

🪛 Ruff (0.15.15)

[warning] 288-288: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_handlers.py` around lines 287 - 288, The override of select_action in tests/test_handlers.py has the wrong signature; change the function declaration for select_action in that file to match the base class signature BaseAgent.select_action(self, observation: list[float] | None = None) (i.e., add the observation parameter with the same type and default) and keep the existing body (raising AssertionError) so pyright/CI type-checking passes while preserving the test intent.

Source: Pipeline failures

rutayan-nv added 10 commits June 12, 2026 16:17

rutayan-nv force-pushed the rpatro/env-params-first-class branch from d0375b4 to d076095 Compare June 12, 2026 21:14

This was referenced Jun 12, 2026

[CLI/Core] Polymorphic agent dispatch + TestRun owns trial counter #893

Open

[TDD-red] CloudAIGymEnv cache key must include env_params (drives domain-randomization correctness fix) #900

Open

rutayan-nv marked this pull request as ready for review June 12, 2026 21:27

rutayan-nv requested review from jeffnvidia, podkidyshev and srivatsankrishnan as code owners June 12, 2026 21:27

coderabbitai Bot reviewed Jun 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(configurator): make env_params first-class to fix the trajectory cache key#901

fix(configurator): make env_params first-class to fix the trajectory cache key#901
rutayan-nv wants to merge 10 commits into
NVIDIA:mainfrom
rutayan-nv:rpatro/env-params-first-class

rutayan-nv commented May 26, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 26, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 12, 2026

Uh oh!

coderabbitai Bot Jun 12, 2026

Uh oh!

coderabbitai Bot Jun 12, 2026

Uh oh!

coderabbitai Bot Jun 12, 2026

Uh oh!

coderabbitai Bot Jun 12, 2026

Uh oh!

coderabbitai Bot Jun 12, 2026

Uh oh!

coderabbitai Bot Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		# SPDX-FileCopyrightText: NVIDIA CORPORATION & AFFILIATES
		# Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

		# SPDX-FileCopyrightText: NVIDIA CORPORATION & AFFILIATES
		# Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

		with pytest.raises(ValueError):
		sink.write(0, {"drop_rate": 0.0})

		def select_action(self) -> tuple[int, dict[str, Any]]: # pragma: no cover - never called
		raise AssertionError("select_action must not be called when run() is overridden")

Conversation

rutayan-nv commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Stacking

Stack & rebase status (updated 2026-06-12)

Uh oh!

coderabbitai Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rutayan-nv commented May 26, 2026 •

edited

Loading

coderabbitai Bot commented May 26, 2026 •

edited

Loading