Skip to content

fix(configurator): make env_params first-class to fix the trajectory cache key#901

Open
rutayan-nv wants to merge 10 commits into
NVIDIA:mainfrom
rutayan-nv:rpatro/env-params-first-class
Open

fix(configurator): make env_params first-class to fix the trajectory cache key#901
rutayan-nv wants to merge 10 commits into
NVIDIA:mainfrom
rutayan-nv:rpatro/env-params-first-class

Conversation

@rutayan-nv

@rutayan-nv rutayan-nv commented May 26, 2026

Copy link
Copy Markdown
Contributor

Summary

Turns the four TDD-red tests in #900 green and makes the underlying
design change required to fix domain randomization in cloudai.

Today, CloudAIGymEnv.get_cached_trajectory_result() keys the cache on
action only. An agent revisiting the same action under a different
env_params sample (e.g. drop_rate) gets the first trial's reward
served back from cache, silently corrupting the reward signal. The
parallel artifact (env.csv) lived in workload code and was skipped on
cache hits, so it was sparse and mis-aligned with trajectory.csv.

This PR moves env_params from a workload-private convention into
first-class cloudai schema so the same fix benefits every agent (PPO,
BO, GA, MAB) and every workload uniformly:

  • TestDefinition.env_params: dict[str, EnvParamSpec] — sibling to
    cmd_args; not part of the agent's action space.
  • TestRun.current_env_params: dict[str, Any] — per-trial sample
    attached by the env, visible to the cache key and to downstream
    sinks.
  • TrajectoryEntry gains env_params. The cache key is now
    (action, current_env_params).
  • CloudAIGymEnv builds an EnvParamsObserver whenever the
    TestDefinition declares env_params. before_step() samples
    deterministically by (seed, name, trial), stashes the sample on
    test_run, and appends to env.csv before the cache lookup — so
    env.csv is written on every trial (cache hit or miss) and aligns
    1:1 with trajectory.csv on step.
  • Workloads without any [env_params.*] block pay zero overhead: no
    observer, no env.csv, identical behaviour to today.

A follow-up PR in cloudaix will delete the now-redundant
env_params.py / env_params_sampler.py / env_params_sink.py
modules in domain_randomization/workloads/mrc_prt/ and route the
demo workload through the cloudai-owned schema.

Test plan

  • The 4 TDD-red tests added in [TDD-red] CloudAIGymEnv cache key must include env_params (drives domain-randomization correctness fix) #900 (test_cache_miss_when_env_params_differ,
    test_step_reruns_workload_when_env_params_change, plus two control
    tests) now pass.
  • 9 new unit tests in tests/test_env_params.py cover EnvParamSpec
    validation, EnvParamsSampler determinism + per-parameter
    independence, CsvSink format + edge cases, and
    EnvParamsObserver wiring.
  • 2 new integration tests in tests/test_cloudaigym.py pin the
    env.csvtrajectory.csv step-alignment contract and the
    zero-overhead claim for workloads without env_params.
  • Full cloudai test suite: 1479 passed, 4 skipped.
  • ruff check / ruff format / pyright clean on all touched files.

Stacking

Depends on #900 (the TDD-red tests). Either rebase #900 onto this PR
after merge, or merge them together. Both are stacked on #893
(rpatro/step-ownership-tdd).


Stack & rebase status (updated 2026-06-12)

Stack: main#893#900#901 (top)

Re-stacked on #900. Carries the env_params first-class fix + the cache-hit integration test. Folded in the increment_step() test adaptation — the env_params tests now let TestRun.increment_step() drive the trial counter, since #893's step-ownership change sits below this PR in the stack.

@coderabbitai

coderabbitai Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

This PR implements domain randomization (environment parameter sampling) for CloudAI DSE workloads, refactors agent execution to a delegated run() method, introduces a Gymnasium-compatible adapter, and updates trajectory caching and observation logic to account for sampled environment state.

Changes

Domain Randomization and Agent Execution Refactoring

Layer / File(s) Summary
Gymnasium dependency
pyproject.toml
Gymnasium ~1.2 is added to dev and rl optional dependencies.
Core TestRun and TestDefinition data model
src/cloudai/_core/test_scenario.py, src/cloudai/models/workload.py
TestRun gains current_env_params field and increment_step() method to track environment state. TestDefinition gains env_params: dict[str, EnvParamSpec] field for declaring domain-randomized parameters per trial.
Agent run() method and handler delegation
src/cloudai/configurator/base_agent.py, src/cloudai/cli/handlers.py
BaseAgent implements concrete run() method that resets env, iterates steps via select_action, executes env.step(), constructs feedback for update_policy, logs, and returns exit code. Handler delegates agent execution to agent.run() instead of manual iteration, aggregating return codes.
Environment parameter specification and sampling
src/cloudai/configurator/env_params.py
New module adds EnvParamSpec (categorical values with optional weights), EnvParamsSampler (deterministic per-trial sampling), EnvParamsSink protocol and CsvSink (env.csv persistence), StepObserver protocol, and EnvParamsObserver (samples before step, persists via sink).
CloudAIGymEnv observer and env_params integration
src/cloudai/configurator/cloudai_gym.py
TrajectoryEntry adds env_params field. CloudAIGymEnv builds observers on init, updates define_observation_space() to dynamic shape, invokes observer.before_step() before step and observer.after_step() after trajectory write. Cached and non-cached paths persist env_params with trajectory. get_cached_trajectory_result() matches cache by both action and env_params values (empty env_params for back-compat).
Gymnasium environment adapter
src/cloudai/configurator/gymnasium_adapter.py, src/cloudai/configurator/__init__.py
New GymnasiumAdapter wraps BaseGym with lazy Gymnasium/NumPy imports, splits parameters into tunable/fixed, exposes tunables as Dict of Discrete action spaces, converts observations to float32 arrays, decodes actions from indices, steps with Gymnasium 5-tuple output, and synchronizes test_run.step (1-based). Exported via module __all__.
Test suite
tests/test_env_params.py, tests/test_gymnasium_adapter.py, tests/test_cloudaigym.py, tests/test_handlers.py, tests/test_test_scenario.py
Validates EnvParamSpec constraints, EnvParamsSampler determinism, CsvSink CSV writing, EnvParamsObserver integration, GymnasiumAdapter action/observation spaces and step synchronization, error handling, CloudAIGymEnv step alignment and cache behavior with env_params (miss on differ, hit on match, back-compat absent), workload rerun on param change, env.csv/trajectory.csv alignment, observer execution on cache hit, and agent.run() delegation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

  • podkidyshev
  • srivatsankrishnan
  • jeffnvidia

🐰 With domain params now sampled per trial,
And agents delegating their runs with style,
The Gymnasium adapter joins the fold—
Environment randomization, a story to be told!

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: making env_params first-class to fix the trajectory cache key, which is the core objective of this PR.
Description check ✅ Passed The description is highly detailed and directly related to the changeset, explaining the problem, solution, design changes, test coverage, and stacking dependencies.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

- Agents that set HAS_CUSTOM_TRAINING_LOOP = True drive their own training loop;
  handle_dse_job calls agent.train() and skips the per-step env.step loop.
- New _run_custom_training_loop helper logs exceptions, returns a process-style
  exit code, and always invokes agent.shutdown() (when defined) in a finally
  block so resources are released on both success and failure paths.
- CustomTrainingLoopAgent Protocol documents the opt-in contract for type
  checkers and IDEs.
Pyright rejected calling _run_custom_training_loop(agent, ...) because the
plain bool predicate did not narrow agent's static type from BaseAgent to
CustomTrainingLoopAgent. Return TypeGuard[CustomTrainingLoopAgent] from
_has_custom_training_loop so the truthy branch in handle_dse_job sees the
opted-in shape and the helper can call agent.train() directly.
If agent.shutdown() raised from the finally block, Python suppressed the
earlier return 0/1 from agent.train() and propagated the exception, breaking
the outer test-run loop in handle_dse_job (skipped remaining scenarios,
failed to accumulate err |= rc). Wrap shutdown() in its own try/except,
log via logging.exception, set rc = 1, and return rc after finally so the
helper always honours the (int) -> int contract.

Adds tests for shutdown-only failure and combined train+shutdown failure.
…mutator

Previously, ``test_run.step`` had no clear owner: the dispatcher set it from
outside, the adapter rewound it on ``reset()``, and other callers wrote to
it ad hoc. In RLlib custom-loop runs this collapsed every trial onto
``step=1``, overwriting ``trajectory.csv`` and ``env.csv`` rows.

Centralize the invariant: ``TestRun.increment_step()`` is the single named
mutator, and ``CloudAIGymEnv.step()`` is its only caller. One ``env.step()``
call advances the trial counter by exactly one — independent of any episode
or dispatcher concept above the gym env.

Contract tests in ``TestIncrementStep`` cover the API; ``test_cloudaigym``
asserts ``step`` is advanced *before* ``output_path`` and trajectory rows
are computed, so cached and live trials both record the post-increment value.
…orphism

Earlier commits in this PR introduced ``HAS_CUSTOM_TRAINING_LOOP`` + a
``CustomTrainingLoopAgent`` Protocol + a TypeGuard helper + an ``if/else``
in ``handle_dse_job`` to switch between the cloudai step loop and an
agent-owned ``train()`` loop. That is a type-tagged conditional dispatching
on agent identity — the textbook signal to replace conditional with
polymorphism (Fowler).

Add a default ``BaseAgent.run() -> int`` that holds the step-loop body
(``select_action`` / ``env.step`` / ``update_policy`` per trial). Agents
that drive their own training (RLlib, etc.) override ``run()`` to delegate
to whatever loop they own and return a process-style exit code.

``handle_dse_job`` collapses to ``err |= agent.run()`` — one line, no
branching, no Protocol vocabulary. The handler no longer knows that
"custom training loops" exist as a category; that's an agent implementation
detail.

Net: -89 lines on cloudai. Surface area shrinks (no Protocol, no TypeGuard,
no flag). ``test_handlers`` replaces the 5 helper unit tests + 2 dispatcher
integration tests with 2 polymorphic tests asserting ``handle_dse_job``
delegates to ``agent.run()`` and propagates its return code.
- GymnasiumAdapter wraps a CloudAI BaseGym as a gymnasium-compatible env:
  Dict of Discrete actions over the tunable params, float32 Box observation
  derived from define_observation_space(), gymnasium 5-tuple from step/step_raw.
- Fixed (single-value) params are injected on every step so agents only ever
  see the tunable subset; step_raw bypasses index decoding and fixed injection
  for callers that already have a fully-formed parameter dict.
- The adapter mirrors handle_dse_job's 1-based test_run.step so artifact paths
  match whether the env is driven by the DSE loop or an external training loop.
- CloudAIGymEnv.define_observation_space() now returns one slot per agent
  metric (at least one), giving adapters the correct Box shape.
- gymnasium is an optional dependency: lazy-imported inside the adapter and
  exposed via a new rl extra (also added to dev) pinned to ~=1.2.
decode_action used to accept partial action dicts and step_raw accepted
arbitrary key sets, both silently propagating incomplete params to the
underlying env. Validate up front:

- decode_action requires keys exactly == self._tunable_params, and each
  discrete index must be in range.
- step_raw requires keys exactly == self._tunable_params | self._fixed_params.

A new _assert_keys helper raises ValueError with sorted missing / extra lists
so callers see exactly what was wrong. Adds tests for missing key, unknown
key, out-of-range index, missing fixed param, and unknown raw key paths.
CloudAIGymEnv.get_cached_trajectory_result(action) keys the trajectory
cache on action alone. When a workload declares env_params (e.g.
drop_rate) and the agent re-selects the same action under a different
env_params sample, the cache returns a stale reward measured under a
different env, silently invalidating any domain-randomization workflow.

Four tests pin the contract; today two FAIL (bug-exposing), two PASS
(back-compat sanity). All 27 pre-existing tests are untouched.

  FAIL  test_cache_miss_when_env_params_differ
  FAIL  test_step_reruns_workload_when_env_params_change
  PASS  test_cache_hit_when_action_and_env_params_match
  PASS  test_cache_hit_when_neither_has_env_params

Driver for the follow-up PR that adds env_params as a first-class field
on TestDefinition + TrajectoryEntry and rewrites the cache key as
(action, env_params). Both unit and integration shapes are covered so
the fix can be validated end-to-end through env.step().
Domain randomization broke under the trajectory cache because the cache
key was action-only: an agent revisiting the same action under a
different env_params sample (e.g. drop_rate) got the *first* trial's
reward served back, silently corrupting the reward signal. env.csv was
also workload-owned and skipped on cache hits, so it was sparse and
mis-aligned with trajectory.csv.

This change makes env_params first-class in cloudai so the design works
the same way for every agent (PPO, BO, GA, MAB) and every workload:

- TestDefinition.env_params: dict[str, EnvParamSpec] - sibling to
  cmd_args, not part of the action space.
- TestRun.current_env_params: dict[str, Any] - per-trial sample
  attached by the env so the cache key and downstream sinks see it.
- TrajectoryEntry gains env_params; the cache key is now
  (action, current_env_params).
- CloudAIGymEnv builds an EnvParamsObserver when env_params is
  declared. before_step() samples deterministically by
  (seed, name, trial), stashes the sample on test_run, and appends to
  env.csv *before* the cache lookup - so env.csv is written on every
  trial (cache hit or miss) and aligns 1:1 with trajectory.csv by step.
- Workloads with no [env_params.*] block pay zero overhead: no
  observer, no env.csv, identical behaviour to today.

Tests: turns the four TDD-red tests from the parent PR green and adds
nine unit tests for EnvParamSpec / EnvParamsSampler / CsvSink /
EnvParamsObserver plus two integration tests pinning the env.csv ↔
trajectory.csv contract.
Closes the most important gap in the env_params test matrix: the
cache-HIT path under an observer-driven TestDefinition.env_params.

The previous tests covered each axis separately - cache key unit tests
for env_params, a step()-level cache-miss integration with manually
set current_env_params, and a step-alignment test where all trials
were cache misses. None of them proved that a cache HIT still drives
the observer (and therefore env.csv) - which is precisely the
sparsity contract the fix was about.

The new test:
  - declares env_params on the TestDefinition (real observer wired in),
  - pre-seeds the trajectory with an entry whose action AND deterministic
    env_params sample match what the observer will produce for step 1,
  - calls env.step() once and asserts:
      (a) runner.run is NOT called (cache short-circuit honored),
      (b) env.csv gains a row (observer fired before the cache lookup),
      (c) the new trajectory entry carries the per-trial env_params.

This is the end-to-end proof that the env.csv sparsity bug cannot
recur, and that env.csv ↔ trajectory.csv stay step-aligned on the
cache-hit path too.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/cloudai/configurator/cloudai_gym.py (2)

143-171: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Keep trajectory.csv aligned on the constraint-failure path.

observer.before_step() has already persisted the sampled env row by Line 147, but the early return on Lines 169-171 skips write_trajectory(). Any trial rejected by constraint_check() now leaves an orphan env.csv row, which breaks the PR’s stated 1:1 env.csvtrajectory.csv contract and makes per-trial artifacts inconsistent.

Suggested fix
         if not self.test_run.test.constraint_check(self.test_run, self.runner.system):
             logging.info("Constraint check failed. Skipping step.")
-            return [-1.0], self.rewards.constraint_failure, True, {}
+            observation = [-1.0]
+            reward = self.rewards.constraint_failure
+            self.write_trajectory(
+                TrajectoryEntry(
+                    step=self.test_run.step,
+                    action=action,
+                    reward=reward,
+                    observation=observation,
+                    env_params=dict(self.test_run.current_env_params),
+                )
+            )
+            for observer in self.observers:
+                observer.after_step(self.test_run, observation, reward)
+            return observation, reward, True, {}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 143 - 171, The early
return on constraint_check leaves an env row without a matching trajectory; to
fix, when test_run.test.constraint_check(test_run, runner.system) returns False,
write a TrajectoryEntry (use self.write_trajectory with step=self.test_run.step,
action=action, reward=self.rewards.constraint_failure, observation=[-1.0],
env_params=dict(self.test_run.current_env_params)) and call each
observer.after_step(self.test_run, observation, reward) before returning the
same quadruple (observation, reward, True, {}); this mirrors the cached-result
path and preserves the 1:1 env.csv ↔ trajectory.csv contract.

186-202: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Preserve current_env_params when recovering from runner failures.

On Line 189 the fallback rebuilds self.test_run from self.original_test_run, which resets current_env_params to its default. The trajectory row written on Line 202 then records {} even though env.csv already contains the sampled params, and the cached failure is stored under (action, {}) instead of (action, sampled_env_params). Repeating the same failing trial will miss the cache and produce disagreeing artifacts.

Suggested fix
         if self.runner.test_scenario.test_runs and self.runner.test_scenario.test_runs[0].output_path.exists():
             self.test_run = self.runner.test_scenario.test_runs[0]
         else:
-            self.test_run = copy.deepcopy(self.original_test_run)
+            self.test_run = copy.deepcopy(self.original_test_run).apply_params_set(action)
             self.test_run.step = new_tr.step
             self.test_run.output_path = new_tr.output_path
+            self.test_run.current_env_params = dict(new_tr.current_env_params)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 186 - 202, When
rebuilding self.test_run in the fallback branch (inside cloudai_gym.py) you must
preserve the sampled env params so the trajectory/cache keys match the actual
env.csv; after creating the deep-copied self.test_run from
self.original_test_run and after setting step/output_path, copy the sampled
params into self.test_run.current_env_params (e.g. assign a deepcopy of
self.original_test_run.current_env_params or the sampled params present on the
runner) before calling get_observation/compute_reward and write_trajectory; this
ensures dict(self.test_run.current_env_params) in the TrajectoryEntry reflects
the real sampled parameters.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/configurator/base_agent.py`:
- Around line 116-149: In run(), replace the eager f-string logging calls with
%-style deferred logging so formatting happens only if the message will be
emitted; specifically update the two logging.info calls inside the run method
(the "Running step ..." and "Step {step}: Observation..." messages) to use
logging.info("Running step %s (of %s) with action %s", step, self.max_steps,
action) and logging.info("Step %s: Observation: %s, Reward: %.4f", step, ...,
reward) respectively, ensuring you format the observation list without using
f-strings so the logging framework can lazily interpolate.

In `@src/cloudai/configurator/env_params.py`:
- Around line 1-2: The file header years in
src/cloudai/configurator/env_params.py are incorrect for the touched year;
update the two top lines (the SPDX copyright line and the subsequent Copyright
line) by replacing "2024-2026" with the exact year string the repo's header test
expects for this file (e.g., "2026") so the header matches
tests/test_check_copyright_headers.py.

In `@src/cloudai/configurator/gymnasium_adapter.py`:
- Around line 133-165: Add a public close(self) -> None method to
GymnasiumAdapter that safely delegates to the wrapped environment's close
method: check for an attribute (e.g. getattr(self._env, "close", None)), call it
if callable, and return None; do not assume inheritance from gymnasium.Env. Add
the method near other lifecycle helpers (e.g. alongside
_sync_underlying_step_counter, _step_with_params) so callers that call
env.close() won't raise AttributeError.
- Around line 77-80: GymnasiumAdapter.__init__ currently filters action space
lists into _tunable_params (len>1) and _fixed_params (len==1) but silently drops
any keys with empty lists; modify GymnasiumAdapter.__init__ to detect any entry
in the raw_action_space where len(v) == 0 and raise a ValueError identifying the
offending key(s) so callers (e.g., TestRun.param_space) fail fast; add a unit
test that constructs a TestRun or raw_action_space containing an empty list for
a cmd_args key and asserts that GymnasiumAdapter(...) raises ValueError with a
message mentioning the empty key(s).

In `@tests/test_env_params.py`:
- Around line 86-87: The test uses a broad pytest.raises(ValueError) when
calling sink.write(0, {"drop_rate": 0.0}); narrow it by adding a match parameter
to pytest.raises that matches the specific ValueError message expected (e.g.,
the message about invalid drop_rate or index) so the test asserts the exact
error condition raised by sink.write; update the pytest.raises invocation around
sink.write to include match="expected message substring" referencing the
sink.write call and pytest.raises.
- Around line 1-2: The copyright header in the file contains the range
"2025-2026" which fails CI; update the header lines (the SPDX line and the
Copyright line that currently reads "2025-2026 NVIDIA CORPORATION & AFFILIATES")
to use a single year "2026" so they match the expected format for new 2026
files.

In `@tests/test_handlers.py`:
- Around line 287-288: The override of select_action in tests/test_handlers.py
has the wrong signature; change the function declaration for select_action in
that file to match the base class signature BaseAgent.select_action(self,
observation: list[float] | None = None) (i.e., add the observation parameter
with the same type and default) and keep the existing body (raising
AssertionError) so pyright/CI type-checking passes while preserving the test
intent.

---

Outside diff comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 143-171: The early return on constraint_check leaves an env row
without a matching trajectory; to fix, when
test_run.test.constraint_check(test_run, runner.system) returns False, write a
TrajectoryEntry (use self.write_trajectory with step=self.test_run.step,
action=action, reward=self.rewards.constraint_failure, observation=[-1.0],
env_params=dict(self.test_run.current_env_params)) and call each
observer.after_step(self.test_run, observation, reward) before returning the
same quadruple (observation, reward, True, {}); this mirrors the cached-result
path and preserves the 1:1 env.csv ↔ trajectory.csv contract.
- Around line 186-202: When rebuilding self.test_run in the fallback branch
(inside cloudai_gym.py) you must preserve the sampled env params so the
trajectory/cache keys match the actual env.csv; after creating the deep-copied
self.test_run from self.original_test_run and after setting step/output_path,
copy the sampled params into self.test_run.current_env_params (e.g. assign a
deepcopy of self.original_test_run.current_env_params or the sampled params
present on the runner) before calling get_observation/compute_reward and
write_trajectory; this ensures dict(self.test_run.current_env_params) in the
TrajectoryEntry reflects the real sampled parameters.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 0097c65c-5ab7-4bd2-8683-6e1ea26cc182

📥 Commits

Reviewing files that changed from the base of the PR and between 18ce72e and d076095.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (14)
  • pyproject.toml
  • src/cloudai/_core/test_scenario.py
  • src/cloudai/cli/handlers.py
  • src/cloudai/configurator/__init__.py
  • src/cloudai/configurator/base_agent.py
  • src/cloudai/configurator/cloudai_gym.py
  • src/cloudai/configurator/env_params.py
  • src/cloudai/configurator/gymnasium_adapter.py
  • src/cloudai/models/workload.py
  • tests/test_cloudaigym.py
  • tests/test_env_params.py
  • tests/test_gymnasium_adapter.py
  • tests/test_handlers.py
  • tests/test_test_scenario.py

Comment on lines +116 to +149
def run(self) -> int:
"""
Orchestrate this agent's exploration over ``self.env``.

Default: a step loop driven by the dispatcher (``select_action`` →
``env.step`` → ``update_policy`` per trial). Agents that drive their
own training loop (e.g. RLlib-based agents calling ``algo.train()``)
override this method.

Returns:
int: Process-style return code (``0`` success, non-zero failure).
``handle_dse_job`` accumulates this via ``err |= agent.run()``.
"""
observation, _ = self.env.reset()
for _ in range(self.max_steps):
result = self.select_action(observation=observation)
if result is None:
break
step, action = result
logging.info(f"Running step {step} (of {self.max_steps}) with action {action}")
prev_observation = observation
observation, reward, done, *_ = self.env.step(action)
self.update_policy(
{
"trial_index": step,
"value": reward,
"observation": observation,
"prev_observation": prev_observation,
"action": action,
"done": done,
}
)
logging.info(f"Step {step}: Observation: {[round(obs, 4) for obs in observation]}, Reward: {reward:.4f}")
return 0

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 Performance & Scalability | 🔵 Trivial | 💤 Low value

Consider lazy-evaluation logging for consistency with Ruff G004.

The f-string logging on lines 135 and 148 works correctly but eagerly evaluates formatting arguments even when the log level would skip the message. Using %-style formatting (logging.info("Running step %s (of %s) with action %s", step, self.max_steps, action)) allows the logging framework to defer evaluation, improving performance in production when info-level logs may be disabled.

This is a minor style and performance optimization; the current implementation is functionally correct.

♻️ Proposed refactor to %-style logging
-            logging.info(f"Running step {step} (of {self.max_steps}) with action {action}")
+            logging.info("Running step %s (of %s) with action %s", step, self.max_steps, action)
             prev_observation = observation
             observation, reward, done, *_ = self.env.step(action)
             self.update_policy(
                 {
                     "trial_index": step,
                     "value": reward,
                     "observation": observation,
                     "prev_observation": prev_observation,
                     "action": action,
                     "done": done,
                 }
             )
-            logging.info(f"Step {step}: Observation: {[round(obs, 4) for obs in observation]}, Reward: {reward:.4f}")
+            logging.info(
+                "Step %s: Observation: %s, Reward: %.4f",
+                step,
+                [round(obs, 4) for obs in observation],
+                reward,
+            )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def run(self) -> int:
"""
Orchestrate this agent's exploration over ``self.env``.
Default: a step loop driven by the dispatcher (``select_action``
``env.step````update_policy`` per trial). Agents that drive their
own training loop (e.g. RLlib-based agents calling ``algo.train()``)
override this method.
Returns:
int: Process-style return code (``0`` success, non-zero failure).
``handle_dse_job`` accumulates this via ``err |= agent.run()``.
"""
observation, _ = self.env.reset()
for _ in range(self.max_steps):
result = self.select_action(observation=observation)
if result is None:
break
step, action = result
logging.info(f"Running step {step} (of {self.max_steps}) with action {action}")
prev_observation = observation
observation, reward, done, *_ = self.env.step(action)
self.update_policy(
{
"trial_index": step,
"value": reward,
"observation": observation,
"prev_observation": prev_observation,
"action": action,
"done": done,
}
)
logging.info(f"Step {step}: Observation: {[round(obs, 4) for obs in observation]}, Reward: {reward:.4f}")
return 0
def run(self) -> int:
"""
Orchestrate this agent's exploration over ``self.env``.
Default: a step loop driven by the dispatcher (``select_action``
``env.step````update_policy`` per trial). Agents that drive their
own training loop (e.g. RLlib-based agents calling ``algo.train()``)
override this method.
Returns:
int: Process-style return code (``0`` success, non-zero failure).
``handle_dse_job`` accumulates this via ``err |= agent.run()``.
"""
observation, _ = self.env.reset()
for _ in range(self.max_steps):
result = self.select_action(observation=observation)
if result is None:
break
step, action = result
logging.info("Running step %s (of %s) with action %s", step, self.max_steps, action)
prev_observation = observation
observation, reward, done, *_ = self.env.step(action)
self.update_policy(
{
"trial_index": step,
"value": reward,
"observation": observation,
"prev_observation": prev_observation,
"action": action,
"done": done,
}
)
logging.info(
"Step %s: Observation: %s, Reward: %.4f",
step,
[round(obs, 4) for obs in observation],
reward,
)
return 0
🧰 Tools
🪛 Ruff (0.15.15)

[warning] 135-135: Logging statement uses f-string

(G004)


[warning] 148-148: Logging statement uses f-string

(G004)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/base_agent.py` around lines 116 - 149, In run(),
replace the eager f-string logging calls with %-style deferred logging so
formatting happens only if the message will be emitted; specifically update the
two logging.info calls inside the run method (the "Running step ..." and "Step
{step}: Observation..." messages) to use logging.info("Running step %s (of %s)
with action %s", step, self.max_steps, action) and logging.info("Step %s:
Observation: %s, Reward: %.4f", step, ..., reward) respectively, ensuring you
format the observation list without using f-strings so the logging framework can
lazily interpolate.

Comment on lines +1 to +2
# SPDX-FileCopyrightText: NVIDIA CORPORATION & AFFILIATES
# Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win

Fix the copyright year header.

CI is already failing here: the new file is checked as touched in 2026, but the header still says 2024-2026. Update it to the exact year string the repo’s header test expects for this file, otherwise the PR cannot merge. Based on learnings, this repository’s copyright check requires touched-year formatting to match tests/test_check_copyright_headers.py.

🧰 Tools
🪛 GitHub Actions: CI / 0_Run pytest (3.12).txt

[error] 1-1: Copyright header year check failed in tests/test_check_copyright_headers.py. AssertionError: Copyright year is not valid. Expected '# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.' but found '# Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.'

🪛 GitHub Actions: CI / 1_Run pytest (3.10).txt

[error] 1-1: Copyright header check failed in tests/test_check_copyright_headers.py. Copyright year is not valid. Expected '# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.' but found '# Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.'

🪛 GitHub Actions: CI / Run pytest (3.10)

[error] 1-1: pytest tests/test_check_copyright_headers.py::test_src_copyright_header failed: Copyright year is not valid. Expected '# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.' but found '# Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.'

🪛 GitHub Actions: CI / Run pytest (3.12)

[error] 1-1: test_src_copyright_header failed (Copyright year is not valid). Expected: '# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.'; Found: '# Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.'

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/env_params.py` around lines 1 - 2, The file header
years in src/cloudai/configurator/env_params.py are incorrect for the touched
year; update the two top lines (the SPDX copyright line and the subsequent
Copyright line) by replacing "2024-2026" with the exact year string the repo's
header test expects for this file (e.g., "2026") so the header matches
tests/test_check_copyright_headers.py.

Sources: Learnings, Pipeline failures

Comment on lines +77 to +80
raw_action_space = env.define_action_space()
self._tunable_params: dict[str, list] = {k: v for k, v in raw_action_space.items() if len(v) > 1}
self._fixed_params: dict[str, Any] = {k: v[0] for k, v in raw_action_space.items() if len(v) == 1}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== BaseGym contract and declarations ==="
rg -nC3 --type=py '\bclass\s+BaseGym\b|def define_action_space\s*\(' src tests

echo
echo "=== Concrete define_action_space implementations ==="
rg -nC3 --type=py 'def define_action_space\s*\(' src tests

echo
echo "=== Existing validation for empty candidate lists ==="
rg -nC3 --type=py 'ValueError|assert|len\([^)]*\)\s*==\s*0|if not [^:]+:' src tests

Repository: NVIDIA/cloudai

Length of output: 50372


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Inspect gymnasium_adapter.py around lines 60-110 ==="
nl -ba src/cloudai/configurator/gymnasium_adapter.py | sed -n '50,130p'

echo
echo "=== Search for empty candidate validation in gymnasium_adapter.py ==="
rg -n --type=py 'len\([^)]*\)\s*==\s*0|==\s*0|empty|candidate|action_space' src/cloudai/configurator/gymnasium_adapter.py

echo
echo "=== Search tests for empty candidate cases ==="
rg -n --type=py 'empty|len\([^)]*\)\s*==\s*0|ValueError|candidate' tests/test_gymnasium_adapter.py

echo
echo "=== Show entire tests/test_gymnasium_adapter.py (small file?) ==="
wc -l tests/test_gymnasium_adapter.py
nl -ba tests/test_gymnasium_adapter.py | sed -n '1,220p'

Repository: NVIDIA/cloudai

Length of output: 176


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Inspect gymnasium_adapter.py around lines 60-110 ==="
nl -ba src/cloudai/configurator/gymnasium_adapter.py | sed -n '50,130p'

echo
echo "=== Search for empty candidate validation in gymnasium_adapter.py ==="
rg -n --type=py 'len\([^)]*\)\s*==\s*0|==\s*0|empty|candidate|action_space' src/cloudai/configurator/gymnasium_adapter.py

echo
echo "=== Search tests for empty candidate cases ==="
rg -n --type=py 'empty|len\([^)]*\)\s*==\s*0|ValueError|candidate' tests/test_gymnasium_adapter.py

echo
echo "=== Show tests/test_gymnasium_adapter.py size and contents (if small) ==="
wc -l tests/test_gymnasium_adapter.py
nl -ba tests/test_gymnasium_adapter.py | sed -n '1,260p'

Repository: NVIDIA/cloudai

Length of output: 176


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Inspect gymnasium_adapter.py around lines 60-130 ==="
sed -n '60,130p' src/cloudai/configurator/gymnasium_adapter.py

echo
echo "=== Search for empty candidate validation in gymnasium_adapter.py ==="
rg -n --type=py 'len\([^)]*\)\s*==\s*0|==\s*0|empty|candidate|action_space' src/cloudai/configurator/gymnasium_adapter.py

echo
echo "=== Search tests for empty candidate cases ==="
rg -n --type=py 'empty|len\([^)]*\)\s*==\s*0|ValueError|candidate' tests/test_gymnasium_adapter.py

echo
echo "=== Test file size and contents (first 260 lines) ==="
wc -l tests/test_gymnasium_adapter.py
sed -n '1,260p' tests/test_gymnasium_adapter.py

Repository: NVIDIA/cloudai

Length of output: 12033


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Inspect BaseGym define_action_space docstring ==="
sed -n '1,120p' src/cloudai/configurator/base_gym.py

echo
echo "=== Inspect cloudai_gym define_action_space and where param_space comes from ==="
sed -n '1,200p' src/cloudai/configurator/cloudai_gym.py

echo
echo "=== Search for define_action_space usage/validation of empty lists ==="
rg -n --type=py 'define_action_space|param_space|action_space' src/cloudai/configurator

echo
echo "=== Search for explicit checks against empty candidate lists ([] or len==0) ==="
rg -n --type=py 'len\([^)]*\)\s*==\s*0|not\s+[^:]+|empty\s+|ValueError|assert' src/cloudai/configurator/gymnasium_adapter.py src/cloudai/configurator/base_gym.py src/cloudai/configurator/cloudai_gym.py

Repository: NVIDIA/cloudai

Length of output: 15364


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Search for 'param_space' definition/assignment ==="
rg -n --type=py '\bparam_space\b\s*=' src

echo
echo "=== Search for 'apply_params_set' and how action keys/values are validated ==="
rg -n --type=py 'def apply_params_set\s*\(' src/cloudai

echo
echo "=== Show BaseGym / adapter helper _assert_keys and surrounding logic ==="
sed -n '110,210p' src/cloudai/configurator/gymnasium_adapter.py

echo
echo "=== Search for empties in param_space values (len(...)==0) across src ==="
rg -n --type=py 'len\([^)]*\)\s*==\s*0|==\s*\[\s*\]|\[\s*\]\s*$' src/cloudai

Repository: NVIDIA/cloudai

Length of output: 113


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Locate all param_space references ==="
rg -n --type=py '\bparam_space\b' src

echo
echo "=== Search for 'explorable' / action space construction utilities ==="
rg -n --type=py '(explorable|action_space|param_space|candidate|candidates|param|grid_search)' src/cloudai/configurator

echo
echo "=== Search for where TestRun params set are applied/validated ==="
rg -n --type=py 'apply_params_set|current_env_params|increment_step' src/cloudai

echo
echo "=== Search for any explicit validation of action values (empty list, len==0) ==="
rg -n --type=py 'len\([^)]*\)\s*==\s*0|ValueError\([^)]*empty|empty candidate|candidate list' src/cloudai

Repository: NVIDIA/cloudai

Length of output: 12650


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Inspect src/cloudai/_core/test_scenario.py around param_space property ==="
sed -n '130,230p' src/cloudai/_core/test_scenario.py

echo
echo "=== Inspect apply_params_set in src/cloudai/_core/test_scenario.py ==="
sed -n '230,340p' src/cloudai/_core/test_scenario.py

echo
echo "=== Inspect env_params categorical spec validation for candidate lists ==="
sed -n '1,220p' src/cloudai/configurator/env_params.py

Repository: NVIDIA/cloudai

Length of output: 11074


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Locate TestDefinition / cmd_args model types ==="
rg -n --type=py '\bclass\s+TestDefinition\b|cmd_args.*BaseModel|extra_env_vars|is_dse_excluded_arg|is_dse_job' src/cloudai/_core src/cloudai

echo
echo "=== Locate implementation of is_dse_excluded_arg ==="
rg -n --type=py 'def\s+is_dse_excluded_arg\s*\(' src/cloudai

echo
echo "=== Inspect likely DSE candidate-list validations (min_length, non-empty, ValueError) ==="
rg -n --type=py 'min_length\s*=\s*1|min_length\s*=\s*2|non-empty|nonempty|empty|len\(.*\)\s*==\s*0|min_items|min_length|validate.*list|model_validator.*list' src/cloudai/_core src/cloudai/configurator src/cloudai

Repository: NVIDIA/cloudai

Length of output: 13895


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Inspect src/cloudai/models/workload.py around TestDefinition and DSE helpers ==="
sed -n '1,260p' src/cloudai/models/workload.py

echo
echo "=== Inspect remaining part of workload.py for cmd_args model validators ==="
sed -n '260,520p' src/cloudai/models/workload.py

echo
echo "=== Search within workload.py for list validators on cmd_args / action params ==="
rg -n --type=py 'cmd_args|model_validator|validator|Field\(|min_length|min_items|empty|ValueError' src/cloudai/models/workload.py

Repository: NVIDIA/cloudai

Length of output: 9136


Add a fail-fast guard for empty candidate lists
TestRun.param_space includes cmd_args entries whenever the value is a list (even []), but GymnasiumAdapter.__init__ only treats len(v) > 1 as tunable and len(v) == 1 as fixed—so len(v) == 0 keys are dropped from both _tunable_params and _fixed_params, silently changing the action schema and leaving the underlying cmd_args unset. Add a ValueError guard for len(v) == 0 in GymnasiumAdapter and a unit test for the empty-list case.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/gymnasium_adapter.py` around lines 77 - 80,
GymnasiumAdapter.__init__ currently filters action space lists into
_tunable_params (len>1) and _fixed_params (len==1) but silently drops any keys
with empty lists; modify GymnasiumAdapter.__init__ to detect any entry in the
raw_action_space where len(v) == 0 and raise a ValueError identifying the
offending key(s) so callers (e.g., TestRun.param_space) fail fast; add a unit
test that constructs a TestRun or raw_action_space containing an empty list for
a cmd_args key and asserts that GymnasiumAdapter(...) raises ValueError with a
message mentioning the empty key(s).

Comment on lines +133 to +165
def render(self) -> None:
self._env.render()

@staticmethod
def _assert_keys(received: Any, expected: set[str], ctx: str) -> None:
received_set = set(received)
if received_set == expected:
return
missing = sorted(expected - received_set)
extra = sorted(received_set - expected)
raise ValueError(f"{ctx} keys mismatch; missing={missing}, extra={extra}")

def _step_with_params(self, params: dict[str, Any]) -> tuple[Any, float, bool, bool, dict[str, Any]]:
self._sync_underlying_step_counter()
obs, reward, done, info = self._env.step(params)
self._step_count += 1
return self._as_obs_array(obs), float(reward), bool(done), False, info

def _sync_underlying_step_counter(self) -> None:
"""
Mirror ``handle_dse_job``'s 1-based ``test_run.step`` so artifact paths match.

The first step is written under ``…/<iteration>/1/``, matching how
``handle_dse_job`` numbers steps; this keeps reports and trajectory
analysis consistent regardless of whether the env is driven by the
DSE loop or by an external training loop wrapping the adapter.
"""
test_run = getattr(self._env, "test_run", None)
if test_run is not None:
test_run.step = self._step_count + 1

def _as_obs_array(self, obs: Any) -> Any:
return self._np.asarray(obs, dtype=self._np.float32)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="src/cloudai/configurator/gymnasium_adapter.py"

echo "== File head =="
sed -n '1,220p' "$FILE"

echo "== Search for close() in this file =="
rg -n "def close\\(" "$FILE" || true

echo "== Search for close usage in repo =="
rg -n "\\.close\\(" src || true

echo "== Search for gymnasium_adapter usage =="
rg -n "gymnasium_adapter" -S . || true

Repository: NVIDIA/cloudai

Length of output: 7178


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n "close" tests/test_gymnasium_adapter.py src/cloudai/configurator/gymnasium_adapter.py || true

sed -n '1,260p' tests/test_gymnasium_adapter.py

Repository: NVIDIA/cloudai

Length of output: 7906


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect BaseGym for close()
fd -a "base_gym.py" src/cloudai/configurator -t f -x 'echo {}' || true
BASE_GYM_FILE="$(fd -a "base_gym.py" src/cloudai/configurator -t f | head -n 1)"
echo "Using: $BASE_GYM_FILE"
sed -n '1,240p' "$BASE_GYM_FILE"

echo "== Search for close() usage on GymnasiumAdapter instances =="
rg -n "GymnasiumAdapter\\(" -S src tests || true
rg -n "gymnasium_adapter\\.GymnasiumAdapter|configurator\\.GymnasiumAdapter|from cloudai\\.configurator import GymnasiumAdapter" -S src tests || true
rg -n "\\.close\\(\\)" -S src tests || true

echo "== Search for __getattr__ / delegation patterns in gymnasium_adapter.py =="
rg -n "__getattr__|__getattribute__|wrap|delegate" src/cloudai/configurator/gymnasium_adapter.py || true

Repository: NVIDIA/cloudai

Length of output: 4640


🌐 Web query:

Gymnasium Env close method API optional wrapper close

💡 Result:

In the Gymnasium API, the close method is used to release resources (such as rendering windows, open files, or database connections) used by an environment [1][2]. Implementation of the close method is optional in custom environments [2]. If an environment does not require any specific resource cleanup, you do not need to implement this method [2]. When using Gymnasium wrappers: 1. The base gymnasium.Wrapper class provides a default close method that calls close on the underlying environment attribute (self.env.close) [3]. 2. If you are creating a custom wrapper that adds its own resource-dependent logic, you can override the close method [4][5]. In such cases, it is standard practice to ensure the underlying environment's close method is still called, typically by including super.close in your override to maintain the chain of closing across wrapped environments [3]. You should not interact with an environment after calling its close method [2]. Additionally, the Gymnasium environment checker utility may test the close method to ensure it functions correctly and can be called multiple times without raising an error [6].

Citations:


🌐 Web query:

Gymnasium Env class has close() default implementation no-op core.py

💡 Result:

In the Gymnasium library, the Env class does not define a default, no-op implementation of the close method in its core.py file. Rather, close is defined as an abstract method or a required interface that custom environments are expected to implement if they need to release external resources [1][2][3]. While Gymnasium's documentation acknowledges that in many cases a custom environment may not need to implement close, the base Env class does not provide a functional default [4][5]. Users are encouraged to override this method to perform necessary cleanup, such as closing windows (e.g., Pygame) or releasing file handles [4][5]. Because there is no default implementation in the base Env class, calling close on an environment that has not explicitly defined it—or failing to implement it when resources are involved—can lead to issues where resources remain open [4][6]. Consequently, users must explicitly manage environment cleanup to ensure proper resource management [6].

Citations:


Add close() to the Gymnasium-shaped adapter API

GymnasiumAdapter is a standalone “Env-shaped” object (it doesn’t inherit from gymnasium.Env/Wrapper) and the class provides no close() method, so teardown code that calls env.close() can fail with AttributeError. Delegate to the wrapped BaseGym.close when available.

♻️ Proposed fix
     def render(self) -> None:
         self._env.render()
 
+    def close(self) -> None:
+        close = getattr(self._env, "close", None)
+        if callable(close):
+            close()
+
     `@staticmethod`
     def _assert_keys(received: Any, expected: set[str], ctx: str) -> None:
         received_set = set(received)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def render(self) -> None:
self._env.render()
@staticmethod
def _assert_keys(received: Any, expected: set[str], ctx: str) -> None:
received_set = set(received)
if received_set == expected:
return
missing = sorted(expected - received_set)
extra = sorted(received_set - expected)
raise ValueError(f"{ctx} keys mismatch; missing={missing}, extra={extra}")
def _step_with_params(self, params: dict[str, Any]) -> tuple[Any, float, bool, bool, dict[str, Any]]:
self._sync_underlying_step_counter()
obs, reward, done, info = self._env.step(params)
self._step_count += 1
return self._as_obs_array(obs), float(reward), bool(done), False, info
def _sync_underlying_step_counter(self) -> None:
"""
Mirror ``handle_dse_job``'s 1-based ``test_run.step`` so artifact paths match.
The first step is written under ``/<iteration>/1/``, matching how
``handle_dse_job`` numbers steps; this keeps reports and trajectory
analysis consistent regardless of whether the env is driven by the
DSE loop or by an external training loop wrapping the adapter.
"""
test_run = getattr(self._env, "test_run", None)
if test_run is not None:
test_run.step = self._step_count + 1
def _as_obs_array(self, obs: Any) -> Any:
return self._np.asarray(obs, dtype=self._np.float32)
def render(self) -> None:
self._env.render()
def close(self) -> None:
close = getattr(self._env, "close", None)
if callable(close):
close()
`@staticmethod`
def _assert_keys(received: Any, expected: set[str], ctx: str) -> None:
received_set = set(received)
if received_set == expected:
return
missing = sorted(expected - received_set)
extra = sorted(received_set - expected)
raise ValueError(f"{ctx} keys mismatch; missing={missing}, extra={extra}")
def _step_with_params(self, params: dict[str, Any]) -> tuple[Any, float, bool, bool, dict[str, Any]]:
self._sync_underlying_step_counter()
obs, reward, done, info = self._env.step(params)
self._step_count += 1
return self._as_obs_array(obs), float(reward), bool(done), False, info
def _sync_underlying_step_counter(self) -> None:
"""
Mirror ``handle_dse_job``'s 1-based ``test_run.step`` so artifact paths match.
The first step is written under ``/<iteration>/1/``, matching how
``handle_dse_job`` numbers steps; this keeps reports and trajectory
analysis consistent regardless of whether the env is driven by the
DSE loop or by an external training loop wrapping the adapter.
"""
test_run = getattr(self._env, "test_run", None)
if test_run is not None:
test_run.step = self._step_count + 1
def _as_obs_array(self, obs: Any) -> Any:
return self._np.asarray(obs, dtype=self._np.float32)
🧰 Tools
🪛 Ruff (0.15.15)

[warning] 137-137: Dynamically typed expressions (typing.Any) are disallowed in received

(ANN401)


[warning] 143-143: Avoid specifying long messages outside the exception class

(TRY003)


[warning] 164-164: Dynamically typed expressions (typing.Any) are disallowed in obs

(ANN401)


[warning] 164-164: Dynamically typed expressions (typing.Any) are disallowed in _as_obs_array

(ANN401)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/gymnasium_adapter.py` around lines 133 - 165, Add a
public close(self) -> None method to GymnasiumAdapter that safely delegates to
the wrapped environment's close method: check for an attribute (e.g.
getattr(self._env, "close", None)), call it if callable, and return None; do not
assume inheritance from gymnasium.Env. Add the method near other lifecycle
helpers (e.g. alongside _sync_underlying_step_counter, _step_with_params) so
callers that call env.close() won't raise AttributeError.

Comment thread tests/test_env_params.py
Comment on lines +1 to +2
# SPDX-FileCopyrightText: NVIDIA CORPORATION & AFFILIATES
# Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🔴 Critical | ⚡ Quick win

Fix copyright header year to match CI expectations.

The copyright header shows 2025-2026 but CI expects 2026 for new files created in 2026. This is breaking the copyright header check in CI.

📅 Proposed fix
 # SPDX-FileCopyrightText: NVIDIA CORPORATION & AFFILIATES
-# Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 # SPDX-License-Identifier: Apache-2.0

Based on learnings: In the NVIDIA/cloudai repo, copyright headers' year formatting must match the test logic enforced by tests/test_check_copyright_headers.py. New files should list only the year they were created.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# SPDX-FileCopyrightText: NVIDIA CORPORATION & AFFILIATES
# Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-FileCopyrightText: NVIDIA CORPORATION & AFFILIATES
# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
🧰 Tools
🪛 GitHub Actions: CI / 0_Run pytest (3.12).txt

[error] 1-1: Copyright header year check failed in tests/test_check_copyright_headers.py. AssertionError: Copyright year is not valid. Expected '# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.' but found '# Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.'

🪛 GitHub Actions: CI / 1_Run pytest (3.10).txt

[error] 1-1: Copyright header check failed in tests/test_check_copyright_headers.py. Copyright year is not valid. Expected '# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.' but found '# Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.'

🪛 GitHub Actions: CI / Run pytest (3.10)

[error] 1-1: pytest tests/test_check_copyright_headers.py::test_src_copyright_header failed: Copyright year is not valid. Expected '# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.' but found '# Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.'

🪛 GitHub Actions: CI / Run pytest (3.12)

[error] 1-1: test_src_copyright_header failed (Copyright year is not valid). Expected: '# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.'; Found: '# Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.'

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_env_params.py` around lines 1 - 2, The copyright header in the
file contains the range "2025-2026" which fails CI; update the header lines (the
SPDX line and the Copyright line that currently reads "2025-2026 NVIDIA
CORPORATION & AFFILIATES") to use a single year "2026" so they match the
expected format for new 2026 files.

Sources: Learnings, Pipeline failures

Comment thread tests/test_env_params.py
Comment on lines +86 to +87
with pytest.raises(ValueError):
sink.write(0, {"drop_rate": 0.0})

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Add match parameter to narrow the pytest.raises scope.

The bare pytest.raises(ValueError) is broad. Adding a match parameter would verify the specific error condition being tested.

🔍 Proposed fix
-    with pytest.raises(ValueError):
-        sink.write(0, {"drop_rate": 0.0})
+    with pytest.raises(ValueError, match=r"step.*must.*positive|step.*0"):
+        sink.write(0, {"drop_rate": 0.0})
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
with pytest.raises(ValueError):
sink.write(0, {"drop_rate": 0.0})
with pytest.raises(ValueError, match=r"step.*must.*positive|step.*0"):
sink.write(0, {"drop_rate": 0.0})
🧰 Tools
🪛 Ruff (0.15.15)

[warning] 86-86: pytest.raises(ValueError) is too broad, set the match parameter or use a more specific exception

(PT011)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_env_params.py` around lines 86 - 87, The test uses a broad
pytest.raises(ValueError) when calling sink.write(0, {"drop_rate": 0.0}); narrow
it by adding a match parameter to pytest.raises that matches the specific
ValueError message expected (e.g., the message about invalid drop_rate or index)
so the test asserts the exact error condition raised by sink.write; update the
pytest.raises invocation around sink.write to include match="expected message
substring" referencing the sink.write call and pytest.raises.

Source: Linters/SAST tools

Comment thread tests/test_handlers.py
Comment on lines +287 to +288
def select_action(self) -> tuple[int, dict[str, Any]]: # pragma: no cover - never called
raise AssertionError("select_action must not be called when run() is overridden")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🔴 Critical | ⚡ Quick win

Fix select_action() signature to match base class.

The override is missing the observation parameter that exists in BaseAgent.select_action(self, observation: list[float] | None = None). This breaks pyright type checking and CI.

🔧 Proposed fix
-    def select_action(self) -> tuple[int, dict[str, Any]]:  # pragma: no cover - never called
+    def select_action(self, observation: list[float] | None = None) -> tuple[int, dict[str, Any]]:  # pragma: no cover
         raise AssertionError("select_action must not be called when run() is overridden")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def select_action(self) -> tuple[int, dict[str, Any]]: # pragma: no cover - never called
raise AssertionError("select_action must not be called when run() is overridden")
def select_action(self, observation: list[float] | None = None) -> tuple[int, dict[str, Any]]: # pragma: no cover
raise AssertionError("select_action must not be called when run() is overridden")
🧰 Tools
🪛 GitHub Actions: CI / 2_Linting.txt

[error] 287-287: pyright failed (hook id: pyright). Method "select_action" overrides class "BaseAgent" incompatibly: positional parameter count mismatch (base has 2, override has 1). Parameter 2 mismatch: base "observation" is keyword parameter, override parameter is position-only (reportIncompatibleMethodOverride).

🪛 GitHub Actions: CI / Linting

[error] 287-287: pyright (hook id: pyright) failed: Method "select_action" overrides class "BaseAgent" in an incompatible manner (reportIncompatibleMethodOverride). Positional parameter count mismatch (base has 2, override has 1). Parameter 2 mismatch: base parameter "observation" is keyword parameter, override parameter is position-only.

🪛 Ruff (0.15.15)

[warning] 288-288: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_handlers.py` around lines 287 - 288, The override of select_action
in tests/test_handlers.py has the wrong signature; change the function
declaration for select_action in that file to match the base class signature
BaseAgent.select_action(self, observation: list[float] | None = None) (i.e., add
the observation parameter with the same type and default) and keep the existing
body (raising AssertionError) so pyright/CI type-checking passes while
preserving the test intent.

Source: Pipeline failures

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant