Skip to content

Mdc/add protpardelle#274

Open
marcuscollins wants to merge 7 commits into
mainfrom
mdc/add-protpardelle
Open

Mdc/add protpardelle#274
marcuscollins wants to merge 7 commits into
mainfrom
mdc/add-protpardelle

Conversation

@marcuscollins

@marcuscollins marcuscollins commented Jun 29, 2026

Copy link
Copy Markdown
Collaborator

This is a working version of Protpardelle-1c in SampleWorks. It may still require parameter tuning and other updates--in particular it isn't clear how to use self-conditioning. However we can generate structures with this version and so I'm making this PR as a milestone.

Only a couple significant changes have been made outside the Protpardelle wrapper code itself. One is to make a features class that is unfrozen, so that we can pass self-conditioning input forward to the next Euler step during sampling. Another is that we define different sampling parameters for Protpardelle when instantiating the sampler. There is no CLI control for either change in this PR.

Summary by CodeRabbit

  • New Features

    • Added support for a new Protpardelle-based structure guidance workflow, including CLI options and model selection.
    • Introduced a new sample configuration for a cc89-based setup and expanded supported environment/package configuration.
  • Bug Fixes

    • Improved guidance logging and model availability checks for clearer runtime feedback.
    • Updated ignore rules so the sampleworks data files remain included while other data folders stay excluded.

marcuscollins and others added 7 commits June 29, 2026 09:34
Wrap sequence-conditioned all-atom Protpardelle-1c models (task
"ai-allatom", e.g. cc89.yaml) under the StructureModelWrapper protocol.
featurize() derives sequence conditioning (aatype/seq_mask/residue_index/
chain_index/atom37 mask) from an Atomworks structure's chain_info, and
step() runs the full reverse-diffusion trajectory internally via
Protpardelle.sample(gt_aatype=...), returning [batch, atoms, 3] coords.

- Register PROTPARDELLE_AVAILABLE / require_protpardelle /
  check_protpardelle_available (catching OSError from protpardelle.env).
- Add StructurePredictor.PROTPARDELLE.
- Tests build a small random ai-allatom model (no weights needed);
  cover sequence extraction, featurization shapes, prior init, protocol
  conformance, and an end-to-end short-trajectory sampling smoke test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
## Summary

Completes the Protpardelle debugging work from ENG-75 / #257 on top of
`mdc/add-protpardelle`.

This PR intentionally targets `mdc/add-protpardelle` so the review
contains only the final Protpardelle fixes, not the whole in-progress
integration branch.

Key changes:

- Restores the EDM sampler/model-wrapper call shape to the
protocol-compatible `step(noisy_state, t_hat, features=features)` after
removing the temporary extra `eps` argument.
- Implements Protpardelle step-time noise handling via
`_expand_noise_level()`, broadcasting scalar or per-batch EDM timesteps
to Protpardelle's expected `B x L` tensor.
- Moves Protpardelle step inputs/conditioning tensors onto the wrapper
device before the model forward pass.
- Keeps `prot_lens_per_chain` on CPU when calling Protpardelle's
sampling helper to avoid CPU/GPU device mismatch in helper-created
residue indices.
- Maps Atomworks selenomethionine atom name `SE` into Protpardelle
atom37's methionine sulfur slot `SD`.
- Fixes `ProtpardelleConditioning` immutability so dataclass
construction can complete before selected conditioning fields become
frozen.
- Allows packaged Protpardelle config/data under
`src/sampleworks/data/**` through `.actlignore` so ACTL sync includes
the runtime YAML config.
- Re-enables the slow `step()` behavior test and adds coverage for the
MSE selenium atom mapping.

## Validation

Local static checks:

- `uvx ty check src/sampleworks/models/protpardelle/wrapper.py
src/sampleworks/core/samplers/edm.py
tests/models/protpardelle/test_protpardelle_wrapper.py`
- `uvx ruff check src/sampleworks/models/protpardelle/wrapper.py
src/sampleworks/core/samplers/edm.py
tests/models/protpardelle/test_protpardelle_wrapper.py`

Remote ACTL / Protpardelle environment checks:

- `pixi run -e protpardelle-dev pytest tests/models/protpardelle -m "not
slow"`
  - `27 passed, 1 deselected`
- `pixi run -e protpardelle-dev pytest
tests/models/protpardelle/test_protpardelle_wrapper.py::TestStep::test_step_returns_coords`
  - `1 passed`
- Exact reduced Protpardelle guidance smoke command from #257 using the
shared checkpoint and 1VME inputs
  - completed successfully with `Guidance run successfully!`
  - wrote results to `output/protpardelle-smoke-guided/`

Non-blocking warnings observed during the smoke run were expected
environment warnings for unavailable optional model/tool paths and
missing mirror environment variables; they did not prevent the
Protpardelle run from completing.

Refs #257.

Co-authored-by: xraymemory <me.anzuoni@gmail.com>
@coderabbitai

coderabbitai Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

Adds a complete ProtpardelleWrapper class (762 lines) implementing featurize/step/initialize_from_prior for the Protpardelle-1c all-atom model. Wires it into the guidance CLI via a new StructurePredictor.PROTPARDELLE enum value, argument helper, and _run_guidance branch. Adds availability guards, a bundled cc89_epoch415.yaml config, pytest fixtures, and comprehensive wrapper tests.

Changes

Protpardelle wrapper and guidance integration

Layer / File(s) Summary
Bundled config and availability infrastructure
src/sampleworks/data/cc89_epoch415.yaml, src/sampleworks/utils/imports.py
Adds cc89_epoch415.yaml with data/diffusion/model/train blocks. Adds PROTPARDELLE_AVAILABLE flag, guarded import suppressing OSError, require_protpardelle decorator, check_protpardelle_available, and extends require_any_model/check_any_model_available to include Protpardelle.
ProtpardelleConditioning, ProtpardelleConfig, and helpers
src/sampleworks/models/protpardelle/__init__.py, src/sampleworks/models/protpardelle/wrapper.py
Defines atom37 constants, SESD alias, loads default YAML sub-configs. Adds ProtpardelleConditioning dataclass with frozen-field logic, ProtpardelleConfig dataclass, annotate_structure_for_protpardelle, and extract_protein_sequences.
ProtpardelleWrapper core: featurize, step, coordinate conversion
src/sampleworks/models/protpardelle/wrapper.py
Implements ProtpardelleWrapper.__init__/featurize, _atom37_indices_from_atom_array, _convert_to_atom37 (indexed scatter/gather), _build_sampling_kwargs, _expand_noise_level, step, and initialize_from_prior.
Guidance CLI wiring
src/sampleworks/utils/guidance_constants.py, src/sampleworks/utils/guidance_script_arguments.py, src/sampleworks/utils/guidance_script_utils.py, src/sampleworks/cli/guidance.py
Adds PROTPARDELLE enum member, add_protpardelle_specific_args CLI helper, optional ProtpardelleWrapper import in guidance_script_utils, get_model_and_device Protpardelle branch, _run_guidance annotation and EDM sampler kwargs population, EDMSamplerConfig forwarding, and a logger.info on parsed config.
Tests and fixtures
tests/models/protpardelle/conftest.py, tests/models/protpardelle/test_protpardelle_wrapper.py
Adds session-scoped fixtures building a small randomly initialized Protpardelle model and wrapper. Tests cover extract_protein_sequences, annotate_structure_for_protpardelle, featurize (shapes, aatype, SE→SD, multichain), _convert_to_atom37 (placement, gradient flow), initialize_from_prior, and step including a slow integration trajectory.

Supporting config and infra fixes

Layer / File(s) Summary
pyproject.toml, .actlignore, EDM comment
pyproject.toml, .actlignore, src/sampleworks/core/samplers/edm.py
Adds protpardelle/protpardelle-dev pixi environments, dependency overrides pinning pandas==2.3.1 and gemmi==0.6.7, extends boltz-osx pypi-deps, narrows .actlignore to re-include src/sampleworks/data/, and removes a ty: ignore comment in AF3EDMSampler.step.
test_runner.py GPU count hardening
tests/runs/test_runner.py
Multiple tests now pass jobs.*.gpu_count=1 when loading rf3_partial/protenix_dual presets; RF3 argv assertion switched to substring check for the script path.

Sequence Diagram

sequenceDiagram
  participant CLI as guidance CLI
  participant Utils as guidance_script_utils
  participant Wrapper as ProtpardelleWrapper
  participant Model as Protpardelle model

  CLI->>Utils: get_model_and_device(PROTPARDELLE, checkpoint)
  Utils->>Wrapper: ProtpardelleWrapper(checkpoint, config_path, device)
  Wrapper-->>Utils: wrapper instance
  CLI->>Utils: _run_guidance(structure, wrapper)
  Utils->>Wrapper: featurize(structure)
  Wrapper-->>Utils: GenerativeModelInput[ProtpardelleConditioning]
  loop EDM sampling steps
    Utils->>Wrapper: step(x_t, t, features)
    Wrapper->>Wrapper: _convert_to_atom37(x_t)
    Wrapper->>Model: forward(atom37_coords, noise_level, seq_conditioning)
    Model-->>Wrapper: predicted x0 atom37
    Wrapper-->>Utils: x0 flat coords (batch x atoms x 3)
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • diff-use/sampleworks#73: Overlaps with _run_guidance EDM sampler kwargs and model-wrapper orchestration changes in guidance_script_utils.py.
  • diff-use/sampleworks#207: Directly related to the guidance.py CLI entrypoint where this PR adds a logger.info after GuidanceConfig.from_cli.
  • diff-use/sampleworks#255: Both PRs modify pyproject.toml Pixi boltz-osx feature dependencies.

Suggested reviewers

  • k-chrispens

🐇 A rabbit in a lab coat writes:

Atom37 slots, all filled with care,
SE maps to SD — a selenium snare!
Conditioning tensors expand and grow,
Through EDM's churn the proteins flow.
Hop hop — the wrapper's here to stay! 🧬

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 51.39% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title is concise and clearly points to the main change: adding Protpardelle support.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch mdc/add-protpardelle

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Nitpick comments (1)
tests/models/protpardelle/test_protpardelle_wrapper.py (1)

147-160: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Keep these assertions at the public wrapper boundary.

These tests reach into _build_sampling_kwargs() and _convert_to_atom37() directly, which makes the suite brittle to internal refactors that preserve featurize(), initialize_from_prior(), and step() behavior. Either move the coverage to public API outcomes or promote the helpers if they are intended contract. As per coding guidelines, tests/**/*.py: "Write black-box tests that verify behavior, not implementation."

Also applies to: 287-320

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/models/protpardelle/test_protpardelle_wrapper.py` around lines 147 -
160, The tests are asserting private helper behavior via
ProtpardelleWrapper._build_sampling_kwargs() and _convert_to_atom37(), which
makes them brittle to internal refactors. Move this coverage to black-box
assertions around the public wrapper methods featurize(),
initialize_from_prior(), and step(), or explicitly promote those helpers if they
are meant to be part of the contract. Keep the checks focused on observable
outputs and behavior at the wrapper boundary rather than internal implementation
details.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pyproject.toml`:
- Around line 139-141: The pyproject.toml currently defines the
dependency-overrides table twice under tool.pixi.pypi-options, which breaks TOML
parsing. Consolidate the duplicate [tool.pixi.pypi-options.dependency-overrides]
blocks into a single table, and keep the pandas and gemmi pins together there
only if they are intended to apply globally.

In `@src/sampleworks/models/protpardelle/wrapper.py`:
- Around line 707-709: The self-conditioning cached in
features.conditioning.x_self_conditioning is being stored with its autograd
history intact, which can carry the prior graph into later sampling steps.
Update the assignment in step() to cache a detached version of x_self_cond
before writing it to features.conditioning.x_self_conditioning, so repeated
iterations do not retain gradients across steps.

In `@src/sampleworks/utils/guidance_script_utils.py`:
- Around line 223-230: In the Protpardelle branch of guidance_script_utils, the
config_path is still built from a hardcoded src/sampleworks/data path, which
breaks when the package is installed and run from outside the repo. Update the
ProtpardelleWrapper config_path handling to load cc89_epoch415.yaml as a bundled
package resource using the package’s resource-loading APIs instead of resolving
a filesystem path relative to the current working directory. Keep the change
localized to the model-loading logic around StructurePredictor.PROTPARDELLE and
ProtpardelleWrapper.
- Around line 63-67: The Protpardelle fallback in the import block for
ProtpardelleWrapper is too narrow and can still crash non-Protpardelle runs.
Update the try/except around the ProtpardelleWrapper import to catch the
additional failure modes noted in imports.py, not just ImportError, and keep the
existing fallback assignment/logging. Also make the bundled YAML path in the
guidance script resolution independent of the current checkout by resolving it
from the package/module location instead of using a repo-relative path.

In `@src/sampleworks/utils/imports.py`:
- Around line 265-270: The install hint in require_any_model() is stale because
the availability check now includes PROTPARDELLE_AVAILABLE, but the
default_message still only mentions Boltz, Protenix, and RF3. Update the
default_message text in src/sampleworks/utils/imports.py to include Protpardelle
alongside the other supported model options so the decorator’s remediation
guidance matches the current logic.
- Around line 187-230: The bare decorator usage of require_protpardelle is
broken because the function object is being passed into the message parameter,
so the wrapper is not applied. Update require_protpardelle in imports.py to
support both `@require_protpardelle` and `@require_protpardelle`("...") by detecting
when the first argument is a callable vs a custom message, or remove the bare
form from the examples/docstring so only the parenthesized usage is advertised.

In `@tests/models/protpardelle/conftest.py`:
- Around line 17-21: The `os.environ.setdefault(...)` usage is eagerly creating
a temp directory via `tempfile.mkdtemp(...)` even when
`PROTPARDELLE_MODEL_PARAMS` is already set, so update the setup in `conftest.py`
to only call `mkdtemp` when the env var is missing. Apply the same lazy pattern
in `test_protpardelle_wrapper.py` where the same `setdefault`/`mkdtemp` usage
appears, using the existing `PROTPARDELLE_MODEL_PARAMS` guard to avoid
allocating untracked temp dirs on import.

In `@tests/runs/test_runner.py`:
- Around line 29-30: The argv assertion in the test is not hermetic because
runner._build_argv() can switch to a direct Python executable when a baked pixi
env is detected. Update the test setup for the affected assertion in
test_runner.py to force the pixi code path before checking the ["pixi", "run",
"-e", "rf3", "python"] prefix, using the existing test fixture/monkeypatch
around runner._build_argv() so the environment detection cannot change the
expected argv.

---

Nitpick comments:
In `@tests/models/protpardelle/test_protpardelle_wrapper.py`:
- Around line 147-160: The tests are asserting private helper behavior via
ProtpardelleWrapper._build_sampling_kwargs() and _convert_to_atom37(), which
makes them brittle to internal refactors. Move this coverage to black-box
assertions around the public wrapper methods featurize(),
initialize_from_prior(), and step(), or explicitly promote those helpers if they
are meant to be part of the contract. Keep the checks focused on observable
outputs and behavior at the wrapper boundary rather than internal implementation
details.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8a25473f-58ce-43f8-b062-7b353055864f

📥 Commits

Reviewing files that changed from the base of the PR and between 4826154 and 6713208.

⛔ Files ignored due to path filters (1)
  • pixi.lock is excluded by !**/*.lock
📒 Files selected for processing (15)
  • .actlignore
  • pyproject.toml
  • src/sampleworks/cli/guidance.py
  • src/sampleworks/core/samplers/edm.py
  • src/sampleworks/data/cc89_epoch415.yaml
  • src/sampleworks/models/protpardelle/__init__.py
  • src/sampleworks/models/protpardelle/wrapper.py
  • src/sampleworks/utils/guidance_constants.py
  • src/sampleworks/utils/guidance_script_arguments.py
  • src/sampleworks/utils/guidance_script_utils.py
  • src/sampleworks/utils/imports.py
  • tests/models/protpardelle/__init__.py
  • tests/models/protpardelle/conftest.py
  • tests/models/protpardelle/test_protpardelle_wrapper.py
  • tests/runs/test_runner.py

Comment thread pyproject.toml
Comment on lines +139 to +141
[tool.pixi.pypi-options.dependency-overrides]
pandas = "==2.3.1"
gemmi = "==0.6.7"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
python - <<'PY'
import pathlib, tomllib
data = tomllib.loads(pathlib.Path("pyproject.toml").read_text())
pixi = data["tool"]["pixi"]

print("Top-level dependency overrides:")
print(pixi.get("pypi-options", {}).get("dependency-overrides", {}))

for feature_name, feature_cfg in pixi.get("feature", {}).items():
    deps = feature_cfg.get("dependencies", {})
    pypi_deps = feature_cfg.get("pypi-dependencies", {})
    overrides = feature_cfg.get("pypi-options", {}).get("dependency-overrides", {})
    if "gemmi" in deps or "gemmi" in pypi_deps or overrides:
        print(f"\nfeature={feature_name}")
        if "gemmi" in deps:
            print("  dependencies.gemmi =", deps["gemmi"])
        if "gemmi" in pypi_deps:
            print("  pypi-dependencies.gemmi =", pypi_deps["gemmi"])
        if overrides:
            print("  dependency-overrides =", overrides)
PY

echo
echo "Other gemmi references in the repo:"
rg -n '\bgemmi\b' .

Repository: diff-use/sampleworks

Length of output: 670


🏁 Script executed:

#!/bin/bash
set -euo pipefail
nl -ba pyproject.toml | sed -n '55,170p'

Repository: diff-use/sampleworks

Length of output: 198


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Relevant pyproject.toml slice:"
sed -n '55,170p' pyproject.toml

echo
echo "Occurrences of dependency-overrides headers:"
grep -n '^\[tool\.pixi\.pypi-options\.dependency-overrides\]$' pyproject.toml || true

echo
echo "All gemmi/pandas override lines:"
grep -nE '^(pandas|gemmi)\s*=' pyproject.toml || true

Repository: diff-use/sampleworks

Length of output: 3634


Remove the duplicate dependency-overrides table. pyproject.toml declares [tool.pixi.pypi-options.dependency-overrides] twice (around lines 139 and 150), which makes the file invalid TOML. Keep a single table and move gemmi = "==0.6.7" into it only if that pin is meant to apply globally.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pyproject.toml` around lines 139 - 141, The pyproject.toml currently defines
the dependency-overrides table twice under tool.pixi.pypi-options, which breaks
TOML parsing. Consolidate the duplicate
[tool.pixi.pypi-options.dependency-overrides] blocks into a single table, and
keep the pandas and gemmi pins together there only if they are intended to apply
globally.

Comment on lines +707 to +709
# pass the self-conditioning to the next step by updating the features.
# TODO: I wonder if we need to adjust this since we will apply additional guidance.
features.conditioning.x_self_conditioning = x_self_cond

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect the relevant section around lines 707-709.
sed -n '650,740p' src/sampleworks/models/protpardelle/wrapper.py

# Find all references to x_self_conditioning in the repository.
rg -n "x_self_conditioning|x_self_cond" src/sampleworks/models/protpardelle -S

# Inspect the type/shape of the conditioning object and any stateful reuse patterns nearby.
rg -n "conditioning\." src/sampleworks/models/protpardelle/wrapper.py -S

Repository: diff-use/sampleworks

Length of output: 5604


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect the conditioning dataclass definitions and nearby setup.
sed -n '80,140p' src/sampleworks/models/protpardelle/wrapper.py

# Inspect the parts that initialize conditioning and any other writes to x_self_conditioning.
sed -n '400,450p' src/sampleworks/models/protpardelle/wrapper.py
rg -n "x_self_conditioning\s*=" src/sampleworks/models/protpardelle/wrapper.py -S

# Check whether the model forward or surrounding code explicitly detaches cached tensors.
rg -n "detach\(|requires_grad|no_grad|self_cond" src/sampleworks/models/protpardelle/wrapper.py -S

Repository: diff-use/sampleworks

Length of output: 6153


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find the sampling/caller context for step().
rg -n "\.step\(" src/sampleworks -S

# Inspect the sampler implementation around the Protpardelle wrapper usage.
rg -n "ProtpardelleWrapper|PureGuidance|sample\(" src/sampleworks -S

# Look for no_grad / inference_mode wrappers around the sampling loop.
rg -n "no_grad|inference_mode|grad_enabled|set_grad_enabled" src/sampleworks -S

Repository: diff-use/sampleworks

Length of output: 3909


🏁 Script executed:

#!/bin/bash
set -euo pipefail

sed -n '400,440p' src/sampleworks/core/samplers/edm.py
sed -n '90,140p' src/sampleworks/core/scalers/pure_guidance.py
sed -n '560,620p' src/sampleworks/models/protpardelle/wrapper.py

Repository: diff-use/sampleworks

Length of output: 6461


Detach x_self_conditioning before caching it.

step() can run with gradients enabled, and this tensor is fed back into the next sampling step. Storing x_self_cond directly on features.conditioning can retain the prior autograd graph across iterations and trigger graph bloat or double-backward failures. Cache a detached tensor here.

Suggested fix
-        features.conditioning.x_self_conditioning = x_self_cond
+        features.conditioning.x_self_conditioning = (
+            x_self_cond.detach()
+            if x_self_cond is not None and x_self_cond.requires_grad
+            else x_self_cond
+        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# pass the self-conditioning to the next step by updating the features.
# TODO: I wonder if we need to adjust this since we will apply additional guidance.
features.conditioning.x_self_conditioning = x_self_cond
# pass the self-conditioning to the next step by updating the features.
# TODO: I wonder if we need to adjust this since we will apply additional guidance.
features.conditioning.x_self_conditioning = (
x_self_cond.detach()
if x_self_cond is not None and x_self_cond.requires_grad
else x_self_cond
)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/sampleworks/models/protpardelle/wrapper.py` around lines 707 - 709, The
self-conditioning cached in features.conditioning.x_self_conditioning is being
stored with its autograd history intact, which can carry the prior graph into
later sampling steps. Update the assignment in step() to cache a detached
version of x_self_cond before writing it to
features.conditioning.x_self_conditioning, so repeated iterations do not retain
gradients across steps.

Source: Coding guidelines

Comment on lines +63 to +67
try:
from sampleworks.models.protpardelle.wrapper import ProtpardelleWrapper
except ImportError:
ProtpardelleWrapper = None
logger.warning("Failed to import Protpardelle, hopefully you're running a different model")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Broaden the Protpardelle fallback. src/sampleworks/utils/imports.py says this import can also fail with OSError/NotADirectoryError, so catching only ImportError can still crash sampleworks.cli.guidance for non-Protpardelle runs. The bundled YAML path is also repo-relative, so it will break outside a checkout.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/sampleworks/utils/guidance_script_utils.py` around lines 63 - 67, The
Protpardelle fallback in the import block for ProtpardelleWrapper is too narrow
and can still crash non-Protpardelle runs. Update the try/except around the
ProtpardelleWrapper import to catch the additional failure modes noted in
imports.py, not just ImportError, and keep the existing fallback
assignment/logging. Also make the bundled YAML path in the guidance script
resolution independent of the current checkout by resolving it from the
package/module location instead of using a repo-relative path.

Comment on lines +223 to +230
elif model_type == StructurePredictor.PROTPARDELLE:
if ProtpardelleWrapper is None:
raise ImportError("Protpardelle dependencies not installed")
logger.debug(f"Loading Protpardelle model from {validated_checkpoint_path}")
model_wrapper = ProtpardelleWrapper(
config_path=str(Path("src/sampleworks/data/cc89_epoch415.yaml").expanduser().resolve()),
checkpoint_path=validated_checkpoint_path,
device=device,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Load the bundled YAML as a package resource, not from src/….

This branch resolves src/sampleworks/data/cc89_epoch415.yaml from the caller's current working directory. That works from the repo root, but an installed CLI launched elsewhere will not have a src/ tree there, so Protpardelle becomes unusable outside development checkouts.

Suggested fix
+from importlib.resources import files
...
         model_wrapper = ProtpardelleWrapper(
-            config_path=str(Path("src/sampleworks/data/cc89_epoch415.yaml").expanduser().resolve()),
+            config_path=str(files("sampleworks.data").joinpath("cc89_epoch415.yaml")),
             checkpoint_path=validated_checkpoint_path,
             device=device,
         )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
elif model_type == StructurePredictor.PROTPARDELLE:
if ProtpardelleWrapper is None:
raise ImportError("Protpardelle dependencies not installed")
logger.debug(f"Loading Protpardelle model from {validated_checkpoint_path}")
model_wrapper = ProtpardelleWrapper(
config_path=str(Path("src/sampleworks/data/cc89_epoch415.yaml").expanduser().resolve()),
checkpoint_path=validated_checkpoint_path,
device=device,
elif model_type == StructurePredictor.PROTPARDELLE:
if ProtpardelleWrapper is None:
raise ImportError("Protpardelle dependencies not installed")
logger.debug(f"Loading Protpardelle model from {validated_checkpoint_path}")
model_wrapper = ProtpardelleWrapper(
config_path=str(files("sampleworks.data").joinpath("cc89_epoch415.yaml")),
checkpoint_path=validated_checkpoint_path,
device=device,
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/sampleworks/utils/guidance_script_utils.py` around lines 223 - 230, In
the Protpardelle branch of guidance_script_utils, the config_path is still built
from a hardcoded src/sampleworks/data path, which breaks when the package is
installed and run from outside the repo. Update the ProtpardelleWrapper
config_path handling to load cc89_epoch415.yaml as a bundled package resource
using the package’s resource-loading APIs instead of resolving a filesystem path
relative to the current working directory. Keep the change localized to the
model-loading logic around StructurePredictor.PROTPARDELLE and
ProtpardelleWrapper.

Comment on lines +187 to +230
def require_protpardelle(message: str | None = None) -> Callable[[F], F]:
"""Decorator to require Protpardelle model availability.

Parameters
----------
message: str, optional
Custom error message. If None, uses default message.

Returns
-------
Callable
Decorator function

Examples
--------
>>> @require_protpardelle
... def sample_protpardelle():
... pass

>>> @require_protpardelle("Custom error message")
... def custom_function():
... pass
"""
default_message = (
"Protpardelle model wrapper is not available. Install with: "
"pixi install -e protpardelle"
)

def decorator(func: F) -> F:
@functools.wraps(func)
def wrapper(*args: Any, **kwargs: Any) -> Any:
if not PROTPARDELLE_AVAILABLE:
error_msg = message or default_message
try:
import pytest

pytest.skip(error_msg)
except ImportError:
raise ImportError(error_msg) from None
return func(*args, **kwargs)

return wrapper # type: ignore

return decorator

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

git ls-files src/sampleworks/utils/imports.py
wc -l src/sampleworks/utils/imports.py
sed -n '150,260p' src/sampleworks/utils/imports.py | cat -n

Repository: diff-use/sampleworks

Length of output: 3914


Support the bare decorator form or remove it from the docs. @require_protpardelle passes the function object into message, so the example does not wrap the target function. Either accept both decorator forms or change the examples to require parentheses.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/sampleworks/utils/imports.py` around lines 187 - 230, The bare decorator
usage of require_protpardelle is broken because the function object is being
passed into the message parameter, so the wrapper is not applied. Update
require_protpardelle in imports.py to support both `@require_protpardelle` and
`@require_protpardelle`("...") by detecting when the first argument is a callable
vs a custom message, or remove the bare form from the examples/docstring so only
the parenthesized usage is advertised.

Comment on lines +265 to +270
if (
not BOLTZ_AVAILABLE
and not PROTENIX_AVAILABLE
and not RF3_AVAILABLE
and not PROTPARDELLE_AVAILABLE
):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Update the require_any_model() install hint to include Protpardelle.

This branch now treats PROTPARDELLE_AVAILABLE as a valid model, but the default_message immediately above still only tells users to install Boltz, Protenix, or RF3. When the decorator fires, the remediation text will be stale.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/sampleworks/utils/imports.py` around lines 265 - 270, The install hint in
require_any_model() is stale because the availability check now includes
PROTPARDELLE_AVAILABLE, but the default_message still only mentions Boltz,
Protenix, and RF3. Update the default_message text in
src/sampleworks/utils/imports.py to include Protpardelle alongside the other
supported model options so the decorator’s remediation guidance matches the
current logic.

Comment on lines +17 to +21
# Must be set before any `import protpardelle...` happens. Respect an
# externally configured directory (e.g. when real weights are available).
os.environ.setdefault(
"PROTPARDELLE_MODEL_PARAMS", tempfile.mkdtemp(prefix="protpardelle_model_params_")
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Avoid allocating a temp dir inside setdefault().

setdefault() evaluates tempfile.mkdtemp(...) eagerly, so this creates an untracked temp directory on every import even when PROTPARDELLE_MODEL_PARAMS is already set. The same pattern appears again in tests/models/protpardelle/test_protpardelle_wrapper.py.

Suggested fix
-os.environ.setdefault(
-    "PROTPARDELLE_MODEL_PARAMS", tempfile.mkdtemp(prefix="protpardelle_model_params_")
-)
+_MODEL_PARAMS_DIR = None
+if "PROTPARDELLE_MODEL_PARAMS" not in os.environ:
+    _MODEL_PARAMS_DIR = tempfile.TemporaryDirectory(
+        prefix="protpardelle_model_params_"
+    )
+    os.environ["PROTPARDELLE_MODEL_PARAMS"] = _MODEL_PARAMS_DIR.name
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Must be set before any `import protpardelle...` happens. Respect an
# externally configured directory (e.g. when real weights are available).
os.environ.setdefault(
"PROTPARDELLE_MODEL_PARAMS", tempfile.mkdtemp(prefix="protpardelle_model_params_")
)
# Must be set before any `import protpardelle...` happens. Respect an
# externally configured directory (e.g. when real weights are available).
_MODEL_PARAMS_DIR = None
if "PROTPARDELLE_MODEL_PARAMS" not in os.environ:
_MODEL_PARAMS_DIR = tempfile.TemporaryDirectory(
prefix="protpardelle_model_params_"
)
os.environ["PROTPARDELLE_MODEL_PARAMS"] = _MODEL_PARAMS_DIR.name
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/models/protpardelle/conftest.py` around lines 17 - 21, The
`os.environ.setdefault(...)` usage is eagerly creating a temp directory via
`tempfile.mkdtemp(...)` even when `PROTPARDELLE_MODEL_PARAMS` is already set, so
update the setup in `conftest.py` to only call `mkdtemp` when the env var is
missing. Apply the same lazy pattern in `test_protpardelle_wrapper.py` where the
same `setdefault`/`mkdtemp` usage appears, using the existing
`PROTPARDELLE_MODEL_PARAMS` guard to avoid allocating untracked temp dirs on
import.

Comment thread tests/runs/test_runner.py
Comment on lines +29 to +30
assert argv[:5] == ["pixi", "run", "-e", "rf3", "python"]
assert "run_grid_search.py" in argv[5] # the exact path varies by invocation

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Make this argv assertion hermetic against baked env detection.

runner._build_argv() switches to a direct Python executable whenever a baked pixi env exists, so this test can still fail on machines that already have .pixi/envs/rf3/bin/python even though the script-path check is now looser. Force the pixi path in the test setup before asserting the ["pixi", "run", ...] prefix.

Suggested hardening
monkeypatch.setenv("SAMPLEWORKS_FORCE_PIXI", "1")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/runs/test_runner.py` around lines 29 - 30, The argv assertion in the
test is not hermetic because runner._build_argv() can switch to a direct Python
executable when a baked pixi env is detected. Update the test setup for the
affected assertion in test_runner.py to force the pixi code path before checking
the ["pixi", "run", "-e", "rf3", "python"] prefix, using the existing test
fixture/monkeypatch around runner._build_argv() so the environment detection
cannot change the expected argv.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants