Mdc/add protpardelle#274
Conversation
Wrap sequence-conditioned all-atom Protpardelle-1c models (task "ai-allatom", e.g. cc89.yaml) under the StructureModelWrapper protocol. featurize() derives sequence conditioning (aatype/seq_mask/residue_index/ chain_index/atom37 mask) from an Atomworks structure's chain_info, and step() runs the full reverse-diffusion trajectory internally via Protpardelle.sample(gt_aatype=...), returning [batch, atoms, 3] coords. - Register PROTPARDELLE_AVAILABLE / require_protpardelle / check_protpardelle_available (catching OSError from protpardelle.env). - Add StructurePredictor.PROTPARDELLE. - Tests build a small random ai-allatom model (no weights needed); cover sequence extraction, featurization shapes, prior init, protocol conformance, and an end-to-end short-trajectory sampling smoke test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…sn't just give us unguided samples
## Summary Completes the Protpardelle debugging work from ENG-75 / #257 on top of `mdc/add-protpardelle`. This PR intentionally targets `mdc/add-protpardelle` so the review contains only the final Protpardelle fixes, not the whole in-progress integration branch. Key changes: - Restores the EDM sampler/model-wrapper call shape to the protocol-compatible `step(noisy_state, t_hat, features=features)` after removing the temporary extra `eps` argument. - Implements Protpardelle step-time noise handling via `_expand_noise_level()`, broadcasting scalar or per-batch EDM timesteps to Protpardelle's expected `B x L` tensor. - Moves Protpardelle step inputs/conditioning tensors onto the wrapper device before the model forward pass. - Keeps `prot_lens_per_chain` on CPU when calling Protpardelle's sampling helper to avoid CPU/GPU device mismatch in helper-created residue indices. - Maps Atomworks selenomethionine atom name `SE` into Protpardelle atom37's methionine sulfur slot `SD`. - Fixes `ProtpardelleConditioning` immutability so dataclass construction can complete before selected conditioning fields become frozen. - Allows packaged Protpardelle config/data under `src/sampleworks/data/**` through `.actlignore` so ACTL sync includes the runtime YAML config. - Re-enables the slow `step()` behavior test and adds coverage for the MSE selenium atom mapping. ## Validation Local static checks: - `uvx ty check src/sampleworks/models/protpardelle/wrapper.py src/sampleworks/core/samplers/edm.py tests/models/protpardelle/test_protpardelle_wrapper.py` - `uvx ruff check src/sampleworks/models/protpardelle/wrapper.py src/sampleworks/core/samplers/edm.py tests/models/protpardelle/test_protpardelle_wrapper.py` Remote ACTL / Protpardelle environment checks: - `pixi run -e protpardelle-dev pytest tests/models/protpardelle -m "not slow"` - `27 passed, 1 deselected` - `pixi run -e protpardelle-dev pytest tests/models/protpardelle/test_protpardelle_wrapper.py::TestStep::test_step_returns_coords` - `1 passed` - Exact reduced Protpardelle guidance smoke command from #257 using the shared checkpoint and 1VME inputs - completed successfully with `Guidance run successfully!` - wrote results to `output/protpardelle-smoke-guided/` Non-blocking warnings observed during the smoke run were expected environment warnings for unavailable optional model/tool paths and missing mirror environment variables; they did not prevent the Protpardelle run from completing. Refs #257. Co-authored-by: xraymemory <me.anzuoni@gmail.com>
📝 WalkthroughWalkthroughAdds a complete ChangesProtpardelle wrapper and guidance integration
Supporting config and infra fixes
Sequence DiagramsequenceDiagram
participant CLI as guidance CLI
participant Utils as guidance_script_utils
participant Wrapper as ProtpardelleWrapper
participant Model as Protpardelle model
CLI->>Utils: get_model_and_device(PROTPARDELLE, checkpoint)
Utils->>Wrapper: ProtpardelleWrapper(checkpoint, config_path, device)
Wrapper-->>Utils: wrapper instance
CLI->>Utils: _run_guidance(structure, wrapper)
Utils->>Wrapper: featurize(structure)
Wrapper-->>Utils: GenerativeModelInput[ProtpardelleConditioning]
loop EDM sampling steps
Utils->>Wrapper: step(x_t, t, features)
Wrapper->>Wrapper: _convert_to_atom37(x_t)
Wrapper->>Model: forward(atom37_coords, noise_level, seq_conditioning)
Model-->>Wrapper: predicted x0 atom37
Wrapper-->>Utils: x0 flat coords (batch x atoms x 3)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (1)
tests/models/protpardelle/test_protpardelle_wrapper.py (1)
147-160: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winKeep these assertions at the public wrapper boundary.
These tests reach into
_build_sampling_kwargs()and_convert_to_atom37()directly, which makes the suite brittle to internal refactors that preservefeaturize(),initialize_from_prior(), andstep()behavior. Either move the coverage to public API outcomes or promote the helpers if they are intended contract. As per coding guidelines,tests/**/*.py: "Write black-box tests that verify behavior, not implementation."Also applies to: 287-320
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/models/protpardelle/test_protpardelle_wrapper.py` around lines 147 - 160, The tests are asserting private helper behavior via ProtpardelleWrapper._build_sampling_kwargs() and _convert_to_atom37(), which makes them brittle to internal refactors. Move this coverage to black-box assertions around the public wrapper methods featurize(), initialize_from_prior(), and step(), or explicitly promote those helpers if they are meant to be part of the contract. Keep the checks focused on observable outputs and behavior at the wrapper boundary rather than internal implementation details.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pyproject.toml`:
- Around line 139-141: The pyproject.toml currently defines the
dependency-overrides table twice under tool.pixi.pypi-options, which breaks TOML
parsing. Consolidate the duplicate [tool.pixi.pypi-options.dependency-overrides]
blocks into a single table, and keep the pandas and gemmi pins together there
only if they are intended to apply globally.
In `@src/sampleworks/models/protpardelle/wrapper.py`:
- Around line 707-709: The self-conditioning cached in
features.conditioning.x_self_conditioning is being stored with its autograd
history intact, which can carry the prior graph into later sampling steps.
Update the assignment in step() to cache a detached version of x_self_cond
before writing it to features.conditioning.x_self_conditioning, so repeated
iterations do not retain gradients across steps.
In `@src/sampleworks/utils/guidance_script_utils.py`:
- Around line 223-230: In the Protpardelle branch of guidance_script_utils, the
config_path is still built from a hardcoded src/sampleworks/data path, which
breaks when the package is installed and run from outside the repo. Update the
ProtpardelleWrapper config_path handling to load cc89_epoch415.yaml as a bundled
package resource using the package’s resource-loading APIs instead of resolving
a filesystem path relative to the current working directory. Keep the change
localized to the model-loading logic around StructurePredictor.PROTPARDELLE and
ProtpardelleWrapper.
- Around line 63-67: The Protpardelle fallback in the import block for
ProtpardelleWrapper is too narrow and can still crash non-Protpardelle runs.
Update the try/except around the ProtpardelleWrapper import to catch the
additional failure modes noted in imports.py, not just ImportError, and keep the
existing fallback assignment/logging. Also make the bundled YAML path in the
guidance script resolution independent of the current checkout by resolving it
from the package/module location instead of using a repo-relative path.
In `@src/sampleworks/utils/imports.py`:
- Around line 265-270: The install hint in require_any_model() is stale because
the availability check now includes PROTPARDELLE_AVAILABLE, but the
default_message still only mentions Boltz, Protenix, and RF3. Update the
default_message text in src/sampleworks/utils/imports.py to include Protpardelle
alongside the other supported model options so the decorator’s remediation
guidance matches the current logic.
- Around line 187-230: The bare decorator usage of require_protpardelle is
broken because the function object is being passed into the message parameter,
so the wrapper is not applied. Update require_protpardelle in imports.py to
support both `@require_protpardelle` and `@require_protpardelle`("...") by detecting
when the first argument is a callable vs a custom message, or remove the bare
form from the examples/docstring so only the parenthesized usage is advertised.
In `@tests/models/protpardelle/conftest.py`:
- Around line 17-21: The `os.environ.setdefault(...)` usage is eagerly creating
a temp directory via `tempfile.mkdtemp(...)` even when
`PROTPARDELLE_MODEL_PARAMS` is already set, so update the setup in `conftest.py`
to only call `mkdtemp` when the env var is missing. Apply the same lazy pattern
in `test_protpardelle_wrapper.py` where the same `setdefault`/`mkdtemp` usage
appears, using the existing `PROTPARDELLE_MODEL_PARAMS` guard to avoid
allocating untracked temp dirs on import.
In `@tests/runs/test_runner.py`:
- Around line 29-30: The argv assertion in the test is not hermetic because
runner._build_argv() can switch to a direct Python executable when a baked pixi
env is detected. Update the test setup for the affected assertion in
test_runner.py to force the pixi code path before checking the ["pixi", "run",
"-e", "rf3", "python"] prefix, using the existing test fixture/monkeypatch
around runner._build_argv() so the environment detection cannot change the
expected argv.
---
Nitpick comments:
In `@tests/models/protpardelle/test_protpardelle_wrapper.py`:
- Around line 147-160: The tests are asserting private helper behavior via
ProtpardelleWrapper._build_sampling_kwargs() and _convert_to_atom37(), which
makes them brittle to internal refactors. Move this coverage to black-box
assertions around the public wrapper methods featurize(),
initialize_from_prior(), and step(), or explicitly promote those helpers if they
are meant to be part of the contract. Keep the checks focused on observable
outputs and behavior at the wrapper boundary rather than internal implementation
details.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8a25473f-58ce-43f8-b062-7b353055864f
⛔ Files ignored due to path filters (1)
pixi.lockis excluded by!**/*.lock
📒 Files selected for processing (15)
.actlignorepyproject.tomlsrc/sampleworks/cli/guidance.pysrc/sampleworks/core/samplers/edm.pysrc/sampleworks/data/cc89_epoch415.yamlsrc/sampleworks/models/protpardelle/__init__.pysrc/sampleworks/models/protpardelle/wrapper.pysrc/sampleworks/utils/guidance_constants.pysrc/sampleworks/utils/guidance_script_arguments.pysrc/sampleworks/utils/guidance_script_utils.pysrc/sampleworks/utils/imports.pytests/models/protpardelle/__init__.pytests/models/protpardelle/conftest.pytests/models/protpardelle/test_protpardelle_wrapper.pytests/runs/test_runner.py
| [tool.pixi.pypi-options.dependency-overrides] | ||
| pandas = "==2.3.1" | ||
| gemmi = "==0.6.7" |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
python - <<'PY'
import pathlib, tomllib
data = tomllib.loads(pathlib.Path("pyproject.toml").read_text())
pixi = data["tool"]["pixi"]
print("Top-level dependency overrides:")
print(pixi.get("pypi-options", {}).get("dependency-overrides", {}))
for feature_name, feature_cfg in pixi.get("feature", {}).items():
deps = feature_cfg.get("dependencies", {})
pypi_deps = feature_cfg.get("pypi-dependencies", {})
overrides = feature_cfg.get("pypi-options", {}).get("dependency-overrides", {})
if "gemmi" in deps or "gemmi" in pypi_deps or overrides:
print(f"\nfeature={feature_name}")
if "gemmi" in deps:
print(" dependencies.gemmi =", deps["gemmi"])
if "gemmi" in pypi_deps:
print(" pypi-dependencies.gemmi =", pypi_deps["gemmi"])
if overrides:
print(" dependency-overrides =", overrides)
PY
echo
echo "Other gemmi references in the repo:"
rg -n '\bgemmi\b' .Repository: diff-use/sampleworks
Length of output: 670
🏁 Script executed:
#!/bin/bash
set -euo pipefail
nl -ba pyproject.toml | sed -n '55,170p'Repository: diff-use/sampleworks
Length of output: 198
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Relevant pyproject.toml slice:"
sed -n '55,170p' pyproject.toml
echo
echo "Occurrences of dependency-overrides headers:"
grep -n '^\[tool\.pixi\.pypi-options\.dependency-overrides\]$' pyproject.toml || true
echo
echo "All gemmi/pandas override lines:"
grep -nE '^(pandas|gemmi)\s*=' pyproject.toml || trueRepository: diff-use/sampleworks
Length of output: 3634
Remove the duplicate dependency-overrides table. pyproject.toml declares [tool.pixi.pypi-options.dependency-overrides] twice (around lines 139 and 150), which makes the file invalid TOML. Keep a single table and move gemmi = "==0.6.7" into it only if that pin is meant to apply globally.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pyproject.toml` around lines 139 - 141, The pyproject.toml currently defines
the dependency-overrides table twice under tool.pixi.pypi-options, which breaks
TOML parsing. Consolidate the duplicate
[tool.pixi.pypi-options.dependency-overrides] blocks into a single table, and
keep the pandas and gemmi pins together there only if they are intended to apply
globally.
| # pass the self-conditioning to the next step by updating the features. | ||
| # TODO: I wonder if we need to adjust this since we will apply additional guidance. | ||
| features.conditioning.x_self_conditioning = x_self_cond |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect the relevant section around lines 707-709.
sed -n '650,740p' src/sampleworks/models/protpardelle/wrapper.py
# Find all references to x_self_conditioning in the repository.
rg -n "x_self_conditioning|x_self_cond" src/sampleworks/models/protpardelle -S
# Inspect the type/shape of the conditioning object and any stateful reuse patterns nearby.
rg -n "conditioning\." src/sampleworks/models/protpardelle/wrapper.py -SRepository: diff-use/sampleworks
Length of output: 5604
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect the conditioning dataclass definitions and nearby setup.
sed -n '80,140p' src/sampleworks/models/protpardelle/wrapper.py
# Inspect the parts that initialize conditioning and any other writes to x_self_conditioning.
sed -n '400,450p' src/sampleworks/models/protpardelle/wrapper.py
rg -n "x_self_conditioning\s*=" src/sampleworks/models/protpardelle/wrapper.py -S
# Check whether the model forward or surrounding code explicitly detaches cached tensors.
rg -n "detach\(|requires_grad|no_grad|self_cond" src/sampleworks/models/protpardelle/wrapper.py -SRepository: diff-use/sampleworks
Length of output: 6153
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find the sampling/caller context for step().
rg -n "\.step\(" src/sampleworks -S
# Inspect the sampler implementation around the Protpardelle wrapper usage.
rg -n "ProtpardelleWrapper|PureGuidance|sample\(" src/sampleworks -S
# Look for no_grad / inference_mode wrappers around the sampling loop.
rg -n "no_grad|inference_mode|grad_enabled|set_grad_enabled" src/sampleworks -SRepository: diff-use/sampleworks
Length of output: 3909
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '400,440p' src/sampleworks/core/samplers/edm.py
sed -n '90,140p' src/sampleworks/core/scalers/pure_guidance.py
sed -n '560,620p' src/sampleworks/models/protpardelle/wrapper.pyRepository: diff-use/sampleworks
Length of output: 6461
Detach x_self_conditioning before caching it.
step() can run with gradients enabled, and this tensor is fed back into the next sampling step. Storing x_self_cond directly on features.conditioning can retain the prior autograd graph across iterations and trigger graph bloat or double-backward failures. Cache a detached tensor here.
Suggested fix
- features.conditioning.x_self_conditioning = x_self_cond
+ features.conditioning.x_self_conditioning = (
+ x_self_cond.detach()
+ if x_self_cond is not None and x_self_cond.requires_grad
+ else x_self_cond
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # pass the self-conditioning to the next step by updating the features. | |
| # TODO: I wonder if we need to adjust this since we will apply additional guidance. | |
| features.conditioning.x_self_conditioning = x_self_cond | |
| # pass the self-conditioning to the next step by updating the features. | |
| # TODO: I wonder if we need to adjust this since we will apply additional guidance. | |
| features.conditioning.x_self_conditioning = ( | |
| x_self_cond.detach() | |
| if x_self_cond is not None and x_self_cond.requires_grad | |
| else x_self_cond | |
| ) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/sampleworks/models/protpardelle/wrapper.py` around lines 707 - 709, The
self-conditioning cached in features.conditioning.x_self_conditioning is being
stored with its autograd history intact, which can carry the prior graph into
later sampling steps. Update the assignment in step() to cache a detached
version of x_self_cond before writing it to
features.conditioning.x_self_conditioning, so repeated iterations do not retain
gradients across steps.
Source: Coding guidelines
| try: | ||
| from sampleworks.models.protpardelle.wrapper import ProtpardelleWrapper | ||
| except ImportError: | ||
| ProtpardelleWrapper = None | ||
| logger.warning("Failed to import Protpardelle, hopefully you're running a different model") |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Broaden the Protpardelle fallback. src/sampleworks/utils/imports.py says this import can also fail with OSError/NotADirectoryError, so catching only ImportError can still crash sampleworks.cli.guidance for non-Protpardelle runs. The bundled YAML path is also repo-relative, so it will break outside a checkout.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/sampleworks/utils/guidance_script_utils.py` around lines 63 - 67, The
Protpardelle fallback in the import block for ProtpardelleWrapper is too narrow
and can still crash non-Protpardelle runs. Update the try/except around the
ProtpardelleWrapper import to catch the additional failure modes noted in
imports.py, not just ImportError, and keep the existing fallback
assignment/logging. Also make the bundled YAML path in the guidance script
resolution independent of the current checkout by resolving it from the
package/module location instead of using a repo-relative path.
| elif model_type == StructurePredictor.PROTPARDELLE: | ||
| if ProtpardelleWrapper is None: | ||
| raise ImportError("Protpardelle dependencies not installed") | ||
| logger.debug(f"Loading Protpardelle model from {validated_checkpoint_path}") | ||
| model_wrapper = ProtpardelleWrapper( | ||
| config_path=str(Path("src/sampleworks/data/cc89_epoch415.yaml").expanduser().resolve()), | ||
| checkpoint_path=validated_checkpoint_path, | ||
| device=device, |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Load the bundled YAML as a package resource, not from src/….
This branch resolves src/sampleworks/data/cc89_epoch415.yaml from the caller's current working directory. That works from the repo root, but an installed CLI launched elsewhere will not have a src/ tree there, so Protpardelle becomes unusable outside development checkouts.
Suggested fix
+from importlib.resources import files
...
model_wrapper = ProtpardelleWrapper(
- config_path=str(Path("src/sampleworks/data/cc89_epoch415.yaml").expanduser().resolve()),
+ config_path=str(files("sampleworks.data").joinpath("cc89_epoch415.yaml")),
checkpoint_path=validated_checkpoint_path,
device=device,
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| elif model_type == StructurePredictor.PROTPARDELLE: | |
| if ProtpardelleWrapper is None: | |
| raise ImportError("Protpardelle dependencies not installed") | |
| logger.debug(f"Loading Protpardelle model from {validated_checkpoint_path}") | |
| model_wrapper = ProtpardelleWrapper( | |
| config_path=str(Path("src/sampleworks/data/cc89_epoch415.yaml").expanduser().resolve()), | |
| checkpoint_path=validated_checkpoint_path, | |
| device=device, | |
| elif model_type == StructurePredictor.PROTPARDELLE: | |
| if ProtpardelleWrapper is None: | |
| raise ImportError("Protpardelle dependencies not installed") | |
| logger.debug(f"Loading Protpardelle model from {validated_checkpoint_path}") | |
| model_wrapper = ProtpardelleWrapper( | |
| config_path=str(files("sampleworks.data").joinpath("cc89_epoch415.yaml")), | |
| checkpoint_path=validated_checkpoint_path, | |
| device=device, |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/sampleworks/utils/guidance_script_utils.py` around lines 223 - 230, In
the Protpardelle branch of guidance_script_utils, the config_path is still built
from a hardcoded src/sampleworks/data path, which breaks when the package is
installed and run from outside the repo. Update the ProtpardelleWrapper
config_path handling to load cc89_epoch415.yaml as a bundled package resource
using the package’s resource-loading APIs instead of resolving a filesystem path
relative to the current working directory. Keep the change localized to the
model-loading logic around StructurePredictor.PROTPARDELLE and
ProtpardelleWrapper.
| def require_protpardelle(message: str | None = None) -> Callable[[F], F]: | ||
| """Decorator to require Protpardelle model availability. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| message: str, optional | ||
| Custom error message. If None, uses default message. | ||
|
|
||
| Returns | ||
| ------- | ||
| Callable | ||
| Decorator function | ||
|
|
||
| Examples | ||
| -------- | ||
| >>> @require_protpardelle | ||
| ... def sample_protpardelle(): | ||
| ... pass | ||
|
|
||
| >>> @require_protpardelle("Custom error message") | ||
| ... def custom_function(): | ||
| ... pass | ||
| """ | ||
| default_message = ( | ||
| "Protpardelle model wrapper is not available. Install with: " | ||
| "pixi install -e protpardelle" | ||
| ) | ||
|
|
||
| def decorator(func: F) -> F: | ||
| @functools.wraps(func) | ||
| def wrapper(*args: Any, **kwargs: Any) -> Any: | ||
| if not PROTPARDELLE_AVAILABLE: | ||
| error_msg = message or default_message | ||
| try: | ||
| import pytest | ||
|
|
||
| pytest.skip(error_msg) | ||
| except ImportError: | ||
| raise ImportError(error_msg) from None | ||
| return func(*args, **kwargs) | ||
|
|
||
| return wrapper # type: ignore | ||
|
|
||
| return decorator |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
git ls-files src/sampleworks/utils/imports.py
wc -l src/sampleworks/utils/imports.py
sed -n '150,260p' src/sampleworks/utils/imports.py | cat -nRepository: diff-use/sampleworks
Length of output: 3914
Support the bare decorator form or remove it from the docs. @require_protpardelle passes the function object into message, so the example does not wrap the target function. Either accept both decorator forms or change the examples to require parentheses.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/sampleworks/utils/imports.py` around lines 187 - 230, The bare decorator
usage of require_protpardelle is broken because the function object is being
passed into the message parameter, so the wrapper is not applied. Update
require_protpardelle in imports.py to support both `@require_protpardelle` and
`@require_protpardelle`("...") by detecting when the first argument is a callable
vs a custom message, or remove the bare form from the examples/docstring so only
the parenthesized usage is advertised.
| if ( | ||
| not BOLTZ_AVAILABLE | ||
| and not PROTENIX_AVAILABLE | ||
| and not RF3_AVAILABLE | ||
| and not PROTPARDELLE_AVAILABLE | ||
| ): |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
Update the require_any_model() install hint to include Protpardelle.
This branch now treats PROTPARDELLE_AVAILABLE as a valid model, but the default_message immediately above still only tells users to install Boltz, Protenix, or RF3. When the decorator fires, the remediation text will be stale.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/sampleworks/utils/imports.py` around lines 265 - 270, The install hint in
require_any_model() is stale because the availability check now includes
PROTPARDELLE_AVAILABLE, but the default_message still only mentions Boltz,
Protenix, and RF3. Update the default_message text in
src/sampleworks/utils/imports.py to include Protpardelle alongside the other
supported model options so the decorator’s remediation guidance matches the
current logic.
| # Must be set before any `import protpardelle...` happens. Respect an | ||
| # externally configured directory (e.g. when real weights are available). | ||
| os.environ.setdefault( | ||
| "PROTPARDELLE_MODEL_PARAMS", tempfile.mkdtemp(prefix="protpardelle_model_params_") | ||
| ) |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
Avoid allocating a temp dir inside setdefault().
setdefault() evaluates tempfile.mkdtemp(...) eagerly, so this creates an untracked temp directory on every import even when PROTPARDELLE_MODEL_PARAMS is already set. The same pattern appears again in tests/models/protpardelle/test_protpardelle_wrapper.py.
Suggested fix
-os.environ.setdefault(
- "PROTPARDELLE_MODEL_PARAMS", tempfile.mkdtemp(prefix="protpardelle_model_params_")
-)
+_MODEL_PARAMS_DIR = None
+if "PROTPARDELLE_MODEL_PARAMS" not in os.environ:
+ _MODEL_PARAMS_DIR = tempfile.TemporaryDirectory(
+ prefix="protpardelle_model_params_"
+ )
+ os.environ["PROTPARDELLE_MODEL_PARAMS"] = _MODEL_PARAMS_DIR.name📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Must be set before any `import protpardelle...` happens. Respect an | |
| # externally configured directory (e.g. when real weights are available). | |
| os.environ.setdefault( | |
| "PROTPARDELLE_MODEL_PARAMS", tempfile.mkdtemp(prefix="protpardelle_model_params_") | |
| ) | |
| # Must be set before any `import protpardelle...` happens. Respect an | |
| # externally configured directory (e.g. when real weights are available). | |
| _MODEL_PARAMS_DIR = None | |
| if "PROTPARDELLE_MODEL_PARAMS" not in os.environ: | |
| _MODEL_PARAMS_DIR = tempfile.TemporaryDirectory( | |
| prefix="protpardelle_model_params_" | |
| ) | |
| os.environ["PROTPARDELLE_MODEL_PARAMS"] = _MODEL_PARAMS_DIR.name |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/models/protpardelle/conftest.py` around lines 17 - 21, The
`os.environ.setdefault(...)` usage is eagerly creating a temp directory via
`tempfile.mkdtemp(...)` even when `PROTPARDELLE_MODEL_PARAMS` is already set, so
update the setup in `conftest.py` to only call `mkdtemp` when the env var is
missing. Apply the same lazy pattern in `test_protpardelle_wrapper.py` where the
same `setdefault`/`mkdtemp` usage appears, using the existing
`PROTPARDELLE_MODEL_PARAMS` guard to avoid allocating untracked temp dirs on
import.
| assert argv[:5] == ["pixi", "run", "-e", "rf3", "python"] | ||
| assert "run_grid_search.py" in argv[5] # the exact path varies by invocation |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
Make this argv assertion hermetic against baked env detection.
runner._build_argv() switches to a direct Python executable whenever a baked pixi env exists, so this test can still fail on machines that already have .pixi/envs/rf3/bin/python even though the script-path check is now looser. Force the pixi path in the test setup before asserting the ["pixi", "run", ...] prefix.
Suggested hardening
monkeypatch.setenv("SAMPLEWORKS_FORCE_PIXI", "1")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/runs/test_runner.py` around lines 29 - 30, The argv assertion in the
test is not hermetic because runner._build_argv() can switch to a direct Python
executable when a baked pixi env is detected. Update the test setup for the
affected assertion in test_runner.py to force the pixi code path before checking
the ["pixi", "run", "-e", "rf3", "python"] prefix, using the existing test
fixture/monkeypatch around runner._build_argv() so the environment detection
cannot change the expected argv.
This is a working version of Protpardelle-1c in SampleWorks. It may still require parameter tuning and other updates--in particular it isn't clear how to use self-conditioning. However we can generate structures with this version and so I'm making this PR as a milestone.
Only a couple significant changes have been made outside the Protpardelle wrapper code itself. One is to make a features class that is unfrozen, so that we can pass self-conditioning input forward to the next Euler step during sampling. Another is that we define different sampling parameters for Protpardelle when instantiating the sampler. There is no CLI control for either change in this PR.
Summary by CodeRabbit
New Features
Bug Fixes