Docker implementation#1219
Conversation
9fa7db8 to
9f20cbe
Compare
|
@Jrice1317 I'd be happy to do a review once the items from Marco have been resolved, it'll also help me understand the changes better |
| } | ||
| ], | ||
| "default": null, | ||
| "description": "If set, builds a docker image using the Dockerfile generated by constructor and saves it as a portable tarball either uncompressed or compressed. ``<name>-<version>-<platform>-<arch>-docker.tar`` will be created in the output docker directory.", |
There was a problem hiding this comment.
Is the format <name>-<version>-<platform>-<arch> true?
From what I could see in the code it looks like:
tag = info.get("docker_tag", f"{info['name'].lower()}:{info['version']}")
| The labels `org.opencontainers.image.title` and `org.opencontainers.image.version` | ||
| are set automatically from `name` and `version`. | ||
| """ | ||
| docker_image: Literal["tar", "gz", "zst"] | None = None |
There was a problem hiding this comment.
Should we add explicit NotImplementedError for those values that are not yet implemented? I see in the PR it says:
(gz, zst) are defined in the schema but not yet implemented.
lrandersson
left a comment
There was a problem hiding this comment.
Well done Jaida! In the future I think we should split these large PRs into smaller separate ones (to simplify review) If the comments are addressed from Marco I approve!
marcoesters
left a comment
There was a problem hiding this comment.
Good changes - a few more issues to address.
| # Avoid duplicating PREFIX/bin in interactive shell PATH after shell init. | ||
| RUN echo 'export PATH=$(sed -e "s,:\?{{ default_prefix }}/bin:,," <<< "${PATH}")' >> ~/.bashrc && \ | ||
| {%- if has_mamba %} | ||
| "$PREFIX/bin/mamba" shell init --all || "$PREFIX/bin/python" -m mamba.mamba init --all | ||
| {%- elif has_conda %} | ||
| $PREFIX/bin/python -m conda init --all | ||
| {%- else %} | ||
| echo "No conda executable found, skipping shell init" | ||
| {%- endif %} |
There was a problem hiding this comment.
That shell initialization is still not quite right.
- We don't need the
PATHmanipulation if no shell has been initialized, so the entire block needs a{%- if has_conda or has_mamba %}wrapper. - The
--condabininitialization option is missing. - I don't see the purpose of the
echocommand. Presumably, the installer creator skippedcondaandmambaon purpose, so that information seems unnecessary.
| "$PREFIX/bin/python" -m conda init --all{%- if has_mamba %} && \ | ||
| ("$PREFIX/bin/mamba" shell init --all || "$PREFIX/bin/python" -m mamba.mamba init --all){% endif %} |
There was a problem hiding this comment.
This is very hard to read but the way I read this, conda will always be initialized. But if the installer only has mamba, that would fail.
Could we put if-conditions into their own lines instead of inlining them?
There was a problem hiding this comment.
I was following what header.sh does as you suggested. See here.
How do you suggest I handle this condition? I was trying to refrain from creating separate RUN commands, but I'm not sure what the third option is here.
There was a problem hiding this comment.
Hmm, I don't know if I agree with what header.sh is doing. The initialization seems to be predicated on conda being in the installer. That's out of scope for this PR.
I also just noticed that the way this is currently written, we would see an error message on mamba v1 (you'd have to redirect stderr to /dev/null). We have some fairly complex logic here with two nested branches: the condabin branch (which adds condabin, not bin. to PATH) and the classic branch. I think Python may be the appropriate way to describe some of it instead of trying to force it into Jinja.
There was a problem hiding this comment.
Alternatively, the -c option is available. While this doesn't initialize all shells, I'd be okay with punting that to another PR instead of introducing complex shell initialization template logic.
There was a problem hiding this comment.
I have a few clarifying questions:
- For
-c— are you suggesting we use-cnow and defer the complex init logic (condabin, mamba v1/v2, --all) to a separate PR? Or are you suggesting we defer-citself to a separate PR? - For condabin — the
ENV PATH="${PREFIX}/bin:${PATH}"line in Stage 2 addsbinunconditionally, which seems wrong for condabin. Should I do something likeif initialize_conda != 'condabin':<ENV_PATH_command>``, or should condabin support be deferred to a follow-up PR entirely? - For moving complex logic to Python — is the mamba version check and stderr redirect what you had in mind for moving to Python, or did you mean moving the entire init block out of the template?
- For scope — when you say the init being predicated on conda is out of scope, do you mean I should remove the
conda initcall entirely for this PR, or condition it solely on whether_has_condais true rather than followingheader.sh's assumption that conda is always present?
There was a problem hiding this comment.
- Use
-cnow, defer the rest to a separate PR. - Your assessment is correct. I don't think we should defer this since using
binforcondabinis incorrect whereas using-cis just incomplete. - The init block since this has too many logical branches.
- You correctly pointed out that your current solution is mimicking what the SH installer does. While I don't think the logic is correct, I think consistency with the SH installer is more important. Fixing the logic is out of scope - keep it consistent with the SH installer for this PR.
|
@Jrice1317 are the 2 open threads the last remaining tasks in this PR? Also let me know if you need help with rebasing, there are many changes on |
First pass Revert shar changes Fix logic with building on non-native platforms Add mamba logic Require base image to be provided in construct.yaml Update docker_build.py Use existing vars in template Add docker as installer type Add example construct.yaml for tests Add test Add clean command Call proper docker command in test Fix pre-commit errors Add docstring to beginning of file Use schema vars properly Use correct image name in test Update docs Add news file Do not generate file extension .docker Fix typos Always use sh for docker Pre-commit fix Remove docker from os_allowed Regenerate schema Add docker_build to schema Make whitespace adjustments Fix logic regarding base image requirement Make image portable Update wording Update docs Revert to using one path Revert back to docker load Add output Update logic Apply suggestions from code review Co-authored-by: Marco Esters <mesters@anaconda.com> Change wording in schema Be more descriptive Be more generalized Use multiline block Refine logic Move check to main if docker installed for building image Change docker_build to docker_image Expand tests Pre-commit fixes Change logic Update docs Fix whitespace in template Remove redundant logic and improve wording Fix test Update docs Make tests linux only Remove restriction on docker_tag and apply code review suggestions Add cross-build support to tests Update the docs Fix typo Add output for debugging Add docker buildx check to utils Apply code review suggestions and update docs Apply code review suggestions Remove code Update docs Remove unnecessary asserts Revert broken tests Update test to verify sh installer is cleaned up Fix pre-commit Apply code review suggestions and update docs Update tests to align with updated schema Apply code review suggestions Fix typo Add logic for condabin Move logic from template to script
3f08ffc to
0ba92bc
Compare
lrandersson
left a comment
There was a problem hiding this comment.
Well done Jaida! I'm approving because I saw you fixed the conflict. Let Marco have the final approval before merging since he has done the heavy-lifting in this review.
marcoesters
left a comment
There was a problem hiding this comment.
Some style/efficiency suggestions and one error left to fix.
| else: | ||
| result = subprocess.run( | ||
| ["docker", "run", "--rm", image_name, "/bin/bash", "-c", "echo 'Hello, World!'"], | ||
| capture_output=True, | ||
| text=True, | ||
| ) | ||
| assert "Hello, World!" in result.stdout |
There was a problem hiding this comment.
| else: | |
| result = subprocess.run( | |
| ["docker", "run", "--rm", image_name, "/bin/bash", "-c", "echo 'Hello, World!'"], | |
| capture_output=True, | |
| text=True, | |
| ) | |
| assert "Hello, World!" in result.stdout | |
| result = subprocess.run( | |
| ["docker", "run", "--rm", image_name, "/bin/bash", "-c", "echo 'Hello, World!'"], | |
| capture_output=True, | |
| text=True, | |
| ) | |
| assert "Hello, World!" in result.stdout |
Since this should work no matter the initialization level, we can remove the else.
| if init == "mamba_v1": | ||
| config["specs"] += ["mamba <2.0.0"] | ||
| else: | ||
| config["specs"] += ["mamba"] |
There was a problem hiding this comment.
| if init == "mamba_v1": | |
| config["specs"] += ["mamba <2.0.0"] | |
| else: | |
| config["specs"] += ["mamba"] | |
| if init == "mamba_v1": | |
| config["specs"].append("mamba <2.0.0") | |
| else: | |
| config["specs"].append("mamba >=2.0.0") |
I find append more semantic. Either way, let's make sure we get mamba v2 by pinning it.
| {%- if initialize_conda == 'condabin' %} | ||
| ENV PATH="${PREFIX}/condabin:${PATH}" | ||
| {%- elif initialize_conda %} | ||
| ENV PATH="${PREFIX}/condabin:${PREFIX}/bin:${PATH}" |
There was a problem hiding this comment.
| ENV PATH="${PREFIX}/condabin:${PREFIX}/bin:${PATH}" | |
| ENV PATH="${PREFIX}/bin:${PREFIX}/condabin:${PATH}" |
bin comes before condabin
| def _build_init_run_block(info): | ||
| from .conda_interface import MatchSpec | ||
|
|
||
| specs = {MatchSpec(spec).name for spec in info.get("specs", ())} | ||
| has_mamba = "mamba" in specs | ||
| has_conda = "conda" in specs | ||
| initialize_conda = info.get("initialize_conda") | ||
|
|
||
| if not (has_conda or has_mamba) or not initialize_conda or initialize_conda == "condabin": | ||
| return "" | ||
| run = 'RUN "${PREFIX}/bin/conda" init --all' | ||
|
|
||
| if has_mamba: | ||
| mamba_version = None | ||
| for record in info.get("_all_pkg_records", ()): | ||
| if record.name == "mamba": | ||
| mamba_version = record.version | ||
| break | ||
| if check_version(mamba_version, min_version="2.0.0"): | ||
| run += ' && "${PREFIX}/bin/mamba" shell init --shell bash' | ||
| else: | ||
| run += ' && "${PREFIX}/bin/python" -m mamba.mamba init --all' | ||
| return run |
There was a problem hiding this comment.
| def _build_init_run_block(info): | |
| from .conda_interface import MatchSpec | |
| specs = {MatchSpec(spec).name for spec in info.get("specs", ())} | |
| has_mamba = "mamba" in specs | |
| has_conda = "conda" in specs | |
| initialize_conda = info.get("initialize_conda") | |
| if not (has_conda or has_mamba) or not initialize_conda or initialize_conda == "condabin": | |
| return "" | |
| run = 'RUN "${PREFIX}/bin/conda" init --all' | |
| if has_mamba: | |
| mamba_version = None | |
| for record in info.get("_all_pkg_records", ()): | |
| if record.name == "mamba": | |
| mamba_version = record.version | |
| break | |
| if check_version(mamba_version, min_version="2.0.0"): | |
| run += ' && "${PREFIX}/bin/mamba" shell init --shell bash' | |
| else: | |
| run += ' && "${PREFIX}/bin/python" -m mamba.mamba init --all' | |
| return run | |
| def _build_init_run_block(info): | |
| if not info.get("_has_conda"): | |
| return "" | |
| initialize_conda = info.get("initialize_conda") | |
| if not initialize_conda or initialize_conda == "condabin": | |
| return "" | |
| run = 'RUN "${PREFIX}/bin/conda" init --all' | |
| for record in info.get("_all_pkg_records", ()): | |
| if not record.name == "mamba": | |
| continue | |
| if check_version(record.version, min_version="2.0.0"): | |
| run += ' && "${PREFIX}/bin/mamba" shell init --shell bash' | |
| else: | |
| run += ' && "${PREFIX}/bin/python" -m mamba.mamba init --all' | |
| break | |
| return run |
This avoids an unnecessary imports and is more efficient by returning early.
There was a problem hiding this comment.
Ah, I see, but why separate if has_conda and initialize_conda when they both return "" and we're focusing on the init?
There was a problem hiding this comment.
Should it be removed since if there is no conda, the init will fail early?
"it" being the has_conda check
There was a problem hiding this comment.
That should have been if not info.get("_has_conda"), sorry. I fixed it in my suggestion.
I chose the separation to just return early. I find multiple returns that use different pieces of information easier to read than a more complex if condition.
|
|
||
| COPY --from=builder ${PREFIX} ${PREFIX} | ||
| {%- if register_envs %} | ||
| COPY --from=builder /root/.conda /root/.conda |
There was a problem hiding this comment.
| COPY --from=builder /root/.conda /root/.conda | |
| COPY --from=builder ${HOME}/.conda ${HOME}/.conda |
Can we use $HOME here in case the base image uses a different default user?
Description
Summary
This PR adds Docker output support to constructor via two new opt-in flows:
Flow 1: Dockerfile generation (
installer_type: docker)Generates a multi-stage
Dockerfilealongside the.shinstaller and stages bothinto a named output directory. No Docker CLI required — the output can be used
directly or customized before building.
Output structure:
Usage:
The generated Dockerfile uses a two-stage build: Stage 1 runs the
.shinstallerin batch mode and cleans up build artifacts (
.afiles,__pycache__, optionallypkgs/). Stage 2 copies the finished environment into a clean final image and optionally initializes the shell.Flow 2: Portable image tarball (
docker_image_format: tar)Builds a Docker image from the generated Dockerfile using
docker buildxandexports it as a portable
.tarfile viadocker save. Requires the Docker CLIto be installed on the host. The target platform must be Linux.
Output:
Usage:
New schema keys
docker_base_imagedocker_image_formattarto build and export a portable image tarball.docker_tagname:version.docker_labelstitleandversionare set automatically.Platform support
linux-*for all Docker features.docker_image_format: taradditionally requires the Docker CLI on the host.initialize_condasupportfalse/ unsetcondabinPREFIX/condabintoPATHonly.true/classicPREFIX/condabinandPREFIX/bintoPATH, runsconda init --all.classic+ mamba in specsAll cases support non-interactive
docker runviaENV PATH. Interactive shells are covered byconda init --allwriting to shell rc files at image build time.Notes
docker_image_formatis designed to be extendable. Additional output formats(
gz,zst) are mentioned in the schema but not yet implemented..shinstaller is always built first and reused as the Docker buildcontext, keeping the two outputs consistent.
Changes
New:
constructor/docker_build.py: Handles Docker output by rendering template and optionally building portable imageconstructor/dockerfile_template.tmpl: Template used to generate Dockerfileexamples/dockerfile/construct.yaml: Example forinstaller_type: dockerflowexamples/docker_image_format/construct.yaml: Example fordocker_image_format: tarflowUpdated:
constructor/_schema.py: Addsdockertoinstaller_typeand addsdocker_base_image,docker_tag,docker_labels,docker_image_formatconstructor/main.py: Addsdockerto installer typestests/test_examples.py: Addstest_dockerfile_generationandtest_docker_image_buildto cover both flowsChecklist - did you ...
newsdirectory (using the template) for the next release's release notes?