Skip to content

Docker implementation#1219

Open
Jrice1317 wants to merge 7 commits into
conda:mainfrom
Jrice1317:docker-implementation
Open

Docker implementation#1219
Jrice1317 wants to merge 7 commits into
conda:mainfrom
Jrice1317:docker-implementation

Conversation

@Jrice1317

@Jrice1317 Jrice1317 commented Apr 22, 2026

Copy link
Copy Markdown
Contributor

Description

Summary

This PR adds Docker output support to constructor via two new opt-in flows:

Flow 1: Dockerfile generation (installer_type: docker)

Generates a multi-stage Dockerfile alongside the .sh installer and stages both
into a named output directory. No Docker CLI required — the output can be used
directly or customized before building.

Output structure:

{output_dir}/<name>-<version>-<platform>/
Dockerfile
<name>-<version>-<platform>.sh

Usage:

installer_type: docker
docker_base_image: "debian:13.4-slim@sha256:..."

The generated Dockerfile uses a two-stage build: Stage 1 runs the .sh installer
in batch mode and cleans up build artifacts (.a files, __pycache__, optionally
pkgs/). Stage 2 copies the finished environment into a clean final image and optionally initializes the shell.

Flow 2: Portable image tarball (docker_image_format: tar)

Builds a Docker image from the generated Dockerfile using docker buildx and
exports it as a portable .tar file via docker save. Requires the Docker CLI
to be installed on the host. The target platform must be Linux.

Output:

{output_dir}/<name>-<version>-<platform>-docker.tar

Usage:

docker_image_format: tar
docker_base_image: "debian:13.4-slim@sha256:..."

New schema keys

Key Description
docker_base_image Required for all Docker features. Base image reference.
docker_image_format Set to tar to build and export a portable image tarball.
docker_tag Optional tag for the built image. Defaults to name:version.
docker_labels Additional OCI labels. title and version are set automatically.

Platform support

  • Target platform must be linux-* for all Docker features.
  • No host-level restriction for Dockerfile-only output.
  • docker_image_format: tar additionally requires the Docker CLI on the host.

initialize_conda support

Value Behavior
false / unset No PATH modification, no init.
condabin Adds PREFIX/condabin to PATH only.
true / classic Adds PREFIX/condabin and PREFIX/bin to PATH, runs conda init --all.
classic + mamba in specs Same as above, additionally runs mamba shell init (v1/v2 detected from resolved package version).

All cases support non-interactive docker run via ENV PATH. Interactive shells are covered by conda init --all writing to shell rc files at image build time.

Notes

  • docker_image_format is designed to be extendable. Additional output formats
    (gz, zst) are mentioned in the schema but not yet implemented.
  • The .sh installer is always built first and reused as the Docker build
    context, keeping the two outputs consistent.
  • The feature is intended to support Dockerfile generation from any host when targeting Linux, but CI currently validates the Docker flows on Linux only because we do not yet have buildx available on macOS/Windows hosts.

Changes

New:

  • constructor/docker_build.py: Handles Docker output by rendering template and optionally building portable image
  • constructor/dockerfile_template.tmpl: Template used to generate Dockerfile
  • examples/dockerfile/construct.yaml: Example for installer_type: docker flow
  • examples/docker_image_format/construct.yaml: Example for docker_image_format: tar flow

Updated:

  • constructor/_schema.py: Adds docker to installer_type and adds docker_base_image, docker_tag, docker_labels, docker_image_format
  • constructor/main.py: Adds docker to installer types
  • tests/test_examples.py: Adds test_dockerfile_generation and test_docker_image_build to cover both flows

Checklist - did you ...

  • Add a file to the news directory (using the template) for the next release's release notes?
  • Add / update necessary tests?
  • Add / update outdated documentation?

@Jrice1317 Jrice1317 requested a review from a team as a code owner April 22, 2026 19:32
@github-project-automation github-project-automation Bot moved this to 🆕 New in 🔎 Review Apr 22, 2026
@conda-bot conda-bot added the cla-signed [bot] added once the contributor has signed the CLA label Apr 22, 2026
@Jrice1317 Jrice1317 marked this pull request as draft April 22, 2026 19:45
@Jrice1317 Jrice1317 changed the title Docker implementation Docker implementation [skip windows] May 6, 2026
@Jrice1317 Jrice1317 force-pushed the docker-implementation branch from 9fa7db8 to 9f20cbe Compare May 6, 2026 16:10
@Jrice1317 Jrice1317 changed the title Docker implementation [skip windows] Docker implementation May 8, 2026
@Jrice1317 Jrice1317 marked this pull request as ready for review May 8, 2026 23:01
Comment thread constructor/_schema.py Outdated
Comment thread constructor/_schema.py Outdated
Comment thread constructor/_schema.py Outdated
Comment thread constructor/_schema.py
Comment thread constructor/_schema.py Outdated
Comment thread constructor/dockerfile_template.tmpl Outdated
Comment thread constructor/main.py Outdated
Comment thread constructor/main.py Outdated
Comment thread tests/test_examples.py Outdated
Comment thread constructor/main.py
@lrandersson

Copy link
Copy Markdown
Contributor

@Jrice1317 I'd be happy to do a review once the items from Marco have been resolved, it'll also help me understand the changes better

@Jrice1317 Jrice1317 requested a review from marcoesters May 14, 2026 15:44
@Jrice1317 Jrice1317 marked this pull request as draft May 14, 2026 22:47
@Jrice1317 Jrice1317 marked this pull request as ready for review May 19, 2026 13:32
Comment thread constructor/docker_build.py Outdated
Comment thread constructor/data/construct.schema.json Outdated
}
],
"default": null,
"description": "If set, builds a docker image using the Dockerfile generated by constructor and saves it as a portable tarball either uncompressed or compressed. ``<name>-<version>-<platform>-<arch>-docker.tar`` will be created in the output docker directory.",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the format <name>-<version>-<platform>-<arch> true?
From what I could see in the code it looks like:

tag = info.get("docker_tag", f"{info['name'].lower()}:{info['version']}")

Comment thread constructor/_schema.py Outdated
The labels `org.opencontainers.image.title` and `org.opencontainers.image.version`
are set automatically from `name` and `version`.
"""
docker_image: Literal["tar", "gz", "zst"] | None = None

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add explicit NotImplementedError for those values that are not yet implemented? I see in the PR it says:

(gz, zst) are defined in the schema but not yet implemented.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

@Jrice1317 Jrice1317 requested a review from lrandersson May 20, 2026 22:16
lrandersson
lrandersson previously approved these changes May 21, 2026

@lrandersson lrandersson left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done Jaida! In the future I think we should split these large PRs into smaller separate ones (to simplify review) If the comments are addressed from Marco I approve!

@github-project-automation github-project-automation Bot moved this from 🆕 New to ✅ Approved in 🔎 Review May 21, 2026

@marcoesters marcoesters left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good changes - a few more issues to address.

Comment thread constructor/_schema.py Outdated
Comment thread constructor/_schema.py Outdated
Comment thread constructor/docker_build.py Outdated
Comment thread constructor/_schema.py Outdated
Comment thread constructor/docker_build.py
Comment thread constructor/main.py Outdated
Comment thread examples/docker_image_format/construct.yaml
Comment thread tests/test_examples.py
Comment thread tests/test_examples.py
Comment thread tests/test_examples.py Outdated
Comment thread constructor/dockerfile_template.tmpl Outdated
Comment thread constructor/dockerfile_template.tmpl Outdated
Comment on lines +47 to +55
# Avoid duplicating PREFIX/bin in interactive shell PATH after shell init.
RUN echo 'export PATH=$(sed -e "s,:\?{{ default_prefix }}/bin:,," <<< "${PATH}")' >> ~/.bashrc && \
{%- if has_mamba %}
"$PREFIX/bin/mamba" shell init --all || "$PREFIX/bin/python" -m mamba.mamba init --all
{%- elif has_conda %}
$PREFIX/bin/python -m conda init --all
{%- else %}
echo "No conda executable found, skipping shell init"
{%- endif %}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That shell initialization is still not quite right.

  • We don't need the PATH manipulation if no shell has been initialized, so the entire block needs a {%- if has_conda or has_mamba %} wrapper.
  • The --condabin initialization option is missing.
  • I don't see the purpose of the echo command. Presumably, the installer creator skipped conda and mamba on purpose, so that information seems unnecessary.

@Jrice1317 Jrice1317 requested a review from marcoesters June 4, 2026 19:26
Comment thread constructor/dockerfile_template.tmpl Outdated
Comment on lines +53 to +54
"$PREFIX/bin/python" -m conda init --all{%- if has_mamba %} && \
("$PREFIX/bin/mamba" shell init --all || "$PREFIX/bin/python" -m mamba.mamba init --all){% endif %}

@marcoesters marcoesters Jun 4, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very hard to read but the way I read this, conda will always be initialized. But if the installer only has mamba, that would fail.

Could we put if-conditions into their own lines instead of inlining them?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was following what header.sh does as you suggested. See here.

How do you suggest I handle this condition? I was trying to refrain from creating separate RUN commands, but I'm not sure what the third option is here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't know if I agree with what header.sh is doing. The initialization seems to be predicated on conda being in the installer. That's out of scope for this PR.

I also just noticed that the way this is currently written, we would see an error message on mamba v1 (you'd have to redirect stderr to /dev/null). We have some fairly complex logic here with two nested branches: the condabin branch (which adds condabin, not bin. to PATH) and the classic branch. I think Python may be the appropriate way to describe some of it instead of trying to force it into Jinja.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, the -c option is available. While this doesn't initialize all shells, I'd be okay with punting that to another PR instead of introducing complex shell initialization template logic.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few clarifying questions:

  1. For -c — are you suggesting we use -c now and defer the complex init logic (condabin, mamba v1/v2, --all) to a separate PR? Or are you suggesting we defer -c itself to a separate PR?
  2. For condabin — the ENV PATH="${PREFIX}/bin:${PATH}" line in Stage 2 adds bin unconditionally, which seems wrong for condabin. Should I do something like if initialize_conda != 'condabin': <ENV_PATH_command>``, or should condabin support be deferred to a follow-up PR entirely?
  3. For moving complex logic to Python — is the mamba version check and stderr redirect what you had in mind for moving to Python, or did you mean moving the entire init block out of the template?
  4. For scope — when you say the init being predicated on conda is out of scope, do you mean I should remove the conda init call entirely for this PR, or condition it solely on whether _has_conda is true rather than following header.sh's assumption that conda is always present?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Use -c now, defer the rest to a separate PR.
  2. Your assessment is correct. I don't think we should defer this since using bin for condabin is incorrect whereas using -c is just incomplete.
  3. The init block since this has too many logical branches.
  4. You correctly pointed out that your current solution is mimicking what the SH installer does. While I don't think the logic is correct, I think consistency with the SH installer is more important. Fixing the logic is out of scope - keep it consistent with the SH installer for this PR.

@lrandersson

lrandersson commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

@Jrice1317 are the 2 open threads the last remaining tasks in this PR? Also let me know if you need help with rebasing, there are many changes on main since the MSI PR was merged so you have some merge conflicts at the moment, I'd probably squash all of your commits into 1 before attempting to rebase.

First pass

Revert shar changes

Fix logic with building on non-native platforms

Add mamba logic

Require base image to be provided in construct.yaml

Update docker_build.py

Use existing vars in template

Add docker as installer type

Add example construct.yaml for tests

Add test

Add clean command

Call proper docker command in test

Fix pre-commit errors

Add docstring to beginning of file

Use schema vars properly

Use correct image name in test

Update docs

Add news file

Do not generate file extension .docker

Fix typos

Always use sh for docker

Pre-commit fix

Remove docker from os_allowed

Regenerate schema

Add docker_build to schema

Make whitespace adjustments

Fix logic regarding base image requirement

Make image portable

Update wording

Update docs

Revert to using one path

Revert back to docker load

Add output

Update logic

Apply suggestions from code review

Co-authored-by: Marco Esters <mesters@anaconda.com>

Change wording in schema

Be more descriptive

Be more generalized

Use multiline block

Refine logic

Move check to main if docker installed for building image

Change docker_build to docker_image

Expand tests

Pre-commit fixes

Change logic

Update docs

Fix whitespace in template

Remove redundant logic and improve wording

Fix test

Update docs

Make tests linux only

Remove restriction on docker_tag and apply code review suggestions

Add cross-build support to tests

Update the docs

Fix typo

Add output for debugging

Add docker buildx check to utils

Apply code review suggestions and update docs

Apply code review suggestions

Remove code

Update docs

Remove unnecessary asserts

Revert broken tests

Update test to verify sh installer is cleaned up

Fix pre-commit

Apply code review suggestions and update docs

Update tests to align with updated schema

Apply code review suggestions

Fix typo

Add logic for condabin

Move logic from template to script
@Jrice1317 Jrice1317 force-pushed the docker-implementation branch from 3f08ffc to 0ba92bc Compare June 16, 2026 22:41
Comment thread constructor/data/construct.schema.json Outdated

@lrandersson lrandersson left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done Jaida! I'm approving because I saw you fixed the conflict. Let Marco have the final approval before merging since he has done the heavy-lifting in this review.

@marcoesters marcoesters left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some style/efficiency suggestions and one error left to fix.

Comment thread tests/test_examples.py
Comment on lines +2008 to +2014
else:
result = subprocess.run(
["docker", "run", "--rm", image_name, "/bin/bash", "-c", "echo 'Hello, World!'"],
capture_output=True,
text=True,
)
assert "Hello, World!" in result.stdout

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
else:
result = subprocess.run(
["docker", "run", "--rm", image_name, "/bin/bash", "-c", "echo 'Hello, World!'"],
capture_output=True,
text=True,
)
assert "Hello, World!" in result.stdout
result = subprocess.run(
["docker", "run", "--rm", image_name, "/bin/bash", "-c", "echo 'Hello, World!'"],
capture_output=True,
text=True,
)
assert "Hello, World!" in result.stdout

Since this should work no matter the initialization level, we can remove the else.

Comment thread tests/test_examples.py
Comment on lines +1968 to +1971
if init == "mamba_v1":
config["specs"] += ["mamba <2.0.0"]
else:
config["specs"] += ["mamba"]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if init == "mamba_v1":
config["specs"] += ["mamba <2.0.0"]
else:
config["specs"] += ["mamba"]
if init == "mamba_v1":
config["specs"].append("mamba <2.0.0")
else:
config["specs"].append("mamba >=2.0.0")

I find append more semantic. Either way, let's make sure we get mamba v2 by pinning it.

{%- if initialize_conda == 'condabin' %}
ENV PATH="${PREFIX}/condabin:${PATH}"
{%- elif initialize_conda %}
ENV PATH="${PREFIX}/condabin:${PREFIX}/bin:${PATH}"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ENV PATH="${PREFIX}/condabin:${PREFIX}/bin:${PATH}"
ENV PATH="${PREFIX}/bin:${PREFIX}/condabin:${PATH}"

bin comes before condabin

Comment on lines +28 to +50
def _build_init_run_block(info):
from .conda_interface import MatchSpec

specs = {MatchSpec(spec).name for spec in info.get("specs", ())}
has_mamba = "mamba" in specs
has_conda = "conda" in specs
initialize_conda = info.get("initialize_conda")

if not (has_conda or has_mamba) or not initialize_conda or initialize_conda == "condabin":
return ""
run = 'RUN "${PREFIX}/bin/conda" init --all'

if has_mamba:
mamba_version = None
for record in info.get("_all_pkg_records", ()):
if record.name == "mamba":
mamba_version = record.version
break
if check_version(mamba_version, min_version="2.0.0"):
run += ' && "${PREFIX}/bin/mamba" shell init --shell bash'
else:
run += ' && "${PREFIX}/bin/python" -m mamba.mamba init --all'
return run

@marcoesters marcoesters Jun 18, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def _build_init_run_block(info):
from .conda_interface import MatchSpec
specs = {MatchSpec(spec).name for spec in info.get("specs", ())}
has_mamba = "mamba" in specs
has_conda = "conda" in specs
initialize_conda = info.get("initialize_conda")
if not (has_conda or has_mamba) or not initialize_conda or initialize_conda == "condabin":
return ""
run = 'RUN "${PREFIX}/bin/conda" init --all'
if has_mamba:
mamba_version = None
for record in info.get("_all_pkg_records", ()):
if record.name == "mamba":
mamba_version = record.version
break
if check_version(mamba_version, min_version="2.0.0"):
run += ' && "${PREFIX}/bin/mamba" shell init --shell bash'
else:
run += ' && "${PREFIX}/bin/python" -m mamba.mamba init --all'
return run
def _build_init_run_block(info):
if not info.get("_has_conda"):
return ""
initialize_conda = info.get("initialize_conda")
if not initialize_conda or initialize_conda == "condabin":
return ""
run = 'RUN "${PREFIX}/bin/conda" init --all'
for record in info.get("_all_pkg_records", ()):
if not record.name == "mamba":
continue
if check_version(record.version, min_version="2.0.0"):
run += ' && "${PREFIX}/bin/mamba" shell init --shell bash'
else:
run += ' && "${PREFIX}/bin/python" -m mamba.mamba init --all'
break
return run

This avoids an unnecessary imports and is more efficient by returning early.

@Jrice1317 Jrice1317 Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, but why separate if has_conda and initialize_conda when they both return "" and we're focusing on the init?

@Jrice1317 Jrice1317 Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be removed since if there is no conda, the init will fail early?

"it" being the has_conda check

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should have been if not info.get("_has_conda"), sorry. I fixed it in my suggestion.

I chose the separation to just return early. I find multiple returns that use different pieces of information easier to read than a more complex if condition.


COPY --from=builder ${PREFIX} ${PREFIX}
{%- if register_envs %}
COPY --from=builder /root/.conda /root/.conda

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
COPY --from=builder /root/.conda /root/.conda
COPY --from=builder ${HOME}/.conda ${HOME}/.conda

Can we use $HOME here in case the base image uses a different default user?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed [bot] added once the contributor has signed the CLA

Projects

Status: ✅ Approved

Development

Successfully merging this pull request may close these issues.

4 participants