Skip to content

Fix: FLUX.2 encode_prompt dtype and add bitsandbytes int8+bfloat16 warning#13798

Open
CoderSATTY wants to merge 1 commit into
huggingface:mainfrom
CoderSATTY:fix/flux2-encode-prompt-bnb-int8-warning
Open

Fix: FLUX.2 encode_prompt dtype and add bitsandbytes int8+bfloat16 warning#13798
CoderSATTY wants to merge 1 commit into
huggingface:mainfrom
CoderSATTY:fix/flux2-encode-prompt-bnb-int8-warning

Conversation

@CoderSATTY
Copy link
Copy Markdown

@CoderSATTY CoderSATTY commented May 23, 2026

What does this PR do?

Fixes #13772

Two related fixes for the Flux2Pipeline:

  1. Added a defensive warning for bitsandbytes 8-bit + bfloat16 quantization.
    This specific combination causes precision loss (bfloat16 is internally downcasted to float16 during MatMul) which accumulates and corrupts images in FLUX models. The warning alerts users at pipeline initialization and suggests using NF4 4-bit quantization instead.

  2. Fixed encode_prompt() for pre-computed embedding workflows.

    • Skips prompt string formatting if embeddings are already provided.
    • Automatically casts pre-computed embeddings to the exact precision (dtype) and device expected by the pipeline. This prevents the silent precision mismatches that were causing corrupted/noisy outputs when passing embeddings between different pipeline instances.

Testing

  • Tested on NVIDIA A100-80GB via Modal.
  • Verified that the pipeline correctly warns when loaded with BitsAndBytesConfig(load_in_8bit=True) and torch.bfloat16.
  • Verified that single-pipeline and two-phase workflows function perfectly and without warning when using NF4 quantization.

Before submitting

Who can review?

@yiyixuxu @sayakpaul

Two related fixes for Flux2Pipeline:
1. Add a warning for bitsandbytes 8-bit + bfloat16 quantization.
   This combination causes precision loss and corrupted images in
   FLUX models. The warning alerts users immediately at pipeline
   initialization and suggests using NF4 4-bit quantization instead.
2. Fix encode_prompt() for pre-computed embedding workflows.
   - Skips prompt string formatting if embeddings are already provided.
   - Automatically casts pre-computed embeddings to the exact precision
     (dtype) expected by the pipeline. This prevents silent image
     corruption when loading embeddings from a different pipeline.
@github-actions github-actions Bot added size/S PR with diff < 50 LOC pipelines labels May 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pipelines size/S PR with diff < 50 LOC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bad image output for Flux.2-dev, rocm, quantization and separate prompt encoding sequence

1 participant