Add Ideogram support and improve BF16 dequantization handling by molbal · Pull Request #459 · city96/ComfyUI-GGUF

molbal · 2026-06-09T20:49:51Z

Summary

This adds support for Ideogram GGUF models.

What Changed

Added ideogram to the supported image GGUF architectures.
Added Ideogram model detection to the converter.
Added GGUF dtype handling needed by Ideogram inference.
Fixed the Ideogram inference failure where a packed GGUF weight dtype caused a byte tensor to reach CUDA linear.
Adjusted BF16 GGUF loading so Ideogram can start inference faster.

Notes

Tested on Windows 11, Python version: 3.12.11 (main, Jul 23 2025, 00:32:20) [MSC v.1944 64 bit (AMD64)] [INFO] Total VRAM 8192 MB, total RAM 48394 MB
[INFO] pytorch version: 2.12.0+cu130
[INFO] Set vram state to: LOW_VRAM
[INFO] Device: cuda:0 NVIDIA GeForce RTX 3080 Laptop GPU

Tested with Q4_0 gguf from https://huggingface.co/leejet/ideogram-4-GGUF

Other GGUF quant types still use the existing dequant paths.

yu234567 · 2026-06-10T07:21:31Z

Great! I successfully ran it, but my device doesn't support bf16; it gets converted to fp32 computation, which makes it very slow. Can you make it run on my device in fp16?

[INFO] got prompt
[INFO] Using xformers attention in VAE
[INFO] Using xformers attention in VAE
[INFO] VAE load device: cuda:0, offload device: cpu, dtype: torch.float32
[INFO] Found quantization metadata version 1
[INFO] Using MixedPrecisionOps for text encoder
[INFO] CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
[INFO] Requested to load Ideogram4TEModel_
[INFO] Model Ideogram4TEModel_ prepared for dynamic VRAM loading. 4319MB Staged. 0 patches attached. Force pre-loaded 144 weights: 594 KB.
[WARNING] Warning: This gguf model file is loaded in compatibility mode 'sd.cpp' [arch:ideogram]
[INFO] gguf qtypes: BF16 (254), Q4_0 (204)
[INFO] model weight dtype torch.bfloat16, manual cast: torch.float32
[INFO] model_type FLOW
[INFO] Requested to load Ideogram4
[INFO] loaded completely; 7997.15 MB usable, 5506.41 MB loaded, full load: True
8%|████ | 1/12 [00:16<03:03, 16.65s/it, Model Initialization complete! ][INFO] Interrupting prompt 2d4b1e4f-b3b1-4a31-9d36-604af4910de5

molbal · 2026-06-11T12:00:38Z

Hi @yu234567 - try now. It should work better now, can you verify please?

yu234567 · 2026-06-13T08:11:28Z

Hi @yu234567 - try now. It should work better now, can you verify please?

Thank you so much, it worked!

…tization handling.

…prove dequantization efficiency.

…ntized model support and LoRA compatibility.

.

dffa75b

m8rr mentioned this pull request Jun 14, 2026

Add Ideogram 4 architecture support #460

Open

molbal added 3 commits June 14, 2026 20:21

Added debug logging for quantization stats and improved tensor dequan…

2533af0

…tization handling.

Added GGUF loader nodes and removed support for K quant tensors to im…

a5d4ce5

…prove dequantization efficiency.

Removed Ideogram 4 GGUF section and updated README to clarify pre-qua…

904971e

…ntized model support and LoRA compatibility.

molbal force-pushed the main branch from e28c638 to 904971e Compare June 15, 2026 20:13

molbal added 2 commits June 15, 2026 22:13

Update .gitignore

ff95bbc

Remove fp8 files from repository

568382d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Ideogram support and improve BF16 dequantization handling#459

Add Ideogram support and improve BF16 dequantization handling#459
molbal wants to merge 6 commits into
city96:mainfrom
molbal:main

molbal commented Jun 9, 2026

Uh oh!

yu234567 commented Jun 10, 2026

Uh oh!

molbal commented Jun 11, 2026

Uh oh!

yu234567 commented Jun 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

molbal commented Jun 9, 2026

What Changed

Notes

Uh oh!

yu234567 commented Jun 10, 2026

Uh oh!

molbal commented Jun 11, 2026

Uh oh!

yu234567 commented Jun 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants