Add Ideogram support and improve BF16 dequantization handling#459
Add Ideogram support and improve BF16 dequantization handling#459molbal wants to merge 6 commits into
Conversation
|
Great! I successfully ran it, but my device doesn't support bf16; it gets converted to fp32 computation, which makes it very slow. Can you make it run on my device in fp16? [INFO] got prompt |
|
Hi @yu234567 - try now. It should work better now, can you verify please? |
Thank you so much, it worked! |
…tization handling.
…prove dequantization efficiency.
…ntized model support and LoRA compatibility.
Summary
This adds support for Ideogram GGUF models.
What Changed
ideogramto the supported image GGUF architectures.Notes
Tested on Windows 11, Python version: 3.12.11 (main, Jul 23 2025, 00:32:20) [MSC v.1944 64 bit (AMD64)] [INFO] Total VRAM 8192 MB, total RAM 48394 MB
[INFO] pytorch version: 2.12.0+cu130
[INFO] Set vram state to: LOW_VRAM
[INFO] Device: cuda:0 NVIDIA GeForce RTX 3080 Laptop GPU
Tested with Q4_0 gguf from https://huggingface.co/leejet/ideogram-4-GGUF
Other GGUF quant types still use the existing dequant paths.