Skip to content

Bump llama.cpp to 8452824 (b9739), release v0.8.27#59

Merged
nyo16 merged 1 commit into
masterfrom
bump-llama-cpp-8452824
Jun 20, 2026
Merged

Bump llama.cpp to 8452824 (b9739), release v0.8.27#59
nyo16 merged 1 commit into
masterfrom
bump-llama-cpp-8452824

Conversation

@nyo16

@nyo16 nyo16 commented Jun 20, 2026

Copy link
Copy Markdown
Owner

Summary

Bumps the vendor/llama.cpp submodule from 74ade5274 to 845282461
(67 commits, tag b9739) and cuts release v0.8.27.

No NIF changes were required. Of the headers the binding compiles against:

  • include/llama.h, ggml/include/ggml.h, ggml/include/ggml-backend.h,
    common/chat.h, common/json-schema-to-grammar.h, and common/sampling.h
    are unchanged.
  • common/speculative.h only gains two optional declarations —
    common_speculative_get_state / common_speculative_set_state (stash/restore
    internal speculative state) — which the binding does not call.
    swaps its name field for a get_name() method, the deprecated
    webui / webui_mcp_proxy / webui_config_json fields are dropped from
    common_params, a models_preset_hf field and an fs_open_ifstream helper
    are added, and common_prompt_checkpoint gains a data_spec blob. The NIF
    constructs only common_params_speculative (setting types and draft.*),
    never common_params or common_params_model, and the sole
    common_params_speculative change is internal need_n_rs_seq() logic that
    now also reserves a recurrent-state seq for EAGLE3 drafts.

Notable upstream changes

  • MTP/speculative: EAGLE3 support for Qwen3.5 & 3.6 (#24593); EAGLE3
    long-prompt segfault fix (#24707).
  • model/convert: optional GLM-DSA indexer tensors (#24770); more consistent
    rope_parameters handling (#24833); skip main_gpu validation when no
    devices are available (#23405).
  • metal: BF16 concat-kernel support check (#24747), f16/bf16 concat
    operator (#24724), rope_back operator (#24725).
  • ggml: sync + bump to 0.15.2; AMX optimization (#24806).
  • mtmd: Windows UTF-8 fix (#24779), InternVL/mtmd-cli batching (#24775,
    #24778), preprocessor refactor (#24736).
  • server: CORS-proxy auth-header fix (#24373), router fixes (#24728,
    #24760, #24739, #23976), invalid-grammar HTTP 400 (#24154).
  • Plus SYCL, Vulkan, OpenCL, hexagon, OpenVINO, webgpu, webui, and CI/docker
    updates. See CHANGELOG.md for the full list.

Verification

Against a freshly rebuilt NIF (Metal, Apple M4 Max):

  • mix test158 passed, 4 skipped
  • ✅ Smoke suite — 7/7 passed (generation, streaming, chat templates,
    JSON-schema grammar, raw GBNF, and embeddings — embedding paths fully
    exercised with a Qwen3-Embedding-0.6B model)
  • mix format --check-formatted — clean
  • mix dialyzer — 0 errors

Notes

  • checksum.exs is intentionally not updated — CI regenerates it against
    the precompiled release artifacts after the v0.8.27 tag is pushed.
  • After merge, tag v0.8.27 on master to trigger the precompile + checksum
    workflow.

Update vendor/llama.cpp from 74ade5274 to 845282461 (67 commits, tag
b9739). No NIF changes required: all binding-relevant headers are
unchanged except common/speculative.h (two new optional get/set_state
declarations the NIF does not call) and common/common.h (changes to
common_params / common_params_model, which the NIF never constructs —
it builds only common_params_speculative, whose sole change is internal
need_n_rs_seq() logic now also covering EAGLE3 drafts).

Verified against a freshly built NIF: 158 tests + 4 skipped, all 7
end-to-end smoke tests pass (generation, streaming, chat templates,
JSON-schema grammar, raw GBNF, and embeddings), mix format clean,
Dialyzer 0 errors.
@nyo16 nyo16 merged commit cf82b74 into master Jun 20, 2026
4 checks passed
@nyo16 nyo16 deleted the bump-llama-cpp-8452824 branch June 20, 2026 22:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant