Is there an existing issue ? / 是否已有相关的 issue ?
Describe the bug / 描述这个 bug
The official GGUF files hosted at openbmb/MiniCPM5-1B-GGUF produce broken tool calling output. The XML tags (<function>, </function>, <param>, </param>) that the model generates are stripped from decoded text, leaving garbled output like:
name="get_weather"> name="city">Tokyo
To Reproduce / 如何复现
- Download any GGUF from openbmb/MiniCPM5-1B-GGUF
- Load in llama-server or Ollama
- Send a chat completion with tools:
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "MiniCPM5-1B",
"messages": [{"role": "user", "content": "What is the weather in Tokyo?"}],
"tools": [{"type":"function","function":{"name":"get_weather","description":"Get weather for a city","parameters":{"type":"object","properties":{"city":{"type":"string"}},"required":["city"]}}}]
}'
Result: XML tags are missing from decoded output. Requesting logprobs: true confirms the token IDs (18, 20, 21, 19) are generated but have empty text — they were suppressed during decoding.
Expected behavior / 期望的结果
The <function>, </function>, <param>, and </param> tokens should appear in the model's output during tool calling, producing valid XML like:
<function name="get_weather"><param name="city">Tokyo</param></function>
Screenshots / 截图
No response
Environment / 环境
- OS: windows
- llama.cpp version: 9631 (6e14286ed)
- Model: MiniCPM5-1B-GGUF
Additional context / 其他信息
Root Cause
The GGUF converter (convert_hf_to_gguf.py) reads tokenizer_config.json where token IDs 18–21 are marked "special": true. The converter assigns negative scores to special tokens, which signals llama.cpp to suppress them from output. These are content tokens, not control tokens like <s> or </s>, so they should not be suppressed.
Fix
The GGUF needs to be re-converted with "special": false for token IDs 18–21 in tokenizer_config.json. I verified this fix locally — tool calling works correctly after re-conversion. The hosted GGUF files on HuggingFace need to be updated.
Is there an existing issue ? / 是否已有相关的 issue ?
Describe the bug / 描述这个 bug
The official GGUF files hosted at openbmb/MiniCPM5-1B-GGUF produce broken tool calling output. The XML tags (
<function>,</function>,<param>,</param>) that the model generates are stripped from decoded text, leaving garbled output like:To Reproduce / 如何复现
Result: XML tags are missing from decoded output. Requesting
logprobs: trueconfirms the token IDs (18, 20, 21, 19) are generated but have empty text — they were suppressed during decoding.Expected behavior / 期望的结果
The
<function>,</function>,<param>, and</param>tokens should appear in the model's output during tool calling, producing valid XML like:Screenshots / 截图
No response
Environment / 环境
Additional context / 其他信息
Root Cause
The GGUF converter (
convert_hf_to_gguf.py) readstokenizer_config.jsonwhere token IDs 18–21 are marked"special": true. The converter assigns negative scores to special tokens, which signals llama.cpp to suppress them from output. These are content tokens, not control tokens like<s>or</s>, so they should not be suppressed.Fix
The GGUF needs to be re-converted with
"special": falsefor token IDs 18–21 intokenizer_config.json. I verified this fix locally — tool calling works correctly after re-conversion. The hosted GGUF files on HuggingFace need to be updated.