[Bug]: GGUF files on HuggingFace strip tool call XML tags from output, breaking tool calling

### Is there an existing issue ? / 是否已有相关的 issue ?

- [x] I have searched, and there is no existing issue. / 我已经搜索过了，没有相关的 issue。

### Describe the bug / 描述这个 bug

The official GGUF files hosted at [openbmb/MiniCPM5-1B-GGUF](https://huggingface.co/openbmb/MiniCPM5-1B-GGUF) produce broken tool calling output. The XML tags (`<function>`, `</function>`, `<param>`, `</param>`) that the model generates are stripped from decoded text, leaving garbled output like:

```
 name="get_weather"> name="city">Tokyo
```


### To Reproduce / 如何复现


1. Download any GGUF from [openbmb/MiniCPM5-1B-GGUF](https://huggingface.co/openbmb/MiniCPM5-1B-GGUF)
2. Load in llama-server or Ollama
3. Send a chat completion with tools:

```bash
curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "MiniCPM5-1B",
    "messages": [{"role": "user", "content": "What is the weather in Tokyo?"}],
    "tools": [{"type":"function","function":{"name":"get_weather","description":"Get weather for a city","parameters":{"type":"object","properties":{"city":{"type":"string"}},"required":["city"]}}}]
  }'
```

**Result**: XML tags are missing from decoded output. Requesting `logprobs: true` confirms the token IDs (18, 20, 21, 19) are generated but have empty text — they were suppressed during decoding.



### Expected behavior / 期望的结果


The `<function>`, `</function>`, `<param>`, and `</param>` tokens should appear in the model's output during tool calling, producing valid XML like:

```xml
<function name="get_weather"><param name="city">Tokyo</param></function>
```


### Screenshots / 截图

_No response_

### Environment / 环境

```shell
- OS: windows
- llama.cpp version: 9631 (6e14286ed)
- Model: MiniCPM5-1B-GGUF
```

### Additional context / 其他信息


## Root Cause

The GGUF converter (`convert_hf_to_gguf.py`) reads `tokenizer_config.json` where token IDs 18–21 are marked `"special": true`. The converter assigns negative scores to special tokens, which signals llama.cpp to suppress them from output. These are **content tokens**, not control tokens like `<s>` or `</s>`, so they should not be suppressed.

## Fix

The GGUF needs to be re-converted with `"special": false` for token IDs 18–21 in `tokenizer_config.json`. I verified this fix locally — tool calling works correctly after re-conversion. The hosted GGUF files on HuggingFace need to be updated.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: GGUF files on HuggingFace strip tool call XML tags from output, breaking tool calling #361

Is there an existing issue ? / 是否已有相关的 issue ?

Describe the bug / 描述这个 bug

To Reproduce / 如何复现

Expected behavior / 期望的结果

Screenshots / 截图

Environment / 环境

Additional context / 其他信息

Root Cause

Fix

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug]: GGUF files on HuggingFace strip tool call XML tags from output, breaking tool calling #361

Description

Is there an existing issue ? / 是否已有相关的 issue ?

Describe the bug / 描述这个 bug

To Reproduce / 如何复现

Expected behavior / 期望的结果

Screenshots / 截图

Environment / 环境

Additional context / 其他信息

Root Cause

Fix

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions