Swap K2V3 TITO tokenizer to IFM template; rename legacy to k2v3_oldbackup by ZhentingWang · Pull Request #43 · LLM360/miles

ZhentingWang · 2026-06-13T07:26:21Z

Summary

The K2V3 family is migrating to the IFM-style chat template introduced in bbq-0601 (used by bbq-8b-mid3_v3 and later checkpoints). This PR swaps K2V3TITOTokenizer to target the new template; the legacy <|im_end|>\n implementation is preserved under a new name (K2V3OldBackupTITOTokenizer / --tito-model k2v3_oldbackup) for checkpoints that haven't migrated yet (bbq-8b-mid3-final and earlier).

Breaking change

--tito-model k2v3 now refers to the IFM template. Legacy K2V3 checkpoint users must update their sbatch:

- --tito-model k2v3
+ --tito-model k2v3_oldbackup

Both classes hard-assert at __init__ that the loaded tokenizer's vocab matches the target template, so a misconfigured combination fails loudly at startup rather than silently producing wrong TITO buffers. Error messages point at the right --tito-model value.

What changed

miles/utils/chat_template_utils/tito_tokenizer.py
- K2V3TITOTokenizer rewritten for the IFM template (no boundary fix; merge_tokens is base concat — buffer already matches canonical render).
- Renamed previous K2V3 implementation to K2V3OldBackupTITOTokenizer (legacy <|im_end|> + \n boundary fix preserved bit-for-bit).
- Both classes have hard __init__ asserts against checkpoint misconfiguration.
- New enum value TITOTokenizerType.K2V3_OLDBACKUP = "k2v3_oldbackup" + registry entry.
tests/fast/utils/chat_template_utils/test_tito_k2v3.py — rewritten for IFM invariants:
- I1: IFM template emits <|ifm|im_end|> with no trailing whitespace
- I2: rollout buffer ends at <|ifm|im_end|> matching canonical
- I3: merge_tokens is pure concat (regression guard against reintroducing legacy \n fix)
- I4: env append round-trips through merge_tokens (8 traj × 4 env = 32 cases)
- I5: init raises on legacy-checkpoint misconfiguration
tests/fast/utils/chat_template_utils/test_tito_k2v3_oldbackup.py — renamed from previous test_tito_k2v3.py with K2V3OldBackup references; covers the legacy <|im_end|> + \n boundary fix unchanged.

What this does NOT cover

Production training on the IFM K2V3 checkpoint also requires IFM-compatible SGLang parsers — the existing hermes / deepseek-r1 parsers read <tool_call> / <think>, not <ifm|tool_call> / <ifm|think>. The IFM-compatible parsers are tracked in LLM360/sglang#33; that PR + this one together unblock IFM rollout.

Verification

Both halves run inside the agentic-rl runtime container against the appropriate checkpoints. The IFM parser tests are verified by shadowing the container's SGLang with the LLM360/sglang#33 branch via PYTHONPATH.

# New (IFM) class against IFM checkpoint, with PR #33 parsers shadowed in:
PYTHONPATH=/path/to/sglang-pr33/python:$PWD:$PYTHONPATH \
  pytest tests/fast/utils/chat_template_utils/test_tito_k2v3.py -v
# → 55 passed (including all 12 parser round-trip / boss flow cases)

# Without PR #33 shadowing (container's stock SGLang):
PYTHONPATH=$PWD:$PYTHONPATH \
  pytest tests/fast/utils/chat_template_utils/test_tito_k2v3.py -v
# → 43 passed, 12 skipped (parser tests skip because k2_v3 parser not registered)

# Legacy class against legacy checkpoint:
PYTHONPATH=$PWD:$PYTHONPATH \
  pytest tests/fast/utils/chat_template_utils/test_tito_k2v3_oldbackup.py -v
# → 54 passed (unchanged from prior baseline)

End-to-end on M2 SLURM:

# To pull the PR #33 SGLang fork once:
git clone --depth 1 --branch fix/tool-parser https://github.com/LLM360/sglang.git ~/sglang-pr33

srun --partition=main --time=15:00 --cpus-per-task=2 \
  --container-image=/mnt/weka/shrd/k2pta/agentic_rl_images/agentic-rl-f9986751.sqsh \
  --container-mounts=/mnt/weka:/mnt/weka \
  bash -lc 'cd /path/to/miles-checkout && \
            PYTHONPATH=$HOME/sglang-pr33/python:$PWD:$PYTHONPATH \
            pytest tests/fast/utils/chat_template_utils/test_tito_k2v3.py \
                   tests/fast/utils/chat_template_utils/test_tito_k2v3_oldbackup.py -v'

Env overrides:

TITO_TEST_MODEL_PATH_K2V3 — IFM checkpoint (default: bbq-8b-mid3_v3/checkpoint_0005500)
TITO_TEST_MODEL_PATH_K2V3_OLDBACKUP — legacy checkpoint (default: bbq-8b-mid3-final)
TITO_TEST_TOOL_PARSER_K2V3 — IFM tool parser name (default: k2_v3, per Updated reasoning tokens while maintaining backward compatibility sglang#33)
TITO_TEST_REASONING_PARSER_K2V3 — IFM reasoning parser name (default: k2_v3)
TITO_TEST_REASONING_EFFORT_K2V3 — defaults to high

Reviewers

@LLM360/RL360-Maintainers

…backup The K2V3 family is migrating to the IFM-style chat template introduced in bbq-0601 (used by bbq-8b-mid3_v3 and later checkpoints). The new template namespaces ChatML tokens as <|ifm|im_start|> / <|ifm|im_end|>, emits no whitespace between messages, and requires assistant messages to carry a thinking field. The legacy <|im_end|>\n template stays supported for older K2V3 checkpoints (bbq-8b-mid3-final and earlier) that haven't migrated yet. Changes: - K2V3TITOTokenizer now targets the IFM template. merge_tokens is pure concat — the buffer already matches the canonical render (model stops at <|ifm|im_end|> and no trailing whitespace follows in the template). - Renamed the legacy K2V3TITOTokenizer to K2V3OldBackupTITOTokenizer. Its <|im_end|> + \n boundary-fix logic is preserved bit-for-bit. - Added TITOTokenizerType.K2V3_OLDBACKUP enum value and registry entry. TITOTokenizerType.K2V3 now points at the new IFM class. - Both classes hard-assert at __init__ that the loaded tokenizer's vocab matches their target template (refuses to load on a misconfigured checkpoint, with an error pointing at the right --tito-model value). - test_tito_k2v3.py rewritten for IFM invariants (no boundary fix, BOS prepend, thinking required, hard-assert sanity). - Renamed previous test file to test_tito_k2v3_oldbackup.py with K2V3OldBackup references. Breaking change for downstream sbatch: --tito-model k2v3 now refers to the IFM template. Legacy checkpoint users must update to --tito-model k2v3_oldbackup. Misconfiguration raises at init rather than silently producing wrong TITO buffers. Out of scope (required separately for IFM training): - IFM-compatible SGLang reasoning_parser + tool_parser (see LLM360/sglang#33). Verification: - tests/fast/.../test_tito_k2v3.py: 43 passed, 12 skipped (skipped = SGLang IFM parsers not yet in this container build). - tests/fast/.../test_tito_k2v3_oldbackup.py: 54 passed (legacy behavior unchanged).

…s/tests Docstrings on K2V3TITOTokenizer / K2V3OldBackupTITOTokenizer and the two K2V3 test files contain visual references to the literal `\n` escape sequence (the chat-template trailing newline). The previous \\n escaping renders correctly but reads awkwardly in source. Convert the affected docstrings to raw strings (r"""...""") so the source literally contains \n, which is easier to read and write. No code or test behavior changes. Tested: 109 passed (55 IFM + 54 oldbackup) inside the agentic-rl container with sglang PR #33 shadowed for the parser tests.

moonfolk · 2026-06-15T02:20:42Z

Note chat template has been updated last week and we switched to the newest version in training (as well as updated all past v3 checkpoints): https://github.com/LLM360/bbq-chat-template/tree/main/bbq-0610. Changes are specific to tool presentation in markdown and xml formats. New default presentation format is markdown and it can fallback to json when it can't render all of tool information in markdown. Tool calling default/recommended format remains unchanged (xml). This shouldn't affect TITO or tool/reasoning parsing, but noting just in case.

ZhentingWang requested a review from a team June 13, 2026 07:26

ZhentingWang force-pushed the swap-tito-tokenizer-to-0610-template branch from 29deffe to e425738 Compare June 13, 2026 07:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Swap K2V3 TITO tokenizer to IFM template; rename legacy to k2v3_oldbackup#43

Swap K2V3 TITO tokenizer to IFM template; rename legacy to k2v3_oldbackup#43
ZhentingWang wants to merge 2 commits into
prodfrom
swap-tito-tokenizer-to-0610-template

ZhentingWang commented Jun 13, 2026 •

edited

Loading

Uh oh!

moonfolk commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ZhentingWang commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Breaking change

What changed

What this does NOT cover

Verification

Reviewers

Uh oh!

moonfolk commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ZhentingWang commented Jun 13, 2026 •

edited

Loading