Skip to content

Swap K2V3 TITO tokenizer to IFM template; rename legacy to k2v3_oldbackup#43

Open
ZhentingWang wants to merge 2 commits into
prodfrom
swap-tito-tokenizer-to-0610-template
Open

Swap K2V3 TITO tokenizer to IFM template; rename legacy to k2v3_oldbackup#43
ZhentingWang wants to merge 2 commits into
prodfrom
swap-tito-tokenizer-to-0610-template

Conversation

@ZhentingWang

@ZhentingWang ZhentingWang commented Jun 13, 2026

Copy link
Copy Markdown

Summary

The K2V3 family is migrating to the IFM-style chat template introduced in bbq-0601 (used by bbq-8b-mid3_v3 and later checkpoints). This PR swaps K2V3TITOTokenizer to target the new template; the legacy <|im_end|>\n implementation is preserved under a new name (K2V3OldBackupTITOTokenizer / --tito-model k2v3_oldbackup) for checkpoints that haven't migrated yet (bbq-8b-mid3-final and earlier).

Breaking change

--tito-model k2v3 now refers to the IFM template. Legacy K2V3 checkpoint users must update their sbatch:

- --tito-model k2v3
+ --tito-model k2v3_oldbackup

Both classes hard-assert at __init__ that the loaded tokenizer's vocab matches the target template, so a misconfigured combination fails loudly at startup rather than silently producing wrong TITO buffers. Error messages point at the right --tito-model value.

What changed

  • miles/utils/chat_template_utils/tito_tokenizer.py
    • K2V3TITOTokenizer rewritten for the IFM template (no boundary fix; merge_tokens is base concat — buffer already matches canonical render).
    • Renamed previous K2V3 implementation to K2V3OldBackupTITOTokenizer (legacy <|im_end|> + \n boundary fix preserved bit-for-bit).
    • Both classes have hard __init__ asserts against checkpoint misconfiguration.
    • New enum value TITOTokenizerType.K2V3_OLDBACKUP = "k2v3_oldbackup" + registry entry.
  • tests/fast/utils/chat_template_utils/test_tito_k2v3.py — rewritten for IFM invariants:
    • I1: IFM template emits <|ifm|im_end|> with no trailing whitespace
    • I2: rollout buffer ends at <|ifm|im_end|> matching canonical
    • I3: merge_tokens is pure concat (regression guard against reintroducing legacy \n fix)
    • I4: env append round-trips through merge_tokens (8 traj × 4 env = 32 cases)
    • I5: init raises on legacy-checkpoint misconfiguration
  • tests/fast/utils/chat_template_utils/test_tito_k2v3_oldbackup.py — renamed from previous test_tito_k2v3.py with K2V3OldBackup references; covers the legacy <|im_end|> + \n boundary fix unchanged.

What this does NOT cover

Production training on the IFM K2V3 checkpoint also requires IFM-compatible SGLang parsers — the existing hermes / deepseek-r1 parsers read <tool_call> / <think>, not <ifm|tool_call> / <ifm|think>. The IFM-compatible parsers are tracked in LLM360/sglang#33; that PR + this one together unblock IFM rollout.

Verification

Both halves run inside the agentic-rl runtime container against the appropriate checkpoints. The IFM parser tests are verified by shadowing the container's SGLang with the LLM360/sglang#33 branch via PYTHONPATH.

# New (IFM) class against IFM checkpoint, with PR #33 parsers shadowed in:
PYTHONPATH=/path/to/sglang-pr33/python:$PWD:$PYTHONPATH \
  pytest tests/fast/utils/chat_template_utils/test_tito_k2v3.py -v
# → 55 passed (including all 12 parser round-trip / boss flow cases)

# Without PR #33 shadowing (container's stock SGLang):
PYTHONPATH=$PWD:$PYTHONPATH \
  pytest tests/fast/utils/chat_template_utils/test_tito_k2v3.py -v
# → 43 passed, 12 skipped (parser tests skip because k2_v3 parser not registered)

# Legacy class against legacy checkpoint:
PYTHONPATH=$PWD:$PYTHONPATH \
  pytest tests/fast/utils/chat_template_utils/test_tito_k2v3_oldbackup.py -v
# → 54 passed (unchanged from prior baseline)

End-to-end on M2 SLURM:

# To pull the PR #33 SGLang fork once:
git clone --depth 1 --branch fix/tool-parser https://github.com/LLM360/sglang.git ~/sglang-pr33

srun --partition=main --time=15:00 --cpus-per-task=2 \
  --container-image=/mnt/weka/shrd/k2pta/agentic_rl_images/agentic-rl-f9986751.sqsh \
  --container-mounts=/mnt/weka:/mnt/weka \
  bash -lc 'cd /path/to/miles-checkout && \
            PYTHONPATH=$HOME/sglang-pr33/python:$PWD:$PYTHONPATH \
            pytest tests/fast/utils/chat_template_utils/test_tito_k2v3.py \
                   tests/fast/utils/chat_template_utils/test_tito_k2v3_oldbackup.py -v'

Env overrides:

  • TITO_TEST_MODEL_PATH_K2V3 — IFM checkpoint (default: bbq-8b-mid3_v3/checkpoint_0005500)
  • TITO_TEST_MODEL_PATH_K2V3_OLDBACKUP — legacy checkpoint (default: bbq-8b-mid3-final)
  • TITO_TEST_TOOL_PARSER_K2V3 — IFM tool parser name (default: k2_v3, per Updated reasoning tokens while maintaining backward compatibility sglang#33)
  • TITO_TEST_REASONING_PARSER_K2V3 — IFM reasoning parser name (default: k2_v3)
  • TITO_TEST_REASONING_EFFORT_K2V3 — defaults to high

Reviewers

@LLM360/RL360-Maintainers

@ZhentingWang ZhentingWang requested a review from a team June 13, 2026 07:26
…backup

The K2V3 family is migrating to the IFM-style chat template introduced
in bbq-0601 (used by bbq-8b-mid3_v3 and later checkpoints). The new
template namespaces ChatML tokens as <|ifm|im_start|> / <|ifm|im_end|>,
emits no whitespace between messages, and requires assistant messages
to carry a thinking field. The legacy <|im_end|>\n template stays
supported for older K2V3 checkpoints (bbq-8b-mid3-final and earlier)
that haven't migrated yet.

Changes:
  - K2V3TITOTokenizer now targets the IFM template. merge_tokens is
    pure concat — the buffer already matches the canonical render
    (model stops at <|ifm|im_end|> and no trailing whitespace
    follows in the template).
  - Renamed the legacy K2V3TITOTokenizer to K2V3OldBackupTITOTokenizer.
    Its <|im_end|> + \n boundary-fix logic is preserved bit-for-bit.
  - Added TITOTokenizerType.K2V3_OLDBACKUP enum value and registry
    entry. TITOTokenizerType.K2V3 now points at the new IFM class.
  - Both classes hard-assert at __init__ that the loaded tokenizer's
    vocab matches their target template (refuses to load on a
    misconfigured checkpoint, with an error pointing at the right
    --tito-model value).
  - test_tito_k2v3.py rewritten for IFM invariants (no boundary fix,
    BOS prepend, thinking required, hard-assert sanity).
  - Renamed previous test file to test_tito_k2v3_oldbackup.py with
    K2V3OldBackup references.

Breaking change for downstream sbatch:
  --tito-model k2v3 now refers to the IFM template. Legacy checkpoint
  users must update to --tito-model k2v3_oldbackup. Misconfiguration
  raises at init rather than silently producing wrong TITO buffers.

Out of scope (required separately for IFM training):
  - IFM-compatible SGLang reasoning_parser + tool_parser (see
    LLM360/sglang#33).

Verification:
  - tests/fast/.../test_tito_k2v3.py: 43 passed, 12 skipped (skipped =
    SGLang IFM parsers not yet in this container build).
  - tests/fast/.../test_tito_k2v3_oldbackup.py: 54 passed (legacy
    behavior unchanged).
@ZhentingWang ZhentingWang force-pushed the swap-tito-tokenizer-to-0610-template branch from 29deffe to e425738 Compare June 13, 2026 07:55
…s/tests

Docstrings on K2V3TITOTokenizer / K2V3OldBackupTITOTokenizer and the
two K2V3 test files contain visual references to the literal `\n`
escape sequence (the chat-template trailing newline). The previous
\\n escaping renders correctly but reads awkwardly in source. Convert
the affected docstrings to raw strings (r"""...""") so the source
literally contains \n, which is easier to read and write.

No code or test behavior changes.

Tested: 109 passed (55 IFM + 54 oldbackup) inside the agentic-rl
container with sglang PR #33 shadowed for the parser tests.
@moonfolk

Copy link
Copy Markdown

Note chat template has been updated last week and we switched to the newest version in training (as well as updated all past v3 checkpoints): https://github.com/LLM360/bbq-chat-template/tree/main/bbq-0610. Changes are specific to tool presentation in markdown and xml formats. New default presentation format is markdown and it can fallback to json when it can't render all of tool information in markdown. Tool calling default/recommended format remains unchanged (xml). This shouldn't affect TITO or tool/reasoning parsing, but noting just in case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants