Skip to content
View iamboosted's full-sized avatar
📊
Benchmarking
📊
Benchmarking

Block or report iamboosted

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Popular repositories Loading

  1. Qwen3.5-9B-OSS-Distill Qwen3.5-9B-OSS-Distill Public

    Distilling GPT-OSS reasoning traces into Qwen 3.5 9B to fix reasoning spirals — no-answer rate 36.2% → 0.5% on a hard holdout.

  2. falcon-h1-slerp-merge falcon-h1-slerp-merge Public

    First SLERP merge of Mamba-2 hybrid LLMs (Falcon-H1-7B-Instruct × H1R-7B). Includes merge script, benchmarks, and architecture documentation.

    Python

  3. Zamba2-SLERP-Merge Zamba2-SLERP-Merge Public

    SLERP merge of Zamba2-7B hybrid models. Merge succeeds but weight-sharing architecture prevents evaluation. Second in a series on non-transformer SLERP merging.

    Python

  4. falcon-h1-deep-reasoning falcon-h1-deep-reasoning Public

    QLoRA math reasoning adapter for Falcon-H1-1.5B-Deep, the deepest Mamba-2 hybrid (66 layers, 1.5B params). 50% → 65% on math benchmarks with 2000 training examples.

    Python

  5. Qwen3.5-9B-Dense-To-Moe Qwen3.5-9B-Dense-To-Moe Public

    Attempted dense-to-MoE conversion of Qwen 3.5 9B (DeltaNet hybrid) using CMoE and D2DMoE. Documents why post-hoc MoEfication fails on SwiGLU models without extensive sparsification. Negative result.