attention-heads

Here are 10 public repositories matching this topic...

lena-voita / the-story-of-heads

This is a repository with the code for the ACL 2019 paper "Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned" and the ACL 2021 paper "Analyzing Source and Target Contributions to NMT Predictions".

transformer attention-heads

Updated Aug 2, 2021
Python

viktor-shcherb / qk-pca-analysis

Star

PCA analysis of Q/K attention vectors to discover position-correlated components across transformer heads

transformers pca scipy rope huggingface positional-encoding attention-heads mechanistic-interpretability

Updated Feb 7, 2026
Python

viktor-shcherb / attention-plasticity

Star

CLI toolkit that ingests qk-sniffer dumps, measures per-head positional predictability and attention plasticity, and exports CSV stats plus ready-to-share plots.

python-package interpretability attention-heads huggingface-datasets transformer-attention research-tooling qk-sniffer

Updated Dec 12, 2025
Python

AdityaSinghDevs / nanolens

Star

Configurable character-level transformer training suite with built-in mechanistic interpretability toolkit — scale to 150M+ parameters and beyond, no ceilings, only hardware limits. Inspect attention weights, hidden states, and head specialisation across all layers. Documented circuit findings included.

nlp deep-learning pytorch transformer gpt language-model attention-mechanism circuit-analysis interpretability character-level-language-model attention-visualization attention-heads transformer-interpretability mechanistic-interpretability residual-stream hidden-state-analysis

Updated Jun 5, 2026
Jupyter Notebook

designer-coderajay / induction-head-detector

Star

Mechanistic interpretability tool to detect induction heads in GPT-2 using TransformerLens

nlp machine-learning deep-learning transformers pytorch gpt-2 attention-heads mechanistic-interpretability transformer-lens

Updated Dec 15, 2025
Python

RayoHQ / attention-binding-a11y

Star

TMLR 2026 | Mechanistic interpretability: attention-head binding (EB*) as a marker of concept emergence. 7 models, 5 architectures (Pythia 160M–2.8B, OLMo-1B, CRFM GPT-2, SmolLM3-3B, Qwen2.5-1.5B), 41 terms.

nlp accessibility transformers language-models pythia interpretability few-shot-learning training-dynamics attention-heads mechanistic-interpretability qwen tmlr olmo smollm3 tmlr-2026 concept-emergence crfm

Updated Jun 9, 2026
Python

atgugu / mechinterp-rfh-replication

Star

Replication of 'From Reasoning to Answer' (EMNLP 2025) — Reasoning-Focus Heads + Activation Patching on DeepSeek-R1-Distill-Qwen-7B

reasoning attention-heads mechanistic-interpretability llm-interpretability deepseek-r1 emnlp-2025 transformer-lens

Updated Mar 27, 2026
Jupyter Notebook

yonseicasl / REAL

Star

REAL: REtrieval-reAsoning and Logic-constructed Attention Behaviors for Long-Context KV Cache Compression

machine-learning compression retrieval transformers inference budget attention-mechanism reasoning eviction kv-cache attention-heads llm long-context cache-eviction longbench

Updated Apr 15, 2026
Python

guilhermezambuzi / stochastic-rupture-pruning

Star

Adaptive inference algorithm for transformers inspired by quantum collapse (SR framework)

deep-learning best-practices efficiency pytorch language-model reduction-strategies efficient-inference attention-mechanisms inference-optimization green-ai attention-heads model-efficiency llm compute-efficiency strange-rose

Updated May 4, 2026
Python

Franzabner / attention-head-surgery-epi

Star

Public derivative EPI scaffold for attention-head pruning research framing, review gates, and claim boundaries.

raspberry-pi scaffold transformer energy-efficiency model-pruning epi claim-review power-measurement ai-research attention-heads public-safe review-gates boundary-review measurement-discipline franzabner energy-per-intelligence

Updated May 8, 2026
Python

Improve this page

Add a description, image, and links to the attention-heads topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the attention-heads topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

attention-heads

Here are 10 public repositories matching this topic...

lena-voita / the-story-of-heads

viktor-shcherb / qk-pca-analysis

viktor-shcherb / attention-plasticity

AdityaSinghDevs / nanolens

designer-coderajay / induction-head-detector

RayoHQ / attention-binding-a11y

atgugu / mechinterp-rfh-replication

yonseicasl / REAL

guilhermezambuzi / stochastic-rupture-pruning

Franzabner / attention-head-surgery-epi

Improve this page

Add this topic to your repo