Skip to content

[¬Re] Vocabulary-Activation Correspondence in Self-Referential LLM Processing #105

Description

@jmccardle

Paper title: [¬Re] Vocabulary-Activation Correspondence in Self-Referential LLM Processing

Paper authors: McCardle, John P.

Paper PDF URL: https://doi.org/10.5281/zenodo.19139301

Metadata URL: https://github.com/jmccardle/dadfar-vac-replication/blob/master/rescience-template/metadata.yaml

Code repository URL: https://github.com/jmccardle/dadfar-vac-replication

Code DOI: (to be minted after acceptance)

Data URL: https://zenodo.org/records/19139301

Data DOI: 10.5281/zenodo.19139301

Original article: Z. Dadfar, "When models examine themselves: Vocabulary-activation correspondence in self-referential processing," arXiv:2602.11358, 2026. https://arxiv.org/abs/2602.11358

Domain: Machine Learning

Language: Python

Type: Replication (failed)

Suggested editor: @gdetorakis (Georgios Detorakis — ML / Computational Neuroscience)

Suggested reviewers: (leave to editor discretion; domain: LLM interpretability / computational neuroscience)

Abstract:
We attempt to replicate the core claims of Dadfar (2026), who reports Vocabulary-Activation Correspondence (VAC) — a correlation between spontaneously adopted vocabulary and concurrent neural activation metrics — during extended self-referential processing in Qwen 2.5-32B-Instruct. Using the author's published configuration and Zenodo data, we identify four obstacles to replication: (1) the generation pipeline produces a bimodal output distribution, and all published terminal words derive from a degenerate summary mode; (2) the primary activation metric scales superlinearly with generation length, confounding VAC with a length artifact; (3) runs that complete 1,000 observations enter limit cycles, making terminal words phase artifacts; and (4) the TRACE-REPRO code repository targets a different model (Llama) than the paper's core claims (Qwen). Cross-model replication on Llama 3.1 70B yields zero compliant baseline runs. The replication fails on all four grounds.

content.pdf

Metadata

Metadata

Assignees

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions