Representational Structure
# Continue PAPER_TPL
OTHER
Transformer architectures’ internal representations closely match human language-network responses and generalize across modalities/datasets.
"Second, the best models explain nearly 100% of the explainable variance (up to the noise ceiling) in neural responses to sentences. ... Fourth, intriguingly, the scores of models initialized with random weights (prior to training, but with a trained linear readout) are well above chance and correlate with trained model scores, which suggests that network architecture is an important contributor to a model’s brain score. In particular, one architecture introduced just in 2019, the generative pretrained transformer (GPT-2), consistently outperforms all other models and explains almost all variance in both fMRI and ECoG data from sentence-processing tasks."
Results, p. 2
These results indicate that specific representational structures in transformer LMs (notably GPT-2) align with population-level neural codes during human sentence processing, supporting a shared representational geometry across AI and brain relevant to conscious access mechanisms.
"Model scores are consistent across experiments/datasets. To test the generality of the model representations, we examined the consistency of model brain scores across datasets. Indeed, if a model achieves a high brain score on one dataset it tends to also do well on other datasets (Fig. 2D), ruling out the possibility that we are picking up on spurious, dataset-idiosyncratic predictivity and suggesting that the models’ internal representations are general enough to capture brain responses to diverse linguistic materials presented visually or auditorily, and across three independent sets of participants."
Results, p. 3
Cross-dataset generalization of model-to-brain fits suggests stable representational structure that captures human language responses across modalities—key for theories linking distributed codes to unified, reportable content.
Figures
Fig. 1 (p. 3)
: By jointly mapping model representations to neural and behavioral data, the figure operationalizes representational structure comparisons across AI and brain.
Limitations: Architectural correlations with neural/behavioral fit do not establish causal mechanisms; models are off-the-shelf and task-level, leaving open how specific circuits implement these representations.