Brains and algorithms partially converge in natural language processing

Charlotte Caucheteux, Alexandre Gramfort, Jean-Rémi King · 2022 · View original paper

← Back to overview
Evidence (4)
Representational Structure # Continue PAPER_TPL AI
Middle layers of transformers best align (linearly) with brain responses, indicating structured, intermediate representations.
"Overall, we observe that the corresponding brain scores largely vary as a function of the relative depth of the embedding within the language transformer. Specifically, both MEG and fMRI scores follow an inverted U-shaped pattern across layers for all architectures (Fig. 4a, e): the middle layers systematically outperform the output (fMRI: ΔR= 0.011 ± 0.001, p < 10−18, MEG: ΔR= 0.003 ± 0.0005, p < 10−13) and the input layers (fMRI: ΔR=.031 ± .001, p < 10−18, MEG: ΔR=.009 ± .001, p < 10−17)."
Compositional embeddings best predict brain responses, p. 3
An inverted-U depth effect with middle layers most brain-like supports the presence of structured, intermediate representational subspaces in AI models that align with human cortical organization, informing cross-system analysis of representational structure in consciousness research .
Figures
Fig. 3 (p. 3) : The figure emphasizes that a middle transformer layer (e.g., 9/12) captures compositional representations that best map to brain data, highlighting structured intermediate codes relevant to representational organization across AI and brain .
Limitations: Linear mapping may miss nonlinear correspondences; alignment is correlational and task/domain-specific (reading in Dutch).
Temporal Coordination # Continue PAPER_TPL BIO
MEG/fMRI reveal temporally ordered cortical stages (V1 ~100 ms → posterior fusiform ~200 ms → temporal/frontal 150–500 ms).
"As expected38–41, the average fMRI and MEG responses to words reveals a hierarchy of neural responses originating in V1 around 100 ms and continuing within the left posterior fusiform gyrus around 200 ms, the superior and middle temporal gyri, as well as the pre-motor and infero-frontal cortices between 150 and 500 ms after word onset (Supplementary Movie 1 and Supplementary Note 1 and Fig. 2a)."
Shared brain responses to words and sentences across subjects, p. 2
This staged time course evidences temporal coordination mechanisms in human language processing, providing concrete millisecond-scale anchors for comparing timing implementations (e.g., positional dynamics, attention synchronization) in AI models .
Figures
Fig. 2 (p. 3) : Peak timing per region visualizes coordinated temporal dynamics across the reading network, aligning with the phenomenon of temporal coordination in cortical processing .
Limitations: Temporal precision is limited by MEG source localization and single-sample SNR; results are specific to visually presented sentences.
Information Integration # Continue PAPER_TPL BIO
Compositional representations engage a large, bilateral fronto-temporo-parietal network, peaking ~1 s after word onset.
"Finally, the brain scores of the compositional embedding are significantly higher than those of lexical of embeddings in the superior temporal gyrus (ΔR= 0.012 ± 0.001, p < 10−16), the angular gyrus (ΔR= 0.010 ± 0.001, p < 10−16), the infero-frontal cortex (ΔR= 0.016 ± 0.001, p < 10−16) and the dorsolateral prefrontal cortex (ΔR= 0.012 ± 0.001, p < 10−13). While these effects are lateralized (left hemisphere versus right hemisphere: ΔR= 0.010 ± 0.001, p < 10−14), they are significant across a remarkably large number of bilateral areas (Fig. 3b)."
Tracking the sequential generation of language representations over time and space, p. 3
Elevated compositional mapping across bilateral fronto-temporo-parietal hubs indicates distributed integration of linguistic information, consistent with system-wide access/integration phenomena relevant to consciousness theories of global broadcasting and binding .
"achieved by the compositional embedding is observed in a large number of bilateral brain regions, and peaks around 1 s after word onset (Fig. 3c, d)."
Tracking the sequential generation of language representations over time and space, p. 3
The late (~1 s) bilateral peak suggests integration over extended time, aligning with notions of global, unified representations rather than strictly local, transient encoding .
Figures
Fig. 3 (p. 3) : Comparing embeddings shows compositional signals best predict activity across a wide network, illustrating integrated content across regions .
Limitations: Mapping is linear and correlational; bilateral effects could partly reflect task structure or feedback and may not directly index conscious access.
Emergent Dynamics # Continue PAPER_TPL AI
Brain-likeness emerges with training: brain scores increase with language accuracy, though random networks already show nonzero mappings.
"Second, brain scores strongly correlate with language accuracy in both MEG (R= 0.77 Pearson’s correlation on average ± 0.01 across subjects) and fMRI (R= 0.57 ± 0.02, Fig. 4b, c)."
The emergence of brain-like representations predominantly depends on the algorithm’s ability to predict missing words, p. 3
Training-driven improvements in prediction foster brain-like representations, indicative of emergent dynamics where representational geometry reorganizes with learning .
"We observe three main findings. First, random embeddings systematically lead to significant brain scores across subjects and architectures. The mean fMRI score across voxels is R= 0.019 ± 0.001, p < 10−16. The mean MEG score across channels and time sample is R= 0.018 ± 0.0008, p < 10−16."
The emergence of brain-like representations predominantly depends on the algorithm’s ability to predict missing words, p. 3
Even untrained networks partially align with brain signals, but stronger alignment emerges as models learn, suggesting graded emergence rather than a sharp threshold .
"We froze the networks at ≈100 training stages (log distributed between 0 and 4, 5 M gradient updates, which corresponds to ≈35 passes over the full corpus), resulting in 3600 networks in total, and 32,400 word representations (one per layer)."
Deep language transformers, p. 6
Reporting ~100 stages and ~5M updates anchors the timescale of representational emergence across training for AI–brain comparisons .
Limitations: Correlations may depend on task and dataset (Dutch Wikipedia); emergence is assessed via linear mapping and does not establish causal similarity of mechanisms.