Interpreting and improving natural language processing in machines with natural language processing in the brain

Information Integration # Continue PAPER_TPL AI

Model representations that pool over the last 10 words predict activity in both short- and long-context language regions, indicating integrated context representations.

"We investigate whether the four NLP models we consider are able to create an integrated representation of a text sequence by comparing the performance of encoding models trained with two kinds of representations: a token-level word-embedding ... and a 10-word representation corresponding to the 10 most recent words. ... ELMo, BERT, and T-XL long context representations predict subsets of both group 1 regions and group 2 regions."

4 Interpreting long-range contextual representations, p. 6

By showing that 10-word (long-context) representations map to both short- and long-context brain regions, the paper provides direct evidence of information integration across distributed inputs in modern NLP models, aligned to neural data .

Figures

Figure 3 (p. 5) : Quantifies that longer-context representations explain more voxels in regions selective for long-range context, consistent with integrated representations that support system-wide access .

Figure 4 (p. 6) : Performance rises with more context, especially in mid-layers, indicating the integration of distributed inputs into unified representations as context grows .

Limitations: Alignment is correlational and relies on a linear mapping; interpretations about integration assume that predictive alignment indicates shared representational content.

Selective Routing # Continue PAPER_TPL AI

Replacing learned attention with uniform attention at a single layer changes brain-prediction performance: shallow layers improve while deep layers degrade.

"We further investigate the effect of attention across different layers by measuring the negative impact that removing its learned attention has on its brain prediction performance. Specifically we replaced the learned attention with uniform attention over the representations from the previous layer... The performance of deep layers, other than the output layer, is harmed by the change in attention. However, surprisingly... shallow layers benefit from the uniform attention for context lengths up to 25 words."

Effect of attention on layer representation, p. 7

Attention acts as a routing mechanism: altering the gating (attention weights) causally changes what information is available for prediction, with layer-dependent effects consistent with selective routing dynamics in transformers .

Figures

Figure 6 (p. 7) : Directly visualizes how altering attention (routing) modulates representational efficacy across layers, evidencing selective control of information flow .

Limitations: Interventions are limited to uniform attention in BERT; results may not generalize to other routing manipulations or architectures without recurrence.

Representational Structure # Continue PAPER_TPL AI

Layer-depth interacts with context length: middle layers best capture long-range context, while deepest layers favor short-range context.

"We observe that in all networks, the middle layers perform the best for contexts longer than 15 words. In addition, the deepest layers across all networks show a sharp increase in performance at short-range context (fewer than 10 words), followed by a decrease in performance."

Relationship between layer depth and context length, p. 7

These trends indicate a structured organization of representations across layers, with mid-layers encoding broader contextual features and deeper layers focusing on local detail—consistent with representational subspaces that vary with depth .

Figures

Figure 5 (p. 7) : Adjusting for layer-1 alters the depth-context profile, revealing how representational geometry shifts across layers and models .

Limitations: Comparisons across architectures are interpretive and depend on alignment quality; representational attributions are based on predictive mapping rather than direct causal readouts.

Temporal Coordination # Continue PAPER_TPL BIO

MEG latencies show early visual-letter processing (~100 ms) and later part-of-speech-related responses (~200 ms) during word reading.

"the number of letters of a word and its ELMo embedding predict a shared portion of brain activity early on (starting 100ms after word onset) ... Further, a word’s part of speech and its ELMo embedding predict a shared portion of brain activity around 200ms after word onset in the left front of the MEG sensor."

Evaluation of predictions; Proof of concept, p. 5

These time-locked signatures support temporally coordinated processing stages (visual feature processing then linguistic categorization), anchoring timing mechanisms relevant to binding and segmentation in language comprehension .

Limitations: MEG results are a proof-of-concept focused on specific features (letters, part of speech) and reference supplementary analyses; generalization to richer semantics is not established here.