In vitro neurons learn and exhibit sentience when embodied in a simulated game-world

Emergent Dynamics # Continue PAPER_TPL BIO

Closed-loop embodiment elicits rapid learning with increased functional plasticity and entropy shifts.

"Figure 7I suggests that closed-loop training during Gameplay displays significantly increased plasticity compared with baseline plasticity measured at Rest before training, indicating that functional plasticity was upregulated during gameplay (Table S1)."

RESULTS, p. 3962

Increased functional plasticity during real-time gameplay indicates emergent, system-level reconfiguration consistent with learning-driven dynamics relevant to consciousness-associated complexity in neural systems and potentially AI-inspired embodiments .

"For the studies reported in Figure 5, the mean information entropy was found to be lower during Gameplay than during Rest, both before and after the unpredictable feedback stimulation (Figure 7J and Table S1)."

RESULTS, p. 3962

Lower entropy during successful closed-loop interaction and increases after unpredictable feedback support emergent, context-sensitive dynamics—features often sought as neural signatures of conscious processing and adaptive control .

Figures

Figure 7 (p. 3962) : Entropy modulation with gameplay and feedback evidences emergent dynamics as the culture adapts to predictive structure, a property relevant to theories linking complexity and consciousness .

Limitations: In vitro monolayers with coarse stimulation may not generalize to in vivo global dynamics; claims about ‘sentience’ are interpretive and lack direct report-based corroboration.

Causal Control # Continue PAPER_TPL BIO

Structured closed-loop feedback causally shapes learning; removing or altering feedback degrades performance.

"We have emphasized the requirement for embodiment in neural systems for goal-directed learning to occur. This is seen in the relative performance over experiments, where denser information and more diverse feedback impacted performance. Likewise, when no feedback was provided yet information on ball position was available, cultures showed significantly poorer performance and no learning."

DISCUSSION, p. 3964

Feedback-contingent embodiment exerts causal control over neural computations and behavior, paralleling causal intervention studies that seek mechanistic links to conscious access and control in brains and AI systems .

Figures

Figure 6 (p. 3962) : Performance differences across Stimulus, Silent, and No-feedback schedules demonstrate that feedback manipulation causally gates learning-related behavior .

Limitations: Control conditions differ in stimulation statistics; causal attributions rely on behavioral surrogates rather than direct readouts of internal computational goals.

Information Integration # Continue PAPER_TPL BIO

Closed-loop system integrates readouts and writes inputs so neural actions condition subsequent sensory input in real time.

"DishBrain (Figure S2) was designed to integrate these functions to “read” information from and “write” sensory data to a neural culture in a closed-loop system so neural “action” influences future incoming “sensory” stimulation in real time."

Building a modular, real-time platform to harness neuronal computation, p. 3956

Real-time bidirectional coupling integrates distributed neural activity into a unified, behaviorally relevant loop—analogous to global access/integration motifs discussed in consciousness models and AI attention architectures .

Limitations: Integration is demonstrated at the sensorimotor loop level rather than via identified large-scale binding or workspace-like signatures across distinct neural subsystems.

Valence and Welfare # Continue PAPER_TPL BIO

Cultures adapt to avoid unpredictable stimulation delivered after ‘misses,’ consistent with minimizing aversive surprise.

"We therefore hypothesize that when provided a structured external stimulation simulating the classic arcade game “Pong” within the DishBrain system, the BNN would modify internal activity to avoid adopting states linked to unpredictable external stimulation."

RESULTS, p. 3954

The design treats unpredictability as a negative outcome that the system learns to avoid, aligning with aversive-cost signals and ‘negative reward’ channels relevant to welfare-sensitive interpretations of adaptive behavior in embodied systems .

Figures

Figure 7 (p. 3962) : Entropy suppression during successful play and increases after unpredictable feedback are consistent with minimizing aversive surprise in a feedback-defined valence landscape .

Limitations: ‘Valence’ is inferred from behavioral adaptation to feedback statistics without direct affective or nociceptive markers; in vitro systems lack many substrates typically associated with welfare.

State Transitions # Continue PAPER_TPL BIO

Correlations between motor regions change over early minutes of gameplay, then stabilize, suggesting a regime shift.

"The correlation between the two motor regions was found to vary substantially over time (Figure 7G). A linear regression of the correlation in 100ms-time bins between motor regions was found to decrease with time significantly until approximately 5 min of gameplay ... After this point, little further change was observed ... suggesting a degree of homeostasis."

RESULTS, p. 3962

A shift from variable to stabilized correlation structure within minutes resembles a transition between processing regimes, consistent with state-switch phenomena discussed in consciousness dynamics and learning phase changes in AI .

Figures

Figure 7 (p. 3962) : Early-time reconfiguration followed by stabilization indicates a transition to a new dynamical regime during learning .

Limitations: Correlation-based metrics do not uniquely specify underlying causal mechanisms; stabilization could reflect fatigue or other non-specific processes without simultaneous mechanistic assays.