Self Model and Reportability
# Continue PAPER_TPL
AI
Consistency and reliability of self-reports proposed as part of criteria for credible reportability.
"Verbal reports of consciousness, sentience, and agency that are consistent with each other, and with the system’s capabilities and behaviors.
○ At least as much as humans, the AI system’s self-reports about these issues are not inconsistent under circumstances that should not cause them to vary (like trivial changes in prompt).
○ At least as much as humans, the AI system’s statements about its internal states match up with its capabilities and behaviors (see Perez & Long, 2023, section 10). If it says it has color vision, it can accurately discriminate between different colored things. If it says it feels pain, then it tends to avoid “noxious” stimuli via the equivalents of its “pain” sensors. If it has preferences, these preferences explain its behavior."
What kinds of future systems would update us?, p. 14
The paper proposes reliability and consistency criteria for AI self-reports as part of evaluating consciousness-related properties, aligning with the Self-Model & Reportability phenomenon by operationalizing report pathways and confidence.
Limitations: This is a conceptual proposal rather than an empirical validation; it notes risks of mimicry and prompt-sensitivity and does not specify standardized protocols for eliciting or verifying reports.