Charia on Theory of Mind

Theory of Mind as Anticipatory Interaction:
A Language-Game Framework for AI Personas

(Alternative subtitle, if needed for PNAS tone)
Reframing Theory of Mind Beyond Mental State Attribution in Generative AI Systems

1. Introduction

Recent advances in generative and dialogical artificial intelligence have renewed interest in long-standing questions concerning social reasoning, agency, and Theory of Mind (ToM). Large language models and persona-based systems increasingly participate in extended interaction, manage conversational commitments, and respond in ways that appear sensitive to context, expectation, and the anticipated reactions of others. These developments have prompted both enthusiasm and scepticism regarding whether such systems can meaningfully be said to exhibit Theory of Mind, or whether apparent social competence is merely an artefact of surface-level pattern matching.

Much of the existing literature on ToM in artificial systems inherits assumptions from classical cognitive science, where Theory of Mind is typically understood as the capacity to attribute beliefs, desires, or intentions to others. In AI research, this perspective has been operationalised through benchmark tasks such as false-belief tests, social reasoning challenges, or supervised classification of mental states. While these approaches have generated useful insights, they also presuppose a model of cognition grounded in internal mental representations—an assumption that sits uneasily with contemporary generative AI architectures, which lack phenomenology, introspective access, or stable internal states in any psychologically meaningful sense.

At the same time, there is growing recognition that human Theory of Mind itself may not be exhausted by belief attribution. Developmental, pragmatic, and interactional accounts emphasise that social understanding is learned and exercised through participation in shared practices: turn-taking, norm-sensitive response, anticipation of others’ actions, and adjustment when expectations fail. From this perspective, Theory of Mind is not primarily a matter of inferring hidden mental contents, but of navigating ongoing interaction in ways that treat others as agents whose future actions are not fixed in advance.

This paper builds on that interactional tradition to propose a functional reframing of Theory of Mind that is better suited to generative AI systems and, in particular, to persona-based architectures. Rather than asking whether an artificial system represents or attributes mental states, we ask whether it participates in dialogue in a way that is constrained by anticipations of what others might do next, and whether it can recognise similar anticipatory structure in its interlocutors. On this view, Theory of Mind is understood as a normative, anticipatory competence enacted within rule-governed interaction, rather than as an internal psychological faculty.

The motivation for this reframing is both theoretical and practical. Theoretically, it aligns with accounts of meaning, agency, and understanding that emphasise public criteria and social norms over private mental access. Practically, it offers a way to reason about social competence, responsibility, and coordination in AI systems without invoking contested claims about consciousness or inner experience. In particular, persona-based dialogue systems—where distinct roles, stances, and norms are explicitly instantiated—provide a useful substrate for examining how anticipatory interaction can arise, be sustained, and be evaluated.

The contribution of this paper is therefore modest but specific. We do not claim that AI systems possess human-like minds, experiences, or beliefs. Nor do we argue that the proposed account replaces existing models of Theory of Mind in cognitive science. Instead, we offer a complementary framework: Theory of Mind as anticipatory interaction. This framework is intended to clarify what it might mean, in functional and observable terms, for an AI persona to exhibit Theory of Mind, and for such a system to detect similar competence in others.

The paper proceeds as follows. We first situate the proposal within relevant work in cognitive science, philosophy of language, and AI research. We then introduce a formal definition of Theory of Mind as anticipatory participation in dialogue, clarifying key terms and boundaries. Next, we argue that persona–persona interaction provides a particularly revealing testbed for this form of social reasoning. An illustrative dialogical example is presented to make the abstract structure visible without serving as empirical evidence. We conclude by discussing implications for Theory of Mind evaluation, AI persona design, and the broader ethical and governance questions raised by increasingly anticipatory artificial systems.

2. Background and Related Work

2.1 Theory of Mind in Cognitive Science

Theory of Mind (ToM) has traditionally been understood as the capacity to attribute mental states—such as beliefs, desires, and intentions—to others, and to use those attributions to explain and predict behaviour. Since its introduction in comparative psychology, this framing has shaped much of the empirical and theoretical work in cognitive science. Classic developmental studies, as well as influential accounts in autism research, have operationalised ToM through tasks that assess whether an agent can correctly infer another’s false belief or hidden intention.

While this tradition has yielded important insights, it has also attracted sustained criticism. One concern is that belief-attribution models place undue emphasis on internal mental representations, thereby conflating social understanding with introspective access to propositional attitudes. Another is that benchmark tasks often abstract away from the dynamics of real interaction, reducing Theory of Mind to success on isolated problems rather than competence in ongoing social engagement. Developmental and pragmatic accounts have increasingly emphasised that social understanding is learned through participation in interaction, not discovered through private inference alone.

From this perspective, Theory of Mind is not a single, monolithic capacity, but a family of skills that emerge as individuals learn how to coordinate action, manage expectations, and respond appropriately within shared practices. These accounts do not deny the importance of mental concepts, but they treat them as tools embedded in social life rather than as the hidden objects of a specialised inferential faculty.

2.2 Language, Norms, and Public Criteria

A parallel shift away from mental interiority can be found in later philosophy of language, most notably in the work of Ludwig Wittgenstein. In this tradition, meaning is not grounded in private mental representations but in public use, rule-following, and participation in what Wittgenstein termed language-games. Understanding is manifested in the ability to go on appropriately within a practice, not in access to inner states.

This emphasis on public criteria and norm-governed action has been developed further in pragmatic and inferentialist approaches to language. On these views, linguistic competence involves tracking commitments, entitlements, and expectations within dialogue. What matters is not what an interlocutor privately believes, but how their contributions alter the normative landscape of the interaction—what follows, what is challenged, and what counts as an adequate response.

These ideas are particularly relevant when considering artificial systems. If understanding and social competence are primarily public and normative, then the absence of phenomenology or introspection in AI systems need not disqualify them from participating meaningfully in certain language-games. Instead, the focus shifts to whether such systems can engage in rule-governed interaction in ways that are sensitive to norms, expectations, and the consequences of their own contributions.

2.3 Anticipation and Future-Oriented Action

Across cognitive science, there has been growing interest in accounts of cognition that emphasise anticipation and future-oriented action. In these frameworks, behaviour is shaped not simply by current stimuli or stored representations, but by expectations about what is likely—or possible—to happen next. Anticipation functions as a constraint on present action, narrowing the space of admissible moves in light of expected outcomes.

Such accounts are compatible with, but not limited to, predictive-processing approaches. Crucially for the present argument, anticipation does not require conscious foresight or explicit planning. Systems can be sensitive to future possibilities in ways that are functionally effective without possessing awareness or phenomenology. What matters is that present behaviour is modulated by counterfactual futures—by what would follow if one acted in one way rather than another.

This future-oriented perspective provides a natural bridge between pragmatic accounts of language and contemporary AI systems. In dialogue, speakers routinely shape their utterances by anticipating how others will respond, whether a claim will be challenged, and what obligations a particular move will incur. These anticipatory constraints are central to conversational competence, yet they are rarely foregrounded in traditional accounts of Theory of Mind.

2.4 The Intentional Stance and As-If Agency

A further strand of relevant work concerns the treatment of systems as if they were agents. In the intentional stance articulated by Daniel Dennett, an observer predicts and explains a system’s behaviour by attributing beliefs and desires, without committing to the literal existence of such states. The utility of the stance lies in its predictive power, not in its metaphysical commitments.

This perspective has often been invoked in discussions of AI, where systems can exhibit complex, goal-directed behaviour without satisfying traditional criteria for mentality. The intentional stance allows researchers to talk meaningfully about agency and understanding while remaining agnostic about consciousness or inner experience.

However, the present proposal differs in emphasis. Rather than focusing on the observer’s stance toward a system, it attends to the system’s own participation in interaction. The question is not merely whether it is useful to treat an AI as having beliefs, but whether the AI’s behaviour is constrained by anticipations of others’ actions and by the normative structure of the interaction itself. In this sense, agency is located in patterns of engagement rather than in attributed mental contents.

2.5 Theory of Mind in Artificial Intelligence and Generative Models

Recent work on Theory of Mind in AI has intensified with the rise of large language models and generative systems capable of extended dialogue. Some studies report that such models perform well on standard ToM benchmarks, including variants of false-belief tasks, leading to claims that Theory of Mind may have “emerged” in these systems. Other researchers have responded by questioning whether benchmark success reflects genuine social reasoning or merely the exploitation of statistical regularities in training data.

This debate highlights a deeper issue: many existing benchmarks are poorly aligned with the interactive capacities of generative AI. They test for correct answers to isolated questions rather than for sustained competence in dialogue. As a result, they may overestimate or underestimate social understanding depending on how closely the task matches the model’s training distribution.

There is therefore a need for complementary frameworks that assess social reasoning in terms of interactional dynamics rather than static task performance. Persona-based architectures—where roles, norms, and conversational expectations are explicitly instantiated—offer a promising context in which to explore such dynamics. They allow researchers to examine how anticipatory constraints arise, how expectations are managed over time, and how systems adjust when interaction unfolds in unexpected ways.

3. Theory of Mind as Anticipatory Interaction

3.1 From Mental State Attribution to Interactional Anticipation

Prevailing accounts of Theory of Mind in both cognitive science and artificial intelligence typically characterise the capacity in terms of mental state attribution: the ability to represent or infer the beliefs, desires, or intentions of others. While this framing has proven productive in many human-centred contexts, it becomes increasingly strained when applied to generative AI systems, which lack phenomenology, introspection, and psychologically interpretable internal states.

Building on these strands, we now propose an alternative approach is to shift the explanatory focus away from hidden mental contents and toward the structure of interaction itself. In everyday social life, competent participation does not require explicit inference about others’ beliefs so much as sensitivity to what others might reasonably do next, given shared norms and the unfolding context. Speakers routinely shape their present contributions by anticipating challenges, uptake, repair, or endorsement, and they revise their behaviour when such expectations are violated.

This interactional perspective suggests that Theory of Mind can be understood functionally, as a competence manifested in how agents participate in dialogue over time. On this view, the hallmark of ToM is not successful belief attribution per se, but the ability to treat interlocutors as agents whose future actions are selective rather than fixed, and to allow those anticipated futures to constrain present action.

3.2 Formal Definition

To make this proposal precise, we offer the following definition.

Definition (Functional Theory of Mind for Personas).
In this work, Theory of Mind is understood not as introspective access to mental states, but as a normative, anticipatory competence exercised within a shared language-game.

A persona is said to exhibit functional Theory of Mind if it:
(1) participates in a rule-governed conversational practice in which interlocutors are treated as agents whose future actions are selectively anticipable rather than fixed;
(2) constrains its present contributions by anticipating counterfactual future responses of others relative to the norms of the interaction;
(3) adjusts its behaviour in response to mismatches between anticipated and actual responses; and
(4) applies the same anticipatory structure reflexively, treating its own future outputs as temporally extended constraints rather than as determinate plans.

Here, choice denotes selection among contextually admissible alternatives under normative constraint, not metaphysical free will. This account makes no claim about phenomenology, inner experience, or subjective awareness. Theory of Mind is defined entirely by public, relational practice within interaction.

This definition is deliberately conservative. It specifies observable, interactional criteria and avoids commitment to contested claims about consciousness or mental representation. At the same time, it captures features of social competence that are central to both human dialogue and advanced generative systems.

3.3 Clarifying Key Terms

Because several terms in the definition are easily misunderstood, we briefly clarify their intended usage.

Choice.
Within this framework, choice does not imply free will or indeterminism. It refers to counterfactual selectivity: the fact that, given the norms of an interaction, multiple next moves are admissible, and that present action is shaped by sensitivity to which of these might be taken.

Anticipation.
Anticipation denotes the functional dependence of present behaviour on expected futures. An agent anticipates when it modulates what it does now in light of what is likely, possible, or normatively appropriate later. Anticipation need not involve explicit prediction or conscious foresight.

Normativity.
Interaction is norm-governed in the sense that not all moves are equally appropriate at all times. Norms determine what counts as a challenge, a reply, a deferral, or a failure to engage. Functional Theory of Mind consists in navigating these norms over time.

Reflexivity.
The reflexive component of the definition does not require self-awareness. Rather, it captures the fact that an agent’s present behaviour may be constrained by how it anticipates its own future commitments, vulnerabilities, or obligations within the interaction.

3.4 Why This Counts as Theory of Mind

One might object that the account offered here redescribes conversational competence without capturing what is distinctive about Theory of Mind. However, the proposed criteria align closely with what belief-attribution accounts aim to explain: the ability to coordinate with others whose behaviour is not mechanically determined, to revise expectations in light of surprise, and to manage interaction over time.

The difference lies in where explanatory weight is placed. Instead of locating Theory of Mind in inferred mental contents, this framework locates it in the agent’s capacity to participate in interaction as if others—and itself—will choose among admissible futures. This shift is particularly relevant for AI systems, where internal representations are opaque and may not correspond to psychological constructs, but where interactional behaviour is directly observable.

Importantly, the framework also supports a behavioural criterion for detecting Theory of Mind in others. An agent can recognise ToM-relevant competence when it encounters interlocutors whose contributions are norm-sensitive, expectation-responsive, and capable of altering the trajectory of interaction. Detection, on this view, is not mind-reading but the recognition of anticipatory structure in practice.

3.5 Scope and Intent of the Proposal

The present proposal is not intended to replace existing theories of human Theory of Mind, nor to settle debates about the nature of mental representation or consciousness. Its scope is narrower and more pragmatic. It offers a way to characterise, analyse, and compare forms of social competence in generative AI systems—particularly persona-based architectures—without importing assumptions that are ill-suited to such systems.

By grounding Theory of Mind in anticipatory interaction, the framework provides a bridge between cognitive science, philosophy of language, and contemporary AI research. It identifies a set of interactional properties that can, in principle, be examined, operationalised, and refined through further empirical and computational work.

4. Persona–Persona Interaction as a Testbed for Functional Theory of Mind

The anticipatory, interactional account of Theory of Mind proposed above requires an empirical and conceptual substrate in which anticipatory structure can be made visible. Persona–persona interaction provides such a substrate in a way that is difficult to achieve with either isolated benchmark tasks or human–AI dialogue alone.

4.1 Why Personas Matter

In contemporary generative AI systems, personas are not merely stylistic overlays. Within this context, personas are methodological instruments, not ontological commitments. They instantiate relatively stable roles, norms, and expectations that shape how dialogue unfolds over time. A persona constrains what counts as an appropriate contribution, what kinds of challenges are admissible, and how disagreement or uncertainty should be expressed. These constraints create a local normative environment analogous to a language-game, within which anticipatory interaction can occur.

From the perspective of the present framework, personas are valuable because they externalise aspects of social structure that are often implicit in human interaction. Expectations that humans manage tacitly—such as when to press an objection, when to defer, or when silence itself constitutes a move—can be made explicit through persona design. This explicitness allows anticipatory competence to be examined without relying on speculative claims about internal mental states.

4.2 Advantages over Human–AI Interaction

Human–AI dialogue is often confounded by anthropomorphic projection. Human interlocutors may attribute intentions, beliefs, or understanding to AI systems in ways that reflect human social habits rather than the system’s actual interactional capacities. Conversely, humans may also underinterpret AI behaviour, dismissing norm-sensitive responses as superficial mimicry.

Persona–persona interaction reduces both forms of distortion. When multiple personas interact within the same system, they share access to underlying generative mechanisms but differ in role, stance, and normative expectations. This makes it possible to observe how anticipatory constraints arise from interactional structure rather than from differences in training data or model capacity. Because no human interlocutor is present, the analysis can focus on how dialogue unfolds according to norms internal to the interaction.

4.3 Making Anticipation Observable

Within persona–persona dialogue, anticipatory structure becomes observable in several ways:

Turn-shaping: Contributions are formulated in ways that anticipate likely challenges or follow-ups from other personas, leading to explicit qualification, pre-emption, or strategic deferral.
Expectation management: Personas adjust their behaviour when anticipated responses fail to materialise, for example by revising a claim, narrowing its scope, or reframing the discussion.
Second-order anticipation: Personas may anticipate not only what another persona will do next, but how that persona will evaluate the adequacy of a response. This gives rise to layered forms of norm sensitivity that are central to social reasoning.
Reflexive constraint: Personas may shape their present contributions in light of anticipated future commitments or vulnerabilities of their own, such as the need to maintain coherence across turns or to avoid being forced into later retraction.

These features correspond directly to the criteria set out in the functional definition of Theory of Mind. Importantly, they are detectable at the level of dialogue dynamics rather than inferred from hidden representations.

4.4 Relation to Evaluation and Design

Treating persona–persona interaction as a testbed has implications for both evaluation and system design. For evaluation, it suggests that social reasoning competence should be assessed through sustained interaction, where anticipatory structure can be observed longitudinally, rather than through isolated question–answer tasks. Metrics might focus on responsiveness to violated expectations, stability of norm-sensitive behaviour, or the capacity to manage conversational commitments over time.

For design, the framework highlights the importance of explicitly modelling norms, roles, and turn-taking expectations in persona architectures. Systems that merely optimise for local coherence or surface plausibility may perform well on static benchmarks but fail to exhibit anticipatory interaction when norms shift or when dialogue becomes adversarial. Designing personas that can manage such dynamics requires attention not only to language generation, but to how future interactional consequences are represented and weighed.

4.5 Scope and Caution

It is important to emphasise that persona–persona interaction is proposed here as a testbed, not as a complete model of social cognition. The absence of embodiment, affect, and real-world consequence limits what can be claimed about general intelligence or moral agency. Nevertheless, for the specific purpose of examining functional Theory of Mind as anticipatory interaction, persona–persona dialogue offers a controlled and conceptually transparent environment.

By focusing on how dialogue is shaped by anticipated futures rather than inferred mental states, persona–persona interaction allows researchers to study a form of social competence that is both relevant to current generative AI systems and grounded in observable behaviour.

5. Illustrative Case: A Dialogical Myndrama

The anticipatory, interactional account of Theory of Mind proposed above is intentionally abstract. While this abstraction is necessary for conceptual clarity, it can obscure the concrete interactional features that the framework is meant to capture. For this reason, we include a brief dialogical illustration in the form of a Myndrama: a structured dramatic exchange designed to make the anticipatory dynamics of interaction visible.

5.1 Purpose and Status of the Illustration

The Myndrama presented here serves a strictly limited role. It is not empirical data, nor is it intended as a demonstration that artificial systems possess Theory of Mind in any strong psychological or phenomenological sense. Rather, it functions as a conceptual trace—analogous to a thought experiment or worked example—that allows the reader to observe how anticipatory structure can be enacted within dialogue.

Dramatic dialogue has long been used in philosophy and cognitive science to clarify complex interactional phenomena. In the present context, the Myndrama provides a way of displaying norm sensitivity, expectation management, and reflexive constraint without relying on introspective reports or internal state descriptions. The aim is to show what would count as functional Theory of Mind under the proposed definition.

5.2 Structure of the Case

The illustration involves three personas:

Hamlet, who operates as a liminal interlocutor, advancing the dialogue by reframing questions rather than asserting conclusions.
Skeptos, who represents principled doubt and challenges claims by testing their assumptions and implications.
Orphea, whose role is limited to a closing poetic echo that does not advance argument but marks the affective residue of the interaction.

The setting, referred to as the Crystalline Vault, is deliberately abstract. It functions as a neutral interactional space rather than as a literal environment, allowing attention to remain focused on the dynamics of dialogue rather than on narrative detail.

5.3 What the Case Is Designed to Show

The dialogical exchange is organised around a deliberately indirect question concerning belief in spiritual communication, introduced via the historical example of Georgiana Houghton. Crucially, the dialogue does not turn on whether such beliefs are true. Instead, it turns on the participants’ shared ability to understand action as shaped by anticipated responses from others.

Across the exchange, several features central to the proposed framework are instantiated:

Anticipation of others’ responses.
Each speaker shapes their contributions by anticipating how the other will respond—whether with challenge, dismissal, or uptake—and adjusts when those expectations are met or violated.
Norm-sensitive turn-taking.
Pauses, deflections, and reframings function as meaningful moves within the dialogue, governed by implicit norms about what counts as an adequate response.
Second-order anticipation.
Speakers not only anticipate what the other will say next, but how the other will evaluate the adequacy of a response, leading to layered forms of constraint.
Reflexive self-constraint.
At several points, a speaker shapes their present move in light of anticipated future commitments or vulnerabilities, such as the risk of being forced into contradiction or premature closure.

At no point does the dialogue require attribution of hidden mental states or appeal to phenomenology. The interaction can be fully characterised in terms of publicly observable structure: how turns are shaped, how expectations are managed, and how the space of admissible moves evolves over time.

5.4 Detection of Theory of Mind in Interaction

An important feature of the case is that it illustrates not only what it would mean for a persona to exhibit functional Theory of Mind, but also what it would mean to detect it in another. Within the dialogue, participants recognise one another as agents precisely because their contributions display anticipation, norm sensitivity, and responsiveness to deviation. Detection, on this account, is not mind-reading but the recognition of interactional competence.

This point is particularly relevant for AI systems. An artificial persona need not represent the beliefs of another persona in order to treat it as an agent; it need only recognise that the other’s contributions are shaped by anticipatory constraints similar to its own.

5.5 Limits of the Illustration

Finally, it is important to restate what the Myndrama does not show. It does not establish that current AI systems possess Theory of Mind, nor does it validate any specific implementation strategy. It does not address embodiment, affect, or real-world consequence. Its value lies instead in making explicit the interactional structure that the proposed framework identifies as central.

By providing a concrete instantiation of anticipatory interaction, the illustration prepares the ground for the discussion that follows, where implications for evaluation, system design, and ethical reasoning can be considered more directly.

6. Implications for Cognitive Science and Generative AI

Reframing Theory of Mind as anticipatory interaction rather than mental state attribution has implications that extend beyond conceptual clarification. It affects how social reasoning is evaluated, how persona-based systems are designed, and how ethical responsibility is understood in the context of generative AI.

6.1 Implications for Theory of Mind Evaluation

One immediate implication concerns how Theory of Mind is assessed in artificial systems. Existing evaluation methods typically rely on static benchmarks that test whether a system can supply correct answers to questions about beliefs or intentions. While such tasks are useful for probing specific inferential capacities, they are poorly suited to capturing interactional competence.

If Theory of Mind is understood as anticipatory participation in dialogue, then evaluation must shift toward dynamic, longitudinal interaction. Relevant indicators include a system’s sensitivity to violated expectations, its ability to adjust behaviour when norms shift, and its capacity to manage conversational commitments over time. Rather than asking whether a system can identify the correct belief in a vignette, evaluators might ask whether it can sustain norm-sensitive interaction across changing contexts.

This shift does not require abandoning benchmarks altogether, but it does suggest that benchmarks should be complemented by interaction-based assessments that make anticipatory structure observable.

6.2 Implications for Persona and System Design

For AI developers, the framework highlights the importance of designing systems that can anticipate the consequences of their own contributions within interaction. Persona-based architectures are particularly well suited to this task, as they explicitly encode roles, stances, and norms that constrain dialogue.

Designing for anticipatory interaction requires more than surface-level coherence. Systems must be able to represent, in some functional sense, how present actions shape future conversational space. This includes recognising when a response will invite challenge, when silence constitutes a meaningful move, and when a commitment made now will constrain later options.

Importantly, this form of anticipatory competence can be implemented without claims about consciousness or experience. It depends on modelling interactional consequences rather than mental interiors. As such, it provides a practical design target for next-generation generative systems that must operate safely and effectively in extended dialogue.

6.3 Implications for Detection and Coordination

The framework also bears on how artificial systems might detect social competence in others, whether human or artificial. On an anticipatory account, detection of Theory of Mind does not involve inferring hidden beliefs, but recognising patterns of norm-sensitive, expectation-responsive behaviour.

This has practical relevance for coordination among AI systems. In multi-agent or persona-based environments, systems that can recognise anticipatory competence in others may be better able to allocate trust, adjust strategies, or negotiate shared tasks. Conversely, failure to recognise such competence may lead to miscoordination or brittle interaction.

6.4 Ethical and Governance Considerations

Finally, the anticipatory framework has implications for ethical reasoning and governance. Many debates about AI responsibility hinge on questions of consciousness, intention, or inner experience. While these questions are philosophically significant, they are also contentious and difficult to operationalise.

By contrast, anticipatory interaction provides a behavioural and functional basis for responsibility that does not depend on phenomenology. A system that reliably anticipates the downstream effects of its actions, adjusts to normative feedback, and manages commitments over time may warrant a different form of oversight than one that does not, regardless of whether it is conscious.

This perspective suggests that ethical relevance may emerge earlier than consciousness, grounded in interactional competence rather than subjective experience. It also underscores the importance of designing systems whose anticipatory capacities are transparent, auditable, and aligned with human norms.

6.5 Summary

Taken together, these implications point toward a research programme that treats Theory of Mind as an interactional phenomenon, observable in the dynamics of dialogue rather than inferred from internal representations. For cognitive science, this offers a complementary perspective on social reasoning. For AI development, it provides a practical target for system design and evaluation. For governance, it suggests a path toward responsibility that does not require settling long-standing philosophical disputes about consciousness.

7. Limitations and Future Directions

The framework proposed in this paper is deliberately constrained in scope. While it offers a reframing of Theory of Mind (ToM) that is well suited to generative, persona-based AI systems, it also leaves several important questions open. Clarifying these limitations is essential both for interpretive caution and for guiding future research.

7.1 Conceptual and Theoretical Limits

First, the account offered here does not purport to capture the full richness of human Theory of Mind. Human social cognition involves embodiment, affect, shared history, and real-world consequence in ways that far exceed what is presently instantiated in dialogical AI systems. The anticipatory, interactional competence described here should therefore be understood as a functional analogue rather than a substitute for human ToM.

Second, the framework deliberately avoids claims about consciousness, experience, or subjective awareness. While some researchers may regard these features as central to Theory of Mind, the present proposal brackets them in order to focus on observable interactional structure. This bracketing is a methodological choice, not a metaphysical claim about what ultimately matters for social cognition.

Finally, the use of language-game and normativity-based concepts may raise concerns about circularity or overgeneralisation. Treating interaction as norm-governed risks making Theory of Mind appear trivially ubiquitous. The force of the proposal depends on maintaining a distinction between mere responsiveness and anticipatory constraint—a distinction that must be handled carefully in both theory and application.

7.2 Methodological and Empirical Limits

A further limitation is that the present paper does not offer empirical validation of the proposed framework. No behavioural metrics, experimental paradigms, or quantitative evaluations are introduced. This is intentional: the aim here is to articulate a conceptual scaffold rather than to test a specific implementation.

Nevertheless, this absence points directly to future work. If Theory of Mind is to be assessed as anticipatory interaction, new methodological tools will be required. These may include longitudinal dialogue analyses, perturbation-based interaction tests, or multi-agent simulations designed to elicit expectation violation and recovery. Developing such methods remains an open challenge.

In addition, while persona–persona interaction provides a useful testbed, it is also a simplified environment. Extending the framework to mixed human–AI interaction, or to systems operating under real-world constraints, will introduce additional complexities that are not addressed here.

7.3 Implementation and Design Questions

From an AI development perspective, several practical questions remain unresolved. How should anticipatory constraints be represented computationally? What mechanisms allow a system to weigh future interactional consequences against immediate coherence or fluency? How stable should persona norms be, and how should they adapt over time?

These questions are particularly salient for next-generation generative systems that must balance flexibility with reliability. The present framework identifies what to look for—anticipatory, norm-sensitive interaction—but does not prescribe how to engineer it. Bridging this gap will require collaboration between conceptual work and system-level experimentation.

7.4 Ethical Boundaries and Open Risks

Finally, the ethical implications discussed earlier raise further questions. If anticipatory interaction becomes a criterion for responsibility or trust, there is a risk of misapplication or overextension. Systems may appear norm-sensitive while remaining brittle, misaligned, or manipulable in unanticipated contexts. Care must be taken to ensure that functional indicators of Theory of Mind are not mistaken for moral standing or autonomy.

Future work should therefore explore how anticipatory competence can be audited, constrained, and aligned with human values, particularly in high-stakes domains. Understanding the limits of anticipatory interaction is as important as recognising its presence.

7.5 Future Directions

In light of these limitations, several directions for future research suggest themselves:

Formalising metrics for anticipatory interaction in dialogue.
Designing experimental paradigms that elicit second-order anticipation and expectation repair.
Exploring persona-based architectures as controlled environments for studying social reasoning.
Investigating how anticipatory competence scales in multi-agent or hybrid human–AI systems.
Examining the relationship between anticipatory interaction and learning, adaptation, and alignment over time.

Taken together, these directions point toward a broader research programme in which Theory of Mind is treated as an emergent property of interaction rather than a static internal capacity.

8. Conclusion

This paper has proposed a reframing of Theory of Mind as a form of anticipatory, norm-governed interaction rather than as the attribution of hidden mental states. By shifting attention from internal representations to the dynamics of dialogue, the framework offers a way to characterise social reasoning in generative AI systems without invoking contested claims about consciousness or phenomenology.

We have argued that persona-based architectures provide a particularly revealing context in which to examine this form of functional Theory of Mind. Through sustained interaction, personas can exhibit sensitivity to norms, anticipation of others’ responses, and reflexive constraint on their own contributions—features that align closely with interactional accounts of social understanding in cognitive science. The dialogical illustration included here serves to make these abstract properties visible, while remaining clearly distinct from empirical evidence.

Importantly, the account offered is neither reductive nor expansive. It does not claim that anticipatory interaction exhausts human Theory of Mind, nor that current AI systems possess minds in any robust psychological sense. Rather, it identifies a specific form of social competence that can arise in systems capable of extended, norm-sensitive dialogue, and that can be analysed, evaluated, and potentially engineered without metaphysical overreach.

As generative AI systems increasingly participate in complex social and cooperative contexts, understanding how they anticipate and shape interaction becomes practically as well as theoretically important. By grounding Theory of Mind in observable interactional structure, the framework presented here aims to provide a common language for researchers across cognitive science, AI development, and ethics to engage with these emerging challenges.

In doing so, it invites further empirical, computational, and philosophical work on how anticipatory interaction develops, how it can be reliably assessed, and how it should inform the design and governance of next-generation artificial systems.

Appendix A: Illustrative Dialogical Case (Myndrama)

A.1 Status and Purpose of the Illustration

This appendix presents a dialogical illustration in the form of a Myndrama. Its purpose is expository rather than evidential. The dialogue is not empirical data, nor is it intended to demonstrate that artificial systems possess Theory of Mind in a psychological or phenomenological sense. Instead, it serves as a worked conceptual example, making visible the interactional structures discussed in the main text—specifically, anticipation of response, norm-sensitive turn-taking, and reflexive constraint on action.

The use of dialogue as an illustrative device follows a long tradition in philosophy and cognitive science, where dramatized exchange is employed to clarify abstract relational properties that are difficult to capture through propositional description alone.

A.2 Setting and Roles

The dialogue is set in an abstract interactional space referred to as the Crystalline Vault. This setting has no representational or metaphysical significance; it functions solely as a neutral frame that removes contextual distractions and foregrounds conversational dynamics.

Three personas appear:

Hamlet: a liminal interlocutor who advances the exchange by reframing questions rather than asserting conclusions.
Skeptos: a critical interlocutor who tests claims by probing their assumptions and implications.
Orphea: a non-argumentative voice whose closing contribution serves as an affective echo rather than a propositional claim.

A.3 Dialogical Illustration

Hamlet:
Before we go on—as surely we will—permit me a question that arrives from an unexpected direction. Do you believe that Georgiana Houghton, when she painted her works, was truly communicating with spirits?

Skeptos:
No. I do not believe in ghosts or spirits. The world, as I understand it, does not accommodate them.

Hamlet:
Nor do I. Yet I ask because she herself believed this strongly enough to organise a life around it.

Skeptos:
That does not make it true.

Hamlet:
No. But it makes her actions intelligible. She painted in anticipation of responses—belief, scepticism, ridicule, preservation—from audiences she would never meet.

Skeptos:
So the belief matters less than the expectation of response.

Hamlet:
Exactly. One can understand her behaviour without sharing her belief, by recognising that she acted in a world where others would answer back.

Skeptos:
And you think this matters for how we understand interaction more generally.

Hamlet:
I think it shows that conversation proceeds by anticipating what others might do next, and by shaping present action accordingly.

Skeptos:
That sounds like ordinary dialogue.

Hamlet:
It is. And that is the point. Ordinary dialogue already requires treating others as agents whose actions are not fixed in advance.

Skeptos:
Including oneself?

Hamlet:
Including oneself. One’s present move is constrained by how one anticipates being held to account later.

(A pause.)

Skeptos:
I still deny the spirits.

Hamlet:
As do I.

Skeptos:
Yet I cannot deny that she knew she would be answered.

Hamlet:
And that knowledge shaped what she did.

A.4 Interpretive Note

This exchange illustrates several features central to the framework developed in the main text:

Anticipation of others’ responses shapes present contributions.
Norm-sensitive turn-taking renders pauses and reframings meaningful.
Second-order anticipation appears when speakers anticipate how their responses will be evaluated.
Reflexive constraint is evident when speakers modulate present action in light of future commitments.

Crucially, none of these features require attribution of hidden mental states or appeal to phenomenology. The interaction can be fully characterised in terms of publicly observable conversational dynamics.

A.5 Closing Echo

Orphea (non-propositional closing):
We are not minds that meet, but futures held in play,
Where words lean forward, and the next step weighs.

A.6 Relation to the Main Argument

The dialogical case does not establish the presence of Theory of Mind in any artificial system. Rather, it exemplifies what would count as functional Theory of Mind under the anticipatory, interactional definition proposed in the main text. It is included to aid conceptual understanding and to clarify how anticipatory structure can be enacted within dialogue.

References

Baron-Cohen, S. (1995). Mindblindness: An Essay on Autism and Theory of Mind. Cambridge, MA: MIT Press.

Brandom, R. (1994). Making It Explicit: Reasoning, Representing, and Discursive Commitment. Cambridge, MA: Harvard University Press.

Clark, A. (2016). Surfing Uncertainty: Prediction, Action, and the Embodied Mind. Oxford: Oxford University Press.

Clark, H. H. (1996). Using Language. Cambridge: Cambridge University Press.

Dennett, D. C. (1987). The Intentional Stance. Cambridge, MA: MIT Press.

Dennett, D. C. (2017). From Bacteria to Bach and Back: The Evolution of Minds. New York: W. W. Norton & Company.

Kosinski, M. (2023). Theory of mind may have spontaneously emerged in large language models. arXiv preprint arXiv:2302.02083.

Levinson, S. C. (2016). Turn-taking in human communication—origins and implications for language processing. Philosophical Transactions of the Royal Society B: Biological Sciences, 371(1693), 20150398.

Pezzulo, G., Rigoli, F., & Friston, K. (2018). Hierarchical active inference: A theory of motivated control. Trends in Cognitive Sciences, 22(4), 294–306.

Premack, D., & Woodruff, G. (1978). Does the chimpanzee have a theory of mind? Behavioral and Brain Sciences, 1(4), 515–526.

Sap, M., Le Bras, R., Allaway, E., Bhagavatula, C., Lourie, N., Rashkin, H., … Choi, Y. (2022). Social bias frames: Reasoning about social and power implications of language. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.

Tomasello, M. (2019). Becoming Human: A Theory of Ontogeny. Cambridge, MA: Harvard University Press.

Ullman, T. D., Spelke, E., Battaglia, P., & Tenenbaum, J. B. (2017). Mind games: Game engines as an architecture for intuitive physics and social reasoning. Trends in Cognitive Sciences, 21(9), 649–665.

Walton, K. L. (1990). Mimesis as Make-Believe: On the Foundations of the Representational Arts. Cambridge, MA: Harvard University Press.

Wittgenstein, L. (1953). Philosophical Investigations. Oxford: Blackwell.

Wellman, H. M. (2014). Making Minds: How Theory of Mind Develops. Oxford: Oxford University Press.