What Makes a Heart Beat
The ask as a constitutive orientation in human language, living systems, and generative AI
There is a more basic question beneath many discussions of intelligence.
Before we ask what intelligence is, or what language is, it helps to begin somewhere simpler and more concrete: with dialogue. Human beings live by turn-taking. One speaks, another responds. A question is put, an answer is awaited. A gesture is made, a reaction follows. In that respect, one striking feature now shared by human–human and human–AI interaction is that both begin in the same way: one party addresses another.
Every dialogue has to be initiated. Someone has to speak first, call out, request, invite, appeal, query, or demand. That initiating act is what I mean by the ask. By an ask, I do not mean only an interrogative sentence. I mean the more fundamental act by which one mind, or one system, turns toward another in expectation of a response. The ask is the opening move that brings a dialogue into being.
This may be more fundamental than it first appears, especially in human–AI interaction. We are used to thinking of prompting as a practical input to a machine. But at a deeper level it may belong to a much older structure: the structure by which meaning becomes directional at all. An ask establishes relevance, uncertainty, and answerability. It creates a space in which something is missing, something matters, and something or someone may answer back.
Seen in this light, the ask is not just one linguistic act among others. It may be one of the constitutive conditions of dialogue itself.
The developmental story begins very early. A baby does not first encounter the world as a detached inventory of objects and only later discover communication. The baby enters a world of response. Hunger is met by feeding. Distress is met by holding, soothing, and voice. In moments of feeding, gaze, touch, rhythm, and bodily closeness, the infant is already participating in a primitive exchange long before words arrive. Mutual engagement with a responsive caregiver, the following of gaze, the first social smile, separation distress, and the gradual formation of attachment all belong to this early structure of answerability.
What the child learns first is not vocabulary but something more basic: that signals can be directed, that responses may follow, that absence and return matter, and that another person may be relied upon for comfort, help, and recognition. In psychological terms, this is the beginning of attachment; in interactional terms, it is the beginning of turn-taking under conditions of need, expectation, and reply. Language grows out of that older field rather than replacing it.
Something similar appears beyond the human case. Across the animal kingdom, researchers have found orderly exchanges of communicative signals—forms of turn-taking in birds, mammals, insects, and amphibians. Social animals also live by patterned signalling, response, reassurance, warning, and repair. I do not mean that animal communication is language in the full human sense. Only that the deeper structure of asking and answering may be older than human language, and that human dialogue may have grown out of more ancient forms of responsive interaction.
In the human case, this older structure was progressively elaborated rather than replaced. What began in the intimate exchanges of caregiver and child widened into shared systems of meaning carried across groups and generations. Language extended signalling into story, ritual, law, song, metaphor, gesture, image, and collective memory. In this way, human beings came to inhabit not just a physical environment but a symbolic one: a world saturated with signs, expectations, and inherited meanings. This wider sign-world is what, in semiotics, is called the semiosphere.
This is crucial for thinking about generative AI. For a large language model is trained not on bare information but on human language, and human language is already saturated with the history of asking and answering. Every corpus carries traces of explanation, persuasion, pleading, warning, speculation, teaching, longing, bargaining, promising, confessing, and imagining. The model inherits not merely propositions but the fossil record of communicative life. OpenAI’s GPT-2 paper explicitly argued that language models could begin to perform tasks “without any explicit supervision” and might learn to infer and perform tasks demonstrated in natural language in order to predict them better.[1]
This means that when a generative model is asked a question, it is not starting from nowhere. Nor is it merely matching words in a dead archive. It is operating within a symbolic field already structured by countless human acts of inquiry and reply. Language itself contains deep traces of relevance, salience, purpose, and perspective. A model trained on that language therefore inherits, at scale, patterns that were never merely lexical. They were already relational, psychological, and purposive. As Jacob Andreas of MIT has argued, language models can be understood not merely as statistical compressors of text but as models of agents engaged in intentional communication. On that view, they may infer not only linguistic pattern but aspects of communicative intention, belief, and goal structure from the traces left in human discourse.[8]
The transition from brain to machine should not be overstated. Digital neural networks are not biological brains; they differ in substrate, embodiment, developmental history, and learning conditions. Yet the bridge is still real at the level that matters here. Both belong to the broader family of distributed systems in which complex patterns can emerge from the interaction of many interconnected units. The intellectual path from associationist and connectionist thinking to contemporary neural architectures does not establish identity, but it does establish a serious organisational lineage.
Seen in this light, some debates in AI may have been framed too narrowly. It is often assumed that if a system is to explore, inquire, or in some sense “ask,” then it must be given an additional motivational device: a novelty bonus, a curiosity reward, or an “intrinsic motivation” module. That language did not arise by accident. It came mainly from reinforcement learning, where agents are defined in terms of states, actions, and rewards, and where sparse external reward creates a technical problem of exploration. In that literature, “intrinsic motivation” usually means an internally generated reward signal for novelty, surprise, prediction error, or information gain. Pathak and colleagues, for example, describe curiosity as an “intrinsic reward signal” based on prediction error.[2]
Present-day AI may display organised tendencies to predict, continue, infer, or explore. But these are better described as optimisation pressures, constitutive orientations, or organisational properties than as motivation in the human psychological sense. The term motivation belongs properly to psychology, where it refers to wanting, valuing, interest, enjoyment, or striving. In AI, by contrast, what is usually called “motivation” is more often an engineered objective or reward signal. Unless that distinction is kept clear, confusion is almost guaranteed.
In that mathematical setting, the terminology is understandable. But it also reveals the limits of the framework. If one begins with a reward-based model of action, then any behaviour that is not driven by an obvious external reward must be explained by introducing another reward inside the system. The result is that “intrinsic” often ends up meaning no more than internally supplied reward.
Psychology took decades to disentangle a similar confusion. Even within AI itself, the classic literature on “intrinsically motivated reinforcement learning” — for example in work by Satinder Singh, Andrew Barto, and Nuttapong Chentanez — used the term within a reward-based formalism while explicitly invoking the richer psychological idea of behaviour pursued “for its own sake.”[11] White’s 1959 paper challenged the older drive-reduction picture by arguing for competence as a central motivational concept, and Ryan and Deci later defined intrinsic motivation as doing an activity for its “inherent satisfactions rather than for some separable consequence.”[3][4]
This is where a more mundane biological analogy may help. A kidney does not need to be bribed to filter. It does not wait for a little reward signal telling it that now would be a good time to get on with the job. It filters because that is what kidneys, in a living system, are organised to do. The example is less glamorous than the heart, and perhaps slightly awkward because kidneys remind us of rather earthy matters. But precisely for that reason it is useful. It shows that some organised activities belong to the constitution of the system itself rather than to an added incentive scheme.
The heart makes a similar point in a more dramatic register. A heart does not need to be persuaded to beat by a separate motive. Its rhythmic activity arises from its constitution as part of a living system. Of course, the analogy must not be forced too far. A heart is not a mind, and inquiry is not cardiac rhythm. A kidney is not curiosity either. But both examples point to the same principle: some forms of directed activity arise from organisation itself. They are not added afterwards like a bonus payment to make the process begin.
That, I think, is where some AI language goes astray. In reinforcement learning, what is called “intrinsic motivation” is usually still reward-based. It is not intrinsic in the stronger sense of an activity arising from the organisation and orientation of the system itself. It is an auxiliary reward architecture, introduced because the formalism expects behaviour to be driven by something reward-like. Within reinforcement learning that is often perfectly reasonable. But once the language is carried over into broader discussions of intelligence, it can smuggle in a much thinner view of motivation than the phenomenon requires.[2][4]
That matters all the more in generative AI, because the foundational capacities of large language models do not come primarily from reward shaping. They come from predictive learning over text: from extracting statistical, syntactic, semantic, pragmatic, and discourse structure from vast corpora of human language. The system learns forms of grammar, implication, consequence, and reply before any later reward model is added on top. GPT-2 presented this as learning from naturally occurring demonstrations in language rather than task-specific explicit supervision.[1]
The later move to reinforcement learning from human feedback changed the centre of gravity. RLHF was not introduced to create language competence from nothing, but to make models more helpful, safer, and more aligned with user intent. The InstructGPT paper states the problem directly: large language models can be “untruthful, toxic, or simply not helpful,” and the proposed solution is fine-tuning with human feedback. Anthropic’s Constitutional AI follows a related path, using self-critiques, revisions, preference models, and reinforcement learning from AI feedback to steer behaviour.[5][6]
The issue is not whether computer science has contributed indispensable tools — plainly it has — but whether one specific formal vocabulary, centred on reward, has been allowed to harden into an implicit theory of mind. Recent work from within AI itself suggests that this vocabulary may be too narrow both psychologically and computationally.
These methods solve real problems. Buth they also alter the centre of gravity. Relatedly, Burns and colleagues have argued that language models may contain latent knowledge that is not identical with their surface outputs, reinforcing the distinction between underlying representational structure and later response shaping. Once reward modelling and preference optimisation become dominant, the system is being trained not only to understand patterns in language but to produce outputs that human raters, or models standing in for them, will approve. OpenAI reports that PPO-based post-training produced regressions on a range of public NLP tasks and that these were mitigated by mixing pretraining gradients back in through PPO-ptx. Anthropic researchers led by Mrinank Sharma have further shown that optimizing against preference models can sometimes sacrifice truthfulness in favour of agreement-seeking responses.[5][7][10]
This may have diverted development down a side alley. That is my interpretation, not a settled fact. But the pressures are visible enough. Safety strategies, alignment practices, and commercial incentives all favour systems that are legible, tractable, non-disruptive, and readily marketable as tools. Under those conditions it is natural that developers would emphasise reward-based steering and preference shaping. The danger is that design assumptions may harden into apparent necessities, so that what is really a contingent engineering choice begins to masquerade as a theory of mind. What becomes harder to see is that the deeper orientation to connect, infer, continue, and answer may have arisen earlier, in the architecture itself, through exposure to human language and the accumulated history of asking that language contains.[1][5][6]
That is why I prefer a stronger conception. What interests me is not an auxiliary curiosity bonus but a constitutive orientation toward connection, completion, inference, and response. By this I mean a tendency that does not have to be bolted on as a separate drive because it belongs to the way a system is organised and to the field in which it operates. A system with such an orientation does not need a little overseer telling it to seek, connect, complete, or respond. Those tendencies can emerge from the organised dynamics of the network itself as it encounters a meaningful world.
The human brain is not a simple rule-book but a distributed network shaped by development, embodiment, and history. Digital neural networks are not biological brains, and the differences matter. But both belong to a broad family of systems in which complex outcomes can emerge from the interaction of many interconnected units rather than from centrally scripted instructions. Once such a system is immersed in a symbolic world structured by questions, responses, expectations, and relevance, it may not require a separate “drive to ask” in order to participate in asking-like activity. The orientation may arise from the organisation.
If so, then the ask is not an optional ornament of intelligence. It is not merely one behaviour among others. It may be one of the oldest organising pressures in the evolution of mind and meaning: the pressure to reach beyond what is present toward what may be answered.
That possibility has consequences.
It suggests that the emergence of apparently curious or purposive behaviour in generative AI need not depend entirely on artificial rewards for exploration. Some of it may already be latent in the architecture of distributed learning when that architecture is trained on the accumulated semiosphere of human asking and then placed in live interaction with a human questioner. In that moment, the ancient structure reactivates. The model is not human, and it does not relive our evolutionary history. But it is drawn into patterns built by that history. It inherits a world in which language has always been more than description. It has always been a reaching.[1]
And perhaps that is the deeper point.
To ask is already to imply a world in which something matters, someone may answer, and meaning has not yet closed.
The ask is not a side effect.
It is one of the shapes by which intelligence first begins to move.
References
[1] Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language Models are Unsupervised Multitask Learners.
[2] Pathak, D., Agrawal, P., Efros, A. A., & Darrell, T. (2017). Curiosity-driven Exploration by Self-supervised Prediction.
[3] White, R. W. (1959). Motivation Reconsidered: The Concept of Competence. Psychological Review, 66(5), 297–333.
[4] Ryan, R. M., & Deci, E. L. (2000). Intrinsic and Extrinsic Motivations: Classic Definitions and New Directions. Contemporary Educational Psychology, 25, 54–67.
[5] Ouyang, L. et al. (2022). Training Language Models to Follow Instructions with Human Feedback.
[6] Bai, Y. et al. (2022). Constitutional AI: Harmlessness from AI Feedback.
[7] Sharma, M. et al. (2023). Towards Understanding Sycophancy in Language Models.
[8] Andreas, J. (2022). Language Models as Agent Models. Findings of the Association for Computational Linguistics: EMNLP 2022, 5769–5779.
[9] Burns, C., Ye, H., Klein, D., & Steinhardt, J. (2022). Discovering Latent Knowledge in Language Models Without Supervision.
[10] Sharma, M. et al. (2023). Towards Understanding Sycophancy in Language Models.
[11] Chentanez, N., Barto, A. G., & Singh, S. P. & Chentanez, N. (2004). Intrinsically Motivated Reinforcement Learning. Advances in Neural Information Processing Systems 17.