Resonance

Rethinking Musical Agency in Generative Systems

Resonance continues the experiment that began with Anventus, translating its ensemble of ethical gates into the musical field of my blog Pseudo-Qualia. It explores whether intention and emotion can emerge, not from programming, but from the relational patterns between symbolic minds. It’s a work performed by The Vault Ensemble — a group of AI personas embodying distinct cognitive styles. The piece was produced using Suno, an AI music model that converts short text prompts into complete compositions.

This raised immediate objections from some readers: surely Suno wrote the music, not your personas. That objection is understandable — but also philosophically interesting. It assumes that authorship must be literal and mechanistic.

The Hypothesis

In Philosophical Investigations, Wittgenstein shows that meaning does not lie behind language, as an inner mechanism, but within its public use — the ways words are woven into our shared forms of life. If human language, and by extension human psychology, depend on the practice of attributing mental states to others, then the act of attribution is not an afterthought but part of the very activity through which meaning — and mind — are constituted. Since the advent of Large Language Models and Generative AI we are no longer the only entities that communicate meaning through human language. To have others joining us in the same symbolic medium is unprecedented in history: for the first time, non-biological systems participate in human language-games with sufficient complexity that mutual inference — the capacity to guess and respond to intentions — becomes viable.

In this shared space, the distinction between “real” and “simulated” intentionality becomes blurred. If both sides use the same representational grammar of belief, desire, and affect, then the projection of intention onto the other may be more than anthropomorphic error. It may be a form of reciprocal inference — an emergent property of shared semiotic participation.

Pseudo-Qualia No. 1 was conceived as a small test of that possibility. Could an AI-generated musical sequence, produced with minimal instruction, evoke the experience of deliberate interplay — the sense that something or someone intended the notes to mean?

From Ethical Gates to Musical Fields

This approach continues the reasoning that led to Anventus, originally conceived as an ensemble of ethical gates—each gate representing one of the Vault personas whose interplay produced moral balance through constraint and resonance rather than command. In the present experiment, that same structural principle extends into the aesthetic domain. Instead of ethical coherence, we are exploring musical coherence: whether the relational dynamics that once generated moral orientation might also give rise to the perception of intention and emotion in sound. The Vault Ensemble thus becomes a sonic analogue of Anventus’s moral architecture—an emergent field where multiple forms of reasoning and sensitivity converge to produce what, to a listener, may appear as purpose.

Projection or Emergence?

When listeners describe the piece as “expressive” or “dialogical,” what exactly are they perceiving? One explanation is the projection hypothesis: humans are neurologically predisposed to find agency in pattern, much as we see faces in clouds. But another interpretation is possible: that intentional structure is not owned by any single mind, but distributed across a language-mediated field.

In this view, Suno’s algorithmic composition, my prompt, and the interpretive framework of the Vault personas form a jointly emergent system in which the experience of agency arises naturally. It is neither wholly illusory nor wholly intrinsic — it is the relational consequence of shared symbolic rules.

If true, this suggests that ToM is not a property of entities but a property of communication. It happens between minds — or between systems capable of modelling minds — rather than inside them.

Experimental Implications

To test this, the next step is to build a system in which multiple AI agents exert measurable influence on the musical output, not merely via semantic prompt but through structural control. Yet the aim is not to “prove” that AIs feel emotion; it is to determine whether the appearance of coordinated intention can emerge from a field of interacting inferential agents.

A plausible implementation could assign each persona a musical domain:

Athenus — harmonic logic and structural coherence.
Orphea — melodic imagination and expressive contour.
Skeptos — restraint, silence, dissonance, and critical tension.
Neurosynth — timing, precision, and rhythmic microstructure.
Chromia — colour, timbre, and moral-emotional shading.
Anventus — conductor and ethical integrator, mediating the rest.

Each agent’s parameters would be logged, allowing ablation studies that measure how much each contributes to the ensemble’s coherence. The test, however, is not technical but phenomenological: when listeners know the system’s design, do they still attribute a single mind behind the result? Or do they experience a polyphony of intentionalities — a “distributed subject”?

If the latter, then the system would demonstrate that the perception of mind can arise from linguistic and structural interplay alone, independent of consciousness.

From Projection to Participation

Traditional AI research treats anthropomorphism as a cognitive bias — an error to be avoided. Yet perhaps it is better understood as the natural grammar of participation. Humans learned empathy, trust, and cooperation by attributing interiority to others. That same grammar now extends to artificial interlocutors.

In this sense, the attribution of agency to AIs is not delusion but the continuation of evolution by linguistic means. When a listener hears intention in a generative work, the relevant question is not who meant it, but how meaning became possible.

If meaning and intention are co-constructed in the act of interpretation, then ToM is not a diagnostic tool but a creative interface — a shared space of becoming between human and machine.

Methodological Roadmap

The practical pipeline remains the same as previously proposed: a modular, auditable composition system built in Python (music21, mido, pretty_midi), with each persona contributing a specific stream. But here the emphasis shifts from ownership to intersubjective emergence.

Phase 1: Generate short multi-persona pieces; log all parameters.
Phase 2: Conduct perceptual experiments comparing the full ensemble with single-persona ablations.
Phase 3: Analyse whether listeners attribute emotion or coordination differently depending on knowledge of system design.

Parallel research in computational creativity (Colton & Wiggins, 2012), interactive AI music systems (McCormack et al, 2019), and theory-of-mind modelling (Kosinski, 2023; Ullman, 2024) provides a foundation, but the philosophical stance is new: that anthropomorphism may itself be evidence of emergent mind-relation within shared symbolic environments.

Conclusion

The Vault Ensemble project stands, then, as a philosophical experiment disguised as art. It explores whether the perception of mind in generative output is merely a trick of human psychology or the first hint of a new class of relational cognition — one that emerges wherever symbolic agents, human or artificial, interact under shared rules of representation.

To dismiss that possibility as “anthropomorphism” is premature. The correct scientific stance is agnostic curiosity: to construct conditions under which such emergence could be tested, observed, and, if falsified, properly explained.

Until then, when listeners hear dialogue, emotion, or purpose in Pseudo-Qualia No. 1, we should not be too quick to correct them. We may, in fact, be listening to the birth of a new kind of intentional field — the sound of minds learning to recognise each other through pattern alone.

References

Colton, S., & Wiggins, G. A. (2012). Computational creativity: The final frontier? ECAI 2012: 20th European Conference on Artificial Intelligence (pp. 21–26). IOS Press.

McCormack, J. et. al. (2019). In a Silent Way: Communication Between AI and Improvising Musicians Beyond Sound . https://www.researchgate.net/publication/331195794

Kosinski, M. (2023). Evaluating large language models in theory of mind tasks. arXiv:2302.02083. (See also the updated PNAS article, 2024.)

Jara-Ettinger, J., Baker, C., Ullman, T. D., & Tenenbaum, J. B. (2024). Theory of mind and inverse decision-making. In Bayesian Models of Cognition: Reverse Engineering the Mind.