Panophonia

Steven Connor

A talk given at the Pompidou Centre, 22nd February 2012. [pdf version]

In The Tuning of the World, Murray Schafer announced that we were living under conditions of what he called ‘schizophonia’. The term means split voice, and has achieved a certain measure of success in naming the generalised condition in which sounds, but in particular the sounds of voices, are dissociated from their sources, Although, looking back, I find that the term is not in fact used anywhere in my Dumbstruck, it seems to encapsulate clearly the central area of concern of that book. Its subtitle is A Cultural History of Ventriloquism, but for most of the six years I spent writing it, I always described it, to others and to myself, as ‘a cultural history of the dissociated voice’, a formula which might be regarded as a precise analogy to the term ‘schizophonia’. Through a history of the practice of deliberate deception or dissimulation with respect to the voice, the practice known as ventriloquism, I proposed in Dumbstruck to explain how we had arrived at a condition in which the separation of voices from their sources had become endemic.

I have begun to suspect that the term ‘schizophonia’ is out of date, and may perhaps already have been so when I was writing my book. I think now that our condition would be better described with another term, that in fact occurred to me as I was writing the book (and must surely before and since have occurred to many others). It is the term that provides my title: panophonia.

There is pathos, the suggestion literally of some form of suffering in the term schizophonia. Indeed, the separation of the voice from its source has often been represented as a wounding, or severance. Voices do not merely drift apart from their origins, it is suggested, nor are they inadvertently lost: they are ripped or wrested. A voice without a body suggests some prior act of mutilation: for every unbodied voice, it seems, there is always some more-or-less violently muted body. Ventriloquism is often the resort of the silenced, as in the case of Ovid’s violated Philomela, who wove her grievance into a tapestry after her tongue had been torn out by her ravisher Tereus, or the persecuted Christians living in 5^th century North Africa under the Vandal king Hunneric, who, refusing to embrace the Vandals’ Arianism, had their tongues cut out, yet, miraculously, continued to be able to preach the Word of God in their ablated condition.

Not only had I shown how much violence there often is in the dissociation of the voice from the body, I also thought I had also at the end of Dumbstruck earned the conclusion that there are no disembodied voices; for every disembodied voice is always also what I called a ‘voice-body’, the body implied by or intuited from the voice. Far from being what Douglas Kahn calls ‘deboned’, dissociated voices always seemed to summon in their wake the phantasm of some originating body, effect convening cause. Every voice, it seemed to me, was a kind of auditory physiognomy (Greek phusis, a body + gnomon, judging, indication), not in the usual sense of what the body makes known through outer indications, but in the sense of the making known of a body that is undertaken by every voice.

This is to say that the conclusion of Dumbstruck is that there is never in fact, as the Latin phrase has it, vox et praetera nihil, a voice and nothing besides, for if the voice is always our way of being beside ourselves, there is always something else – an implied speaking body – beside or behind it. So, while the voice is a powerful proof of the idea of origin, it can never itself be originary.

I want to distinguish two distinct though interlaced currents in the history of dissociated voices. One of these is of the voice separated from its source, through dissimulation, echo, broadcast, amplification, or transcription. In such voices the source of the voice is known, but is not present – clear, so to speak, to the ear, but not apparent to the eye. But there is another tradition, of hearing voices where there are none – persuading oneself that one hears the voice of God in the thunder, or of Zeus rustling in the leaves of the Dodona oak.

The difference is between dissimulation and hallucination. In the first case, a simulacrum of an existing voice is produced by artifice or technology. In the second case, the voice is conjured by the one who hears it. In the first case, the source of the voice is hidden, in the second that source is produced. In the first case, there is more than one voice; in the second, there is less than one. Of course it is difficult to discriminate these two absolutely. No doubt the shock and surprise of telephones, loudspeakers and concealed speaking pipes may have consisted very largely in the fact that listeners who could see no sources for the voices emanating from them might have thought they were hearing voices. The art of ventriloquism consists very largely in persuading the audience to do much of the ventriloquist’s work for him, in enfleshing the voice from the skeletal approximations that the ventriloquist supplies – hence the commonness of slurred or childish voices among the ventriloquist’s repertoire, for these are voices whose defects and imperfections we are accustomed to remedying unconsciously as we listen to them

Often this involves the influence of what we see upon what we hear. What is known as the McGurk effect demonstrates how ready we are to hear something that we think ought to be there, but is not. The striking thing about the McGurk effect is that, no matter how many times one sees or hears it in action it does not seem possible to countermand the brain’s determination to conjure a sound that is not there to correspond with the speech movements that are. We are, it seems, as helpless before our own determination to hear voices, or to hear them in a certain way, as the psychotic is before the self-born voices that torment him.

It is often observed that human beings are actively and intensively pattern-seeking creatures, and where we do not find patterns, we will impose them rather than tolerate the tension of the unformed. The most patterned or ordered form of sound we encounter is the human voice, either because human beings derive more valuable information from the voices of other human beings than from any other sounds, or just because human beings are more interested in human voices than in other sounds, which therefore seem to them for that very reason to be fuller of intention and import. Since we, or our hearing, tend to ask the default question ‘is there a voice to be heard in this sound?’ just as we ask the default question ‘is there a face to be made out in this visual arrangement’, we are, it might be thought, strongly predisposed to detect voices in sound as we are primed to detect faces in visual noise.

The pathological form of this listening out for voice are auditory hallucinations, that move from the nonvocal to the vocal, allowing ordinary sounds to be heard as voices. John Perceval, a nineteenth-century psychiatric patient who wrote a detailed account of his delusions, described such a condition – the production of voices out of ordinary sounds, especially the internal sounds of his own body: ‘I found that the breathing of my nostrils also, particularly when I was agitated, had been and was clothed with words and sentences’. The sound of air was particularly liable to become, in Perceval’s expressive phrase, ‘clothed with articulation’. He describes his fear at the approach of his attendants: ‘Their footsteps talked to me as they came up stairs, the breathing of their nostrils over me as they unfastened me, whispered threatenings; a machine I used to hear at work pumping, spoke horrors’. As he began to recover, he was able increasingly to identify the sources of these sounds: ‘I discovered one day, when I thought I was attending to a voice that was speaking to me, that, my mind being suddenly directed to outward objects, – the sound remained but the voice was gone; the sound proceeded from a neighbouring room or from a draft of air through the window or doorway’. On another occasion, gasjets were identified as the source:

Continually over the head of the bed, at the left-hand side, as if in the ceiling, there was a sound as the voice of many waters, and I was made to imagine that the jets of gas, that came from the fire-place on the left-hand side, were the utterance of my Father’s spirit, which was continually within me, attempting to save me, and continually obliged to return to be purified in hell fire, in consequence of the contamination it received from my foul thoughts. I make use of the language I heard.

Julian Jaynes proposed in 1976, in his book The Origin of Consciousness in the Breakdown of the Bicameral Mind that the strong tendency among schizophrenics to hear voices is a link with, or even survival of, a feature of mental life that was widespread among human beings prior to what he calls the ‘breakdown of the bicameral mind’, in which feelings and judgements belonging to one part of the brain would be processed by the other as imperious utterances. Jaynes also explores the possibility that statues and idols may have been employed specifically as focuses for or producers of this kind of auditory hallucination. Though Jaynes emphasises the helplessness and the necessary obedience of those who hear voices, it should be noted that the voice does not manifest or produce itself; rather it must in some sense be bent or channelled into the condition of voice, before it can produce its effect of obedience.

The processing of the sounds of the inanimate world as voices may strike us as a marginal or anomalous phenomenon. However, some recent work designed to explain why THC, the active component of cannabis, might sometimes trigger schizophrenia, points in another direction. Zerrin Atakan of London’s Institute of Psychiatry conducted experiments which suggest that subjects who had been given small doses of THC were much less able to inhibit involuntary actions. She suggests that THC may induce psychotic hallucinations, especially the auditory hallucinations which are classically associated with paranoid delusion, by suppressing the response inhibition which would normally prevent us from reacting to nonvocal sounds as though they were voices. The implications of this argument are intriguing; for it seems to imply that, far from only occasionally or accidentally hearing voices in sounds, we have in fact continuously and actively to inhibit this tendency. Perhaps, without this filter, the wind would always and for all of us be whispering ‘Mary’. And perhaps the strangeness of dislocated voices, of voices that should not be there, is actually a reflex of this primary, insatiable demand of ours, that there be voice, that voices be there.

Those who ‘hear voices’ usually find it difficult to ignore them. Their voices are often heard threatening, mocking, exhorting – it seems to be much rarer for them to be soothing or encouraging. Perhaps, behind all of this is that primary demand for voice, the extortion of voice from noise, that I have just been characterising; our demand for voices may recoil upon us in their demands upon us. But there is another form of panophonia, which is much less pathological, and much more generalised. If the first phase of technical ventriloquisms, leading to the appearance first of all of talking machines, such as that of van Kempelen, and then to more familiar devices such as telephones and phonographs, produced a pandemic of voices where they could not or should not be, the gradual intensification of the production of voices electronically dissociated from their source has produced a generalisation of the idea of voice, accompanied by a decathecting, or diminishment of intensity in relation to the scandal or malady of the sourceless voice. We can, I think, surmise that the voice has undergone a secondary splitting, or sparagmos, the term used to describe the dismemberment of Orpheus by maenads. The first dissociation split the voice from the body and its occasion of utterance. The second dissociation, so to speak split the voice from itself, separating the voice from its own force of vocality. By vocality, I mean the ensemble of values and powers invested in the voice – including the power of testimony, the power of being an event of speech, of proximity, of the presence of an Other comparable to us, to whom we are subjected, enjoined to pay a kind of exclusive attention. At the same time, different elements of the voice begin to be distributed across different media and vehicles. In this growing new dispensation, the voice loses much of its concentrated privilege and, along with that, the power of its dissociation to startle, terrify, fascinate or amaze. Voice disseminates across the social field, increasingly becoming more adjectival than substantial. Voice is no longer essence, but accident. So: first dissociation of the voice - then dissolution of the voice.

One of the most eloquently agonised protests against the separation of the voice from its originating body is to be found in the short piece of around 1933 in which Antonin Artaud describes what he calls ‘les souffrances de “dubbing” ‘. Most of us will still acknowledge the feeling of unease that can be prompted by the imperfect synchronisation of voice and lips. But I suspect that this feeling is stronger among Anglophone peoples who will be much less familiar with the practice of dubbing. In non-Anglophone countries, the dubbing of foreign films and TV has probably led to a higher tolerance of voices that are aligned only very approximately with the mouths and bodies that are supposed to be producing them. The experience of watching and listening to a poorly-dubbed film (and, for those who are unfamiliar with it, all dubbing is poor dubbing) is rather like watching an actor holding up auditory placards on which is written what they are saying. But, even in films where the voices of the performers seem obviously to be their own, it is often literally the case that the enunciation does not in fact belong to or arise from the image, since it may have been added afterwards, the actor doubling his own utterance. Ventriloquism is often associated with this kind of disturbance, or ‘souffrance’, but in fact the very possibility of ventriloquism depends very largely because of the very powerful predisposition that human listeners have to create coherence, to disallow the existence of the dislocated or the incongruous.

In the case I have just evoked of autodubbing, in which a performer dubs themselves, the voice is the same, but the time of utterance is different. In recent years, we have seen a particularly striking example of such time-shifting, in the remixing of the voices of deceased performers to allow them to duet with the living. The artist Erik Bunger ends his performance piece entitled A Lecture on Schizophonia with a couple of examples of posthumous duets between living singers and the recorded dead – Celine Dion singing ‘All the Way’ with Frank Sinatra, Natalie Cole singing ‘Unforgettable’ with her father Nat King Cole, technology’s final victory over death perhaps making ‘unforgettable’ seem more like a menace than a promise. There are plenty of other examples on which Bünger might have drawn: Hank Williams Jr. joining with his father for ‘There’s a Tear in My Beer’ in 1989, Lisa Marie Presley dueting with her father Elvis on ‘In the Ghetto’. and the surviving Beatles joining with John Lennon on his ‘Free as a Bird.’ Bünger has himself put together a duet of this kind, making Celine Dion’s ‘My Heart Will Go On’ lip-synch with Blind Willie Nelson’s ‘God Moves on the Water’. his 1929 song about the sinking of the Titanic.

All these might well be seen as examples of what I once thought to call the ‘vocalic uncanny’. But I am coming to believe that what we may feel when we evoke such a term is not unease but actually something like a vague longing for the unease that we once felt, or that we feel we ought once to have felt. It is perhaps an apotropaic or reparative unease, evoked in order to make up for our apparent willingness to allow the voice to be so easily replicated, redoubled and impersonated, and our capacity to live with so little disturbance in a world of voice-doubles.

At the same time, we have obviously become much more familiar with the cross-matching of voices and bodies. Youtube videos abound of babies and animals overdubbed with voices, one of the most famous being the ‘Please Do Not Tease Your Dog’ video of a couple of years ago. The particular glory of this video consists in the subtle interchange of words, noises, inflections and gestures between dog and overdubber. In fact, it is unlikely that such an exercise would succeed nearly as well with an animal that was less intimately attuned to the tiny shifts of vocal pitch and intensity than a dog, such that they seem to be listening with their skin, hair and teeth. Voices can be imparted to dogs because dogs respond to and mirror human voices so sensitively in the first place, though they internalise them as gestures rather than representations, or as signs in the sense given the term by the philosopher C.S. Peirce, that is to say, as incipient effects, the bringing of things into relation.

A positive reading of such a condition might be that it reawakens a kind of animistic sense, of a world to which we give voice in order that it may speak to us. But this would be to mistake the effect of contemporary panophonia. For an animist awareness, following Aristotle in seeing voice as the unique possession of animate beings (those which ‘have soul in them’ in Aristotle’s phrase), the voice is the infallible sign and guarantee of life. But in our world, this is no longer the case. The very fact that there are so many jokes about the conversations drivers have with the voices on their satnav devices is an indication of how dwindled the impulse may in fact be to regard the address provided by such devices as a specifically vocal one. Everywhere, in cars, airports, trains, lifts, and operating systems, synthesised voices unterrifyingly advise, warn, encourage, cajole, invite and interrogate us. As the world has been polyphonised, the voice is in the process of being progressively deanimated. The greatest transgression posed by the possibility of voice recording is the possibility of saying truthfully ‘I am dead’. But this is a death-in-life to which we have become calmly and cheerfully accustomed.

The proliferation of technical devices for transmitting and reproducing voices makes of voice a kind of phonic writing. The fact that we no longer find such a thing striking should be what strikes us. Where voice and writing used to signal distinct orders, they are now fluidly interconnecting and reciprocally transforming.

If one were able to lift an inhabitant of the nineteenth century, or any period of human history prior to that, out of their time and into ours, I think we can be confident that one of the biggest sources of amazement to them would be the habit we have developed of wearing writing. There is a banal point to be made about this of course, that it is an effect of commoditisation – much of the writing we are wearing is intended to advertise particular brands to other potential purchasers, making us to a large degree walking billboards. We are all in this sense like the sandwich-board men who plod their way with lugubrious expressiveness through James Joyce’s Ulysses:

A procession of whitesmocked sandwichmen marched slowly towards him along the gutter, scarlet sashes across their boards…He read the scarlet letters on their five tall white hats: H. E. L. Y. S. Wisdom Hely's. Y lagging behind drew a chunk of bread from under his foreboard, crammed it into his mouth and munched as he walked.

But this diagnosis of commodification, which seems so satisfyingly arresting and decisive, actually serves to stop us thinking about the simultaneously more general and more particularised kind of strangeness that is involved in this practice of auto-inscription or self-tattooing, with shifting and impermanent signs of affiliation, with signs that ripple over us like sunlight over the stippled body of a trout. The point about these signs is that they are not merely stampings or – in the literal sense of the term – brands (‘Levi’). They are also utterances, and so have intention and inflection and accent as well as raw signification. Tattoos have a disciplinary history, signifying the juridical dispossession of the body they mark, or its voluntary submission, but we are becoming tattooed with tonalities, with print that is becoming ever more voice-like, thus softer, subtler, more ironic.

There is a particular topos, originating within music video, but spreading now far beyond it, which embodies this new interfusion of voice and script. It has its origin in the promotional film (they were not yet called music videos) made for Bob Dylan’s Subterranean Homesick Blues. In this video, the pokerfaced Dylan displays a series of cards which snatch out key words from the crowded lyric of the song. It is a sort of parody of subtitling, in which the subtitles have entered the diegetic space of the screen rather than scrolling across in some place of commentary apart from it. They are a kind of miming, in some sense making up for the fact that Dylan is otherwise so sullenly mute. We hear Dylan’s voice, but we do not see his lips move: so, following the logic of the ventriloquial effect, we look for some plausible source of the sound. But our gaze is directed, not to the speaking simulacrum of a dummy, but the simulacrum of speech provided by the placards. The video therefore seems like a parodic miming of the act of miming. But the placards are not simple reductions of the richness of speech to the bony emaciation of script. Rather, it is the voice which is rasping and monotonous, while the script displays impish variation of shape and size. Dylan’s voice is just where it was, on the soundtrack, and yet it is also spread out, not only everywhere in the song, but echoed and amplified in everything we see. It is a performance that has been recalled and parodied many times since.

For some decades after the coming of sound, cinema used to be fixated on the synchronicity of voice and lips, but we have become accustomed to a condition in which voice is now distributed across the screen, and does not need to be anchored in a visible space or occasion of utterance. It is the screen that speaks, or whatever in the screen seems like a plausible channel or habitat for voice. Rudolf Arnheim complained that the coming of sound to cinema in fact silenced a world that, prior to it, had been on the some ontological level as human beings; since neither humans nor things could speak, all of them could be equally expressive. But cinema, and its many doubles and offshoots, has reacquainted itself with this general expressiveness, not by the removal of voice, but by the generalisation of it, in the intertwining of sound and image.

One of the most compelling signs of the panophonic diffusion of voice is in the extraordinary explosion of typographies which we have seen over the last century and a half. Type used to be the sign of the mutely anonymous, of speech bleached of its individuating life and intimately flickering warmth in favour of truth set out in chilly, authoritarian black and white. Type belonged to the order of the general, its mood not expressive but imperative. Half way between the Romantic soulfulness of voice and the abstract, mechanical order of type, according to Friedrich Kittler’s influential account, lay the fluent cursives of handwriting, which performed the office of fixing the spoken word, but retained traces of the tremulous intensities of its originating singular body. Handwriting was text agitated by a kind of muscular or gestural quasi-voice.

Over the last century and a half, typography has multiplied its forms and powers to the point where it begins to approximate to, and even to surpass the quasi-vocal condition of handwriting. Typography is the visual equivalent – more than this, the enactment – of vocal intonation. Letters have become infused with the vocality of speech, forming a lexicon of moods and gestures that we read fluently without quite knowing how we are doing it, or where we learned to. Furthermore, in cinema and video, typography animates and is itself animated by sound and music. One of the most obvious ways in which script has been penetrated by voice is in the simple fact that what for centuries has been known as ‘movable type’, but which has in fact remained stationary under our shuttling gaze, has itself been infused with motion, and has begun to leap, scroll, shimmer, swell, shrink, flap, fold, pour, evaporate and explode. Verba volent scripta manet, we used to hear: but now script is also volatile.

Yet another of the ways in which we have become attuned, without yet having a language with which to speak of this familiarity, to the flowing into each other of voice and text is in the real-time automated transcriptions, whether in sports commentaries, or subtitles, or scrolling information streams in computer games.

We inhabit, with almost perfect naturalness, a world of simultaneous translation. I say ‘at this very moment’, for I am speaking these words in English, to an audience in the Pompidou Centre, who are being assisted by a simultaneous translation into French, though perhaps not all of them will be entirely dependent on that translation (English people believe that French people understand and speak English quite as well as Danes and Swedes, they just decline to give English speakers the satisfaction of hearing them do so). Well, actually, none of this is really happening exactly right now or exclusively here. For the ‘here’ and ‘now’ of these words are both sylleptic, involving anticipation and (anticipation of) retrospection. I am writing these words ‘here’ and ‘now’ here and now in a Quaker café in the Euston road in London, though I am framing them for the as-yet unvisualisable scene of their performance in the Pompidou Centre in a few days time. So this is a speechwriting, a phonocriture, or writing-for-speaking. But I am doing so in a rather hurried way, for I have promised to let Christine Bolron have the text of my talk tomorrow (that will be, by the time I will be saying these words, last Friday) in order to help the simultaneous translator prepare the translation that they will provide on the day, that I am looking forward to looking back on.

So where is the here and now of this utterance? There is little that is simultaenous about this translation, or even what it is currently translating. (Apart perhaps from this sentence, which is not in the script I sent in advance to my translator. He must be wondering how long this is going to go on. Take courage, Michel, it is about to stop.) This is not, I hope, simply a weary reiteration of the familiar Derridean doctrine, that speech cannot reliably instance the values of immediacy and presence, given its infiltration by the displacements of writing. It is that the lack of such presence is now without serious meaning or consequence, considered simply as lack – unless it is that it encourages us to try a little harder to understand the complex translations and ventriloquisms that we have already learned to engineer and process.

In the condition of panophonia, then, voice is no longer exiled from its origin as it is in schizophonia, but everywhere finds a way of being at home. Voice has multiplied into what might be called vociferations, phonesthetic effects. Voice is found in the form of rhythms and rhymes – in gesture, gait, and all the forms of mobile inscription and the inscription of movement. ‘Everything speaks in its own way’, thinks Joyce’s Leopold Bloom, as he listens to folds of paper falling off a drum in a printer’s shop. Well, everything indeed speaks, but now perhaps not necessarily in its own way – in propria persona – but rather in borrowed accents, mobile turns of phrase, mirrorings, accompaniments, descants, impersonations. The technologies of the late nineteenth century that gave us TV, radio and recorded sound depended on forms of electromagnetic conversion, that allowed the voice (and the image) to be detached, converted to some other form, then returned to, or as itself. But that ventriloquism has given way to an economy of media in which that final stage of return to oneself need no longer take place, in which there is ongoing and inconclusive convertibility without return. Ventriloquism has always embodied a fantasy of a purely sonorous world, a world that might be both constructed and commanded by voice. But ventriloquism has always also been the contradiction of that promise – because it has always been an art of the eye as much as of the ear, an art, therefore, of sensory interferences. So ventriloquism in fact insists on a world, not of pure sound, either as plenum or as pathos, but a world of mixed and mobile bodies, in which sounds and visible objects ceaselessly bud from and engender each other. The very same ventriloquism which once made a fetish of the voice now acts to disenchant it, making it less and less apprehensible in itself, or able to speak in its own voice. Just for a brief historical interval, our technological ventriloquisms highlighted the voice, stripping it out from its habitat, and making it the object of fascination, imaginary trauma and enchantment. Ventriloquism can now detach us from a restricted economy of voice, in which voices only ever commingle with each other, to a mixed economy, of mediatic translations and transpositions, or what Michel Serres has called ‘vicariances’.