Text: Katrin Blawat
At some point in the conversation with Nick Enfield, doubts start to creep in. It’s those little pauses that crop up before the Australian answers: was that just a moment of inattention, a second of deliberation? Or was there a message in that brief hesitation? Enfield, his fellow scientist Tanya Stivers and Director Stephen Levinson are looking into questions like these in their project on multimodal interaction. While everyday conversations may often be mundane and littered with errors, they are all the more interesting for psycholinguists.
“It’s easy to study how pure information is transmitted by looking at individual, unrelated sentences,” says Enfield. “But if we observe an informal chat within its overall context – including things like the exchange of glances, body language and movements – we can learn a lot about the relationship between the people conversing.”
Nick Enfield moved into his office at the Max Planck Institute on the outskirts of the Radboud University campus in Nijmegen nine years ago. The institute building is situated in the middle of a forest and looks so inconspicuous from the outside that no one would ever guess that this is a place where scientists study the essence of what makes us human. That is, after all, exactly what Enfield and his fellow scientists aim to achieve when they spend days and weeks patiently transcribing every “um” and “ah” of a recorded conversation.
Humans are unique among species in terms of the complexity of their interaction with conspecifics. “These endless possibilities and the will to cooperate with others, to form friendships, to manipulate others for one’s own benefit or to quarrel with strangers – you don’t find that anywhere else to the same extent as in human society,” says Nick Enfield.
In his quest for the roots of this drive to be social, Enfield soon found his way to our everyday use of language – including sign language, gestures and facial expressions, such as eyebrows raised in skepticism or a hand raised in defense. It is only thanks to their sophisticated capacity for communication that people are able to interact and cooperate. If it weren’t for this, it would be pure coincidence if any of a human being’s actions were at all coordinated with those of another human being.
However, if the desire to cooperate is a characteristic of the human species and any form of language is the expression of this desire, does this not suggest the existence of universal principles of human communication across all cultures? A kind of “universal infrastructure,” as the Max Planck scientist calls it, that regulates coexistence in every human society.
The psycholinguists have already come up with some initial ideas on what this fundamental infrastructure might look like. For a conversation to go without a hitch, both the speaker and the listener have to see themselves in a similar situation. The listener must work out whether the sentence “I’m thirsty” is a purely informational utterance or whether it is also a request for something to drink. Even more indispensable is the ability to correctly read the speaker’s intention when irony, sarcasm or simply an imprecise mode of expression are involved – as they are in almost any informal conversation.
Levinson and his team conducted an fMRI study to examine how this unspoken mutual understanding is achieved. If the sender signifies a communicative intention of an unconventional nature, the same regions of the brain are activated in both the speaker and the recipient – the recipient apparently simulates the intentions of the sender.
Scientists use theory of mind to describe our ability to get into someone else’s thoughts, and for Enfield and his fellow scientists, this theory is one of the core elements of human communication and cooperation. But how, for example, does a listener recognize that the speaker is about to stop speaking and give the other a turn to speak? Nick Enfield examined this issue with the help of an experiment in which his subjects were played excerpts of a phone conversation between two Dutchspeaking friends. The test persons were asked to press a button the moment they thought the speaker was about to finish speaking. The experiment simulated a situation that occurs in every conversation: the fact that it takes about half a second to produce a response to what is heard in a simple, casual chat means that it is necessary to predict when the conversation partner is going to stop talking in order to avoid long pauses.
“An ideal conversation has no gaps and no overlaps in the sense of people speaking at the same time,” says Enfield. There are still lulls in a conversation, of course, but they usually convey some sort of meaning. In fact, one of the findings of Enfield’s studies is that positive responses to yes/no questions are expressed faster than negative responses in all cultures. If the speaker notices the brief pause of a few milliseconds, their partner’s potentially negative response can be anticipated.
In his tests, Enfield manipulated the sound recordings, in one case making the words incomprehensible without changing attributes such as the pitch, volume and duration of what was said. The experimental subjects found it much more difficult to anticipate the end of a turn at conversation than they did when the recording was unedited. If, on the other hand, Enfield modified the pitch, the subjects had no trouble whatsoever.
Thus, it is rather the individual words and grammar than the pitch that indicate that a speaker is about to end their contribution to the conversation. “At least this is true of Dutch, and probably other European languages, too,” says Enfield. “But what it’s like in a language like Japanese, in which a great deal of words can simply be omitted, we just don’t know yet.”
Nevertheless, the psycholinguists noticed that there is one thing that all languages have in common. “All languages ensure that people can cooperate,” says the Max Planck scientist, referencing anthropologist Robin Dunbar. The latter postulates that language evolved 200,000 to 400,000 years ago as a tool for regulating social matters in what is, for primates, a fairly large-scale society. According to his theory, gossiping was the engine of the development that eventually reached its peak in half a million different languages.
Linguists estimate that there are around 7,000 languages spoken in the world today. And in only about 10 percent of them did someone go to the trouble of writing down grammar rules and extensive lists of vocabulary. According to the Ethnologue database, 82 percent of languages are spoken by communities with fewer than 100,000 members, while 40 percent even have fewer than 10,000 speakers.
Nick Enfield knows only too well what it means to study languages like Kri, which is spoken by only 300 people living in a remote part of Southeast Asia. Enfield’s office contains photos of colorfully dressed speakers of the language to remind him of the annual “field season,” the time when he leaves the Netherlands to spend a few weeks in the foreign culture. Early on, he was surprised by the distinct hierarchies that manifested themselves in each apparently insignificant exchange of banter: the senior person sat virtually enthroned above whoever the dialog partner was – even if it meant that the latter had to literally kneel out in the street.
The scientists follow the everyday lives of the people in Asia or Africa with a video camera. Who initiates a conversation, where are they looking when they do so, how and when does the other person respond? “You can’t avoid becoming a part of the foreign society when you do this,” explains Enfield. “After all, we live among the people we study, we eat with them, we join in their celebrations.” Test conditions of this kind are a nightmare for psychologists, he says, because they are difficult to control. Since everything is happening in real time, a situation can never be repeated exactly.
The researchers eventually get back to their desks with hundreds of hours of video recordings. That’s when the most time-consuming part of the work begins. Like zoologists who return home from an excursion and need to examine under the microscope the legs of the beetles they collected and measure them down to the last millimeter, the psycholinguists transform their video recordings into annotated records, transcribing everything precisely, right down to the last millisecond. Those who refer back to their notes are amazed at how incomplete the sentences in a conversation usually are. The researchers do not skip a single pause or a single stammer, but this degree of meticulousness takes time: the accurate transcription of one minute of video recordings takes about two hours.
Sometimes Enfield feels like a deepsea researcher discovering a strange world of exotic species: creatures that look totally different than anything they have seen before but that, like all organisms, nevertheless pursue the objective of surviving and reproducing. It’s similar with the languages of the world, says Enfield. Structural similarities between a Western European language and Kri will be hard to find, but in the highly industrialized West as in the Asian jungle, the respective language serves to organize interhuman activities.
If, however, the quest is to exchange factual information and nothing else, the spoken language, at least, is not always necessary, comments Enfield. The scientist knows this from his own experience of visiting a foreign country for the first time and having to come to grips with the new language before anything else. “It’s always challenging, but it’s fun,” he says. “You know you’ll make progress just as soon as you invest some time in it.”
Thanks to their experience in field research, the Max Planck scientists are challenging an established theory of linguistics with their own theory of a universal infrastructure of linguistic usage. To this day, Noam Chomsky’s concept of a single, deep grammar that is valid for all languages is still the dominant one in the discipline. Followers of Chomsky’s teachings claim to be able to work out universal commonalities in language structure. According to them, all languages have structures like nouns, verbs, adjectives and auxiliaries, as well as rules on word order in a sentence.
Not true, says linguist and anthropologist Stephen Levinson, Director at the Max Planck Institute in Nijmegen, citing examples from languages that an untrained language user could not even pronounce: “Riau Indonesian has no rules governing word order; the Australian languages Kayardild and Bininj Gun-wok have no auxiliary verbs; and in Lao, a special verb form is used instead of adjectives.”
Furthermore, there are certain specific characteristics that no inventor of an artificial language would ever think up, as they appear, at first glance, to be too obscure. For instance, the Native American language Kiowa has no standard form for the plural. Instead, speakers indicate whether they’re talking about an unusual number for a specific object, such as more than two legs or only two individual pebbles. All languages appear to have a common purpose: communication that makes possible the organization of society. But the existence of universality on a structural level – that is, Chomsky’s universal grammar described above – is something that he and Nick Enfield consider to be a myth.
“Languages differ so significantly at every level of their structure that we find it difficult to identify even a single feature that is common to all of them,” says the Max Planck Director. “We are the only species whose communication systems fundamentally differ from each other in form and content. If you think about the evolution of language but ignore this fact, you miss the one characteristic that makes our species remarkable.”
Michael Tomasello backs Stephen Levinson up on this point. A Director at the Leipzig-based Max Planck Institute for Evolutionary Anthropology, Tomasello states emphatically: “Universal grammar is dead,” pointing out that scientists are not able to say exactly what it is that’s supposed to be universal – nor do they have any clear way of finding out.
The universal infrastructure postulated by Levinson and Enfield, however, is deeper – and therefore less obvious. It manifests itself in such aspects as the time lag before a response is articulated. And the researchers hold that they can prove this universal infrastructure by demonstrating that there are rules for informal linguistic usage that people of all cultures follow. Each of the psycholinguists speaks a handful of exotic languages to ensure that they are equipped for these kinds of intercultural studies. Enfield is an expert in Lao and Kri, and can also make himself understood in Khmer and Chinese. His boss, Levinson, has thus far specialized in languages of India, Mesoamerica, Australia and New Guinea.
But even if the scientists have no trouble communicating, cross-cultural studies are very time consuming and costly. Organizing a study in just a single Western European language, on the other hand, is much easier, says Enfield, explaining that they barely even need to advertise for test subjects in such cases. The fact that most study findings stem from this relatively restricted cultural milieu is something he calls “ethnocentrism”.
To get around this problem, Nick Enfield expanded the topic of one of his most recently published studies to encompass a total of ten languages from five continents: Italian, English, Danish and Japanese were examined alongside languages from Mexico, Laos, Namibia and Papua New Guinea. The psycholinguists wanted to find out how much time elapses in the various language areas before a person reacts to a simple yes or no question in conversation. In a bid to make their study as true to life as possible, the scientists analyzed real-life situations that they had previously filmed in the individual cultures concerned.
In Lao, for example, two men discussed what route they should take with their truck to reach the next village. The findings support Enfield’s theory of a fundamental infrastructure in everyday linguistic usage: when it came to questions that could be answered with a yes or a no, people in all language areas responded, on average, after 208 milliseconds. “This time lapse is evidently a universal target,” says the scientist.
The study did, however, uncover minor culture-specific variations. The Japanese answered the fastest, the Danes the slowest. “The infrastructure of language use as we see it is not fixed – it’s a bunch of principles that can grow into specific local traits, thereby shaping a culture,” says Stephen Levinson, explaining this finding.
How closely language use and culture are interwoven in people’s everyday lives is something on which Nick Enfield and five other scientists plan to gather evidence in another research project. Since January, Enfield is heading the Human Sociality and Systems of Language Use project, financed with two million euros from the European Research Council.
Whose job is it to clear up misunderstandings in a casual conversation? How strongly can a person express a particular wish? Here too, the psycholinguists want to answer questions like these for a total of seven different languages. Enfield’s box of technology, including his video camera, is all packed and ready to go. Soon he will be back in Laos to meet up again with the colorfully dressed speakers whose photos adorn the filing cabinet in his office.
Theory of mind (ToM)
The ability to get into someone else’s thoughts, to identify that person’s intentions and align one’s own behavior accordingly. By the classical tests, children do not seem have a fully developed ToM until the age of about four, when they become able to explicitly identify the beliefs and assumptions of another person as incorrect. How ever, work at the MPI in Nijmegen has shown implicit ToM as early as one year.
The innate human ability to form new sentences with the help of a few grammatical rules and a limited vocabulary. As evidence of universal grammar, linguist Noam Chomsky alleged that any healthy child could learn any language as its mother tongue. However, universal grammar is often also understood to be a bunch of principles that are common to all of the languages of the world (such as certain rules of sentence structure, lexical units like nouns, verbs, etc.). Many scientists have criticized this notion due to counterexamples for most of the proposals.
Transferring audio recordings of spoken language to a written record, often in a specially devised writing system. In the case of scientists Levinson and Enfield, it even includes all lulls in conversation, fillers and incomplete sentence fragments. For linguists and psycholinguists, it is one of the most time-consuming aspects of their studies.