"The World Atlas of Language Structures" Published

Max Planck scientists in Leipzig unveil one-of-a-kind documentation of world’s linguistic diversity / Surprising degree of grammatical borrowing between languages.

August 01, 2005

Grammar is often considered a dry and dauntingly complex subject and the diversity of sound structures and syntactic patterns found in the world’s languages is so great that no individual linguist can hope to keep track of them all. Yet a thorough familiarity with language differences and language universals is indispensable for a deeper understanding of the human language faculty. A team of linguists at the Max Planck Institute for Evolutionary Anthropology in Leipzig has now unveiled a new magnum opus which synthesizes the results of thousands of language-specific studies in a new and uniquely accessible way: The World Atlas of Language Structures (WALS). On 142 full color maps, the atlas displays the geographical distribution of variables of language structure in a way that is user-friendly not just to specialists but also to laymen. Accompanying the printed volume is an interactive CD-ROM, enabling users to generate and test their own hypotheses and create their own maps. The wealth of data in the atlas will place comparative linguistics on a completely new footing. One surprising result has already emerged: structural features of language are much more strongly conditioned by geography than had previously been assumed.

Distribution of Genitive-Noun word order: Map from the "The World Atlas of Language Structures".

Of the 7000-odd languages still spoken today, 2560 are represented in The World Atlas of Language Structures—even if the average number per map is "only" 400. This stems from the fact that only several hundred languages are really well described; of the rest we have only fragmentary knowledge or none at all. Some 6800 sources were utilized by a team of 50 authors, under the direction of Martin Haspelmath, David Gil, and Bernard Comrie of the Max Planck Institute, and Matthew Dryer of the University at Buffalo. All languages are equal in the atlas: each language, regardless of number of speakers, is represented on the map by the same circular symbol. For linguists, small and endangered languages threatened with imminent extinction are fully as interesting as large national languages.

The atlas provides information on a vast range of structural variables: number of consonants (from 6 to 122), presence of rare sounds like ö and ü, tone systems, gender categories, plural formation, number of cases, verbal future and past forms, imperatives, word order, passives, numerals, color terms, writing systems, and more.

For a few well-described variables such as word order (Verb-Object vs. Object-Verb, Adjective-Noun vs. Noun-Adjective, etc.), the maps display over a thousand languages. For relative-clause formation, however, information is harder to come by, so that the corresponding maps include less than 200 languages. The two maps on the grammar of Sign Language show only 35 languages, as comparative research into sign languages is still in its infancy.

The non-randomness of the geographical distribution is immediately obvious on almost every map. Languages with ö and ü occur almost exclusively in northern Eurasia (from Paris to Peking), but not south of the Himalayas. The complex sounds gb and kp exist only in West and Central Africa. Languages with the word order Noun-Genitive ("the house of the father") are found in Africa, Europe, Southeast Asia, and Central America; elsewhere the dominant order is Genitive-Noun ("the father’s house"). In the languages of Eurasia and northern Africa the normal mode of expression is to say "I gave him the food"; in Australia and the Americas, by contrast, the more common structure is "I gave him with food" (cf. English "I plied him with food").

This is a surprising result. Since its inception in the 19th century, comparative linguistics has primarily sought to trace similarities between languages to a common inheritance from a reconstructed ancestral protolanguage. The WALS maps, however, show plainly that structural features exhibit a strong tendency to geographical homogeneity: a language will typically have much in common structurally with its neighboring languages, regardless of whether they are related to it or not. Thus Hindi, which is related genealogically to the Germanic, Romance, and Slavic languages of Europe — all going back to an Indo-European protolanguage which was spoken around 6000 years ago — shows striking structural similarities to (unrelated) Tamil and other languages of the Dravidian family of southern India. And Finnish is much more similar to its (unrelated) geographical neighbors Swedish and Russian than to its distant relatives in Siberia.

Commonalities like these must reflect borrowing of structural patterns between neighboring languages. That words are routinely borrowed from neighboring languages is well known, but the extent of grammatical borrowing revealed here comes as a surprise. The mechanisms of such borrowing are still insufficiently understood, and represent a challenge for future research.

The data in this atlas are also of profound importance to research into the fundamental cognitive structures, perhaps in part innate, that underlie the human language faculty. Many of the universals of language that have been observed involve correlations between logically independent linguistic variables. Much has been conjectured about such correlations, but in most cases the data have been too sparse for reliable conclusions. The database provided on the interactive CD-ROM will now enable users to look at combinations of any linguistic variables and search for correlations among them.

Go to Editor View