Word beads

A team of Cambridge linguists has embarked on an ambitious project to identify how the languages of the world are built – from Inuit Yupik to sub-Saharan Bantu, from Navajo to Nepalese.

There are bound to be quirks and anomalies – the linguistic equivalents of the duck-billed platypus – a language that is just weird and doesn’t quite fit into the big categories.

Professor Ian Roberts

Parents are often amazed by the speed at which children acquire language in early childhood, becoming fluent around three years of age. Compare this with the average adult attempting to acquire a second language, and it’s a quite remarkable achievement.

A five-year research project led by Professor Ian Roberts from the University of Cambridge aims to work out what it is about how a language is built that guides a child’s innate ability to acquire it.

In the late 1950s, the American linguist Noam Chomsky suggested that children are born with an innate ability to acquire language – a ‘blueprint’ for speaking any language on the planet. According to Chomsky, encoded in the human brain is an innate set of linguistic principles he called the ‘universal grammar’ that encompasses all of the properties that any language can have. The language the child then actually speaks is simply determined by exposure to the language (or languages in the case of a multilingual family) they hear as they develop.

But precisely how a universal grammar might underlie the range of languages we have today, not to mention the many past languages that have vanished completely, is a continuing puzzle, as Professor Roberts explained: “If you talk about a universal grammar then you might naturally think there is a universal language, when of course there isn’t. Rather, there are thousands of different languages.”

“The central notion is that the specification that the child has in the genome, the universal grammar, must be of the most abstract, general, structural properties of language and that different languages manifest these properties in slightly different ways,” he added. “The empirical question then is to work out what it is about a language that guides the child’s innate ability to acquire it. In other words, to understand how Chomsky’s theory could work, we need to work out how languages are built.”

Listen to a talk given by Ian Roberts for the Festival of Ideas:

Language footprints

One way to investigate the variation between languages is to suppose that there is in fact very little difference, and that each language can be deconstructed to a ‘typological footprint’ that defines it. This is the hypothesis that Professor Roberts and his team have now set out to investigate over the next five years, with €2.5 million funding from the European Research Council (ERC) Advanced Investigator Grant scheme.

“This starting premise is almost certainly going to prove too simplified,” admitted Professor Roberts, “but in the process of homing in on precisely how languages are built, what we hope will emerge is a new perspective on comparative grammar for the languages of the world.”

The idea that languages can be categorised into different types is not a new one but this project will break new ground in syntactic theory (the understanding of how sentences are constructed) by exploring how different languages measure up in terms of a set of five structural properties defined by the team.

Professor Roberts believes that a relatively small number of structural properties are needed to define each language’s unique ‘footprint’ and that this footprint is crucial to learning the language, as he explained: “We think that while the innate universal grammar may determine certain gross features of language, it is encountering this footprint that fine-tunes the acquisition of language in children.”

A linguistic duck-billed platypus

The carefully chosen properties under investigation relate to the more abstract, structural features of languages. “These properties are not always immediately apparent from surface data and require a bit of analysis to discover,” said Professor Roberts. “If children acquiring language can discover such complex properties spontaneously, this probably reflects their innate abilities since they are doing more than simply reproducing patterns.”

One example is the order of words in a sentence. In English, for instance, the word order follows subject-verb-object (as in John loves books). Although this is one of the most common word orders in the world, it’s by no means the only one. In fact, all of the logical permutations of subject, verb and object can be found in different languages but in very different frequencies, the most frequent being subject-object-verb (John books loves) in languages such as Japanese. In languages like Mohawk, words can even be combined to form new verbs (John bookloves).

Each language will have its own rules for this property, and for each of the other five properties being analysed; the task of the team is to identify these patterns. They will look at thousands of languages, from the languages of Europe to the Bantu languages of the sub-Sahara, from Caribbean languages to the Carib languages of the indigenous peoples of Brazil, and from Navajo to Nepalese. Information will be garnered from online grammars, original historical documentation of language structures and, where feasible, native-speaker consultants.

“There are bound to be quirks and anomalies – the linguistic equivalents of the duck-billed platypus – a language that is just weird and doesn’t quite fit into the big categories,” said Professor Roberts. “But of course these isolated cases are interesting in themselves. We hope to reach a situation as we refine the classificatory system when we can make predictions about what types of languages are out there.”

Gap filling

Professor Roberts’ hunch is that the classification will turn out to be more complicated than the team initially envisaged: “I suspect that we will need to evolve the properties as we go along until we arrive at the perfect set. That’s what we are most interested in doing. It’s the first time this has been done systematically or on this scale.”

Tantalisingly, when the researchers arrive at a set of properties that categorise the structure of the languages of the world, the results will not only reveal relations among language families but could also tell us something about ancient patterns of human migration.

The main aim of the project though is to deepen our fundamental understanding of how languages vary and how the human mind works in acquiring language, explained Professor Roberts: “Our current view is that language is not pre-specified but rather under-specified in the sense that there are certain aspects about the structure of language that universal grammar doesn’t say anything about. These gaps appear to be filled in as the child develops by cognitive mechanisms that work with the properties of the language they hear, and it is these properties that we aim to define.”


This work is licensed under a Creative Commons Licence. If you use this content on your site please link back to this page.