Theoretical Essay

Fundamental Operations of Language: Reflections on Optimal Design

Noam Chomsky

University of Arizona image/svg+xml


Basic Property of Language
Computational Efficiency


20 years ago, in lectures in Brasilia, I suggested that we might someday discover that the Faculty of Language (FL) is “beautifully designed, a near-perfect solution to the conditions imposed by the general architecture of the mind-brain in which it is inserted, another illustration of how natural laws work out in wondrous ways,” so that language is rather like a snowflake, and the near-perfect design can be expected to impose inefficiency of use. I added that “these are fables,” with the redeeming value that they “might even turn out to have some elements of validity. In the years since, solid reasons have been found to suggest that these hopes were understated, and that the “fable” – the Strong Minimalist Thesis – appears to have considerable validity. A number of striking and puzzling properties of FL – “universals of language” in the contemporary sense – have been shown to derive from the simplest computational operation, Merge, along with conditions of computational efficiency that are in effect part of natural law. And as anticipated, they do indeed impose communicative inefficiency.


I will review recent steps towards achieving these goals and also sharpening and improving related concepts of UG. I am very sorry that I cannot be with you in person today. There are a great many topics I would have liked to discuss with you – not just in the area of our specific concerns here.

Before turning to these, it’s hard to refrain from some comments on the shocking events that are taking place under the new regime in Brazil. The most ominous, of truly world-shaking significance, is the destruction of the Amazon by handing its riches over to Bolsonaro’s friends in agribusiness and mining. According to the most recent scientific studies, if developments continue on their present course, by 2035 the Amazon will probably shift from being a net Carbon sink to a net emitter, with disastrous consequences not just for Brazil but for human survival.

Those with eyes open are aware that we are facing a severe environmental crisis.

Mean global temperatures are approaching those of 125,000 years ago, when sea levels were 6-9 meters higher than today. Some serious recent studies predict non-linear escalation of the already deeply threatening pace of global warming. It doesn’t take much imagination to understand what this will mean for the world, from the low-lying plains of Bangladesh to coastal cities all over the world and well beyond. Under these circumstances, to destroy the Amazon forests is a crime of colossal proportions, true criminal insanity.

As we all know, an immediate consequence of Bolsonaro’s criminal programs is the virtual genocide of the indigenous populations, another catastrophic crime. For linguists in particular, that means also the destruction of the extraordinary wealth of languages and the cultural richness embedded in them. All of this is happening right now, not far away. It’s worth recalling a little background, which I can only barely mention but deserves a lot more discussion. A World Bank study in 2016 described Lula’s terms in office as a “golden decade” in Brazil’s history, with remarkable achievements in reducing poverty, expanding inclusiveness in education, and in general respecting the dignity of the poor and marginalized in Brazil’s deeply inegalitarian society. Shortly after Lula left office, a “soft coup” was initiated, leading finally to his imprisonment, clearly timed to silence him before the election of October ‘18 that he was likely to win. He is now the world’s most important political prisoner. The immediate motive for the coup may have been the discovery of vast offshore oil deposits, and the PT proposal to use most of the profits for education and welfare – not for the profit of multinational energy corporations.

That is not all. The Bolsonaro government is significantly restricting Brazil’s rich contributions to the sciences. The humanities are on the chopping block. The very existence of independent universities is threatened, along with the educational system more generally, with severe implications for the cultural and intellectual level of the next generation of Brazilians. The recently announced sharp cuts for philosophy and sociology are telling us, loud and clear, that Brazilians are to renounce any interest in and concern for the primary questions that have animated human thought, cultural achievement, and social action throughout modern history, and are to become passive tools for profit-making, disguised as development. To put the matter differently, the financiers and international corporations and investors who are robbing Brazil from its citizens are also seeking to silence the most ancient voice of Western thought, the Delphic Oracle, with its injunction to know thyself. That means to find out what kind of creatures we are, what kind of cultures and societies we create, and how they can be developed, perhaps radically changed, to satisfy ideals of justice and freedom. This assault on elementary rights and on human dignity cannot be permitted to persist.

Our particular concerns in this conference happen to be integral to these deep currents of human thought and cultural achievement. From classical antiquity, it has been recognized that language has a special place in establishing what kind of creatures we are. Language is, first of all, a true species property, a common endowment of human groups and unique to humans, with no analogue in essential respects among other organisms. And it is also the primary source of the uniquely creative capacities of humans, which have so changed the world – not always for the better.

A rich tradition identifies language with thought, expressed succinctly by William Dwight Whitney’s characterization of language as audible thought. Wilhelm von Humboldt had gone much further: for him, language is virtually identical with thought. In his words, “Language is the formative organ of thought. Intellectual activity, entirely mental, entirely internal, and to some extent passing without trace, becomes, through sound, externalized in speech and perceptible to the senses. Thought and language are therefore one and inseparable from each other.” For Descartes, the creative aspect of normal language use – the capacity to form new expressions of thought that are appropriate to occasions but not caused by them – was a prime basis for postulating a second substance, res cogitans, alongside of extended matter.

The 17th century scientific revolution provided new insights into language and reasons to regard it as the crucial feature of the “human capacity,” the term used by paleoanthropologist Alexander Marshack to describe the elusive quality that makes humans so distinctive. Galileo and other great figures of the time refused simply to accept conventional verities, and quickly discovered that what may seem natural and obvious is in fact puzzling, deeply mysterious, demanding explanation, whether falling bodies or perception of geometrical figures or any of the other phenomena of the world.

Language did not escape their probing. They expressed their awe and amazement at something taken for granted though indeed puzzling, mysterious, and demanding explanation: the remarkable fact that with a few dozen sounds, we can somehow construct infinitely many expressions of thought and can convey to others who have no access to our minds their innermost workings. It is amazing, if you think about it. And there is nothing similar to it in the organic world. That raises a crucial question: How can this unique human achievement be understood and explained?

These ideas inaugurated a rich tradition exploring “rational and universal grammar.” Rational in that it sought explanation, not just description, or in modern practice, simulation. Universal, in that it sought to discover underlying capacities realized in all languages.

The last prominent representative of this tradition, to my knowledge, was Otto Jespersen, a century ago. For Jespersen, a particular language “come[s] into existence in the mind of a speaker” on the basis of finite experience, yielding a “notion of structure” that is “definite enough to guide him in framing sentences of his own,” crucially “free expressions” that are typically new to speaker and hearer. And a more general concern of linguistic theory is “the great principles that underlie the grammars of all languages.”

While traditional formulations were imprecise, I think it’s fair to interpret them as recognizing that the faculty of language FL, as well as individual languages, are internal properties of persons, the former shared throughout the species and unique to humans in fundamental respects, a true species property, and the basis for human culture and creativity.

All of this was swept aside by the structuralist/behaviorist currents of the early 20th century, which adopted a very different conception of language and the enterprise of investigating it. And all was forgotten, even a prominent recent figure like Jespersen, as shown in a study by historian of linguistics Julia Falk.

The general program that culminates with Jespersen falls within the natural sciences. Its revival since the 1950s, within generative grammar, has been called the “biolinguistics program.” Earlier efforts had run into difficulties, both empirical and conceptual. Evidence was limited, and it remained unclear how to understand Jespersen’s concept “notion of structure in the mind” that enables speakers to frame free expressions, which hearers can somehow understand: what we can think of as “the Galilean challenge.”

By mid-20th century, Turing and other great mathematicians had developed the tools for addressing the Galilean challenge, at least in part, within the general theory of computability. Jespersen’s “notion of structure in the mind” is I-language: a finite generative system that determines an unbounded array of hierarchically structured expressions of thought, what we can call “the Basic Property of Language.” These internal structures can be externalized in the sensory motor system SM, typically in sound, but as we now know, sign language is essentially the same as spoken language in structure, manner of acquisition, and general use. And with some reservations, the same seems to be true of touch. In general, the mode of externalization seems to be extraneous to language, an important fact.

Both I-language and the shared FL are properties of individuals, coded somehow in the brain. I-language is the mature state attained by FL, given experience: it is the system that guides the speaker in framing free expressions. UG is the theory of FL, adapting a traditional term to a new context. UG is to be distinguished from generalizations that largely hold of language, a crucial distinction sometimes overlooked.

UG faces several empirical conditions: descriptive adequacy, learnability and evolvability. It must account for properties of I-languages and the feat of acquiring I-languages from scattered and impoverished data, the very acute problem of Poverty of Stimulus (POS), often unappreciated, now known to be much more severe even than had been assumed earlier And it must be simple enough so that it could have evolved – more specifically, evolved under the empirical conditions that are coming to light.

These demands appear to be in conflict: to overcome POS, it seems that rich initial structure is necessary, but for evolvability, this system should be simple. The apparent conflict is a crucial element of the Galilean challenge.

Very little is known about the evolution of cognition. The prominent evolutionary biologist Richard Lewontin, in one of the major articles on the topic, argued that we will never learn much about it because relevant data are unobtainable. But a few things are known that are suggestive. Genomic evidence indicates that modern humans, who emerged some 200,000+ years ago, began separating not long after (in evolutionary time), roughly 150,000 years ago. Language capacity is shared, so had already evolved. Riny Huijbregts has shown that the languages of the earliest group to separate (the Khoisan languages) are all and only those that make extensive use of clicks, with irrelevant exceptions. That suggests that while the internal core of FL was shared before separation, externalization, or at least some of its aspects, might have appeared later. Not long after the separation we begin to see indications of rich symbolic behavior; there is no serious evidence of any before the appearance of modern humans.

All of this suggests that FL emerged fairly suddenly in evolutionary time. If so, we would expect that FL should be simple in structure, with few elementary principles of computation, satisfying the evolvability condition. What remains to fix an I-language should be detectable from simple evidence. We know now from extensive studies that the essentials of language are mastered very early, in the first few years of life; and on the basis of very limited evidence, as Charles Yang has shown by statistical analysis of corpora available to children.

These considerations sharpen the Galilean challenge.

As in any other rational endeavor, linguistic research should seek the simplest theory of UG. One reason is quite general: simplicity of theory corresponds to depth of explanation. A second reason is a precept of Galileo’s: nature is simple, and the task of the scientist is to establish the fact, from objects in motion to flight of birds to all phenomena of nature. This precept is a posit, a kind of regulative principle. In the sciences, the precept has been spectacularly successful – a good reason for taking it seriously.

It is sometimes argued that products of evolution are different: evolution pursues a course of tinkering, François Jacob’s bricolage, yielding complex objects. Perhaps so, but that would not hold in the case of evolution of language if what I outlined before is correct. If so, then for language there is a special reason to seek simplicity, not holding generally: the actual conditions of language evolution. Considerations like these bear directly on the Minimalist Program MP, to which I’ll return.

The twin demands of learnability and evolvability, which appear on the surface to be in conflict, provide the condition for genuine explanation – for addressing the Galilean challenge. Genuine explanation is at the level of UG, in a form that satisfies these twin demands, relying on “third factor” principles, which can be presupposed. It is an austere requirement. Anything short of that is at best a partial account, which we can understand as clearing the way towards explanation – a very important endeavor: a carefully structured problem is a great advance over chaos, but only a step forward in the search for genuine explanation.

My feeling is that we may finally be in a position to take the Galilean challenge seriously and to provide some genuine explanations for fundamental features of language. That’s quite important if true. It would inaugurate a new stage in the ancient study of language.

Reviewing quickly some familiar ground, and skipping many details, the early proposals of generative grammar were dual: phrase structure grammar (PSG) for compositionality and projection (labeling), transformational grammar (TG) for displacement, both PSG and TG for linear ordering. Both PSG and TG were far too complex to meet the long-term goals of genuine explanation. The general assumption was that compositionality, linearity, and projection are natural and expected, while displacement, though ubiquitous, is a strange and more complex property of language, a kind of imperfection that has to be accommodated with some more complex mechanisms.

That’s a view that’s still widely held in current literature, erroneously I think. More recent work suggests that the opposite is true: displacement is the simplest of the computational operations of language and any special mechanisms to account for it are misconceived.

I’ll sketch one particular path of further development of the “generative enterprise,” one that seems to me particularly promising.

In the 1960s, PSG was abandoned, on solid grounds. It permits a vast number of impossible rules (e.g., VP à N PP) and the symbols, such as NP, illegitimately suggest properties that are extrinsic to PSG. In retrospect, we can say that PSG conflated three phenomena that should be considered separately: composition, linear order, and projection.

By 1970, PSG had been replaced by X-bar theory, which eliminated the host of impossible rules and the illegitimacy of the symbols, and separated linear order from compositionality and projection.

These improvements had several consequences that were not fully appreciated at time. X-bar systems generate abstract structures with no linear order. But linguistic expressions plainly have linear order, which has to be assigned somehow. And languages of course differ in how it is assigned. Thus English is uniformly head-initial and Japanese uniformly head-final; the rich research into these matters in the years that followed has shown that while not universal, such systematic cross-category choices are by far the most common.

It follows that there is really no alternative to the Principles and Parameters (P&P) approach that crystallized a decade later, though it took some time for this to be recognized.

Furthermore, different parametric choices yield the same interpretation: V-O has the same interpretation as O-V. That suggests that linear order is a matter of externalization, not part of core l-language that expresses thoughts — the narrow syntax that yields interpretations at the conceptual-intentional CI interface. That was a first step towards much more far-reaching conclusions about language architecture, which became clearer once compositionality and displacement were unified within the MP.

X-bar theory still conflates projection and compositionality. That runs aground with XP-YP constructions, which are ruled out in principle by X-bar theory, but which abound: displacement, subject-predicate, many others. Artifices were used in practice to impose endocentricity, but these we want to avoid. These problems are overcome substantially by labeling theory, which uses the optimal computational device of minimal search to accommodate exocentric constructions and to determine when movement must, may, and may not take place. The shift from projection to labeling brings to light hidden problems in accounts of projection (e.g., what exactly is a head, and at what level of generation is such a notion established?). And other problems arise as well. But the general picture is I think a substantial improvement over X-bar theory. This step finally separates three independent factors: compositionality, labeling, and ordering, the last not part of core I-language (narrow syntax).

The existence of exocentric constructions requires postulation of a workspace WS, containing the elements accessible to the combinatorial operations. Thus before NP and VP are combined, each must be constructed separately in WS. We can understand WS to be the set containing the Lexicon and all syntactic objects already constructed, though there is more to say.

By the 1990s, it seemed to some of us that enough had been learned so that it would be possible to confront directly the task of constructing genuine explanations. What came to be called the MP.

Pursuing the program, we seek the simplest computational operation OP, one that is in fact incorporated in some manner in every computational operation, and we ask how much can be explained by OP along with the other factors that enter into acquisition of I-language: external data, language-independent (“third factor”) principles, and conditions imposed by the systems with which language interacts (“bare output conditions,” BOCs). Third factor conditions include computational efficiency, an essential element of the quest for simplicity, hence explanation.

If OP suffices to explain some linguistic phenomenon, we approach genuine explanation. The learnability problem is solved: there is no learning, though there might be triggering of innate properties by experience, a familiar phenomenon. The evolvability problem is solved as far as possible within linguistics: The Basic Property holds as a matter of fact, and requires a computational operation, so it’s to be expected that nature selected the simplest one, OP; and if indeed OP is contained in all other operations, the question doesn’t even arise. Just how OP is implemented in the brain, and how this implementation evolved, are interesting questions. There are some intriguing ideas discussed in Angela Friederici’s recent book, Language in our Brain.

The simplest computational operation is binary set formation, called “Merge” in recent literature. An account for some phenomenon in terms of Merge satisfies the dual conditions, counting as genuine explanation. Almost. I’ll return briefly to some qualifications.

We therefore take Merge (X,Y) = {X,Y}, where X and Y are either lexical items or syntactic objects in WS, already generated. There are, of course, many subcases of Merge, two general ones of particular significance: External Merge (EM), where X and Y are distinct; Internal Merge (IM) where one is contained in the other; say Y is contained in X -- technically, is a term of X, where Z is a term of W if Z is a member of W or of a term of W. Internal Merge yields displacement, with two copies. Thus if Y is contained in X, then Merge(X,Y} = {Y, {X,Y}}.

To clarify some misunderstandings in the literature, note that there is a single operation Merge, with these two subcases. There is no operation “Re-Merge” or “Copy”; just Merge. There is no “reconstruction” operation: just two copies (in fact, sometimes many more). Furthermore, there is no way to construct Merge from more elementary operations; in particular, no evolutionary path from some subcase of Merge to Merge. In fact, any subcase of Merge alone is more complex than Merge; thus EM(X,Y) is {X,Y} with the extra condition that X and Y are distinct.

Nevertheless, one subcase of Merge might be simpler than another in the way it functions. That is indeed the case. Compare EM and IM. To apply EM, we must search all of WS: the Lexicon (which is huge) and all objects previously constructed (a set that can grow without limit as constructions become more complex). To apply IM, we search only a single object, a vastly simpler process.

That raises a question: why does language ever use EM? The answer is straightforward. IM alone yields structures that have no interpretations at the CI interface. It does not yield expressions of thought, hence does not constitute an I-language, within the biolinguistics framework. There are no head-complement or XP-YP constructions, hence no theta structure. In fact, there can be only a single-membered lexicon. EM overcomes these inadequacies, yielding a possible I-language. Hence both EM and IM must be available: EM to yield an I-language in the first place, IM because it would require an unmotivated stipulation to bar the simplest operation. And indeed both are ubiquitous, with distinct semantic roles, a property sometimes called “duality of semantics.”

Consider a system that only uses IM. It can be shown that this in effect yields the successor function, and with some limited tweaking, all of arithmetic. That is a suggestive conclusion, perhaps providing a solution to a problem that greatly troubled Darwin and Wallace: why is knowledge of arithmetic universal (so they assumed, correctly it seems, though like language and many other innately determined properties of the organism must be triggered by experience)? A serious problem for them, since it could not have evolved through natural selection. A possible answer is that Merge appeared at some point in the evolutionary record, perhaps along with Homo Sapiens, providing the Basic Property of language and also arithmetic.

An important discovery of recent years is that Merge alone provides far-reaching genuine explanations. One is unification of compositionality and displacement. Far from being a curious imperfection of language as commonly assumed (by me in particular), displacement is expected: it would require an unmotivated stipulation to bar it. It is the use of EM that requires explanation, which is immediate along the lines mentioned.

It follows as well that reconstruction is automatic with its many complex interpretive consequences, now with a solid basis for genuine explanation.

Proceeding further, with unification, what was suggested by X-bar theory becomes much clearer. Merge yields sets, with hierarchy but no order, just like X-bar theory. Why then does linear order exist? The reason is clear. It is required by the SM interface. The articulatory system requires linear order; it cannot produce structures (for sign, with the extra-dimensionality available, the linearity condition is slightly relaxed).

The SM systems have nothing to do with language. They were in place long before language emerged, and any SM system of sufficient richness will do for externalization of I-language. Looking at the general architecture of language, it seems to consist of narrow syntax generating CI structures that are expressions of thought, and a process of externalization that may (but need not) map them to one or another SM system.

If so, then externalization is not strictly speaking a property of language alone; rather of an amalgam of language and some language-independent system. We might anticipate then that the mapping should be complex and unstable, and able to take many forms; in short, that externalization would be the primary locus of the apparent complexity, diversity, and mutability of language. That seems increasingly to be a plausible thesis. It may even turn out that the core of language, constructing thought, is close to a common human possession. If true, it is an important insight into what kind of creatures we are.

Note that there is an important distinction between the linearity condition imposed by the SM system and the thought-related conditions at the CI interface. The former are true BOCs, conditions entirely independent of language, with properties that have nothing to do with language. I-language could exist without any means of externalization – and indeed may have at its earliest stages. In contrast, if the thought-related conditions are not satisfied, as in a system restricted to IM, we do not have an I-language at all.

If externalization is an ancillary property of language, then communication, which relies on externalization, is even more remote from the essential nature of language. That conclusion is supported by investigation of cases of conflict between communicative efficiency and computational efficiency, the latter an essential property of language, imposed by third-factor conditions. In every known case, the former is sacrificed, often leading to difficulties in communication: structural ambiguity, garden path sentences, islands, filler-gap problems. The conclusion conflicts with a widely held contemporary doctrine that communication is the function of language, which developed in some fashion from animal communication systems. The doctrine seems groundless, refuted by substantial evidence. Further evidence is provided by the minimal meaning-bearing elements of human language, which differ radically from the components of animal systems in the ways they relate to the world, a topic I will put aside here.

The traditional doctrine that language is primarily an instrument of thought is supported further by another significant problem that has a genuine Merge-based explanation: the puzzling property of structure-dependence of rules, holding for all languages and all constructions. Acquisition studies show that the property is operative as early as children can be tested; at 30 months according to some recent experiments. It is evident that this principle cannot be learned even by a single child, let alone by all children in all languages. Curiously, there have been many efforts to show how the property might be learned from massive data, all refuted, but all pointless in the first place given the nature of the problem.

The property is indeed puzzling. It means that children ignore everything they hear, all of which observes linear order, and attend only to what they never hear: structures. Their mental operations are limited to these, ignoring much simpler computations based on the linear order that constitutes their entire experience. In a Merge-based system, the problem doesn’t arise. The child has no option other than keeping to structure-dependence.

Again, we have a genuine explanation for a deep and quite surprising principle of language, with many consequences. As discussed elsewhere, in this case there is also supporting neurolinguistics and psycholinguistic evidence.

To summarize, there is a strong case for genuine explanations for some fundamental and rather curious and puzzling principles of language:

(1) The Basic Property;

(2) The ubiquity of displacement;

(3) The basis for reconstruction;

(4) Structure-preservation;

(5) The circumstances when IM must, may, and cannot apply.

One particularly exciting prospect is the range of new questions and problems that are proliferating. The obvious ones – vast in scope – are to discover how far Merge-based optimal explanation can reach. A related problem is to reconcile the strong argument that externalization is ancillary to core language with the evidence that some of its properties, though not at the interface (not narrow phonetics), seem to affect the operation of narrow syntactic rules. Another task is to determine in what ways UG assumptions must be enriched to accommodate phenomena of language. I think there is strong evidence that at least the asymmetric relation pair-Merge is needed. In the case of any such proposal, a central question is to determine to what extent it makes use of such third-factor principles as minimal search, thus reducing the UG contribution.

Still another task, which turns out to be quite challenging, is to overcome serious defects of Merge, which allows legitimate generation of ungrammatical sentences, violating no internal or external conditions. That requires principled means to restrict accessibility, raising many intriguing questions of considerable import.

A related matter is a conceptual problem with the way Merge-based systems have been formulated. A computational system moves through successive states, with operations that map one state to the next one. But what are the states of a Merge-based system? The answer is WS, the workspace. It constitutes the current state of the computation. Merge therefore applies to a state WS, mapping it to WS’ (as do other operations). The formulation of Merge has therefore been defective, and interesting questions arise as to how to reformulate it properly as a mapping of one WS to another.

A good deal of work has been done on these problems, with some interesting results, matters I cannot go into here.

In the early years of the generative enterprise questions of the kind we have been discussing could scarcely be envisioned. And until fairly recently, genuine explanation seemed a dream far out of reach. My own judgment, as mentioned earlier, is that we are entering a new and exciting phase in the millennia-long study of language.

How to Cite

CHOMSKY, N. Fundamental Operations of Language: Reflections on Optimal Design. Cadernos de Linguística, [S. l.], v. 1, n. 1, p. 01–13, 2020. DOI: 10.25189/2675-4916.2020.v1.n1.id271. Disponível em: Acesso em: 7 dec. 2023.



© All Rights Reserved to the Authors

Cadernos de Linguística supports the Opens Science movement

Collaborate with the journal.

Submit your paper