“Structured heterogeneity”, a founding concept of variationist sociolinguistics, puts focus on the ordered social differentiation in language. We extend the notion of structured heterogeneity to formal phonological structure, i.e., representations based on contrasts, with implications for phonetic implementation. Phonology establishes parameters for what varies and how. Patterns of stability and variability with respect to a given feature’s relationship to representations allow us to ground variationist analysis in a framework that makes predictions about potential sound changes: more structure correlates to more stability; less structure corresponds to more variability. However, even though all change requires variability, not all variability leads to change. Two case studies illustrate this asymmetry, keeping a focus on phonetic change with phonological stability. First, Germanic rhotics (r-sounds) from prehistory to the present day are minimally specified. They show tremendous phonetic variability and change but phonological stability. Second, laryngeal contrasts (voicing or aspiration) vary and change in language contact. We track the accumulation of phonetic change in unspecified members of pairs of the type spelled <s> ≠ <z>, etc. This analysis makes predictions about the regularity of sound change, situating regularity in phonology and irregularity in phonetics and the lexicon. Structured heterogeneity involves the variation inherent within the system for various levels of phonetic and phonological representation. Phonological change, then, is about acquiring or learning different abstract representations based on heterogeneous and variable input.

The association between structure and homogeneity is an illusion. Linguistic structure includes the orderly differentiation of speakers and styles through rules which govern variation in the speech community; native command of the language includes the control of such heterogeneous structures. (WEINREICH; LABOV; HERZOG, 1968: 187­–188)

Introduction: Structure and variation

Sound change is one of the oldest areas of linguistic inquiry, and is today an area of vibrant and rapid progress. Current work pursues several different lines of investigation, including formal phonology, phonetics and variationist theories and methods. One parameter along which we can consider this rich array of approaches is how they deal with structure and variation. Broadly speaking, linguists acknowledge the presence and importance of both, but little work addresses the relationship between them directly. We argue for a very tight inverse relationship — more structure correlates with less variation and vice versa — that is constrained by historical and social factors. We argue here that (abstract, formal) structure predicts where we are more likely to see certain kinds of variation. Specifically, the absence of structure (underspecification) corresponds to more variation, whereas phonological content limits the range of variation. We start from an old observation by Trubetzkoy (1977, originally published 1939, and cited by DRESHER 2009, p. 6–7, 45) that phonologically unspecified segments, by virtue of lacking positive content, “vary greatly with respect to [their] realization”. Because variation is a requirement for change, we pursue the idea that specification or lack thereof correlates with what changes and how. Moreover, we explore the idea that the lack of specification in some sense frees up properties of unspecified segments to be used to signal regional or social correlations, see Salmons (2020).

While the particular focus we place on structure and variation is new, to our knowledge, interest in the role of variation in phonological theory has been a topic of constant, albeit dramatically inconsistent, consideration. Relatively close to our perspective is SAPIR (1925), who shows how ‘sound patterns’ (= phonology) constrain variation, where the structural role of English /ʍ/ in words like when allows less variation than the articulatorily closely related sound associated with blowing out a candle. In the founding work on sociophonetics, Weinreich, Labov, and Herzog (1968) initiate the extremely productive move toward an explicit focus on the role of variation, especially in sound change but also in phonology, as followed on by Labov (1969), creating the notion of ‘variable rules’. From the perspective of formal phonological theory, Coetzee and Pater (2014) develop an account of phonological variation, building on earlier Optimality Theory work using what they call “Partially Ordered Constraints”. Probably the best known proposal for connecting phonetic variation with contrast comes from Dispersion Theory, according to which contrast correlates with the distribution of phonemes within phonetic space (LILJENCRANTS; LINDBLOM, 1972; but see HALL, 2011; FRUEHWALD, 2017).

Most recently, Fruehwald (2017) anticipates some aspects of our project, evaluating phonological bounding of phonetic change, defined specifically as changes in speakers’ knowledge of the implementation of continuous phonetic variables (FRUEHWALD, 2017, p. 26). We pursue this question as well, while also considering how changes in those continuous phonetic representations do or do not affect more abstract, discrete representations.

We generalize here a point from Dresher (2009, p. 180): “the phonetic properties of a vowel obviously influence its phonological representation; but this influence is not simply one way, and the phonological representation can in turn affect the phonetics, by delimiting the space within which the vowel can range.” That is, phonological specification establishes the boundaries within which phonetic variation occurs. Accordingly, variation in phonetic properties that correspond to non-contrastive features is greater than for those that are contrastive. For example, Natvig (2018) finds greater regional and social variation in formant trajectories of Norwegian vowels for non-contrastive features than for contrastive ones. Furthermore, Tanner, Sonderegger, and Stuart-Smith (2020) find parallel relationships for Japanese voicing cues, leading them to conclude that structured variation is bounded by language-specific contrasts. We seek a more explicit understanding of this relationship between phonological structure and variation, synchronically and diachronically. Specifically, we propose a move toward a model of phonology that captures the observation that “not all variability and heterogeneity in language structure involves change; but all change involves variability and heterogeneity” (WEINREICH; LABOV; HERZOG, 1968, p. 188). Having framed our project, we turn next to the theoretical context (section 2). We continue with one case study of rhotics (section 3) and another of laryngeal contrasts (section 4), before concluding in section 5.

1. Theoretical background: Phonetics, phonology and sound change

We traditionally think of phonological change in terms of alterations of the sound system, including the phonemic inventory, as sets of contrasts and the features used to define those contrasts. Outcomes of phonological change thus typically involve mergers and splits or chain shifts where classes of phonemes continue to exist but are defined by different featural characterizations (OXFORD, 2015). In order to examine how structure does and does not change over time, we now summarize our position on the role and content of phonological representations.

Our view of phonology uses Modified Contrastive Specification (MCS), which centers contrast as the fundamental role of phonological representations (DRESHER; PIGGOT; RICE, 1994). Building on observations that representations are based on inventory-specific contrasts (AVERY; RICE, 1989), MCS proposes that features are organized in a hierarchical structure. Dresher (2009) develops this position, arguing that contrastive features partition phonemic inventories into natural classes, one feature at a time, based on language-specific phonological patterns. This feature-by-feature assignment is formalized as the “Successive Division Algorithm”, or SDA (DRESHER, 2009, p. 16), which is a method for establishing which features are present in phonological representations and which features are absent, or redundant. These representations, then, are inherently underspecified for content. Following the contrastivist hypothesis (HALL, 2007), the features selected for specification via the SDA are those features that are active in phonological rule alternations.

The set of ranked phonological features is a contrastive hierarchy built on the relative scope or order of phonological features (DRESHER, 2009, p. 17). As an example, the application of the SDA to German consonants results in the contrastive hierarchy [consonant] < [obstruent] < [nasal] < [lateral], as in (1) (with further specification to place for nasals and manner and place for obstruents). Here, the categories [(sonorant)], [(oral)], and [(rhotic)] define natural classes that contrast with the features in the contrastive hierarchy. These could also be marked as positive and negative feature values. As a framework, MCS is agnostic with respect to whether features are equipollent (binary features) or privative (presence vs. absence of features), but we adopt a privative feature model (see below).

(1) Contrastive Hierarchy of German Consonants

Figure 1.

Contrastive hierarchies define language-specific contrasts based on how a given language organizes its inventory into natural classes. A desirable outcome of this position is that the ‘same’ sound in two languages may have different phonological representations based on how that sound relates to the other phonemes in the language. For example, some German varieties have an alveolar trill /r/ (WIESE, 2001). Malayalam also has a trilled /r/, in a natural class with retroflex /ɭ/, both specified for [RTR] and contrasting with three other liquids /ʐ, l, ɾ/ (NATVIG, 2020, p. 14–15). In both German and Malayalam, surface [r] displays a rich collection of phonetic properties in common, yet how those features are codified into the phonological representations of a language is dependent upon the entire system of language-specific contrasts and phonological patterns (AVERY; RICE, 1989; DRESHER; PIGGOT; RICE, 1994; RICE, 1999; 2009). Put another way, structure in our sense is about more than a particular surface form and how that may change or vary, but reather also about how a given sound relates to the broader organization of contrastive features. We also include here processes that render underspecified phonological categories pronounceable, and turn now to those issues.

Phonological underspecification is critical for modelling variation of sound patterns (e.g., HALL, 2011). We see this not only in the MCS position of minimal specification in order to distinguish a language’s phonemes, but also in the content of features that are assigned through the SDA. Here, we argue for privative feature content in the phonological representations. Trubetzkoy (1977, p. 67, see also CHOMSKY; HALLE 1968, p. 409) defines privative features as those that are present or absent in some segment, such as [nasal], present for segments like /m, n/ but absent in /p, b, t, d/. Current work often motivates privativity with arguments drawn on capturing active and non-active features in synchronic analyses: phonological activity drives the specifications for phonemes and natural classes of phonemes and, in our view, also supports the privative nature of those features. That is, contrastive oppositions are marked by the presence vs. the absence of a given feature ([Feature] vs. [ ]), where the specified [Feature] is accessible for phonological processes and the corresponding unspecified contrast with [ ] is phonologically inert, where the empty brackets represent a literal lack of specification.

A great deal of support for privative features comes from the assimilation of laryngeal features (i.e., voicing, aspiration, etc.), in a framework known as ‘laryngeal realism’ (IVERSON; SALMONS, 1995; 1999; AVERY; IDSARDI, 2001; HONEYBONE, 2005; SALMONS, 2020). A pervasive pattern is this: in a laryngeal contrast the feature from one set of the pair is active, in that it spreads to neighboring phonemes, where the other set is either the target of this assimilation and/or shows no active spreading of its surface features. For example, in English and the majority of Germanic languages (see SALMONS, 2020 for a survey of Germanic laryngeal phonetics and phonology) what is traditionally referred to as ‘voiceless’ /p, t, k/ are sources for the active spreading of [glottis], a feature used to encode aspiration and other phenomena as detailed below. Take, for example, the distributions of [s] and [z] in the plurals of ca[ps], ha[ts], and pa[ks], compared to ca[bz], fa[dz], and ba[gz], where /z/ is the underlying form of the sibilant (IVERSON; SALMONS 1995). In these examples, the ‘voicelessness’ ([glottis]) of final /p, t, k/ spreads to the following /z/; the opposite pattern, where ‘voicing’ spreads, is unattested in English and other ‘aspirating’ languages in the Germanic family (SALMONS, 2020). Under a privative model, the asymmetry of this assimilation is a consequence of representations: /p, t, k/ are specified for [spread glottis], whereas /b, d, g/ are unspecified. That is, this laryngeal contrast is characterized by a [spread glottis] vs. [ ] opposition, i.e., a fortis vs. lenis contrast. However, in languages with active assimilation to voiced obstruents, e.g., Dutch, French, Polish, the contrast is encoded as [slack vocal folds] for voiced /b, d, g/ against [ ] for voiceless /p, t, k/. Work in a similar spirit is increasingly reaching beyond laryngeal distinctions to cover phonetic and phonological contact patterns (NATVIG, 2019) and English vowel diachrony (PURNELL; RAIMY 2015; PURNELL; RAIMY; SALMONS, 2019).

With respect to the MCS, the SDA models the assignment of a feature to a set of phonemes; the other phonemes are then unspecified for that specific feature, shown as the empty set [ ]. For laryngeal contrasts, the two types discussed here are represented in (2). We follow a simplified model based on Avery and Idsardi (2001), where phonological features combine mutually exclusive articulatory gestures (dimensions). Therefore, [glottis] subsumes aspiration [spread glottis] and glottalization [constricted glottis]. On the other hand, [vocal folds] classifies features for voicing [slack vocal folds] and voicelessness [stiff vocal folds]. These contrastive hierarchies model the privative oppositions in the different systems, accounting for the patterns of activity in each.

(2) Privative laryngeal contrasts: fortis-lenis vs. voicing systems.

Figure 2.

The makeup of a language’s phonological representations — what is and what is not specified for contrastive purposes — has implications for variation and change. As we have already suggested, more specified corresponds to less variation in the phonetics because there is more representational substance for discrete category marking. Conversely, surface forms that correspond to features that are phonologically unspecified are prone to a higher degree of variability. We illustrate this in our discussion of modularity, which outlines processes for rendering all categories regardless of their underlying representations into pronounceable objects.

Following one long and robust tradition in linguistics, we simply assume a model of grammar relying on modularity. One important recent paper argues for “phonology and phonetics that are modularly separate and qualitatively different in their representational and computational details” (FRUEHWALD, 2017, p. 29). Although different in specifics, we adopt a similar orientation through the framework put forward in Purnell and Raimy (2015), namely that the sound system consists of a minimum of three representational levels. The most abstract level consists of the features required to represent the contrastive segments in a language’s phonemic inventory. In our view, these representations are the result of the application of the SDA to the inventory, with phonological activity driving the content of representational features. This is the Phonological level of representation.

At an intermediary module between phonology and phonetics, abstract phonemic categories are assigned articulatory instructions. Here, the Phonetic-Phonological level of representation completes phonological features with articulatory gestures and provides additional non-contrastive (i.e., redundant or predictable) gestures through enhancement processes. These serve to increase the acoustic and perceptual distinctions of contrasts (STEVENS; KEYSER; KAWASAKI, 1989), which themselves tend to be prone to individual and regional variability (KEYSER; STEVENS, 2006; HALL, 2011; NATVIG, 2020). Yet, the Phonetic-Phonological level of representation’s primary function is to build structure into underspecified abstract, phonemic categories in order to render them pronounceable. These gestures are in turn implemented at the Phonetic level of representation, where they are converted into continuous properties in the speech signal. In sum, these modular processes render categorical constituents into gradient variables in a step-by-step manner, as represented in Figure 1, from Natvig (2019, p. 91). Table 1 demonstrates the conversions of English laryngeal representations in (2a) throughout these levels of representation.

Figure 3.Figure 1. Levels of representation in the sound system.

Level of Representation /p, t, k/ /b, d, g/
Phonological [consonant] [glottis] [consonant] [ ]
Phonetic-Phonological [consonant] [spread gl.]/[constr. gl.] [consonant] ([stiff vf.])/[slack vf.]
Phonetic [ph, th, kh] (coda [pʔ, tʔ, kʔ]) [p, t, k, b, d, ɡ]
Table 1.Table 1. Levels of representation of English laryngeal contrasts.

Fruehwald (2017, p. 30) disambiguates distinct phonetic processes, including those that compute knowledge of phonetic targets, and phonetic representations. In the present discussion, we consider these both to be Phonetic-level operations; this is the level of representation that operates on the gradient representations Fruehwald (2017) defines as phonetic in his evaluations of types of change. For his purposes, a change such as Montréal French [r] to [ʀ] (SANKOFF; BLONDEAU, 2007) is not a phonetic change because it does not involve a continuous change in phonetic space (FRUEHWALD, 2017, 26; we discuss a similar change in Germanic languages in section 2). We agree that this type of change is not phonetic in nature, but we do not see it as a phonological change either, in the relevant sense of a change in contrastive features. Rather, we view it as a change in the gestural assignment of a phonological category, i.e., a Phonetic-Phonological-level change. In section 2, we elaborate on this position, particularly with respect to why we do not consider this a Phonological-level change, that is a change in abstract, contrastive feature assignment. In sum, the modular sound system we adopt here defines specific representational types with distinct purposes. For a given phenomenon, this allows us to make tightly constrained predictions for — and analyses of — what is changing in the sound system and the potential impact that the progression of that change may have on that system.

As already noted, this has implications for variation and, accordingly, diachronic processes of change: the positive phonological content that is encoded in the specified side of a privative contrast provides explicit instructions to the articulatory organs for the purposes of phonetic implementation. There is then a bilateral relationship between the phonetics and the phonology: while a segment’s phonetic properties restrict the set of possible phonological features, its range of phonetic variation is bounded by its phonological representations (DRESHER, 2009, p. 180). Once again, this model makes an easily testable prediction of an inverse relationship between phonological specification and variation: surface properties that correspond to the unspecified side of a privative contrast are more prone to variability than their specified counterparts. Considering variation with respect to specified and unspecified phonological features within a modular framework further allows for the modelling of what a given variation has the potential to change: phonological feature, completion or enhancement gesture, or phonetic implementation in terms of duration and timing of gestures in the speech signal.

We now turn to deploying this machinery on two very different examples of variation and change over time. Our examples are rhotics and laryngeal contrast, as already noted, and we examine how they do or do not change in terms of modules, where different types of variation can occur, and what that means for us. We use these to demonstrate different types of change and potential effects on the system.

2. Germanic (and other) rhotics: Phonetic variation, phonological stability

In recent work, we have argued that rhotics show a persistent tendency to be unspecified consonant sonorants, particularly in Germanic languages, in the present and the past (NATVIG; SALMONS, 2020). This tendency appears to be strong cross-linguistically as well, although there are exceptions based on differences in rhotic inventories and potential phonological activity from rhotic sources (NATVIG, 2020). For example, variation and changes in surface forms of /r/ between apical and uvular are ubiquitous in the Germanic family. Phonologically conditioned variation between apicals and uvulars (e.g., GEBHARDT, 1907; SJÖSTEDT, 1936), changes in language contact (KING; BEECH, 1998), cross-generational change (WIESE, 2001), and variation in acquisition (SELÅS; NETELAND, 2019) converge to demonstrate that apical and uvular – and many other – variants of Germanic /r/ naturally belong to the pool of variation. That is, these languages’ and varieties’ phonological representations do not limit the types of /r/ that occur.

In terms of phonological patterns, Howell (1991) shows that Germanic /r/ demonstrates consistent behavior across the family and over time, regardless of its surface form. In his examination of breaking (i.e., conditioned diphthongization before coda liquids) in Old English, Howell (1991) situates the pattern within diachronic trends in the Germanic family. He finds that synchronic phonetics in modern German varieties mirror the historical pattern, both in conditioning and outcome. Concretely, he demonstrates that /r/ shows gradient variation in how consonantal or vocalic it is based on its position in the syllable (HOWELL, 1991, p. 59). The extent to which /r/ conditions breaking, then, is not a matter of its specific surface form or contrastive features, but in how reduced and more vowel-like it is in a syllable coda before a consonant. Here, we view this coda r-vocalization as a lenition process that is a direct consequence of its underspecified structure. It is a conditioned lack of completion or implementation of non-contrastive gestures that results in a range from full deletion, to r-colored vowels, to ‘r-approximation’, where the lenited allophone presents as an approximant (NATVIG, forthcoming).

Taking up this spirit of analysis, we find r-lenition in almost every Germanic language, regardless of the prototypical surface form. For example, r-variation and coda r-deletion or vocalization occur in at least some varieties of English [ɹ], German [r, ɽ, ʀ, ʁ], Danish [ʀ, ʁ], and Norwegian and Swedish [ɾ, r, ʀ, ʁ] (see NATVIG; SALMONS, 2020 for discussion). Furthermore, Dutch, with regional and social variation favoring either alveolar or uvular /r/ (VERSTRAETEN; VAN DE VELDE, 2001), shows high rates of r-approximation in codas, primarily as the retroflex or bunched [ɹ] phone (SEGBREGTS, 2014, p. 188). Accordingly, this activity indicates that /r/ is a unified abstract construct regardless of its surface form. The relevant conditions are the environments in which /r/ occurs in connection with its variable implementation, particularly through r-vocalization. We take the extreme variation of rhotics, both phonologically conditioned and unconditioned, to be a product of their un(der)specified phonological character. In most Germanic languages, for instance, which contrasts sonorants for [nasal] and [lateral] – and even [retroflex] for some varieties of Norwegian (see below), /r/ is unspecified for place features. It is at most specified as [consonant, sonorant] (NATVIG, 2020; NATVIG; SALMONS, 2020).

We consider diachronic changes in American Norwegian /r/ and contact with English in this context. Many American Norwegian varieties, like the one discussed here, are Eastern Norwegian in origin (JOHANNESSEN; LAAKE, 2012). One of the features of the Eastern Norwegian sound system is that in addition to the nasals, a lateral, and an /r/ phoneme with alveolar tap and trill variants, these varieties also have a retroflex flap /ɽ/. It is generally agreed that the flap phoneme /ɽ/ phonologized from earlier /rð/ sequences and some /l/s, resulting in a contrast between /l/, /r/, and /ɽ/, but with some optional [r]-[ɽ] and [l]-[ɽ] alternations (KRISTOFFERSEN, 2000, p. 24). Although /ɽ/ is somewhat marginal as a phoneme, there are instances in Eastern Norwegian varieties where it is obligatory (KRISTOFFERSEN, 2000, p. 90). We therefore assume /ɽ/ is specified as a phonological category in Eastern Norwegian and propose the contrastive hierarchy in (3) for American Eastern Norwegian sonorants, following NATVIG (2020, forthcoming):

(3) Contrastive hierarchy of Eastern Norwegian sonorants.

Figure 4.

Over at least three to four generations in the United States, bilingualism and dialect contact have had numerous structural consequences for American Norwegian. Namely, the retroflex flap /ɽ/ has free variation between [ɽ] and [ɹ] because the difference between these two sounds is not contrastive (HJELDE, 1996). Furthermore, Natvig (forthcoming) finds that Norwegian American speakers in western Wisconsin born between 1859 and 1957 have increased productions of [ɹ] for /r/ over time. On this view, [ɹ] is an allophone of both /ɽ/ and /r/, potentially obscuring the [retroflex] vs. [ ] contrast, which could potentially lead to phonological change via merger. However, Natvig (forthcoming) also shows that [ɹ] for /r/ is almost exclusively the product of retroflexion, where /r/ in coda preceding a coronal obstruent coalesces into a retroflex obstruent, for example sur [sʉːɾ] ‘sour’ and sur-t [sʉːʈ] ‘sour-neut’ (KRISTOFFERSEN, 2000, p. 96), with details just below. The relevant changes that result in higher rates of [ɹ] in American Norwegian are (1) the increase in the application of this retroflexion process, i.e., less r-deletion and more retroflexion, and (2) a phonetic change in the timing of the retroflex gesture, i.e., it occurs over a longer duration of the preceding vowel as in [sʉːʈ] > [sʉɹʈ]. Crucially, these changes in the surface form of /r/ do not result in new or different phonological representations.

The variable completion and implementations of /r/ as a phonological category for American Norwegian are presented in Table 2 (with /ɽ/ for comparison). The completion of a variable [aperture] gesture indicates whether /ɽ/ and /r/ occur as flaps and taps/trills, respectively; the presence of the null allophone [Ø] suggests full deletion of all features for /r/ regardless of level of representation. Importantly, the final column, describing /r/ in retroflex codas, is variably implemented with a [retroflex] gesture, resulting in [ɹ]. Full deletion is still a viable output of this environment; it is just one that is less common over time based on the data in Natvig (forthcoming).

Level of Representation /ɽ/ /r/ /r/ ___ Coronal
Phonological [consonant] [sonorant] [ ] [retroflex] [consonant] [sonorant] [ ] [ ] [consonant] [sonorant] [ ] [ ]
Phonetic-Phonological [consonant] [sonorant] [retroflex] [aperture], [ ] [consonant] [sonorant] [coronal] [aperture] [consonant] [sonorant] [retroflex] [ ]
Phonetic [ɽ, ɹ] [ɾ/r, Ø] [ɹ, Ø]
Table 2.Table 2. Levels of representation and phonological processes of American Norwegian /ɽ, r/.

Like r-approximation in Dutch and r-vocalization that is and has been so pervasive in some Germanic varieties, r-approximation in American Norwegian is a conditioned lenition of /r/ in codas. In terms of the surface forms, [ɹ] is a new variant for /r/ in Norwegian, one that took root and expanded in the United States under contact with English. From this perspective, one might think that [ɹ] is the result of a transfer from English, an imposition of phonology of the socially dominant language. However, examining this [ɹ] variant in the context of both the Norwegian phonological system — its contrastive representations, gestural implementations, and variable processes — and the trajectories of coda /r/ in the history of the Germanic family reveals that American Norwegian r-approximation is the continuation of a very long trend. Although the specific social and phonological contexts in American Norwegian differ from Old English and Germanic breaking (HOWELL, 1991), the underlying process is the same, namely that /r/ is highly variable and subject to phonological conditions that are external to its representational structure (NATVIG; SALMONS, 2020).

3. The typology of laryngeal contrast and change: Accumulated phonetic variation, phonological change

Our second case study comes from Germanic laryngeal contrasts. We assume the view of ‘laryngeal realism’ already introduced in Section 2, whereby most Germanic languages and many others of the world have a fortis-lenis opposition, contrasting [spread glottis] vs. [ ]. Previous work (some of it reviewed in SALMONS, 2020) has discussed a set of cases where laryngeal contrasts have changed phonologically. Languages such as Yiddish, Dutch, Frisian, Scots appear to have adopted a system where the ‘voiced’ members of pairs are marked — presumably driven by imposition effects following language shift, following language shift and contact in at least the first three cases.

Our focus here falls on another aspect that has only begun to attract attention, phonetic change without featural change. We follow a point from Sonderegger et al. (2020, p. 120), that “individual speakers control both linguistic and social-indexical contrasts”. Following our mantra that lack of phonological specification allows more phonetic variation, this opens the door to using that variation for social purposes. This means speakers may tend to control structural contrast by constraining variation in specified segments and social-indexical contrast by exploiting variation in unspecified segments.

We explore this briefly with three examples from different varieties of English. In each example, variation is structured around ‘trading relations’ (REPP 1982), namely that we use multiple cues to perceive contrasts, so that we can use more or less of one or another feature to signal a contrast. The first two examples involve changing realizations of laryngeal contrast and possible incipient phonological change, while the third turns to the social and regional implications of specification or lack thereof.

First, Purnell, Tepeli, and Salmons (2005) and Purnell et al. (2005) investigate possible effects of German on English spoken in Wisconsin, looking at word-final laryngeal contrast, words of the type bed vs. bet, over real and apparent time. German neutralizes this distinction word- or syllable-finally, while most varieties of English do not. Eastern Wisconsin was heavily settled by German immigrants and various effects of German have established themselves on English in the region.

In their data, the contrast was variably carried by duration of the preceding vowel (longer before laryngeally unmarked, lenis /d, z/ and shorter before [spread glottis], fortis /t, s/, etc.) and glottal pulsing on /d, z/ and related sounds. Figure 2 shows these patterns for four groups of speakers by year of birth, and the figure represents how those two acoustic manifestations are realized on such pairs — pulsing on the vertical axis and duration on the horizontal one. Fortis realizations are toward the bottom left corner — with less pulsing and/or shorter vowel duration — while the lenis ones are higher and to the right — showing more pulsing and/or longer preceding vowels. The tremendous phonetic variation across generations is immediately apparent and the focus of their work shows the apparent incipient neutralization of the laryngeal contrast in word-final position. Not remarked on until now, though, is what is shown by the two ovals: variation in the unmarked members of pairs (lenis) is greater than in the specified ones (fortis), reflected in the much smaller size of the bottom oval compared to the top one. In particular, on the vertical axis, a measure of phonetic voicing, the fortis realizations are far more compact than the lenis ones.

Figure 5.Figure 2. Changes in the realization of final laryngeal distinctions in eastern Wisconsin (PURNELL et al., 2005, p. 331).

This and other, still unpublished, work strongly suggests that younger speakers are indeed beginning to at least variably neutralize the distinction in final position. This is suggested in the figure by how close the youngest generation’s (born 1966-1986) realization of /d, z/, etc. approaches the fortis realizations of earlier generations. For this youngest group, the line between the two series is shortest, indicating less phonetic distance between the series. We do not yet have data on perception, so we cannot comment on potential neutralization, but the unmarked member of the pair appears to be slipping into the bottom left. Further work is needed, but the feature appears to be a ‘sociolinguistic marker’ in the sense of LABOV (1972), namely a feature that is not generally subject to metalinguistic awareness but does show style shifting (see WALKER 2020 and discussion below).

Second, Sonderegger et al. (2020) — building directly on the work of Purnell, Tepeli, and Salmons and Purnell et al. — show change in trading relationships over time, where contrast is maintained but with changes in phonetics. As noted above, Scots seems to differ from most kinds of English by virtue of having voice rather than spread glottis as the active feature (though see SONDEREGGER et al., 2020 on the complexity of this question). Key here is that the realization of aspiration vs. voicing may drift and that even highly aspirated stops may be unspecified for contrastive purposes. The same is true for corresponding decreases in voicing over time for the voiced series so long as the contrast between the two is maintained and the phonological activity remains the same. As a result, variation may occur in both sides of the contrast, as it does in their data, yet within the bounds of the specification for voiced stops, assuming that the representations have not changed over the timespan studied in their investigation (IVERSON; SALMONS, 1999; SALMONS, 2020). That is, although Sonderegger et al. (2020) show a decrease in the degree of voicing over time for all but the youngest generation in their study, the voiced stops (which we suspect are phonologically specified for [vocal folds]) still present with some voicing cues. Whether or not the representational features change, that is from a system that marks /b, d, g/ with [vocal folds] to one that marks [p, t, k] with [glottis], requires an investigation into assimilation patterns and other forms of phonological activity. Sonderegger et al. (2020, p. 119) see their results as consistent with the view that “the stop contrast is stable at a structural level and that its phonetic realization is changing over time”, and note that “changes in the phonetic realization like we observe here are a necessary, but not sufficient, condition for demonstrating such a shift” in phonological specification (2020, p. 120), which would require evidence from assimilations or other patterns. Patterns of regressive voice assimilation are in fact reported by Abercrombie (1967, p. 135-136) for “Educated Scots”, such that a word like blackboard is pronounced with a medial [gb] cluster (see also SALMONS 2020, p. 126).

Third, Walker (2020) explores how Southern US English specifically exploits an enhancement for social/regional purposes. She gives clear arguments that Southern speakers have the same contrast of [spread glottis] vs. [ ] as most English speakers, but they have been shown to use more glottal pulsing (voicing) on the unmarked (lenis) member. This pattern is “subject to style-shifting but not to overt social commentary” (2020, p. 613), that is, it serves as a ‘marker’ of variation in the Labovian tradition, like we saw with Wisconsin final neutralization. Gathering data from Virginia speakers codeswitching between more Southern and more Standard English and actors from various regions producing Southern and non-Southern speech, both groups produced more glottal pulsing with Southern speech, pointing to active control of this variation and some level of awareness of the feature as marking Southernness.

In each of these instances we see phonetic change or change in the social meaning of phonetic realizations, but with phonological stability. This creates the groundwork for rich phonetic change, but it also means that structural change has a higher threshold. Purnell, Tepeli, and Salmons (2005) and Purnell et al. (2005) show that the acoustic traits used to realize a contrast can be highly structured and subject to phonetic change. Following traditional assumptions about phonological change, such structured changes in how a contrast is realized can open the door for learners to build new or different contrasts based on input. Interestingly, in each case we may be seeing the precursors to phonological change, with possible phonological neutralization in Wisconsin English and the beginnings of featural change in Scots. Tanner, Sonderegger, and Stuart-Smith (2020) show similar patterns outside of the Germanic family as well, finding a possible shift toward aspiration in at least some kinds of Japanese (which they note has been treated as a ‘hybrid’ system). While we are not aware of proposals of phonological change in Southern US English, the parallels seem clear and that is an issue to keep an eye on. Of course, structure takes on many forms, not just representations, but also processes. Finding clarity and explicitness here helps us understand the patterns we see and hone predictions for variationist and phonological research moving forward.

4. Summary and conclusions

With a focus on variation and sound change, we have argued for the need to integrate phonological theory and (often especially socio-) phonetics in new ways to understand sound change while taking seriously the notion of ‘structured heterogeneity’. Concretely, speech sounds have clear fundamental structures but also show vast variability, as successive generations of learners and speakers build grammars from rich and diverse input. In all, we find remarkable variability on the surface, but fundamental stability in abstract structure. A clearer picture of sound change emerges through investigation and analysis of both the ‘structure’ and the ‘heterogeneity’ of sound patterns.

For rhotics, contact-influenced variation in the Norwegian heritage language in the United States is consistent with the underspecified character of /r/. We find an English-like approximant [ɹ] that changes sub-contrastive domains of the sound system. Approximant allophones for /r/ in American Norwegian are generally restricted to retroflex environments, owing to [retroflex] being a feature in the American Norwegian contrastive inventory. Similar patterns obtain in heritage language rhotics more broadly, where surface properties often undergo changes, but they vary within the bounds of a language’s particular phonological structure (e.g., HENRIKSEN, 2015; AMENGUAL 2016; KUPISCH, 2020). If we are correct that unspecified phonological categories correlate with increased phonetic variation, evidence from phonetic changes in rhotics show that this variation is still constrained by the larger architecture of the phonological system at hand.

For laryngeal contrast in the examples discussed, on the assumption that one member of each pair is specified and the other is not, we see greater variation in the latter group. That phonetic variation, crucially, is structured and the changes appear to be systematic. If so, that means that new generations of learners receive structured, systematic input that differs from what earlier generations had, and are building their phonological system from that.

The perspective that variation is inversely related to phonological specification provides researchers principled explanations for why some changes in sound patterns occur frequently, and even multiple times in the history of a language family, whereas some types of changes never occur or are exceedingly rare. Many other phenomena invite analysis from this vantage point, including rhotacism and zetacism (change of /r/ to sibilant) or dissimilation, as Rob Howell (p.c.) reminds us.

While the phenomena we discuss are not to that point yet, many before us have speculated that changes in phonetic realizations can reach a tipping point (SALMONS, 2021), triggering the acquisition of different contrasts or different featural representations. Work on phonological change typically more or less exclusively focuses on the latter, and variationists tend to focus more or less exclusively on the former — with notable exceptions such as Fruehwald (2017) and much work by Labov leading up to Labov (2020). We have argued that variationist work is valuable for formal phonology and that formal phonology makes predictions relevant to variationist research. Structured heterogeneity, a pillar of variationist work, informs phonological theory as well.

We have illustrated our case with sound patterns, but the basic phenomenon at hand, how variation shapes abstract grammar, is a fundamental issue about the nature of human grammar. Future work, if this direction proves fruitful, will need to integrate across traditions and subfields.

5. Acknowledgements

We are first and foremost deeply grateful to Abralin for the opportunity to participate in the remarkable Abralin ao vivo series and to have this opportunity to submit to Cadernos. In addition to the ao Vivo audience, we thank Sarah Holmstrom, Rob Howell, Monica Macaulay, Unn Røyneland, Jorunn Simonsen Thingnes, Maíra Sueco Maegava Córdula, and Alessandra Mara de Assis for comments and feedback on previous drafts of this manuscript. David Natvig is funded by the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement number 838164 and partly supported by the Research Council of Norway through its Centres of Excellence funding scheme, project number 223265.


ABERCROMIE, David. Elements of general phonetics. Edinburgh: Edinburgh University Press, 1967.

AMENGUAL, Mark. Acoustic correlates of the Spanish tap-trill contrast: Heritage and L2 Spanish speakers. Heritage Language Journal, v. 13, n. 2, 88-112, 2016.

AVERY, Peter; IDSARDI, William J. “Laryngeal dimensions, completions and enhancement.” In: HALL, Tracy Alan. Distinctive Feature Theory, Berlin: Mouton de Gruyter, 2001, 47-70.

AVERY, Peter; RICE, Keren. Segment structure and coronal underspecification. Phonology, v. 6, n. 2, 179-200, 1989. DOI

CHOMSKY, Noam; HALLE, Morris. The sound pattern of English. Cambridge: MIT Press, 1968.

COETZEE, Andries W.; PATER, Joe. “The place of variation in phonological theory.” In: GOLDSMITH, John A.; RIGGLE, Jason; YU, Alan C.L. The Handbook of Phonological Theory, Chichester: Wiley Blackwell, 2014, p. 401-434.

DRESHER, B. Elan. The Contrastive hierarchy in phonology. Cambridge: University Press, 2009.

DRESHER, B. Elan; PIGGOTT, Glyne; RICE, Keren. “Contrast in Phonology: Overview.” In: DYCK, Carrie. Toronto Working Papers in Linguistics 13, Toronto: Department of Linguistics, University of Toronto, 1994, p. iii–xvii.

FRUEHWALD, Josef. The role of phonology in phonetic change. Annual Review of Linguistics, v. 3, 2017, p. 25-42. DOI

GEBHARDT, August. Grammatik der Nürnberger Mundart [Grammar of the Nuremburg dialect]. Leipzig: Breitkopft & Härtel, 1907.

HALL, Daniel Currie. The role and representation of contrast in phonological theory. 2007. Dissertation (Doctor of Philosophy in Linguistics) – University of Toronto, Toronto, 2007.

HALL, Daniel Currie. Phonological contrast and its phonetic enhancement: dispersedness without dispersion. Phonology, v. 28, n.1, p. 1–54, 2011. DOI

HENRIKSEN, Nicholas. Acoustic analysis of the rhotic contrast in Chicagoland Spanish. Lingusitic Approaches to Bilingualism, v. 5, n.3, 285-321, 2015.

HJELDE, Arnstein. “Some phonological changes in a Norwegian dialect in America.” In: URELAND, P. Sture; & CLARKSON, Iaian. Language Contact Across the North Atlantic, Tübingen: Max Niemeyer, 1996, p. 283–295.

HONEYBONE, Patrick. “Diachronic evidence in segmental phonology: The case of obstruent laryngeal specifications.” In: OOSTENDORP, Marc van; WEIJER, Jeroen van de. The Internal Organization of Phonological Segments, Berlin: Mouton de Gruyter, 2005, 317-352. DOI

HOWELL, Robert B. Old English breaking and its Germanic analogues. Tübingen: Max Niemeyer, 1991.

HOWELL, Robert B. Personal communication [email]. 2020, 24 September.

IVERSON, Gregory; SALMONS, Joseph. Aspiration and laryngeal representation in Germanic. Phonology, v. 12, n. 3, p. 369-396, 1995. DOI

IVERSON, Gregory; SALMONS, Joseph. Glottal spreading bias in Germanic. Linguistische Berichte, v. 178, p. 135-151, 1999.

JOHANNESSEN, Janne Bondi; LAAKE, Signe. Østnorsk som fellesdialekt i Midtvesten. [East Norwegian as a common dialect in the Midwest.] Norsk Lingvistisk Tidsskrift, v. 30, n. 2, p. 365–380, 2012.

KEYSER, Samuel; STEVENS, Kenneth. Enhancement and overlap in the speech chain. Language, v. 82, n. 1, p. 33-63, 2006. DOI

KING, Robert D.; BEECH, Stephanie A. On the origins of Germanic uvular [R]: The Yiddish Evidence. v. 10, n. 2, p. 279–290, 1998.

KRISTOFFERSEN, Gjert. The phonology of Norwegian. Oxford: Oxford University Press, 2000.

KUPISCH, Tanja. Towards modelling heritage speakers’ sound systems. Bilingualism: Language and Cognition, v. 23, n. 1, 29-30, 2020.

LABOV, William. Contraction, deletion and inherent variability of the English copula. Language v. 45, p. 715-762, 1969.

LABOV, William. Sociolinguistic patterns. Oxford: Blackwell, 1972.

LABOV, William. The regularity of regular sound change. Language, v. 96, n.1, p. 42-59, 2020. DOI

LILJENCRANTS, Johan; LINDBLOM, Björn. Numerical simulation of vowel quality systems: The role of perceptual contrast. Language, v. 48, n. 4, 839-862, 1972.

NATVIG, David. Contrast, variation, and change in Norwegian vowel systems. 2018. Dissertation (Doctor of Philosophy in Scandinavian Linguistics) – College of Letters and Science, University of Wisconsin–Madison, Madison, WI, 2018.

NATVIG, David. “Levels of representation in phonetic and phonological contact.” In: DARQUENNES, Jeroen; SALMONS, Joseph; VANDENBUSSCHE, Wim. Language Contact: An International Handbook, v. 1, Berlin, Bostin: De Gruyter, 2019, p. 88-100. DOI

NATVIG, David. Rhotic underspecification: Deriving variability and arbitrariness through phonological representations. Glossa: A Journal of General Linguistics, v. 5, n. 1, 48, p. 1-28. 2020. DOI:

NATVIG, David. Variation and stability of American Norwegian /r/ in contact. Forthcoming.

NATVIG, David; SALMONS, Joseph. “Fully accepting variation in (pre)history: the pervasive heterogeneity of Germanic rhotics.” In: SUTCLIFFE, Patricia C. The Polymath Intellectual: A Festschrift in Honor of Robert D. King, Agarita Press, 2020, p. 81–102.

OXFORD, Will. Patterns of contrast in phonological change: evidence from Algonquian vowel systems. Language, v. 91, n. 2, p. 308-358, 2015. DOI

PURNELL, Thomas; TEPELI, Dilara; SALMONS, Joseph. German substrate effects in Wisconsin English: evidence for final fortition. American Speech, v. 80, p. 135–164, 2005. DOI

PURNELL, Thomas; SALMONS, Joseph; TEPELI, Dilara; MERCER, Jennifer. Structured heterogeneity and change in laryngeal phonetics: Upper Midwestern final obstruents. Journal of English Linguistics, v. 33, n. 4, p. 307–338, 2005. DOI

PURNELL, Thomas; RAIMY, Eric. “Distinctive features, levels of representation and historical phonology.” In: HONEYBONE, Patrick; SALMONS, Joseph. The Oxford Handbook of Historical Phonology, Oxford: Oxford University Press, 2015, p. 522-544. DOI

PURNELL, Thomas; RAIMY, Eric; SALMONS, Joseph. Old English vowels: diachrony, privativity, and phonological representations. Language, research reports, v. 94, n. 4, p. e447-e473, 2019. DOI

REPP, Bruno H. Phonetic trading relations and context effects: new experimental evidence for a speech mode of perception. Psychological Bulletin, v. 92, p. 81–110, 1982. DOI 10.1037/0033-2909.92.1.81

RICE, Keren. Featural markedness in phonology: variation. Glot International, Part 1, v. 4, n. 7, p. 3-6, Part 2, v. 4, n. 8, p. 3-7, 1999.

RICE, Keren. “Nuancing markedness: a place for contrast.” In: RAIMY, Eric; CAIRNES, Charles E. Contemporary views on architecture and representations in phonology, Cambridge, MA: MIT Press, 2009, 311-321. DOI

SALMONS, Joseph. “Germanic laryngeal phonetics and phonology.” In: PUTNAM, Michael T.; PAGE, B. Richard. The Cambridge Handbook of Germanic Linguistics, Cambridge: Cambridge University Press, 2020, p. 119-142.

SALMONS, Joseph. Sound Change. Edinburgh: Edinburgh University Press, 2021.

SANKOFF, Gillian; BLONDEAU, Hélène. Language change across the lifespan: /r/ in Montréal French. Language, v. 83, n. 3, p. 560-588, 2007.

SAPIR, Edward. Sound patterns in language. Language v. 1, n. 2, p. 37–51, 1925.

SJÖSTEDT, Gösta. Studier över r-ljuden i sydskandinaviska mål. [Studies of the r-sound in southern Scandinavian dialects]. Lund: Blom, 1936.

SONDEREGGER, Morgan; STUART-SMITH, Jane; KNOWLES, Thea; MACDONALD, Rachel; RATHCKE, Tamara. Structured heterogeneity of Scottish stops over the twentieth century. Language v. 96, n. 1, p. 94-125, 2020. DOI

SEGBREGTS, Koen. The Sociophonetics and phonology of Dutch r. 2014. Dissertation (Doctor of Philosophy), University of Utrecht, Utrecht, 2014.

SELÅS, Magnhild; NETELAND, Randi. Norwegian children’s acquisition of the dialect feature r. RASK: Internationalt tidsskrift for sprog og kommunikation, v. 49, n. 1, p. 5–23, 2019.

STEVENS, Kenneth; KEYSER, Samuel; KAWASAKI, Haruko. “Toward a phonetic and phonological theory of redundant features.” In: PERKELL, Joseph; KLATT, Dennis. Invariance and Variability in Speech Processes, Hillsdale: Erlbaum, 1989, p. 426–449.

TANNER, James; SONDEREGGER, Morgan; STUART-SMITH, Jane. Structured speaker variability in Japanese stops: Relationships within versus across cues to stop voicing. The Journal of the Acoustical Society of America, v. 148, n. 2, p. 739-804, 2020.

TRUBETZKOY, Nikolai S. Grundzüge der Phonologie [Principles of phonology.]. Göttingen: Vandenhoeck & Ruprecht. 6th edition, 1977.

VERSTRAETEN, Bart; VAN DE VELDE, Hans. “Socio-geographical variation of /r/ in Standard Dutch.” In: VAN DE VELDE, Hans; HOUT, Roeland van. ‘r-atics: Sociolinguistic, phonetic and phonological characteristics of /r/. Brussels: Etudes & Travaux, 2001, p. 45–61.

WALKER, Abby. Voiced stops in the command performance of Southern US English. The Journal of the Acoustical Society of America, v. 147, n.1, 606-615, 2020.

WEINREICH, Uriel; LABOV, William; HERZOG, Marvin I. “Empirical foundations for a theory of language change.” In: LEHMANN, Winfred P.; MALKIEL, Yakov. Directions for Historical Linguistics: A symposium, Austin: University of Texas Press, 1968, p. 97–195.

WIESE, Richard. “The unity and variation of (German) /r/.” In: VAN DE VELDE, Hans; HOUT, Roeland van. ‘r-atics: Sociolinguistic, phonetic and phonological characteristics of /r/. Brussels: Etudes & Travaux, 2001, p. 11–26.