The 1980s was a period of great excitement about and discovery of the linguistic properties of certain pragmatic expressions that were given different names, as will be addressed below. SCHOURUP (2016) analyzed well, Oh, and like, in conversation, SCHIFFRIN (1987) social markers (well), epistemic and inferential markers (I mean, y’know, now, Oh, then), and connectors (and, but, or, so, now, because), also in conversation. Fraser (1988) started his decades-long work categorizing such expressions primarily in constructed examples.
All of these studies are synchronic. To date there have been extensive diachronic studies of the rise of epistemic pragmatic markers in English such as I mean, you see (e.g. BRINTON, 1996; 2008; 2019), but few studies have addressed the rise of pragmatic markers and especially of metatextual discourse markers such as after all, by the way from a constructionalist perspective (see, however, TRAUGOTT, 2018; 2020).1 Here I adopt a constructionist perspective on change (cf. TRAUGOTT and TROUSDALE, 2013), in which focus is on usage changes in textual data.
The main purpose of this paper is two-fold. First to argue that far more attention should be paid to discourse markers than is common in standard British and American grammars, especially in cognitive linguistics, including construction grammar. It is part of a larger enterprise dedicated to infusing more pragmatics into cognitive linguistics (see e.g. SCHMID, 2012) and construction grammar in particular (e.g. FINKBEINER, 2019). A secondary purpose is to exemplify the importance of context and the inferences it invites in the gradual development of discourse markers.
The terminology has been problematic. SCHOURUP (2016) used the term “pragmatic particle”, and this was adopted later by FOOLEN (1996), among others. SCHIFFRIN (1987) used the term “discourse marker”. FRASER (1988 et passim), however, distinguished between “pragmatic markers”, the larger set of markers such as Schiffrin discussed, and a sub-category of “discourse markers” which connect segment 1 with segment 2. AIJMER (2002) and FISCHER (2006) opted for the term “Discourse particle”. I follow Fraser in referring to the umbrella set as “pragmatic markers” and the restricted subset of pragmatic markers that are connectors, including topic-orienting markers, as “discourse markers” (DMs). In other words, there is a taxonomy of markers, the most abstract domains of which are presented with examples in Figure 1 (“Ms” is short for “markers”):
A road map of the paper is as follows: the main characteristics of pragmatic markers, including discourse markers (section 2), why attention should be paid to discourse markers, especially by cognitive linguists (section 3), some basics of construction grammar (section 4), some basics of a constructional approach to language change (section 5). A case study, the rise of the metatextual discourse marker by the way, is presented in section 6. A number of issues are discussed in section 7, including an alternative perspective on the rise of DMs: cooptation into thetical grammar, as proposed by e.g. KALTENBÖCK; HEINE; KUTEVA, (2011) and HEINE (2013). Section 8 serves as a very brief conclusion.
2. The main characteristics of pragmatic markers
Pragmatic markers, including discourse markers, are crucial aspects of communication. Although mostly associated with conversation, they also occur in monologic talk and writing—one writes FOR someone after all! Typically they are not syntactically integrated with the host clause, so they are not part of “core syntax”, and are not truth-conditional. This has led some people to think of them as meaningless fillers, markers of disfluency, carelessness, or even the decline of the language (see BRINTON, 2019, p. 4). While it is true that they do not have contentful semantics, they have conventionalized pragmatic meaning, and can be used as powerful expressions of a speaker’s stance. Consider (1), an example from the Coronavirus Corpus. Here the combination of the DM cluster Oh, by the way plus the content of the following segment illustrates intentional rudeness (see TERKOURAFI, 2008).
(1) On March 12 in Hanover, N.J., local police said a woman, who was arrested on charges of driving under the influence, purposely coughed on an officer and said, “Oh, by the way, I have the coronavirus and so do you now.”
(3/05/2020 New York Times [Coronavirus Corpus])
Y’know, well, or so would not signal the same thing, and non-use of a discourse marker would be even less effective.
Prototypically, pragmatic markers are:
• Multifunctional: they typically have several discourse functions, often associated with use in different positions in the clause, e.g., in pre-clausal position after all signals justification; in medial position, it signals epistemic ‘of course’, reminding Addressee/Reader (AD/R) of an obvious fact; and in post-clausal position it signals concessive/contrastive meaning (however, more recently justificational meaning has predominated in this position),
• subjective, expressing the Speaker/Writer’s (SP/W’s) attitudinal stance toward or evaluation of the content of the associated clause,
• some are intersubjective, paying attention to AD/R’s face, e.g. signaling a hedge or “in your-face” aggression,
• often mobile: some may appear in several positions in the clause, e.g. actually (AIJMER, 1986),
• often, but far from always (DEHÉ and WICHMAN, 2010), set off by a prosodic envelope (or comma in writing),
• characterized by conventionalized pragmatics (HANSEN, 2012; FINKBEINER, 2019).
They are part of grammar, understood as the characterization of our knowledge of language (see FRASER, 1990).).
The subset of pragmatic markers known in the narrow sense as discourse markers:
• provides contextualizing cues and processing instructions about how to interpret relationships between clauses, specifically discourse topic 1 (D1) and discourse topic 2 (D2),
• is characterized by the constructional form-meaning template:
[D1 DM D2] ↔ [SP/W signals the intended relationship between D1 and D2]
DMs may occur in clause-medial and post-final position as well as pre-clausal position, and may be preferred in post-final position (e.g. all the same ‘however’) but use in these positions will not be discussed here, given space constraints.
3. Why attention should be paid to pragmatic markers
I have found that neither pragmatic markers nor DMs are discussed as a class or in pragmatic terms in foundational British and American works of English grammar and cognitive linguistics. While exemplified in QUIRK et al. (1985), they are discussed mainly in syntactic terms as conjuncts and adjuncts, with little emphasis on pragmatics and discourse function. HUDDLESTON; PULLUM; PETERSON (2002) treat them as “supplements”, again with little attention to pragmatics. While BIBER et al. (1999) list several “linking adverbials” (called “metatextual DMs” here), the focus is on register, specifically how frequently the adverbials are used in conversation or academic prose. Cognitive linguists in the US and the UK likewise pay little attention to DMs. They are not discussed in LANGACKER (1987), although he says a usage-based approach2 to cognitive linguistics seeks to account for “a speaker’s knowledge of the full range of linguistic conventions (closing quotation marks) (LANGACKER, 1987, p. 494). Neither are they discussed in Frame Semantics (e.g. FILLMORE; KAY; O’CONNOR, 1988), and its offshoot construction grammar (GOLDBERG, 1995; 2006; 2019). Connectives like and, but, or, so do of course appear in grammars, but as coordinators/connectives, not as pragmatic discourse linking expressions.
Absence of discussion of pragmatic discourse linking expressions in works on cognitive linguistics is surprising considering the enormous influence of SWEETSER’s (1990) book, From Etymology to Pragmatics, in which she discusses the multi-functionality of modals and of the connectives and, but, or, so (all of them DMs in SCHIFFRIN 1987). Sweetser’s hypothesis is that many expressions, particularly modals and connectives, are understood in three domains:
• socio-physical (real world)
• epistemic (world of reasoning and belief)
• speech act (textual/discourse world)
One of her examples is so (SWEESTER, 1990, p.79; italics original):
(2) a. He heard me calling, so he came.
(The hearing causes the coming, in the real world.)
b. (You say he is deaf, but) he came, so he heard me calling.
(The knowledge of his arrival causes the conclusion that he heard me calling.)
c. Here we are in Paris, so what would you like to do on our first evening here?
(Our presence in Paris enables my act of asking what you would like to do.)
Sweetser’s work has been so important for metaphor studies that we have perhaps lost sight of her statement regarding interpretations in the three domains: “the choice of a ‘correct’ interpretation depends not on form, but on a pragmatically motivated choice between viewing the conjoined clauses as representing content units, logical entities, or speech acts” (SWEETSER 1990, p.78). This observation could have been the basis of cognitive work on DMs. The pragmatics of markers like so, however, and well are discussed at length in Relevance Theory (e.g. BLAKEMORE, 1987). But less frequent DMs like after all, all the same, by the way that are markers of topic-orientation receive almost no attention from cognitive linguists (or relevance theorists). Given GOLDBERG’s (2013, p. 16) view that: “Semantics, information structure, and pragmatics are interrelated; all play a role in linguistic function”, it is appropriate to zero in on markers that give clues to discourse management and discourse structuring (see FRASER, 1996; 2006; 2009).
Some DMs are used primarily for topic-shifting function (FRASER, 2009). They signal whether D2 is meant as:
• a digression from the discourse topic in D1 (by the way)
• a justification for content of D1 (after all)
• continuation of the discourse topic (also)
• return to the discourse topic (anyway)
· an invitation to reevaluate inferences from D1 (but)
A constructed example of reevaluation of inference from D1 is:
(3) I liked the restaurant. But the food was terrible.
I liked the restaurant prompts the inference ‘the food/service was good’. But signals that SP/W will reevaluate in D2 the positive inference from D1. By contrast and in (4) signals that SP/W will support the inference from D1 in D2:
(4) I liked the restaurant and the food was terrific.
4. Some basics of construction grammar
Several different varieties of construction grammar have been developed in the last fifty or so years (see HOFFMANN and TROUSDALE, 2013), but all varieties share the assumption that our knowledge of language is organized as a set of form-meaning pairings or “constructions”. The variety that has most frequently been adopted in historical work is that of GOLDBERG (1995; 2006).
Constructions are signs and SAUSSURE’s (1983) concept of the arbitrariness of the sign is often invoked (e.g. CROFT, 2001, p. 364; GOLDBERG, 2006: 219). The sign in construction grammar is, however, not equivalent to Saussure’s sign, which is a holistic form-meaning pairing. Rather, the sign in construction grammar has several properties or features. These are formalized in sign-based construction grammar (e.g. BOAS and SAG, 2012; MICHAELIS, 2013). In this paper I draw on Croft’s much-cited model of constructions (CROFT, 2001, p. 18) reproduced in Figure 2. In this model, there are 3 form properties: syntactic, morphological, phonological linked to 3 meaning properties: semantic, pragmatic, discourse functional. These properties/components are considered to have the potential to overlap and are not discrete. Each is subject to change.
Components of a construction must be accessible as they can be separately interpreted by language learners and can change separately (as will be illustrated below in section 6). A child who learns dog and applies it to any animal with four legs has learned a component of the meaning of dog, not the holistic construction. What a language-learner learns is what the particular character of the set of properties associated with a construction is. They do not assemble the properties in the flow of speech but by hypothesis learn a “routinized chunk” (DE SMET and CUYCKENS, 2007, p. 188). These routinized chunks are gestalts that “are at once holistic and analyzable. They have parts but are not reducible to the parts” (LAKOFF, 1977, p. 246; ÖSTMAN and FRIED, 2004, p. 4-5). Nevertheless, accessibility is a matter of degree, and the more entrenched in memory a construction is, the less likely it is to be analyzable (DE SMET and CUYCKENS, 2007).
Construction grammar of the Goldberg and Croft varieties is “usage-based”. This means that “[k]nowledge of language is grounded in “instances of a speaker’s producing and understanding language” (KEMMER and BARLOW, 1999, p. viii) and that “[k]nowledge of language includes both items and generalizations, at varying levels of specificity” (GOLDBERG, 2013, p. 16) that are built bottom-up from experience.
5. Some basics of a constructional approach to language change
According to a usage-based perspective on change (see e.g. BYBEE, 2010: Chapter 6), change inevitably occurs because:
• speakers actively use language and negotiate meaning based on experience (not innate capabilities),
• speakers acquire knowledge of language (throughout life),
• perception and production are asymmetric; components of a construction may be mismatched as what someone says or writes is not necessarily understood in the way it was meant.
This means that the working hypothesis is that “usage changes”, not “grammars change” (the generative view first articulated in KIPARSKY, 1968).
A matter of some debate is the relationship between innovation and change. Individual speakers innovate constantly as they produce new utterances. Likewise, hearers frequently innovate because they may understand input in ways different from what was intended. But from a constructional perspective, innovation is not change, it is a factor that enables change. Change results from the conventionalization and sedimenting of new constructional characteristics that have been replicated across SP/Ws and AD/Rs. Change at the morphosyntactic level is typically manifested (“actualized”) gradually in terms of involving small shifts in the surface realization, including collocations, of a construction (DE SMET, 2012; DETGES, et al., In preparation a).
It is a basic tenet of cognitive linguistics that meaning is not fixed. Rather, there is “meaning potential”: there is an “essentially unlimited number of ways in which an expression can prompt dynamic cognitive processes” (FAUCONNIER, 2008, p. 661). Morphosyntactic changes typically arise out of reinterpretations (“reanalyses”) that occur in replicated contexts, both linguistic and communicative, which ground interpretation (LANGACKER, 1987, p. 497; see also Detges et al., In preparation b and in it papers by HANSEN and WINTER-FROEHMEL). My focus is on linguistic context (also known as “co-text”, STUBBS, 1995). An example of the role of linguistic context is presented below in connection with the rise of the discourse marker use of by the way.
6. A case study: the rise of the discourse marker by the way
In this brief case study I summarize part of TRAUGOTT (2020), in which the histories of the “digressive” markers by the way, by the by, incidentally, and parenthetically are analyzed. My data are drawn manually, mostly from three electronic corpora. One, Early English Books Online (EEBO) is a corpus of c755 million words of Early Modern English books published during the period 1475-1700, many of them theology, sermons, and history. This corpus is neither parsed nor balanced. The other two corpora are parsed and balanced: Corpus of Historical American English (COHA), which gives access to c400 million words from 1810-2009, and Corpus of Contemporary American English, c1 billion words as of March 2020, covering the years 1990–2019.
Nearly all pragmatic markers in English originate historically in discoverable lexical expressions. Some originate in adverbial adjuncts, e.g. by the way < ‘along the road’; but < ‘on the outside’; after all < ‘after everything’; so < ‘in that manner’; now < temporal adverb ‘at this time’. In English a large number of metatextual DMs are adverbial in origin. Some pragmatic markers originate in clauses, many of them epistemic pragmatic markers, e.g. albeit, I guess, I mean (see BRINTON, 1996; 2008; 2019), and a few in NPs, e.g. all the same, no doubt.
In contemporary English, by the way is usually said to signal “digression”, change to a new topic that is represented as less important than D1 (MITTWOCH; HUDDLESTON; COLLINS, 2002, p. 779). The online Google dictionary defines it as “used to introduce a minor topic not connected with what was being spoken about previously”.
(5) [talking of new online games] all of them have failed by staying way too close to the same formula. By the way, I also consider Zelda II as one of my favorites, since that's the most challenging game of the series so far.
(2012 Blog, The legend of Zelda needs to evolve [COCA])
Since the end of the 19thC by the way can also be used as a hedge introducing a possibly sensitive topic in D2. In such cases SP/W downplays the importance of the topic in D2, and introduces it performatively as if it was unimportant. A revised characterization of by the way therefore seems in order: “used to signal that the upcoming clause is to be understood as a minor topic not connected with what was being spoken about previously”:
(6) What do you love about her (Lady Gaga)? By the way, you’re in for a very rude awakening! her ice blue lipstick? The blood drinking?
(2012 Blog, Lady Gaga’s ‘Born this way’ [COCA])
In other words, by the way is used multifunctionally. Since the 1930s it has increasingly been used as a marker of aggression, as illustrated by the coronavirus example in (1) (Oh, by the way I have the coronavirus, and so do you now). In the remainder of this section, I present a hypothesis about how the development from a spatial adverbial phrase to DM came about.
By the way originates in an adverbial phrase that is now usually expressed as on/along the way. In Old English the phrase is often be wege as no article was required at that time. From about 1000 to the present, but decreasingly so since about 1800, this phrase is found used as a locative and directional adverbial:
(7) a. ac twegen his geferan feollon be wege
but two his companions fell be way
‘but two of his companions fell along the way’
(c.1000 ÆLS (Maur) B1.3.7 [DOEC])
b. And by the weye his wif Creusa he les.
‘And along the way he lost his wife Creusa’
(1386 Chaucer, Legend good women 945 [MED])3
In these examples, the road is the background for the event (possible fainting, death of Creusa).
Toward the end of the 15thC, by the way started to be used in a variety of contexts of talk. Earlier examples of this context are mostly found in homilies or versions of the Bible, in which Christ talks to his disciples on the road. The kinds of genres in which what I call ‘talk en route’ were expanded. They include histories (8a), drama (8b), and works in which the style was relatively informal:
(8) a. it was told him by the way that his wyf was deed
‘it was told him on the road that his wife was dead
(1482 Caxton, Prolicionycion [EEBO])
b. Why, then, we are awake: let's follow him
And by the way let us recount our dreams.
(1594-95 Shakespeare, Midsummer Night’s Dream, IV. i. 202 [OSS])
This kind of context is replicated with a number of different locutionary verbs, e.g. tell, observe ‘mention’, recount, say.
In the 16thC we find use in a new register and a new context.4 The route is a metaphor for argumentation, especially in religious and philosophical works. The metaphor is one characterized as ARGUMENT IS A JOURNEY in LAKOFF and JOHNSON (2003). It is akin to REDDY’s (1993) conduit metaphor and to the idea of a ‘road map’ for a talk such as I used in section 1 above. The argumentation journey is almost always introduced by say, mention or some other locutionary verb. The main differences from the earlier use are the meaning ‘in passing’ rather than literal ‘on the road’, the formal register, and the fact that many examples appear in translations from French or Latin, e.g.:
(9) likewise of many others the which at this time i omit: this much i will say by the way, that this straight passeth ouer the coast of afrike to the troppike of cancer,
‘likewise (details) of many other (rivers) which I omit at this time: I will say this much in passing, that this straight passes over the coast of Africa to the tropic of Cancer’
(1568 Hackett, The new found worlde, or Antarctike [EEBO]; trans. from French)
So from the 15thC on we find by the way used in represented ordinary talk and from the 16thC in academic argumentation as well as ordinary talk. In both cases it is used adverbially and as background to locutionary acts. The history of by the way is further evidence for the importance of verbs of speaking (“verba dicendi”) noted by Lenker (2010) in discussing the rise of connectives like however.
In the mid-17thC we find some potentially ambiguous examples as well as some unambiguous ones. In (10) by the way can be interpreted as being used either as the metaphorical adverbial ‘in passing’ or as a DM marker of topic-shift to a D2 of only partial relevance to D1:
(10) which city [llanbaderne the greatECT] is now dwindled to nothing: reader, by the way, I observe that cities surnamed the great, come to little at last
(1662 Fuller, History of the worthies of England [EEBO])
Crucial to this analysis is that if by the way means ‘in passing’ in (10) it has been topicalized and appears in initial position in the clause, a position necessary for reanalysis of the adverbial as a DM.5 If it is inferred to be used as a topic-shifter, D1 is about a particular city whereas D2 is a generalization interpolated into this description of a particular city. Another possibly ambiguous example is (11):
(11) 't is natural for brave spirits, not to hold their tongues in the very face of danger, … bees turn not to droanes, nor courages ever abate or degenerate: by the way, i observe that none have ever arrived to an eminent grandeur, but who began very young
(1661 Argyll, Instructions to a son [EEBO])
An ‘in passing’ reading is perhaps preferred in light of i observe, but there is a distinct shift from a list of observations about general behaviors to a particular observation that is used as a piece of advice.
A second example from Fuller’s work provides an example in which an ‘in passing’ reading appears inappropriate. In (12) by the way signals that the comment about the naming habits of English writers is a topic-shift, and somewhat incidental to the encomium on Bullock.
(12) henry bullock … a good linguist, and general scholar, familiar with erasmus, … calling him bovillum in his epistles unto him: by the way our english writers, when rendring a Sirname in Latine which hath an appellative signification, content them to retein the body of the name, and only disguise the termination,
‘ … by the way our English writers, when rendering a surname in Latin which has a meaning that is descriptive of an object, content themselves by retaining the body of the name, and only disguise the ending’
(1662 Fuller, History of the worthies [EEBO])
Presumably bovillus is bov ‘bull’ + ill ‘affectionate diminutive’ + the ‘disguising ending’ -us). Similarly in (13) the fact that other authors have not discussed the particular type of fever described is somewhat incidental in this excerpt from a treatise on lung diseases.
(13) and worse then all these a spermatick (seedy) Feaver, in malignity and putrefaction transcending all others: by the way, this sort of Feaver is not mentioned by any Authour, because it's comprehended under continual humoral Feavers.
‘and worse than all these a reproductive fever, transcending all other fevers in malignancy and decay: by the way, this sort of fever is not mentioned by any author, because it is included in the category of incessant fevers of bodily fluid’
(1666 Harvey, Morbus anglicus [EEBO]
We can conclude that by the way was being used as a DM by the 1660s in relatively formal texts. By hypothesis, for two and a half centuries literal travel and, later, metaphorical argumentation journey were referred to as non-essential contexts for talk and argumentation. The three components of the phrase (by + the + way) came to be fixed or “chunked” as one. Talk and argumentation became associated with and conventionalized as part of the function of by the way and by the way came to be used as the SP/W’s comment on the nature of D2 (in this case, marking it as non-essential digression, background information). This is the kind of profile shift known as “context-absorption” in the grammaticalization literature (e.g. KUTEVA, 2001, p. 150).
A partial model of the differences between the two main stages of by the way (spatial adverbial and DM) is given in Figure 3 to illustrate how CROFT’s model in Figure 2 can be used to represent changes to a construction. It shows the outcomes of key changes by the 1660s only. Intermediary steps such as the development of the ‘in passing’ adverbial, are omitted. The replicated context (abbreviated as “ReplicContext”) is colored grey to show that it is an enabling context for the change, not part of the construction itself; the ‘etc.’ under ReplicContext is a cover term for contexts other than those described so far, including but not limited to one to be discussed below in section 7, use in relative clauses; “PM” is short for pragmatic marker, and “loc” for locative adverbial:
The differences between Stage 1 and Stage 2 suggest that by the 1660s by the way had undergone constructionalization, the creation of a formnew-meaningnew sign (TRAUGOTT and TROUSDALE, 2013, p. 22). Specifically, the form changes were: i) clause-initial > pre-clausal syntax, ii) morphological univerbation, iii) phonological changes that were part of the systemic change known as the Great Vowel Shift. The meaning changes were: i) loss of the semantic content ‘route’, ii) the rise of conventionalized pragmatic meanings, and iii) the rise of the metatextual discourse functions.
The hedging function is not included in Figure 3 as it did not become entrenched until the end of the 19thC. Presumably, association of DM by the way with partial relevance and casualness, motivated by treating the route as background, enabled interpretation of by the way as an index of the hedging function as well as topic shift. D2 is presented as if it is not important even though it introduces some significantly face-threatening material. The hedging function is exploited in dramas at the turn of the 20thC:
(14) All the same, Pastor, I respect you more than I did before. By the way, did I hear, or did I not, that our late lamented Uncle Peter, though unmarried, was a father?
(1897 Shaw, The devil’s disciple [CLMET])
There is evidence in the corpora of association over the 20thC of by the way and especially the combination Oh, by the way with a routine way of introducing important directives or negative content as if they were unimportant. In (15) we find a negative comment about its being used as a routine:
(15) Everything the person says is suspect in my eyes from their [sic] on. It's that old “oh, by the way ….” routine.
(2012 It takes more than a pill [COCA BLOG]]
Association of use of (Oh,) by the way with negative content by hypothesis enabled use as marker of negative attitude to and disaffiliation from a perceived irresponsible turn-about by some Other individual or entity (the content of D2):
(16) We are just gorging on government spending and debt. And then the president turns around and says, Oh, by the way, we really have to be worried about this debt at the same time they are passing every new spending bill.
(2009 Fox_Susteren [COCA])
In statements like (16) SP/W disaffiliates (DREW and HERITAGE, 1992) from the alleged statement in D2 and invites the inference that SP/W also disaffiliates from the quoted source (the president in this case). The coronavirus example in (1) suggests that Oh, by the way has been extended from reported speech to direct address and use as a marker of aggressive face-threat (TRAUGOTT, In preparation).
The development of the DM function is a case of subjectification of meanings over time. Used as a circumstantial locative adverbial, by the way is objective and truth conditional. So are after all ‘after everything’, butan ‘on the outside of’ in their adverbial uses. But used as DMs, they signal SP/W’s perspective on the relationship between D1 and D2. Subjectification of this kind is typical of the histories of DMs. Some also show intersubjectification of meaning over time (paying attention to AD/R). These DMs include by the way, as evidenced by use as a hedge and later as an aggressive marker.
The first stage of by the way is akin to SWEETSER’s (1990) “socio-physical”, objective use, and the later DM use to her “speech act” use of connectives like but, so, because. As an adverbial, by the way could occur anywhere that an adverbial was licensed to do so, but an intermediate step is that it was replicated in topicalized initial position, a precursor to pre-clausal DM use. This speech act use arose in the contexts of:
• reports on speech acts (it was told him by the way that his wyf was deed, example (8a)),
• invitations to talk (let us follow him And by the way let us recount our dreams, example (8b)),
• expression of argumentation strategies (this much i will say by the way, example (9)),
• topicalized use of the adverbial (And by the way let us recount our dreams, example (8b)).
A further context was use in appositive relative clauses (examples (19) and (20) below). The DM use did not arise as a direct mapping from the real world to the discourse world, but by replicated interpretations of the relationship between D1 and D2. Nor did the DM arise instantaneously. Rather, constructionalization as a DM took about 250 years of small cumulative shifts in contextual uses.
There is, however, an alternative interpretation of how DMs arise. KALTENBÖCK; HEINE; KUTEVA (2011), HEINE (2013) and HEINE et al. (2017) argue for a two-level Discourse Grammar consisting of: a) sentence grammar: fairly traditional “core syntax”, and b) thetical grammar: a level where “asyntactic” material such as parentheticals, appositional relative clauses, imperatives, and DMs are accounted for (many of the expression-types called “supplements” in HUDDLESTON; PULLUM; PETERSEN, 2002). The hypothesis is that expressions in thetical grammar are “coopted” instantaneously. They may have undergone changes before cooptation and are likely to undergo further changes, but the shift from sentence to thetical grammar is instantaneous in each case of the rise of a DM.
However, while there were no doubt instantaneous innovations by individual language-users, the change across contexts and in the community of speakers was gradual, taking over two centuries in the case of by the way, as discussed above. To date I have found only one development (all the same ‘however’) for which I do not have data showing gradual change over centuries. The nominal phrase ‘everything the same’ appears to have been reinterpreted with contrastive pragmatics between 1800 and 1820. The rapid development as a DM of all the same can be interpreted as a case of analogization to a preexisting DM like after all, especially as the new DM all the same meaning ‘however’ is preferred in post-final contexts, the position in which concessive/contrastive after all was preferred at the time.
In my view a dual-level grammar is not needed. As GOLDBERG (2013, p. 16) proposes, such functions as semantics, pragmatics and discourse function “are part of our overall conceptual system and not a separate modular component”. Construction grammar is a single-level grammar that does the job well of accounting for expressions that do not conform to standard syntax. In fact it was originally developed precisely to account for idiosyncratic expressions that could not be readily addressed in the syntax of the late 20thC, cf. KAY and FILLMORE’s (1999) analysis of What’s that fly doing in my soup? Parentheticals, appositional relatives, imperatives and DMs are all constructions in Goldberg’s sense because they show some element of idiosyncracy and meet the criterion for a construction that “some aspect of its form or function is not strictly predictable from its component parts or from other constructions recognized to exist” (GOLDBERG, 2006, p. 5).
One of the interesting questions raised during discussion of my presentation at Abralin ao Vivo was what the relationship of by the way is to parentheticals. My response concerns only by the way and other digressive markers, by the by, incidentally, not DMs in general. I am not using the extended notion of “theticals” here, but the subset known as “parentheticals”. Drawing on BIBER et al. (1999), I define parentheticals as expressions that “give additional information which is related to, but not part of the main message”. Parentheticals in this sense include appositive relative clauses and digressive clauses. As stated above, such expressions can readily be interpreted in terms of a monolevel construction grammar.
From a constructional perspective, we need to think not about the DM alone, but about the form-meaning template [D1 DM D2] ↔ [SP/W signals the intended relationship between D1 and D2]. “Digressive markers” are a category of DMs that mark D2 as meant to be taken as parenthetical: additional information that is to be taken as not very important, as was shown by example (5) above. I repeat it here as (17).
(17) = (5) [talking of new online games] all of them have failed by staying way too close to the same formula. By the way, I also consider Zelda II as one of my favorites, since that's the most challenging game of the series so far.
(2012 Blog, The legend of Zelda needs to evolve [COCA])
Here ‘by the way, I also consider…’ is presented as parenthetical information. By the way signals that D2 is additional information not to be taken as directly relevant and, by implication, not very important. If by the way is removed, as in (18), D2 is simply a statement that expands on what went before. The elaboration is strengthened by also, but it is not structurally parenthetical:
(18) all of them have failed by staying way too close to the same formula. I also consider Zelda II as one of my favorites, since that's the most challenging game of the series so far.
In what follows I distinguish use of by the way: i) within parenthetical clauses such as appositive relative clauses, as in (19) and (20) below, ii) introducing a clause and rendering it parenthetical, as in (21) below, from iii) punctuation within parentheses.
As mentioned, one of the contexts for the rise of the DM use is appositional relative clauses. This context is attested in the 1620s, fairly late in the development of DM use. The earliest examples are best interpreted with the adverbial ‘in passing’ meaning, since they appear in the context of observe, note and show as in (19):
(19) an answer to a pamphlet, intituled, the fisher catched in his owne net: in which, by the way, is shewed, that the protestant church was not so visible in all ages, as the true church ought to bee;
‘an answer to a pamphlet entitled the fisher caught in his own net; in which, in passing, is shown that the Protestant church was not as visible in all ages as the true church ought to be’
(1624 Featley, The Romish fisher caught in his own net [EEBO])
The appositional relative is parenthetical and is used as a host for by the way. This use appears to have strengthened the inference that the metaphorical route for the textual journey is background to what is said. By the 1650s by the way appears to have been used in appositional relatives to evoke a speech act (‘which, I say’) as well. Here SP/W subjectively evaluates the content of the relative clause and strengthens its add-on and digressive nature:
(20) in case they should withdraw their obedience to their lawfull princes, as soone as they were become christians; which, by the way, laies (‘lays’) a very ill character upon those, who …
(1649 Hammond, Letter to Lord Fairfax [EEBO])
In the later part of the 17thC, once by the way was constructionalized as a DM, it is found used independently of the relative clause to introduce a clearly syntactically parenthetical digression, as in (21):
(21) for says mr: collier, p: 62 the meaning must be (by the way, that must is a little hard upon me) that providence is a ridiculous supposition;
(1698 Congreve, Amendment of Mr. Collier’s false and imperfect citations [EEBO])
This continues to be one of its uses in contemporary English, as illustrated in (17) above.
As regards punctuation, 440 hits of by the way appear in parentheses in EEBO from the 1570s on, with the largest counts in the 1640s and 1650s, the time when constructionalization of the new DM was taking place.
(22) there hath beene noted (by the way) the portion appropriated to the priest,
(1631 Morton, Of the institution of the sacrament [EEBO])
Most of the 440 examples of (by the way) in EEBO appear clause-medially, as in (22), so most do not appear in prototypical pre-clausal DM position. Most appear to mean ‘in passing’, especially when used in the context of say or note as in (22). The large number of examples of by the way punctuated with parentheses in EEBO suggests that at least in the transitional period, both the adverbial ‘in passing’ and the DM uses were consciously thought of as parenthetical.
The number of instances of (by the way) is relatively large in EEBO (440 in a corpus of 775,000 words) compared to that in COCA (only 7 examples in the 1 billion word corpus). This may be because contemporary writers are less particular about punctuation, or because by the way is largely used as a hedge in contemporary English. As a hedge marker, and particularly as a marker of rudeness and aggression, it does not meet the definition of “parentheticals” given above (“give additional information which is related to, but not part of the main message”.”). In fact [by the way X] used as a hedge or insult has primarily speech act rather than information-transmition properties. While it is clear that [by the way D2] can be used as a parenthetical digression from the topic at hand (D1), further research is called for on such issues as whether [by the way D2] used as a hedge or insult is appropriately defined as a parenthetical and whether there are degrees of parentheticality.
Regardless of the answer to that question, pragmatic markers, including DMs, are very much part of the linear structure of language, as the difference between (17) and (18) shows. Haselow (2016) and others have suggested that DMs serve to signal what’s coming next, in turn-taking or in monologual discourse. Individual pragmatic markers are learned with constraints on where in the clause they may be used: by the way can be used pre-clausally, medially, and post-finally; until recently but was used only in initial position, but has recently been found post-finally with concessive meaning largely in Australia, where it may be also used to signal not only contrast but also Australianness (MULDER; THOMPSON; WILLIAMS, 2009). There are also constraints on combinations of pragmatic markers (e.g. FRASER, 2015, LOHMANN and KOOPS, 2016). Pragmatic markers therefore challenge Chomsky’s (2020) negative view of communication and linearity. However, a constructional approach to grammar focuses on schemas, patterns and sets (BOAS, 2013). Therefore, in addition to thinking about linearity, we need to think about paradigmatic choices. For example, what are the subclasses of DMs that can occur in the schema that I have characterized as [D1 DM D2] ↔ [SP/W signals the intended relationship between D1 and D2]? How did they develop over time? What were their functions, and how have those functions undergone modification? Has there been reorganization of subclasses?
Metatextual DMs of the type I have outlined are used to express SP/Ws’ cognitive intentions and perceptions regarding text management. They are used in negotiation of meaning in communicative discourse. We cannot do without them, whether in conversation, fictional representation of speech, or in academic prose. They are part of our knowledge of grammar. Therefore they should be included in standard grammars and deserve attention by linguists of all persuasions.
Many thanks to Miguel Oliviera, Jr. for inviting me to participate in Abralido ao Vivo and to the virtual participants for thought-insiring questions.