The usage of the Italian subjunctive, particularly in the context of embedded completive clause, can be considered a normative linguistic stereotype par excellence, to which speakers should pay particular attention if they want to speak ‘properly’. However, despite the massive effort of the normative enterprise, as well as the much scholarly attention garnered from linguists, overall consensus on what exactly constrains mood selection in discourse is not unanimous: whether it makes a semantic contribution, which verbs should trigger it, whether it signals more careful style. Grammarians are also concerned with the attrition of the subjunctive and its productivity in speech, fearing the loss of its supposed semantic contribution. Several studies have addressed these issues, but only a small amount of this body of work on Italian subjunctive has utilized a quantitative method and these assumptions have not been evaluated systematically under an accountable empirical methodology. The findings of the present variationist investigation illuminate new evidence in the patterning of subjunctive use in community-based spontaneous speech data and refuting the claims that it is productive and semantically-motivated. The analysis reveals a lexically motivated pattern of variation, i.e., the use of the subjunctive is mainly restricted to a handful of main clause verbs and a single embedded verb. Systematic analysis also shows a correlation between subjunctive choice and higher level of education, a social meaning that further strengthens the idea that no semantic contribution is made when the speaker opts for the subjunctive over the indicative, a phenomenon that is inherently variable.


The use of the subjunctive mood, particularly in the contexts of embedded clauses, has become a linguistic feature for speakers to parade as a sign of polished Italian and of belonging to a higher social rank or a status symbol of a higher level of education. It is a bona fide feature of standard Italian and more specifically of the bon usage par excellence. Its ‘correct’ usage is at the center of the prescriptive enterprise. In both prescriptive and linguistic literature, it is very often associated with more careful speech (BONOMI, 1993; GATTA, 2002; POLETTO, 2000; SANTULLI, 2009; SCHNEIDER, 1999; VELAND, 1991) and its avoidance is often condemned as a feature of popular, uneducated, or careless speech (FOCHI, 1956, 1957; GONZÁLEZ DE SANDE, 2004; SCHMITT JENSEN, 1970). In spoken discourse, its variable use is often considered non-standard, and cases of indicative where some would expect a subjunctive, e.g., with credere ‘to believe’, are sometimes deemed acceptable by grammarians and linguists though in “substandard varieties” of Italian (GIORGI; PIANESI, 2004, p. 425) or they are construed as “characteristic of a low style” (translation mine, WANDRUSZKA, 1991, p. 425). What triggers these assumption and reactions to non-standard use is the inherent variability characterising the use of the subjunctive, as exemplified in (1) and (2), in which subjunctive mood alternates with the indicative under the same main clause verb, otherwise known as the governor without apparent change in meaning.

Figure 1.[1] The codes identify the corpus, the numeric speaker code, and the line number at which the utterance occurred. Examples from two corpora will be shown throughout this paper, including the C-ORAL-ROM (2005) (C) and the Lessico di frequenza dell’Italiano Parlato (1993) (L).

Nevertheless, in both prescriptive and theoretical linguistic accounts of subjunctive, the right context in which to use the subjunctive remains unclear. The last few decades have seen intensified interest from grammarians and linguists alike, in motivations underlying the use of the Italian subjunctive, particularly in the context of embedded clauses, where the subjunctive is said to be mandatory when governed by some classes of main clause verbs. Mood selection is then considered to be a mechanism implemented at a distance (SHLONSKY, 2006, p. 83), where the semantic characteristics of the main verb determine the selection of the embedded mood (POLETTO, 2000). Generally, verbs that connote volition, necessity, emotion and doubt are consistently deemed categorical subjunctive selecting contexts (TEKAVČIĆ, 1972; see also BRONZI, 1977; LEPSCHY; LEPSCHY, 1981; SERIANNI, 2006). Notwithstanding this assumption that the subjunctive is not freely selected but rather mandatory in a variety of contexts, it is also said to be dependent on the speaker’s degree of commitment vis-à-vis the truth of the embedded proposition. In fact, when variability is acknowledged, i.e., the alternation between subjunctive and indicative in the same syntactic context is governed by the same (type of) main clause verb, the motivations behind mood choice are said to be correlated with the speaker’s intent or state of mind at the time of the utterance (COSTANTINI, 2011, p. 39; GIANNAKIDOU; MARI, 2015; PORTNER; RUBINSTEIN, 2013; see also MANZINI, 2000). Additionally, the subjunctive and indicative alternation is seen as having two alternate semantic/pragmatic interpretations (SATTA, 1994), with the subjunctive marking less subjectivity (LEPSCHY; LEPSCHY, 1981; also SERIANNI, 2006). For instance, the prototypical verb credere ‘to believe’ is said to select the indicative instead when its choice supposedly corresponds to a more certain predication according to what the speaker believes, as in the example provided by Gatta (2002, p. 5) “credo che Dio esisteIND” (‘I believe that God exists’), in which the use of subjunctive would instead reveal some uncertainty or weakness of the speaker’s belief. There seems to be a general consensus that mood selection is semantically motivated, although no consensus is reached as to exactly which verbs or meanings should trigger the subjunctive in discourse. Treatments of the contemporary Italian subjunctive in academic literature and formal grammars invoke a kaleidoscopic range of meanings to explain its distribution in speech. It is said to be a mood fluctuating between opinion and perception (BINAZZI, 2015), projecting “a modality of uncertainty and doubt” of the event in question (translation mine, SIMONE, 1993, p. 80). It is also claimed to represent “the intense and emotive degree, the particular and the personal, doubt and the unreal, the unexpected and the surprising, the desired and the feared, the extraordinary and the exceptional” (translation mine, DORIGO, 1951, p. 322). Linguists and grammarians are also concerned with the supposed attrition or loss of the subjunctive and its productivity in speech. This assumption is linked to the hypothetical desemanticization of subjunctive morphology, a putative change that would purportedly result in the loss of the subjunctive. Few scholars claim that the Italian subjunctive is dead (MARCHI, 1984), although many subscribe to the idea that it is losing ground in favour of the indicative. This assumption is based on the observations that the indicative is taking over contexts that are traditionally associated in prescriptive accounts with the subjunctive (see credere and verbs of opinion in general; SERIANNI, 1986; SIMONE, 1993; TRIFONE, 2007). Italian sociolinguists seem to agree that the subjunctive is losing ground particularly in vernacular speech due to a preference for the more transparent, regular and frequent morphology associated with the indicative (BERRUTO, 1987; BINAZZI, 2015; DE MAURO, 2017; TAVONI, 2002). A few quantitative studies have been conducted to evaluate hypotheses regarding the supposed semantically motivated selection of the subjunctive and its productivity in contemporary Italian. Some striking and sometimes contradictory results surface: 1) the subjunctive remains quite productive in speech and not threatened by the raise of the indicative mood in contexts traditionally associated to subjunctive mood (e.g., volitive and opinion matrix verbs; see BONOMI, 1993; SANTULLI, 2009; VELAND, 1991); 2) its use is semantically motivated (VELAND, 1991); 3) the indicative is slowly intruding into subjunctive-selecting contexts, mainly verbs of opinion, supporting the hypothesis of a change in progress (LOENGAROV, 2006; LOMBARDI VALLAURI, 2003; SANTULLI, 2009; SCHNEIDER, 1999; VOGHERA, 1993, p. 3). As a result, the subjunctive is favoured with verbs of opinion, volition and hope while verbs of communication disfavour it (BONOMI, 1993; LOMBARDI VALLAURI, 2003); 4) when the indicative is chosen over the subjunctive, speakers tend to mark modality elsewhere, by using expressions such as a mio parere ‘in my opinion’, la mia sensazione ‘my feeling’, or a matrix verb in a conditional tense (GATTA, 2002, p. 88). However, methodological, and analytical differences cause a lack of consistency in between studies, and consequently raise the issue of comparability of their results. Despite a general assumption that the subjunctive is relatively productive in speech, when we calculate the overall rate of occurrence based on the results provided by the authors in their publications, we notice that the rate of subjunctive fluctuates considerably across studies that provided quantitative insights into the use of subjunctive.

Figure 2.Figure 1. Comparison of the rate of subjunctive selection in previous quantitative research.

Three main reasons underlie this lack of consistency from study to study are the nature of the data used as benchmark; the process of choosing the contexts of study (i.e., the circumscription of the variable context); and the lack of systematic quantitative comparison of the conditioning. Previous studies have used a disparate range of often incommensurable datasets as the basis for their quantitative analysis of subjunctive use, including newspapers (BONOMI, 1993; SANTULLI, 2009; VELAND, 1991), text messages (SANTULLI, 2009), on-line forum discussions (LOENGAROV, 2006; SANTULLI, 2009), as well as literature (SOLIMAN, 2002), with only a few studies relying on corpora of spontaneous speech data (LOENGAROV, 2006; LOMBARDI VALLAURI, 2003; SCHNEIDER, 1999). Some studies focus on verbs of opinion, motivated by the fact that these are said to be the semantic class that suffered the greatest loss of the subjunctive, although these studies lack systematic comparison with other semantic classes. The choice of the subjunctive-selecting contexts, i.e., the set of verbs that are supposed to trigger the subjunctive in discourse, is pre-determined and restricted to a limited number of main clause verbs frequently reported in the literature and in the grammars (e.g. credere ‘to believe’, pensare ‘to think’, sembrare ‘it seems’). Therefore, imposing a restriction on the context of variation and possibly obviating accountable reporting of all the variation found in the datasets. Additional restrictions include the selective testing of certain contexts at the expense of others. A few studies focussed exclusively on 3rd person subjects (LOENGAROV, 2006) or 1st person singular subjects (SANTULLI, 2009) with the goal of examining the hypothetical correlation between subjunctive selection and the degree of commitment of the speaker. These studies analyzed contexts where the speaker is less committed (3rd person subjects) or more committed (1st person subjects) vis-à-vis the truth of the embedded proposition. However, as for in the case of semantic classes, the analyst is taking for granted that the subjectivity or the commitment of the speaker plays a role, without taking the necessary steps to empirically assess whether that is, in fact, the case. Finally, because analyses are based on raw number of tokens of the subjunctive, and without any reference to rates of occurrence, the analyst weakens the possibility of establishing whether a given context is actually more favourable to the subjunctive than another.

Despite considerable scholarly attention to the subjunctive mood in Italian embedded complete clauses, the topic of its selection and variable use in speech remains a point of debate: whether subjunctive use in Italian is determined by semantic factors and speech style, and whether its use is productive in speech have yet to be established. This paper adduces new evidence to fill this gap by investigating the use of the subjunctive in Italian discourse through a systematic quantitative analysis of subjunctive selection in present-day community-based production data through the lens of Variationist Sociolinguistics.

1. The variationist framework

This research is conducted within the theoretical framework of Variationist Sociolinguistics (LABOV, 1972). Although most spoken Romance languages, such as French, Spanish and Portuguese, particularly their North and South American varieties (BERLINCK, 2019; KASTRONIC, 2016; POPLACK et al., 2018; POPLACK; LEALESS; DION, 2013; TORRES CACOULLOS et al., 2018), have been extensively studied using quantitative methods, there is a dearth of sociolinguistic variationist research focusing on linguistic variation in Italian. Previous variationist research demonstrated the restriction of the subjunctive to a handful of governors (POPLACK, 1992; POPLACK et al., 2018; POPLACK; LEALESS; DION, 2013), supporting the hypothesis of a lexicalization, i.e., the semantics makes little or no contribution to the choice of the subjunctive, but its choice is rather dictated by some contexts. Poplack and her colleagues showed that the use of the subjunctive in Franche is restricted to a few governors, more specifically of the 37 governors that triggered the subjunctive at least once in the 2.5M word Ottawa-Hull corpus (POPLACK, 1989), only three verbs (falloir ‘to be necessary’, vouloir ‘to want’ and aimer ‘to like’) accounted for 88% of all subjunctive morphology in discourse (POPLACK et al., 2018). Likewise, some previous quantitative research on Italian (GATTA, 2002; SCHNEIDER, 1999) has also hinted to a tendency for the subjunctive to appear in quasi-fixed expressions, therefore lexicalization, with credere, sembrare and pensare.

The theoretical approach of Variationist Sociolinguistics rests on the observation that speakers engage in choices to express a given meaning or grammatical function in discourse and recognizes variability as an inherent property of speech. The key theoretical construct is the linguistic variable (LABOV, 1972) that involves two or more variants used alternatively to express the same referential meaning or function in discourse (POPLACK, 2011; POPLACK; LEVEY, 2010). The linguistic variable studied here is the expression of the subjunctive and its major forms that alternate in the embedded context of completive clauses with no apparent change in meaning or function of subjunctive and indicative. The variationist perspective recognizes that variability is not random but rather structured and governed by multiple factors, be it linguistic or social. The underlying structure of subjunctive variability is discerned from examination of its distribution in discourse and its variable conditioning (POPLACK; TAGLIAMONTE, 2001). A key methodological tenet in variationist research requires adherence to the Principle of Accountability (LABOV, 1972). This requires that all tokens relevant to the variable under investigation must be taken into consideration, including tokens that did occur as well as those that did not but could have, in order to fully account for the envelope of variation. This implies the identification of an objectively-defined variable context in which the variants alternate without change in meaning. This work differs substantially from previous analyses in the way the variable context is delimited, and the principle of accountability is considered.

2. Circumscribing the variable context

A particular challenge for the study of the subjunctive in a variationist framework is related to the abstract construct of the linguistic variable outlined above and the consideration of the relevant tokens of the variable under investigation. The issue of morphosyntactic variation, and more specifically whether grammatical constructions may be analyzed as linguistic variables, characterizes a longstanding debate in linguistics and some scholars have questioned the validity of extending the variationist framework beyond the level of phonology (e.g., LABOV, 1987; LAVANDERA, 1978). One may ask how can we consider mood selection a linguistic variable if theoretically subjunctive and indicative are said to express different, perhaps opposing, meanings? Traditional accounts of mood variation appeal to semantic explanations for justifying the use of the subjunctive and its variability with the indicative, by invoking the doctrine of form-function symmetry, i.e., the desire to establish a one-to-one relationship between a form and its meaning. The analyst could identify the contexts that are supposed to trigger the subjunctive and subsequently ascertain the extent to which this does occur in naturalistic speech. However, the semantic function that is supposed to trigger the subjunctive in discourse remains a matter of some debate in the literature, with little prospect of imminent resolution. Moreover, the various meanings that purportedly accompanies the use of the subjunctive in both prescriptive as well as descriptive and theoretical research (DIGESTO, 2019; POPLACK, 1992; POPLACK et al., 2018) are typically correlated with subjective, psychological or attitudinal motivations, including appeals to the speaker’s state of mind or their intent, hope, fear, emotions, etc. These are impossible to operationalize in running discourse (POPLACK et al., 2013). This issue precludes any attempt to circumscribe the variable context based on semantic considerations alone. Previous variationist research has adopted a more pragmatic approach to this issue by circumscribing the variable context corpus-internally, i.e., by locating all the contexts where the subjunctive is actually used to determine where it could be used. Therefore, the subjunctive-selecting contexts were identified as “every tensed clause governed by a matrix verb, i.e., governor, which triggered the subjunctive at least once” (POPLACK et al., 2018, p. 229). This process yields the complete list of governors that selected the subjunctive at least once in the dataset, as the main clause verb credere ‘to believe’ shown in (3).

Figure 3.

After identifying the lexical identities of the governors, the next step was to analyze the data again and extract all the variants that compete with the subjunctive, which, in the case of the current study, were the indicative (4) and the conditional (5).

Figure 4.

By adopting this method, we were able to account for the envelope of variation and to objectively identify the contexts that triggered the subjunctive in discourse. Whether the use of the subjunctive carries semantic meaning or not will be objectively assessed against a corpus of actual speech data. The study excluded all ambiguous tokens, such as homophones between the subjunctive and the indicative in some grammatical contexts, which make establishing the morphological role of the verb difficult. For instance, 2nd person singular of the present indicative and the present subjunctive with first (-are) group verbs (e.g., tu am-i ‘you love’), 1st person plural of the present indicative and the present subjunctive of all three conjugation groups (e.g., noi amiamo [-are] ‘we love’, scriviamo [-ere] ‘we write’, sentiamo [-ire] ‘we feel’), and finally, 2nd plural of the simple past indicative and the imperfect subjunctive of all three conjugation groups (e.g., voi amaste [-are] ‘you loved’). Other cases excluded from the study include those that do not license variability because they are fixed expressions or titles of movies (6). Other exclusions include interruption, reformulation, or incomplete sentences (7).

Figure 5.

3. The data

The data on which the current study is based were extracted from two corpora of spoken Italian: Lessico di frequenza dell’Italiano Parlato (DE MAURO et al., 1993; henceforth abbreviated as LIP) and the C-ORAL-ROM Integrated Reference Corpora for Spoken Romance Languages (CRESTI; MONEGLIA, 2005; henceforth abbreviated as C-ORAL). For a detailed account of the corpora, we refer the reader to Voghera (2001) and Cresti & Moneglia (2005) for the description of the LIP and C-ORAL, respectively. These two datasets provide a wide range of recordings of everyday speech, as well as a variety of speech styles and speakers from, the four main Italian urban centres, i.e., Milan, Florence, Rome, and Naples. The characteristics of the subsample exploited here are summarized in Table 1.

Corpus N Words N Speakers Speaker age Date data collected Origin of speakers
C-ORAL 255,138 61 10-60+ 2000-2003 Various
LIP 352,709 unknown unknown 1990-1992 Various
Total 607,487
Table 1.Table 1. Number of words and speakers retained for each corpus of contemporary Italian.

Only recordings of spontaneous conversations in naturalistic settings were retained for the study, therefore those occurred in non-spontaneous/non-naturalistic contexts (e.g., news broadcasts, scientific press conferences, etc. – all of which are scripted) were excluded. It is important to acknowledge that datasets have several drawbacks in terms of sampling methodologies (DIGESTO, 2019, p. 44). Some geographical regions are disproportionately represented, i.e., the informal data from C-ORAL was mainly collected in Florence and surrounding areas; LIP provided recording in the four main urban centres listed above, although no socio-demographic information was collected, and it is uncertain whether the speakers were bona fide members of the targeted speech communities. There is considerable variation in the length of recordings within and between corpora, and there is relatively little speech from a large number of participants. Only a portion of the data is accompanied by socio-demographic information, mainly C-ORAL data, and overall, the data is skewed towards speakers with high level of education. Despite these limitations, they are essential tools for quantitative analysis of naturalistic speech. Consequently, both corpora represent spontaneous speech in a fair way, allowing us to investigate a number of external and internal factors which have never been systematically examined under an accountable empirical methodology. Moreover, the quantitative analysis will focus on both rates and conditionings. Rates of subjunctive could indeed be affected by an overrepresentation of a highly educated population, particularly given the prestige attached to the use of the subjunctive, although its distribution across contexts is more likely to remain consistent. In fact, while rates can vary due to extralinguistic factors, the conditioning (i.e., the underlying grammar) is expected to reflect more stable constraints on variation (POPLACK; TAGLIAMONTE, 2001; see also POPLACK; LEVEY, 2010). All the retained data were concordanced, and every unambiguous occurrence of subjunctive morphology was identified and retained for analysis.

4. Operationalizing hypotheses

By adopting the coding protocol outlined in previous variationist studies on mood variation (e.g., POPLACK et al., 2013; POPLACK et al., 2018), we propose falsifiable criteria to assess hypotheses regarding the internal and external conditioning of the subjunctive extrapolated from the relevant literature, particularly the issue of a meaning-based alternation with the indicative, its productivity in speech and its stylistic conditioning. The hypotheses are operationalized and tested in a quantitative manner, shedding light on the constraints operating on the selection of the subjunctive in the variable context objectively defined above. All tokens of the variable were coded according to several internal and external factors said to influence the choice of the subjunctive in discourse by drawing on the protocols outlined in previous variationist studies (POPLACK et al., 2013; POPLACK et al., 2018).

Semantic class of the governor. A recurrent hypothesis is that what triggers the selection of the subjunctive in completive clauses is the nature of the governor. However, the task of coding each governor according to a given semantic class was an arduous one since an exhaustive agreed-upon list of verbs for each semantic class is not available and there is no inter-analyst agreement on exactly which semantic classes or meanings should select the subjunctive. This study adopts the semantic classification suggested in previous variationist research (POPLACK, 1992; POPLACK; LEALESS; DION, 2013) and quantitative Italian research (e.g., BONOMI, 1993; LOMBARDI VALLAURI, 2003), and test the main categories reported in normative treatments of subjunctive usage. We tested the assumption that the subjunctive is supposed to be triggered categorically by verbs conveying meanings of emotion (8a), volition (8b) and necessity (8c), variably with opinion (8d) and evaluative (8e) verbs and disfavoured with verbs of communication (8f). It is important to note that in the classification adopted in the current research, opinion verbs denote a subjective or epistemic meaning while evaluative verbs rather denote an evaluative attitude and assessment of an event (e.g., verificare “to establish”, reputare “to deem”).

Figure 6.

Sentence type. Previous variationist research operationalized an approximate test of assertion vis-à-vis the predication, and this was done independently of the semantics conveyed by the governor. Therefore, negative (9a) and interrogative (9b) sentences were coded as less assertive contexts than the affirmative (9c) counterpart. If the subjunctive is associated with a less assertive reading, we can predict that negative and interrogative contexts would favour its selection.

Figure 7.

Presence of other indicators of non-factual modality. Beyond whatever meaning is supposedly embodied by the subjunctive morphology, we assess the role of the presence of elements in discourse that could contribute to a non-factual reading of the proposition. Every token was coded according to the presence of explicit cues in running discourse indicating non-factual reading, a less assertive or uncertain predication, through the presence of adverbs, modals or expression helping to objectively establish a doubtful interpretation of the proposition expressed. Tokens were coded according to the presence (10a) and the absence (10b) of such indicators. The prediction is that the subjunctive is favoured in contexts objectively denoting uncertain/less-certain predication.

Figure 8.

Lexical identify of the governor. Each token was coded according to the lexical identity of the governor that triggered the subjunctive, to empirically ascertain the contribution of the governors to subjunctive selection. If the lexical identity of the governor contributes to variability independently of any meaning to be expressed, we can take it as an indication that the subjunctive may not be productive. We can also measure productivity in terms of the number of governors that triggered subjunctive in discourse and how they correlate or not with other internal and/or external factors. To establish that the subjunctive is productive, we should observe it to be triggered by a great number of governors, to make a semantic contribution and we should note a consistent effect when cross-tabulated with another factor group, e.g., semantic class of the governor.

Lexical identity and morphology of the embedded verb. Likewise, each token was coded according to the lexical identity of the embedded verb as well as its type of morphology. As argued for the lexical identity of the governor, to deem the subjunctive productive, we should detect a high number of embedded verbs carrying subjunctive morphology. Moreover, the type of morphology enabled us to ascertain whether speakers prefer more transparent and frequently occurring indicative morphology rather than dealing with the subjunctive. Furthermore, if the subjunctive is not semantically motivated but instead lexicalized in discourse, we can predict that it would be favoured with more frequent and irregular forms, since these forms are said to resist change and are more amenable to entrenchment in discourse (BYBEE, 1985, 2007; BYBEE; THOMPSON, 1997). Tokens were coded according to whether their morphological form was regular (11a), irregular (11b), or suppletive (11c).

Figure 9.

5. Overall results

The extraction method applied here yielded a dataset of 1713 tokens, of which 3% (N=50) were marked with the conditional. Due to its relative rarity, the conditional variant was excluded from the analysis, resulting in the overall distribution of the two competing variants: subjunctive and indicative, shown in Figure 2.

Figure 10.Figure 2. Overall distribution of the variants in the dataset.

Overall results show robust variability, with the subjunctive being selected more than two-thirds of the time, appearing quite productive, at least superficially. However, overall rates can be misleading and may hide internal conditioning that is indiscernible from the inspection of surface forms alone.

We can now turn to the main question: what constrains the observed variability? This study reports the results concerning the factors operationalized to test hypotheses that a meaning-based use of the subjunctive in discourse is at work: the semantic class of the governor, the sentence type, and the presence of indicators of non-factual modality. The effect of a given factor is inferred by comparing its individual rate of subjunctive to the overall rate for the pooled data (cf. Figure 2, overall subjunctive rate of 68%). If a factor shows a rate of subjunctive selection higher than the overall rate, its effect is deemed favouring. The greater the difference, the stronger the effect. Likewise, the same method is applied when deeming the effect disfavouring (rate of subjunctive selection of a given factor lower than the overall rate). A neutral effect is deemed when there is no substantial difference between the rate of subjunctive of a given factor and the overall rate.

6. Semantic tests

According to recurrent claims in the literature, semantic classes of emotive, volitive and necessity verbs, which are categories usually indicating a modal interpretation related to permission, necessity, desire, or obligation are predicted to favour the subjunctive categorically. On the other hand, subjunctive should occur variably with evaluative and opinion verbs, and disfavoured with communicative verbs. Importantly, for every overall effect noticed, we expect to observe a consistent effect amongst the members of the semantic class to deem the effect genuine, i.e., no one member of any given semantic class can determine the effect of the entire class.

Semantic Class % Subj N
Emotive 93% 43/46
Volition 89% 201/225
Necessity 81% 76/94
Evaluative 71% 17/24
Opinion 66% 786/1190
Communication 19% 16/84
Overall 68% 1139/1663
Table 2.Table 2. Subjunctive selection according to semantic class of the governor. Shading indicates a favouring effect.

Results in Table 2 show that some semantic classes favour the subjunctive more than others. All three classes of stronger modal determination, i.e., emotive (12), volitive (13) and necessity (14) governors, favour the subjunctive and select it most of the time, which is in line with theoretical and prescriptive assumptions.

Figure 11.

However, we do not observe overall the categorical selection of subjunctive mood often assumed with such classes of verbs. Moreover, variability characterizes every semantic class although to different degrees. The semantic class of evaluative verbs slightly favours the subjunctive (71%); opinion verbs have no effect to the selection of the subjunctive (66%); communicative verbs highly disfavour the selection of the subjunctive (19%). As noted above, we expect every member of a given semantic class to behave the same way to ascertain that the effect observed is genuine. In contrast, we do not observe such co-occurrence with the subjunctive by semantic class. We observe, on the other hand, considerable discrepancies between the rate of the subjunctive by semantic classes and the rates of selection for each member of a given semantic class. The class of opinion verbs is the most populated in terms of data and lexical types of governors. 80 verbs were coded as verbs of opinion, although only four of them account for more than half (64%) of all the data within this semantic category. These four verbs show different patterns (Figure 2): credere ‘to believe’ (76%) and sembrare ‘it seems’ (74%) favour the subjunctive; pensare ‘to think’ (68%) has no effect; non è ‘it is not’ (32%) highly disfavours the subjunctive. The same can be said for all other governors coded as opinion verbs, as summarized in Figure 2 above. The observation is that not all opinion governors share the same pattern or direction of effect. The rate of subjunctive observed in this semantic class range from 14% to 100% (27 of these governors are singletons)1. Moreover, many governors show an exceptionally low or singleton token count, which precludes any substantial conclusion for those contexts. The semantic class of evaluative verbs contains little data dispersed across a few governors and no consistent effect is observed (Figure 3).

Figure 12.Figure 3. Breakdown of subjunctive selection according to governors within the semantic classes of Evaluative, Opinion and Communicative verbs.

On the other hand, the class of communicative verbs show a clear lexical effect: it is composed of only two lexical types, dire ‘to say’ and the construction non è da dire ‘it must not be said,’ and the former account for 99% of the data and highly disfavour the subjunctive (18%). Another semantic class showing a lexical effect is the context of necessity verbs (Figure 4).

Figure 13.Figure 4. Breakdown of subjunctive selection according to governors within the semantic classes of Emotive, Volitive and Necessity verbs.

We observed that 64% of the data in this category (N=60/69) belong to one governor, bisognare ‘it is necessary’, while the other only governors of moderate frequency disfavours the subjunctive (bastare, ‘to suffice; ≈ as long as’, 54%). If we exclude bisognare from the factor group, we notice that this governor has the effect of nullifying the favouring effect of necessity verbs since the overall rate drops to 62% (N=21/34).

Summarizing, we observed the dominance of a single governor and its dissimilarity from the patterns attested for other members of this semantic class. Hence, the subjunctive has a clear lexical effect rather than a genuine semantic effect. The only two semantic class showing a more consistent pattern overall are volitive and emotive verbs. However, as shown in Figure 4, both classes display a fair number of singletons and highly infrequent governors, weakening the observed overall effect. In other words, the overall rates may be affected by the presence of a few governors with a low token count, which inflates rates, due to a true semantic effect. Moreover, with regards to the semantic class of volitive verbs, two governors, volere ‘to want’ and sperare ‘to hope,’ account for 55% of all the volitive verbs data, further weakening the observed overall effect

Summarizing, the widespread assumption and the explicit prescription that some classes of verbs categorically select the subjunctive are not supported in these data. The only quantitative result that lends support to the semantic hypothesis was observed with volitive and emotive matrices. However, we must bear in mind that these two categories account altogether for only 16% of the dataset (N=271/1663) and are mostly constituted of infrequent and singleton governors, except for volere and sperare, preventing strong confirmation of a genuine semantic effect. The idiosyncratic lexical effects and within-category inconsistencies observed in the data suggest that the semantic motivation cannot account for mood selection in Italian subordinate clauses.

Further evidence of a lack of semantic contribution to subjunctive selection is drawn from the analysis of the contribution of sentence type. Recall that non-assertive contexts such as negative and interrogative sentences should favour the subjunctive. Results shown in table 3 do not support such hypothesis.

Sentence type % N
Affirmative 72% 949/1310
Interrogative 68% 34/50
Negative 51% 156/303
Total 68% 1139/1663
Table 3.Table 3. Rate of subjunctive according to sentence type.

The assertive context of affirmative sentence slightly favours the subjunctive, while interrogative contexts have no effect and negative sentences disfavour the selection of the subjunctive. Further analysis showed several interesting results. Cross-tabulation of sentence type and lexical identity of the governor (Table 4) showed that 1) the effect of sentence type was not consistent across all governors that triggered the subjunctive, 2) where cells were populated with data, no effect of negation was observed. It should be noted that the dispersion of data among dozens of different governors resulted in either empty cells or very low token counts in interrogative contexts, which made it impossible for us to ascertain how interrogative clauses affected mood choice, and they were therefore excluded. If we consider the most frequent governors in the data set as shown in Table 4: Chi-square tests confirm that there is no significant difference between affirmative and negative contexts; some governors (bisognare, sperare and può darsi) only occur in affirmative contexts and non è necessarily occurs only in negative clauses, thus impeding the test of the non-assertive hypothesis.

Figure 14.Table 4. Subjunctive selection according to sentence type and governor (affirmative vs. negative sentence types only).

More interestingly, one single governor, non è, is responsible for the overall disfavouring effect of negation observed above in Table 4 since it accounts for 55% of all the negative contexts (N=168/303). If we exclude tokens of the outlier non è from calculating the overall rate of subjunctive in non-affirmative contexts, we observe no difference between affirmative and non-affirmative sentence types (Figure 5 below). The evidence suggests that the hypothesis according to which less assertive contexts should favour the subjunctive does not hold true in speech.

Figure 15.Figure 5. Subjunctive selection according to affirmative and non-affirmative data, with and without non è.

We also assessed the doubtful nature of the proposition by coding for the presence of explicit and objectively identified indicators of non-factual modality in the ambient discourse. The presence of elements, sometimes referred to as subjunctive triggers, e.g., epistemic adverbs, presence of a modal, tense/mood of the main clause such as future, conditional, or subjunctive, may be predicted to favour the use of the subjunctive. Overall, results show that the absence of indicators has no effect while the presence of indicators favours the selection of subjunctive (Table 5). Further investigation highlighted some important results with regards to the presence of indicators of non-factual modality.

First, it is important to note that indicators of non-factual modality are rare in discourse. In fact, most of the data, 86%, do not display any overt indicators. Second, if the presence of indicators was a genuine effect, we would expect the different indicators to behave the same way, i.e., to have a consistent effect across the categories. On the contrary, some indicators such cases of governors embedded under conditional se ‘if’ slightly disfavour the subjunctive; the presence of lexical indicators (e.g., forse) has no effect; future tense, conditional or subjunctive moods of the matrix clause favours the subjunctive; the combination of multiple factors categorically selects the subjunctive, though with only 3 tokens we cannot draw any substantial conclusion.

Indicators of Non-Factual Modality % N
Presence of indicators 75% 180/239
Absence of indicators 67% 959/1424
Total 68% 1139/1663
Breakdown of the factor group
Combination of factors 100% 3/3
Auxiliary used modally 92% 23/25
Tense/mood of the matrix clause 79% 81/102
Lexical indicator 68% 62/91
Absence of indicators 67% 956/1424
Conditional se+governor 61% 11/18
Total 68% 1139/1663
Table 4.

Overall, factors that are designed to capture a doubtful reading of the proposition through the presence of explicit cues in running discourse do not have consistent effects on the selection of the subjunctive. If semantics were genuinely an explanatory factor of subjunctive selection in completive clauses, we should have not observed such inconsistencies both across and within the factor groups tests. The exception to the rule seems to concern both contexts of emotive and volitive verbs, though they still account for a small portion of the dataset and the low number of tokens, particularly for emotive verbs, suggesting that we are not observing a genuine effect of semantics.

7. Lexicalized use of the subjunctive in discourse

The factor groups designed to assess the semantic contribution to the selection of the subjunctive failed to account for a meaning-based alternation in discourse. The results presented above demonstrate how certain governors play a key role in the overall outcomes observed. A total of 140 governors was extracted from the data. Such a high number may suggest a productive use of the subjunctive in discourse. Although a relatively high number of governors in Italian discourse selected a subjunctive at least once, more than a third (38% of the governor pool, N=53/140) are singletons and 3% of the governor pool (4 verbs) account for nearly half of the data (46% of all tokens, N=769/1663). The frequency of verb use in the discourse may explain this. This would indicate a situation in which the more frequent the governor, the higher the rate of the subjunctive, which is not the case.

Figure 6 shows that the there is no correlation between governor and frequency: highly frequent verbs, the top four governors credere ‘to believe’, sembrare ‘it seems’, pensare ‘to think’ and non è ‘it is not’, do not behave the same way; the same can be said about medium-frequency governors, ranging from 18% to 100% rate of subjunctive selection, highlighting idiosyncratic inconsistencies and the absence of an apparent trend concerning frequency. In sum, only the top three governors, credere, sembrare and pensare are responsible for 38% of all the subjunctive morphology in this data set (N=436/1139). The addition of parere, volere, sperare and bisognare (the next four favouring contexts to the selection of the subjunctive) brings the count to 55% of all the subjunctive morphology, which is a considerable discrepancy if we consider the overall high number of governors.

Figure 16.Figure 6. Rate of subjunctive according to the governors. The table indicates the analyst-imposed frequency divisions.

Each governor has the potential to trigger subjunctive morphology in completive clauses, though this option seems to fall under the purview of only three lexical governors. In addition to the role of the governor, we notice striking results with regards to the lexical identity of the embedded verb, i.e., the verb that carries subjunctive morphology.

Due to the high number of governors in running discourse, we can expect a high number of embedded verbs to receive subjunctive morphology. Moreover, every verb in the language is theoretically eligible to carry subjunctive morphology. However, when we examine the results for the lexical identity of the embedded verb, the effects observed are quite dramatic. Results displayed in Figure 7 show that only three verbs favour the selection of the subjunctive: essere ‘to be’ (73%), andare ‘to go’ (71%) and sapere ‘to know’ (81%). As observed for the governors, we do not detect a consistent trend with the lexical identities of the embedded verbs. Figure 7 shows that other relatively highly frequent verbs slightly disfavour the subjunctive, such as the case for the second most frequent verb avere ‘to have’ (65%), or have no effect, e.g., potere ‘can’ (67%), discarding frequency as explanatory factor for the variation observed.

Figure 17.Figure 7. Subjunctive selection according to lexical identity of the embedded verb.

More strikingly, of the 232 verbs coded as embedded lexical identities, a single verb essere ‘to be’ accounts for 40% of the pool of embedded verbs and 42% of all the subjunctive morphology used in discourse in the context of completive clauses (N=481/1139). The addition of the next seven verbs (avere, potere, fare, dovere, andare, venire and stare in Figure 7 above) brings the count to 71% of all the subjunctive in this data, suggesting that most of the embedded verbs are extremely rare or singletons. Furthermore, a Chi-square test of essere vs. other embedded verbs shows that there is a significant difference in their selection of the subjunctive (p-value .000228, significant at p < .01). On one hand, with the governors, we observed an apparent productivity due to the large number of verbs though only few of them account for most of the variation in discourse, while on the other hand, the actual marking of the subjunctive is done on the embedded verb. Thus, we observed a restriction of the subjunctive morphology with a handful of embedded verbs.

We also investigated whether the type of morphology of the embedded verb affects variant selection and found that suppletive morphological forms favoured the selection of subjunctive forms. (Table 6).

Embedded Morphological Form % N
Suppletive 75% 515/691
Regular 65% 368/568
Irregular 63% 256/404
Total 68% 1139/1663
Table 5.Table 6. Subjunctive selection according to morphological form of the embedded verb.

Further examination showed that the lexical identity of the verb overrides the effect of morphological form since most suppletive forms (81%, N=560/591) belong to one lexical type, essere. The effect of essere in its suppletive confirms the restriction of the subjunctive mood with this specific lexical identity and hints at a lexicalized use in discourse. When we cross-tabulate the embedded lexical identity/morphology with the governors, we noted a substantial higher rate of occurrence for almost all the frequent governors (with 50+ tokens) with suppletive essere than other lexical identity and morphological forms (Figure 8).

Figure 18.Figure 8. Subjunctive selection by governor according to lexical identity and morphology of the embedded verb.

We observed rates of subjunctive skyrocketing with suppletive essere morphology, for instance, with pensare (+24%), non è (+23%) and credere (+22%). Verbs such as volere and bisognare, which already show high rates of subjunctive selection, jump to 100% when the embedded verb is suppletive essere. In sum, we observed and confirm overall a high propensity to trigger the subjunctive with the very salient suppletive forms of essere. Furthermore, when we breakdown the tokens of essere, of the 431 total count, 330 belong to one single form: sia.

Summarizing, all these results detract from the assumption that the variable use of the subjunctive in completive clauses in Italian makes a semantic contribution but indicate instead a routinized use in discourse. Moreover, on the surface, the use of the subjunctive seems productive if we consider the contribution of the governors but clearly not extendable to the embedded verbs. To further investigate these contrasts and examine the apparent productivity of the subjunctive in running discourse, we examined whether social factors contribute to subjunctive variability.

8. The social conditioning

Both speech style and level of education showed interesting results and enabled us to account for the apparent productivity and the lexical pattern reported above. The link between the use of the subjunctive in more formal contexts has been widely reported and well-documented, though a systematic comparison of the pattern across categories has never been demonstrated. Cross-tabulation of the lexical identity of the governor and speech style reveals some interesting results.

Figure 19.Table 7. Subjunctive selection according to style and lexical identity of the governor.

First, the total shows an overall rate of subjunctive selection that is higher in more careful speech contexts (71%, N=763/1082) compared to more casual conversations (65%, N=376/581), lending support to the hypothesis that the subjunctive is indeed favoured in more formal contexts. Another important result is that, if we consider casual speech data only, credere, pensare, sembrare, and bisognare favour the subjunctive, but they also account for 32% of the casual speech data (N=184/581) and almost half of all the subjunctive used in informal situations (49%, N=184/376, cf. total subjunctives in casual speech in Table 7). This is an important result concerning the question of productivity since it shows an even less productive use of the subjunctive in casual speech. Moreover, the highly frequent governor non è constitutes a large portion of the casual speech subsample and since it highly disfavours the subjunctive, it lowers the overall rate of subjunctive selection in the context of casual speech.

Another important result is that despite the fluctuations of the effect of speech style by governors, Chi-square tests indicated that no statistically significant difference at p < .01 is observed, whether the lexical types listed in Table 7 above are used in casual or careful speech. On the other hand, results for the two categories of infrequent and singleton governors highlighted a substantial difference according to speech style. First, we observed a greater number of singletons in careful speech. Second, the overall rate of subjunctive selection for the infrequent governors is considerably higher in more careful speech, and this difference is statistically significant at p < .01 (p-value=.008612). Third, not only the rate of subjunctive selection is higher for infrequent governors in more careful speech, but the amount of subjunctive used is almost three times bigger in the least ‘natural’ way of speaking. These results suggest that speakers may be more sensitive to the overt prestige that the use of the subjunctive in discourse has acquired in contemporary Italian, triggering it more in formal than in casual conversations. It is now clear that the apparent productivity of the subjunctive in completive clauses is attributed to highly infrequent and singleton governors surfacing more frequently in careful speech. One might think that a more careful speech calls for a richer vocabulary and that lexical diversity is a logical consequence of the several speech contexts instantiated in the formal subsample of the data. The formal subsamples were recorded in a relatively wide variety of contexts, e.g., political debates, television interviews, etc. Each context focusses on a different topic, though they were all characterized by a more careful speech and therefore possibly more complex and florid argumentation.

Notwithstanding these considerations, we should bear in mind that most of the variation is not accounted for by a stylistic difference, since the bulk of the governors behave in the same way regardless of speech style. Further investigation showed an interesting correlation between the speaker’s level of education and speech style.

Figure 20.Table 8. Number of governors and rate of subjunctive according to speakers’ level of education.

It is clear from Table 8 that highly educated speakers are extensively contributing to the pool of governors with a rich set of lexical types, most of which are infrequent or singleton governors. In other words, these speakers are responsible for the apparent productivity of subjunctive observed on the surface. This result is consistent with the observation that the subjunctive is a sociolinguistic stereotype in contemporary Italian society. We would expect that if there is one segment of the population that would be extremely sensitive to this highly normative and salient feature, it would be the one with a higher level of education. If the subjunctive were truly productive, as reported elsewhere or as it misleadingly appears to be on the surface, we should have observed more of it in the vernacular of the speakers regardless of their level of education. On the contrary, Table 8 shows that speakers with little or no formal education do not use the subjunctive in casual conversation to the extent that highly educated speakers do. We also notice a substantial difference in terms of the governor pool by level of education: highly educated speakers are single-handedly providing the rich set of governors observed in our analysis. Despite the general preference to use the subjunctive more in careful speech, speakers with a higher level of education contribute to a great number of governors even in their casual speech. These observations suggest that speakers, mainly with a higher level of formal education, make the effort to use the subjunctive, and therefore to convey the linguistic prestige that this morphology has gained in contemporary Italian society.

9. Discussion

By making use of the standard variationist methodology, we empirically evaluated several hypotheses regarding the nature of linguistic and social factors conditioning and the choice of the subjunctive in contemporary Italian discourse. We were able to address three main issues characterising contemporary accounts and ideology of Italian subjunctive, i.e., the supposed semantic contribution, its social conditioning as well as its status in terms of productivity. Results does not lend support to one of the major assumptions prevailing in the literature and in normative accounts of the subjunctive, i.e., the semantic/pragmatic motivation underlying the choice of the subjunctive mood in embedded completive clauses. On the contrary, objective and independent tests showed that rates of occurrence are mainly affected by the presence of numerous governors within each semantic category and most of the time showed fluctuation in subjunctive rates, i.e., an inconsistent effect, casting doubt on the supposed influence of such factors on the subjunctive selection. Amongst the factors tested, only two showed a consistent effect, emotive and volitive governors, though the authenticity of this effect is heavily challenged by the relatively negligible number of tokens to which this purported semantic effect was applied (N=271/1663, 16% of the entire dataset). Furthermore, most of the governors populating these semantic categories were very infrequent or singletons. On the other hand, although we are far from the dramatic situation observed in French reported above, we spotted lexically routinized use of the subjunctive in Italian discourse, with a handful of governors accounting for most variation observed in speech and the subjunctive marked to a great extent on a single embedded form, suppletive essere.

The subjunctive looks very productive on the surface, which apparently mitigates this lexicalization. Its overall rate is fairly high (68%), and many verbs triggered it in discourse (N=140). By means of systematic comparisons, we were able to pinpoint to the nature of this apparent productivity: the overt prestige denoting subjunctive morphology in contemporary Italian speech. We first observed an apparent effect of style: overall rates of the subjunctive were higher in careful speech (71%), as opposed to casual speech (65%), but the core set of governors shared showed similar behaviour across the two speech styles, and Chi-square tests showed no significant difference. More strikingly, the high saliency of the subjunctive is reflected in the strong tendency of highly educated speakers to make use of the subjunctive in embedded completive clauses, not only in careful speech but in casual speech as well. This might be explained by the massive effort of the prescriptive enterprise invested in making the subjunctive a hallmark of bon usage. Subjunctive morphology is indeed very salient (DELLA VALLE; PATOTA, 2014; STEWART, 2002) and its non-standard use often provokes vitriolic reaction from writers, journalists, teachers, intellectuals and the general public as well. The overt prestige associated with the use of the (standard) subjunctive is often lamented, to the extent that even some linguists claims that its existence is threatened by the rising use of the indicative and its avoidance is often condemned as a feature of popular, uneducated, or careless speech (FOCHI, 1956; GONZÁLEZ DE SANDE, 2004; SCHMITT JENSEN, 1970; SIMONE, 1993). Our results showed that the subjunctive performs a social function. The great contrast observed between highly educated speakers and those with little or no formal education in our dataset confirmed that the overt prestige of subjunctive morphology and more importantly its productivity is due to level of education rather than a genuine internal linguistic productivity. If the subjunctive were truly productive in speech, we would expect it to arise independently of education. In other words, speakers would not need formal education to learn it.

10. Conclusions

The findings of this study contribute to our understanding of the variability in mood selection in the context of completive clauses and further highlights the importance of adopting empirical quantitative methods for examining linguistic variation. Our results are further evidence that quantitative patterns of co-occurrence of variants are indiscernible on the surface, and they are only accessible through systematic examination of conditioning contexts. The discrepancies between the findings reported in the current study and the claims in the literature, as well as the results diverging from previous sociolinguistic accounts of Italian subjunctive, are due to several reasons. To begin, we must take into account the inherent variability of speech, which arises from form-function asymmetries in language rather than simply matching a meaning to a form. The distinctions suggested by the linguistic theory as well as prescriptive dictates on how a given linguistic feature is conditioned and often neutralised in discourse. Secondly, the adherence to the envelope of variation based on the delimitation of a variable context defined corpus-internally as opposed to the standard ideology on how the subjunctive is supposed to function in discourse. Also, by dismissing whatever diverges from the idealized standard language can lead to erroneous conclusions based on only a handful of possibly idiosyncratic occurrences, and can therefore lose sight of the pattern regulating variability. Besides, the chaotic state of grammatical prescription for the subjunctive, as also reported by a few metalinguistic analyses of subjunctive (see DIGESTO, 2019, cap. 2; POPLACK et al., 2015; POPLACK; LEALESS; DION, 2013), suggests that it is virtually impossible to rely on normative expectations in order to identify the appropriate set of governors and meanings to define the variable context. In fact, despite relatively high rates of subjunctive selection, the variability observed and reported above characterizes emotive, necessity and volitive matrices as well. Even necessity, a wellspring of subjunctive use, shows a clear lexical effect in our dataset, highlighting the key contribution of the lexical identity of the governors, rather than the semantics underlying each governor, and reinforcing the importance of conducting systematic multifactorial analysis of variables susceptible to impact variant choice. The structure of variation is invisible to any but systematic and exhaustive quantitative analysis. Finally, the systematic analysis based on natural production data. The use of speaker or analyst intuition, as well as the use of written, often highly formal, datasets prevent the analyst from accounting for actual linguistic performance and for the inherent social structure of variability, understood as the set of implicit rules on variant choice which constitute the longstanding community norm.

11. Acknowledgements

I would like to thank the reviewers for their enriching observations and suggestions, as well as Prof. Stephen Levey and Prof. Shana Poplack for useful discussions and feedback. I would like to thank the researchers who compiled the Italian corpora, LIP and C-ORAL-ROM, and made it publicly available to the community, which helped me carried out this research.


Salvio Digesto

Distribution of the subjunctive under verbal governors in spoken contemporary Italian. Governors are sorted by total number of tokens. Horizontal lines in the table indicate frequency divisions suggested by the data. %Data indicates the proportion of the data each governor account for. %S_Morpho indicates the proportion of the subjunctive morphology each governor account for.

Figure 21.

Figure 22.

Figure 23.

Figure 24.

Figure 25.