LEXICALIZATION AND SOCIAL MEANING OF THE ITALIAN SUBJUNCTIVE

The usage of the Italian subjunctive, particularly in the context of embedded completive clause, can be considered a normative linguistic stereotype par excellence, to which speakers should pay particular attention if they want to speak ‘properly’. However, despite the massive effort of the normative enterprise, as well as the much scholarly attention garnered from linguists, overall consensus on what exactly constrains mood selection in discourse is not unanimous: whether it makes a semantic contribution, which verbs should trigger it, whether it signals more careful style. Grammarians are also concerned with the attrition of the subjunctive and its productivity in speech, fearing the loss of its supposed semantic contribution. Several studies have addressed these issues, but only a small amount of this body of work on Italian subjunctive has utilized a quantitative method and these assumptions have not been evaluated systematically under an accountable empirical methodology. The findings of the present variationist investigation illuminate new evidence in the patterning of subjunctive use in community-based spontaneous speech data and refuting the claims that it is productive and semantically-motivated. The analysis reveals a lexically motivated pattern of variation, i.e., the use of the subjunctive is mainly restricted to a handful of main clause verbs and a single embedded verb. Systematic analysis also shows a correlation between subjunctive choice and higher level of education, a social meaning that further strengthens the idea that no semantic contribution is made when the speaker opts for the subjunctive over the indicative, a phenomenon that is inherently variable. – Published: 01/10/2022 cadernos.abralin.org DOI 10.25189/2675-4916.2021.V2.N3.ID609 ISSN: 2675-4916 V. 2, N. 3, 2021 2 de 36 RÉSUMÉ L’usage du subjonctif italien, en particulier dans le contexte de phrase complétive, peut être considéré comme un stéréotype linguistique normatif par excellence, auquel les locuteurs doivent prêter une attention particulière s’ils veulent parler ‘correctement’. Cependant, malgré l’effort massif de l’entreprise normative ainsi que la grande attention des linguistes, le consensus général sur ce qui contraint exactement la sélection du mode dans le discours n’est pas unanime : si le subjonctif apporte une contribution sémantique, quels verbes devraient le sélectionner, s’il marque un style plus soigné. Les grammairiens se préoccupent également de l’attrition du subjonctif et de sa productivité dans le discours, craignant la perte de son apport sémantique supposé. Plusieurs études ont abordé ces questions, mais très peu de ces travaux portant sur le subjonctif italien ont été réalisés selon une méthode quantitative et ces hypothèses n’ont jamais été testées systématiquement selon une méthodologie empirique. Les résultats de la présente enquête variationniste mettent en lumière de nouvelles preuves de la structure de l’utilisation du subjonctif dans les données du discours spontané de la communauté, et réfutent les affirmations selon lesquelles il serait productif et sémantiquement motivé. L’analyse révèle un modèle de variation lexicalement motivé, à savoir l’utilisation du subjonctif est principalement limitée à un petit nombre de verbes de la phrase principale et à un seul verbe enchâssé. L’analyse systématique montre également une corrélation entre le choix du subjonctif et un niveau d’éducation plus élevé, une signification sociale qui renforce l’idée qu’aucune contribution sémantique n’est apportée lorsque le locuteur opte pour le subjonctif plutôt que pour l’indicatif, un phénomène qui est intrinsèquement variable.

Plusieurs études ont abordé ces questions, mais très peu de ces travaux portant sur le subjonctif italien ont été réalisés selon une méthode quantitative et ces hypothèses n'ont jamais été testées systématiquement selon une méthodologie empirique. Les résultats de la présente enquête variationniste mettent en lumière de nouvelles preuves de la structure de l'utilisation du subjonctif dans les données du discours spontané de la communauté, et réfutent les affirmations selon lesquelles selecting contexts (TEKAVČIĆ, 1972; see also BRONZI, 1977;LEPSCHY;LEPSCHY, 1981;SERIANNI, 2006). Notwithstanding this assumption that the subjunctive is not freely selected but rather mandatory in a variety of contexts, it is also said to be dependent on the speaker's degree of commitment vis-à-vis the truth of the embedded proposition. In fact, when variability is acknowledged, i.e., the alternation between subjunctive and indicative in the same syntactic context is governed by the same (type of) main clause verb, the motivations behind mood choice are said to be correlated with the speaker's intent or state of mind at the time of the utterance (COSTANTINI, 2011, p. 39;GIANNAKIDOU;MARI, 2015; PORTNER; RUBINSTEIN, 2013; see also MANZINI, 2000). Additionally, the subjunctive and indicative alternation is seen as having two alternate semantic/pragmatic interpretations (SATTA, 1994), with the subjunctive marking less subjectivity (LEPSCHY;LEPSCHY, 1981;also SERIANNI, 2006). For instance, the prototypical verb credere 'to believe' is said to select the indicative instead when its choice supposedly corresponds to a more certain predication according to what the speaker believes, as in the example provided by Gatta (2002, p. 5) "credo che Dio esisteIND" ('I believe that God exists'), in which the use of subjunctive would instead reveal some uncertainty or weakness of the speaker's belief. There seems to be a general consensus that mood selection is semantically motivated, although no consensus is reached as to exactly which verbs or meanings should trigger the subjunctive in discourse.
Treatments of the contemporary Italian subjunctive in academic literature and formal grammars invoke a kaleidoscopic range of meanings to explain its distribution in speech. It is said to be a mood fluctuating between opinion and perception (BINAZZI, 2015), projecting "a modality of uncertainty and doubt" of the event in question (translation mine, SIMONE, 1993, p. 80). It is also claimed to represent "the intense and emotive degree, the particular and the personal, doubt and the unreal, the unexpected and the surprising, the desired and the feared, the extraordinary and the exceptional" (translation mine, DORIGO, 1951, p. 322).
Linguists and grammarians are also concerned with the supposed attrition or loss of the subjunctive and its productivity in speech. This assumption is linked to the hypothetical desemanticization of subjunctive morphology, a putative change that would purportedly result in the loss of the subjunctive. Few scholars claim that the Italian subjunctive is dead (MARCHI, 1984), although many subscribe to the idea that it is losing ground in favour of the indicative. This assumption is based on the observations that the indicative is taking over contexts that are traditionally associated in prescriptive accounts with the subjunctive (see credere and verbs of opinion in general; SERIANNI, 1986;SIMONE, 1993;TRIFONE, 2007).
Italian sociolinguists seem to agree that the subjunctive is losing ground particularly in vernacular speech due to a preference for the more transparent, regular and frequent morphology associated with the indicative (BERRUTO, 1987;BINAZZI, 2015;DE MAURO, 2017;TAVONI, 2002). A few quantitative studies have been conducted to evaluate hypotheses regarding the supposed semantically motivated selection of the subjunctive and its productivity in contemporary Italian. Some striking and sometimes contradictory results surface: 1) the subjunctive remains quite productive in speech and not threatened by the raise of the indicative mood in contexts traditionally associated to subjunctive mood (e.g., volitive and opinion matrix verbs; see BONOMI, 1993;SANTULLI, 2009;VELAND, 1991); 2) its use is semantically motivated (VELAND, 1991); 3) the indicative is slowly intruding into subjunctive-selecting contexts, mainly verbs of opinion, supporting the hypothesis of a change in progress (LOENGAROV, 2006;LOMBARDI VALLAURI, 2003;SANTULLI, 2009;SCHNEIDER, 1999;VOGHERA, 1993, p. 3). As a result, the subjunctive is favoured with verbs of opinion, volition and hope while verbs of communication disfavour it (BONOMI, 1993;LOMBARDI VALLAURI, 2003); 4) when the indicative is chosen over the subjunctive, speakers tend to mark modality elsewhere, by using expressions such as a mio parere 'in my opinion', la mia sensazione 'my feeling', or a matrix verb in a conditional tense (GATTA, 2002, p. 88). However, methodological, and analytical differences cause a lack of consistency in between studies, and consequently raise the issue of comparability of their results. Despite a general assumption that the subjunctive is relatively productive in speech, when we calculate the overall rate of occurrence based on the results provided by the authors in their publications, we notice that the rate of subjunctive fluctuates considerably across studies that provided quantitative insights into the use of subjunctive. Three main reasons underlie this lack of consistency from study to study are the nature of the data used as benchmark; the process of choosing the contexts of study (i.e., the circumscription of the variable context); and the lack of systematic quantitative comparison of the conditioning. Previous studies have used a disparate range of often incommensurable datasets as the basis for their quantitative analysis of subjunctive use, including newspapers (BONOMI, 1993;SANTULLI, 2009;VELAND, 1991), text messages (SANTULLI, 2009), on-line forum discussions (LOENGAROV, 2006;SANTULLI, 2009), as well as literature (SOLIMAN, 2002), with only a few studies relying on corpora of spontaneous speech data (LOENGAROV, 2006;LOMBARDI VALLAURI, 2003;SCHNEIDER, 1999). Some studies focus on verbs of opinion, motivated by the fact that these are said to be the semantic class that suffered the greatest loss of the subjunctive, although these studies lack systematic comparison with other semantic classes.  Santulli (2009) i.e., the set of verbs that are supposed to trigger the subjunctive in discourse, is predetermined and restricted to a limited number of main clause verbs frequently reported in the literature and in the grammars (e.g. credere 'to believe', pensare 'to think', sembrare 'it seems'). Therefore, imposing a restriction on the context of variation and possibly obviating accountable reporting of all the variation found in the datasets. Additional restrictions include the selective testing of certain contexts at the expense of others. A few studies focussed exclusively on 3 rd person subjects (LOENGAROV, 2006) or 1 st person singular subjects (SANTULLI, 2009) with the goal of examining the hypothetical correlation between subjunctive selection and the degree of commitment of the speaker. These studies analyzed contexts where the speaker is less committed (3 rd person subjects) or more committed (1 st person subjects) vis-à-vis the truth of the embedded proposition. However, as for in the case of semantic classes, the analyst is taking for granted that the subjectivity or the commitment of the speaker plays a role, without taking the necessary steps to empirically assess whether that is, in fact, the case. Finally, because analyses are based on raw number of tokens of the subjunctive, and without any reference to rates of occurrence, the analyst weakens the possibility of establishing whether a given context is actually more favourable to the subjunctive than another.
Despite considerable scholarly attention to the subjunctive mood in Italian embedded complete clauses, the topic of its selection and variable use in speech remains a point of debate: whether subjunctive use in Italian is determined by semantic factors and speech style, and whether its use is productive in speech have yet to be established. This paper adduces new evidence to fill this gap by investigating the use of the subjunctive in Italian discourse through a systematic quantitative analysis of subjunctive selection in presentday community-based production data through the lens of Variationist Sociolinguistics.

THE VARIATIONIST FRAMEWORK
This research is conducted within the theoretical framework of Variationist Sociolinguistics (LABOV, 1972 (POPLACK et al., 2018). Likewise, some previous quantitative research on Italian (GATTA, 2002;SCHNEIDER, 1999) has also hinted to a tendency for the subjunctive to appear in quasi-fixed expressions, therefore lexicalization, with credere, sembrare and pensare.
The theoretical approach of Variationist Sociolinguistics rests on the observation that speakers engage in choices to express a given meaning or grammatical function in discourse and recognizes variability as an inherent property of speech. The key theoretical construct is the linguistic variable (LABOV, 1972) that involves two or more variants used alternatively to express the same referential meaning or function in discourse (POPLACK, 2011;LEVEY, 2010). The linguistic variable studied here is the expression of the subjunctive and its major forms that alternate in the embedded context of completive clauses with no apparent change in meaning or function of subjunctive and indicative. The variationist perspective recognizes that variability is not random but rather structured and governed by multiple factors, be it linguistic or social. The underlying structure of subjunctive variability is discerned from examination of its distribution in discourse and its variable conditioning (POPLACK; TAGLIAMONTE, 2001). A key methodological tenet in variationist research requires adherence to the Principle of Accountability (LABOV, 1972).
This requires that all tokens relevant to the variable under investigation must be taken into consideration, including tokens that did occur as well as those that did not but could have, in order to fully account for the envelope of variation. This implies the identification of an objectively-defined variable context in which the variants alternate without change in meaning. This work differs substantially from previous analyses in the way the variable context is delimited, and the principle of accountability is considered.

CIRCUMSCRIBING THE VARIABLE CONTEXT
A particular challenge for the study of the subjunctive in a variationist framework is related to the abstract construct of the linguistic variable outlined above and the consideration of the relevant tokens of the variable under investigation. The issue of morphosyntactic variation, and more specifically whether grammatical constructions may be analyzed as linguistic variables, characterizes a longstanding debate in linguistics and some scholars have questioned the validity of extending the variationist framework beyond the level of phonology (e.g., LABOV, 1987;LAVANDERA, 1978). One may ask how can we consider mood selection a linguistic variable if theoretically subjunctive and indicative are said to express different, perhaps opposing, meanings? Traditional accounts of mood variation appeal to semantic explanations for justifying the use of the subjunctive and its variability with the indicative, by invoking the doctrine of form-function symmetry, i.e., the desire to establish a one-to-one relationship between a form and its meaning. The analyst could identify the contexts that are supposed to trigger the subjunctive and subsequently ascertain the extent to which this does occur in naturalistic speech. However, the semantic function that is supposed to trigger the subjunctive in discourse remains a matter of some debate in the literature, with little prospect of imminent resolution. Moreover, the various meanings that purportedly accompanies the use of the subjunctive in both prescriptive as well as descriptive and theoretical research (DIGESTO, 2019;POPLACK, 1992;POPLACK et al., 2018) are typically correlated with subjective, psychological or attitudinal motivations, including appeals to the speaker's state of mind or their intent, hope, fear, emotions, etc. These are impossible to operationalize in running discourse (POPLACK et al., 2013). This issue precludes any attempt to circumscribe the variable context based on semantic considerations alone. Previous variationist research has adopted a more pragmatic approach to this issue by circumscribing the variable context corpus-internally, i.e., by locating all the contexts where the subjunctive is actually used to determine where it could be used. Therefore, the subjunctive-selecting contexts were identified as "every tensed clause governed by a matrix verb, i.e., governor, which triggered the subjunctive at least once" (POPLACK et al., 2018, p. 229). This process yields the complete list of governors that selected the subjunctive at least once in the dataset, as the main clause verb credere 'to believe' shown in (3).
(3) Credo che tutti lo sappiateSUBJ. (C.438.218) 'I believe that everyone knows it.' After identifying the lexical identities of the governors, the next step was to analyze the data again and extract all the variants that compete with the subjunctive, which, in the case of the current study, were the indicative (4) and the conditional (5).
(4) Credo che tutto ritornaIND. (C.511.264) 'I believe that everything comes back.' (5) Io credo che le coscienze del nostro paese oggi sarebberoCOND in piazza. (C.438.253) 'I believe that our country's consciousness today would be protesting.' By adopting this method, we were able to account for the envelope of variation and to objectively identify the contexts that triggered the subjunctive in discourse. Whether the use of the subjunctive carries semantic meaning or not will be objectively assessed against a

THE DATA
The data on which the current study is based were extracted from two corpora of spoken Only recordings of spontaneous conversations in naturalistic settings were retained for the study, therefore those occurred in non-spontaneous/non-naturalistic contexts (e.g., news broadcasts, scientific press conferences, etc. -all of which are scripted) were disproportionately represented, i.e., the informal data from C-ORAL was mainly collected in Florence and surrounding areas; LIP provided recording in the four main urban centres listed above, although no socio-demographic information was collected, and it is uncertain whether the speakers were bona fide members of the targeted speech communities. There is considerable variation in the length of recordings within and between corpora, and there is relatively little speech from a large number of participants. Only a portion of the data is accompanied by socio-demographic information, mainly C-ORAL data, and overall, the data is skewed towards speakers with high level of education. Despite these limitations, they are essential tools for quantitative analysis of naturalistic speech. Consequently, both corpora represent spontaneous speech in a fair way, allowing us to investigate a number of external and internal factors which have never been systematically examined under an accountable empirical methodology. Moreover, the quantitative analysis will focus on both rates and conditionings. Rates of subjunctive could indeed be affected by an overrepresentation of a highly educated population, particularly given the prestige attached to the use of the subjunctive, although its distribution across contexts is more likely to remain consistent. In fact, while rates can vary due to extralinguistic factors, the conditioning (i.e., the underlying grammar) is expected to reflect more stable constraints on variation (POPLACK; TAGLIAMONTE, 2001; see also POPLACK; LEVEY, 2010). All the retained data were concordanced, and every unambiguous occurrence of subjunctive morphology was identified and retained for analysis.

OPERATIONALIZING HYPOTHESES
By adopting the coding protocol outlined in previous variationist studies on mood variation (e.g., POPLACK et al., 2013;POPLACK et al., 2018), we propose falsifiable criteria to assess hypotheses regarding the internal and external conditioning of the subjunctive extrapolated from the relevant literature, particularly the issue of a meaning-based alternation with the indicative, its productivity in speech and its stylistic conditioning. The hypotheses are operationalized and tested in a quantitative manner, shedding light on the constraints operating on the selection of the subjunctive in the variable context objectively defined above.
All tokens of the variable were coded according to several internal and external factors said to influence the choice of the subjunctive in discourse by drawing on the protocols outlined in previous variationist studies (POPLACK et al., 2013;POPLACK et al., 2018).
Semantic class of the governor. A recurrent hypothesis is that what triggers the selection of the subjunctive in completive clauses is the nature of the governor.
However, the task of coding each governor according to a given semantic class was an arduous one since an exhaustive agreed-upon list of verbs for each semantic class is not available and there is no inter-analyst agreement on exactly which semantic classes or meanings should select the subjunctive. This study adopts the semantic classification suggested in previous variationist research (POPLACK, 1992; POPLACK; LEALESS; DION, 2013) and quantitative Italian research (e.g., BONOMI, 1993;LOMBARDI VALLAURI, 2003), and test the main categories reported in normative treatments of subjunctive usage. We tested the assumption that the subjunctive is supposed to be triggered categorically by verbs conveying meanings of emotion (8a), volition (8b) and necessity (8c), variably with opinion (8d) and evaluative (8e) verbs and disfavoured with verbs of communication (8f). It is important to note that in the classification adopted in the current research, opinion verbs denote a subjective or epistemic meaning while evaluative verbs rather denote an evaluative attitude and assessment of an event (e.g., verificare "to establish", reputare "to deem"). 'I calculated that Metaponto, which is forty kilometers away, had a nice sea.' f. Non si può dire che nel terzo mondo sianoSUBJ cattivi. (L.412.140) 'You cannot say that in the third world they're all mean.' Sentence type. Previous variationist research operationalized an approximate test of assertion vis-à-vis the predication, and this was done independently of the semantics conveyed by the governor. Therefore, negative (9a) and interrogative (9b) sentences were coded as less assertive contexts than the affirmative (9c) counterpart. If the subjunctive is associated with a less assertive reading, we can predict that negative and interrogative contexts would favour its selection. c. Fra l'altro, a quell'epoca, credo che la siaSUBJ stata la meglio ditta di tutta l'Italia. (C.322.151) 'Also, at that time, I believe that it was the best firm in all of Italy.' Presence of other indicators of non-factual modality. Beyond whatever meaning is supposedly embodied by the subjunctive morphology, we assess the role of the presence of elements in discourse that could contribute to a non-factual reading of the proposition.
Every token was coded according to the presence of explicit cues in running discourse indicating non-factual reading, a less assertive or uncertain predication, through the presence of adverbs, modals or expression helping to objectively establish a doubtful interpretation of the proposition expressed. Tokens were coded according to the presence (10a) and the absence (10b) of such indicators. The prediction is that the subjunctive is favoured in contexts objectively denoting uncertain/less-certain predication.
(10) a. Magari ci poteva stare che io andassiSUBJ a cambiarli! (C.224.61) 'Maybe it made sense that I could go to exchange them.' b. Voglio stare bene con lui e spero che siaSUBJ per tutta la vita? (C.511.159) 'I want to be happy with him and I hope that it will be for a lifetime.' Lexical identify of the governor. Each token was coded according to the lexical identity of the governor that triggered the subjunctive, to empirically ascertain the contribution of the governors to subjunctive selection. If the lexical identity of the governor contributes to variability independently of any meaning to be expressed, we can take it as an indication that the subjunctive may not be productive. We can also measure productivity in terms of the number of governors that triggered subjunctive in discourse and how they correlate or not with other internal and/or external factors. To establish that the subjunctive is productive, we should observe it to be triggered by a great number of governors, to make a semantic contribution and we should note a consistent effect when cross-tabulated with another factor group, e.g., semantic class of the governor.
Lexical identity and morphology of the embedded verb. Likewise, each token was coded according to the lexical identity of the embedded verb as well as its type of morphology. As argued for the lexical identity of the governor, to deem the subjunctive productive, we should detect a high number of embedded verbs carrying subjunctive morphology. Moreover, the type of morphology enabled us to ascertain whether speakers prefer more transparent and frequently occurring indicative morphology rather than dealing with the subjunctive.
Furthermore, if the subjunctive is not semantically motivated but instead lexicalized in discourse, we can predict that it would be favoured with more frequent and irregular forms, since these forms are said to resist change and are more amenable to entrenchment in discourse (BYBEE, 1985(BYBEE, , 2007THOMPSON, 1997). Tokens were coded according to whether their morphological form was regular (11a), irregular (11b), or suppletive (11c).
'But it's inevitable that one gives a bit of oneself-on the contrary gives it their all.' b.
'If someone has nothing to do, it is also good that he be distracted.' c. Comunque io penso che il grigio ghiaccio siaSUBJ (vs. èIND) bellino. (COR.6.215) 'Anyway I think that the light gray is also cute.'

OVERALL RESULTS
The extraction method applied here yielded a dataset of 1713 tokens, of which 3% (N=50) were marked with the conditional. Due to its relative rarity, the conditional variant was excluded from the analysis, resulting in the overall distribution of the two competing variants: subjunctive and indicative, shown in Figure 2. Overall results show robust variability, with the subjunctive being selected more than two-thirds of the time, appearing quite productive, at least superficially. However, overall rates can be misleading and may hide internal conditioning that is indiscernible from the inspection of surface forms alone.
We can now turn to the main question: what constrains the observed variability? This study reports the results concerning the factors operationalized to test hypotheses that a meaning-based use of the subjunctive in discourse is at work: the semantic class of the governor, the sentence type, and the presence of indicators of non-factual modality. The effect of a given factor is inferred by comparing its individual rate of subjunctive to the overall rate for the pooled data (cf. Figure 2, overall subjunctive rate of 68%  (COR.505.265) applied when deeming the effect disfavouring (rate of subjunctive selection of a given factor lower than the overall rate). A neutral effect is deemed when there is no substantial difference between the rate of subjunctive of a given factor and the overall rate.

SEMANTIC TESTS
According to recurrent claims in the literature, semantic classes of emotive, volitive and necessity verbs, which are categories usually indicating a modal interpretation related to permission, necessity, desire, or obligation are predicted to favour the subjunctive categorically. On the other hand, subjunctive should occur variably with evaluative and opinion verbs, and disfavoured with communicative verbs. Importantly, for every overall effect noticed, we expect to observe a consistent effect amongst the members of the semantic class to deem the effect genuine, i.e., no one member of any given semantic class can determine the effect of the entire class.  Table 2 show that some semantic classes favour the subjunctive more than others. All three classes of stronger modal determination, i.e., emotive (12), volitive (13) and necessity (14) governors, favour the subjunctive and select it most of the time, which is in line with theoretical and prescriptive assumptions. 'Let me say first that a good craftsman needs to know lots of things.'

Results in
However, we do not observe overall the categorical selection of subjunctive mood often assumed with such classes of verbs. Moreover, variability characterizes every semantic class although to different degrees. The semantic class of evaluative verbs slightly favours the subjunctive (71%); opinion verbs have no effect to the selection of the subjunctive (66%); communicative verbs highly disfavour the selection of the subjunctive (19%). As noted above, we expect every member of a given semantic class to behave the same way to ascertain that the effect observed is genuine. In contrast, we do not observe such co-occurrence with the subjunctive by semantic class. We observe, on the other hand, considerable discrepancies between the rate of the subjunctive by semantic classes and the rates of selection for each member of a given semantic class. The class of opinion verbs is the most populated in terms of data and lexical types of governors. 80 verbs were coded as verbs of opinion, although only four of them account for more than half (64%) of all the data within this semantic category. These four verbs show different patterns ( Figure 2): credere 'to believe' (76%) and sembrare 'it seems' (74%) favour the subjunctive; pensare 'to think' (68%) has no effect; non è 'it is not' (32%) highly disfavours the subjunctive. The same can be said for all other governors coded as opinion verbs, as summarized in Figure 2 above. The observation is that not all opinion governors share the same pattern or direction of effect.
The rate of subjunctive observed in this semantic class range from 14% to 100% (27 of these governors are singletons) 2 . Moreover, many governors show an exceptionally low or singleton token count, which precludes any substantial conclusion for those contexts. The semantic class of evaluative verbs contains little data dispersed across a few governors and no consistent effect is observed (Figure 3). non è da dire "it must not be said " dire "to say" essere convinto "to be certain " non s ap ere "no t to kn ow" immaginare "to imagine" essere sicuro "to be sure" 27 Sin gletons può d arsi "it might be" ritenere "to co nsider" parere "it seems" sembrare "it seems" non è "it is n ot" 42 Infreq uent governors (2-10 tkn/gov) pens are "to think" credere "to believe" reputare "to deem" prevedere "to predict" calco lare "to calcu late" verificare "to verify" stabilire "to establis h" co ntro llare "to co ntrol" ammettere "to admit" Commun icative  We observed that 64% of the data in this category (N=60/69) belong to one governor, bisognare 'it is necessary', while the other only governors of moderate frequency disfavours the subjunctive (bastare, 'to suffice; ≈ as long as', 54%). If we exclude bisognare from the factor group, we notice that this governor has the effect of nullifying the favouring effect of necessity verbs since the overall rate drops to 62% (N=21/34).
Summarizing, we observed the dominance of a single governor and its dissimilarity from the patterns attested for other members of this semantic class. Hence, the subjunctive has a clear lexical effect rather than a genuine semantic effect. The only two semantic class showing a more consistent pattern overall are volitive and emotive verbs. However, as shown in Figure 4, both classes display a fair number of singletons and highly infrequent governors, weakening the observed overall effect. In other words, the overall rates may be affected by the presence of a few governors with a low token count, which inflates rates, due to a true semantic effect. Moreover, with regards to the semantic class of volitive verbs, The assertive context of affirmative sentence slightly favours the subjunctive, while interrogative contexts have no effect and negative sentences disfavour the selection of the subjunctive. Further analysis showed several interesting results. Cross-tabulation of sentence type and lexical identity of the governor (Table 4) showed that 1) the effect of sentence type was not consistent across all governors that triggered the subjunctive, 2) where cells were populated with data, no effect of negation was observed. It should be noted that the dispersion of data among dozens of different governors resulted in either empty cells or very low token counts in interrogative contexts, which made it impossible for us to ascertain how interrogative clauses affected mood choice, and they were therefore excluded. If we consider the most frequent governors in the data set as shown in Table 4: Chi-square tests confirm that there is no significant difference between affirmative and negative contexts; some governors (bisognare, sperare and può darsi) only occur in affirmative contexts and non è necessarily occurs only in negative clauses, thus impeding the test of the non-assertive hypothesis. More interestingly, one single governor, non è, is responsible for the overall disfavouring effect of negation observed above in Table 4 since it accounts for 55% of all the negative contexts (N=168/303). If we exclude tokens of the outlier non è from calculating the overall rate of subjunctive in non-affirmative contexts, we observe no difference between affirmative and non-affirmative sentence types (Figure 5 below). The evidence suggests that the hypothesis according to which less assertive contexts should favour the subjunctive does not hold true in speech. We also assessed the doubtful nature of the proposition by coding for the presence of explicit and objectively identified indicators of non-factual modality in the ambient discourse. The presence of elements, sometimes referred to as subjunctive triggers, e.g., epistemic adverbs, presence of a modal, tense/mood of the main clause such as future, conditional, or subjunctive, may be predicted to favour the use of the subjunctive. Overall, results show that the absence of indicators has no effect while the presence of indicators favours the selection of subjunctive ( Overall, factors that are designed to capture a doubtful reading of the proposition through the presence of explicit cues in running discourse do not have consistent effects on the selection of the subjunctive. If semantics were genuinely an explanatory factor of subjunctive selection in completive clauses, we should have not observed such inconsistencies both across and within the factor groups tests. The exception to the rule seems to concern both contexts of emotive and volitive verbs, though they still account for a small portion of the dataset and the low number of tokens, particularly for emotive verbs, suggesting that we are not observing a genuine effect of semantics.

LEXICALIZED USE OF THE SUBJUNCTIVE IN DISCOURSE
The factor groups designed to assess the semantic contribution to the selection of the subjunctive failed to account for a meaning-based alternation in discourse. The results presented above demonstrate how certain governors play a key role in the overall outcomes observed. A total of 140 governors was extracted from the data. Such a high number may suggest a productive use of the subjunctive in discourse. Although a relatively high number of governors in Italian discourse selected a subjunctive at least once, more than a third (38% of the governor pool, N=53/140) are singletons and 3% of the governor pool (4 verbs) account for nearly half of the data (46% of all tokens, N=769/1663). The frequency of verb use in the discourse may explain this. This would indicate a situation in which the more frequent the governor, the higher the rate of the subjunctive, which is not the case. Figure 6 shows that the there is no correlation between governor and frequency: highly frequent verbs, the top four governors credere 'to believe', sembrare 'it seems', pensare 'to think' and non è 'it is not', do not behave the same way; the same can be said about mediumfrequency governors, ranging from 18% to 100% rate of subjunctive selection, highlighting idiosyncratic inconsistencies and the absence of an apparent trend concerning frequency.

Highly frequent governors
Each governor has the potential to trigger subjunctive morphology in completive clauses, though this option seems to fall under the purview of only three lexical governors.
In addition to the role of the governor, we notice striking results with regards to the lexical identity of the embedded verb, i.e., the verb that carries subjunctive morphology.
Due to the high number of governors in running discourse, we can expect a high number of embedded verbs to receive subjunctive morphology. Moreover, every verb in the language is theoretically eligible to carry subjunctive morphology. However, when we examine the results for the lexical identity of the embedded verb, the effects observed are quite dramatic. Results displayed in Figure 7 show that only three verbs favour the selection of the subjunctive: essere 'to be' (73%), andare 'to go' (71%) and sapere 'to know' (81%). As observed for the governors, we do not detect a consistent trend with the lexical identities of the embedded verbs. Figure 7 shows that other relatively highly frequent verbs slightly disfavour the subjunctive, such as the case for the second most frequent verb avere 'to have' (65%), or have no effect, e.g., potere 'can' (67%), discarding frequency as explanatory factor for the variation observed. We also investigated whether the type of morphology of the embedded verb affects variant selection and found that suppletive morphological forms favoured the selection of subjunctive forms. (Table 6). Further examination showed that the lexical identity of the verb overrides the effect of morphological form since most suppletive forms (81%, N=560/591) belong to one lexical type, essere. The effect of essere in its suppletive confirms the restriction of the subjunctive mood with this specific lexical identity and hints at a lexicalized use in discourse. When we crosstabulate the embedded lexical identity/morphology with the governors, we noted a substantial higher rate of occurrence for almost all the frequent governors (with 50+ tokens) with suppletive essere than other lexical identity and morphological forms (Figure 8). We observed rates of subjunctive skyrocketing with suppletive essere morphology, for instance, with pensare (+24%), non è (+23%) and credere (+22%). Verbs such as volere and bisognare, which already show high rates of subjunctive selection, jump to 100% when the embedded verb is suppletive essere. In sum, we observed and confirm overall a high propensity to trigger the subjunctive with the very salient suppletive forms of essere. Summarizing, all these results detract from the assumption that the variable use of the subjunctive in completive clauses in Italian makes a semantic contribution but indicate instead a routinized use in discourse. Moreover, on the surface, the use of the subjunctive seems productive if we consider the contribution of the governors but clearly not extendable to the embedded verbs. To further investigate these contrasts and examine the apparent productivity of the subjunctive in running discourse, we examined whether social factors contribute to subjunctive variability.

THE SOCIAL CONDITIONING
Both speech style and level of education showed interesting results and enabled us to account for the apparent productivity and the lexical pattern reported above. The link between the use of the subjunctive in more formal contexts has been widely reported and well-documented, though a systematic comparison of the pattern across categories has never been demonstrated. Cross-tabulation of the lexical identity of the governor and speech style reveals some interesting results. First, the total shows an overall rate of subjunctive selection that is higher in more careful speech contexts (71%, N=763/1082) compared to more casual conversations (65%, N=376/581), lending support to the hypothesis that the subjunctive is indeed favoured in more formal contexts. Another important result is that, if we consider casual speech data only, credere, pensare, sembrare, and bisognare favour the subjunctive, but they also account for 32% of the casual speech data (N=184/581) and almost half of all the subjunctive used in informal situations (49%, N=184/376, cf. total subjunctives in casual speech in Table   7). This is an important result concerning the question of productivity since it shows an even less productive use of the subjunctive in casual speech. Moreover, the highly frequent governor non è constitutes a large portion of the casual speech subsample and since it highly disfavours the subjunctive, it lowers the overall rate of subjunctive selection in the context of casual speech.
Another important result is that despite the fluctuations of the effect of speech style by governors, Chi-square tests indicated that no statistically significant difference at p < .01 is observed, whether the lexical types listed in Table 7 above are used in casual or careful speech. On the other hand, results for the two categories of infrequent and singleton governors highlighted a substantial difference according to speech style. First, we observed a greater number of singletons in careful speech. Second, the overall rate of subjunctive selection for the infrequent governors is considerably higher in more careful speech, and this difference is statistically significant at p < .01 (p-value=.008612). Third, not only the rate of subjunctive selection is higher for infrequent governors in more careful speech, but the amount of subjunctive used is almost three times bigger in the least 'natural' way of speaking. These results suggest that speakers may be more sensitive to the overt prestige that the use of the subjunctive in discourse has acquired in contemporary Italian, triggering it more in formal than in casual conversations. It is now clear that the apparent productivity of the subjunctive in completive clauses is attributed to highly infrequent and singleton governors surfacing more frequently in careful speech.
One might think that a more careful speech calls for a richer vocabulary and that lexical diversity is a logical consequence of the several speech contexts instantiated in the formal subsample of the data. The formal subsamples were recorded in a relatively wide variety of contexts, e.g., political debates, television interviews, etc. Each context focusses on a different topic, though they were all characterized by a more careful speech and therefore possibly more complex and florid argumentation.
Notwithstanding these considerations, we should bear in mind that most of the variation is not accounted for by a stylistic difference, since the bulk of the governors behave in the same way regardless of speech style. Further investigation showed an interesting correlation between the speaker's level of education and speech style. It is clear from Table 8 that highly educated speakers are extensively contributing to the pool of governors with a rich set of lexical types, most of which are infrequent or singleton governors. In other words, these speakers are responsible for the apparent productivity of subjunctive observed on the surface. This result is consistent with the observation that the subjunctive is a sociolinguistic stereotype in contemporary Italian society. We would expect that if there is one segment of the population that would be extremely sensitive to this highly normative and salient feature, it would be the one with a higher level of education. If the subjunctive were truly productive, as reported elsewhere or as it misleadingly appears to be on the surface, we should have observed more of it in the vernacular of the speakers regardless of their level of education. On the contrary, Table 8 shows that speakers with little or no formal education do not use the subjunctive in casual conversation to the extent that highly educated speakers do. We also notice a substantial difference in terms of the governor pool by level of education: highly educated speakers are single-handedly providing the rich set of governors observed in our analysis. Despite the general preference to use the subjunctive more in careful speech, speakers with a higher level of education contribute to a great number of governors even in their casual speech.
These observations suggest that speakers, mainly with a higher level of formal education, make the effort to use the subjunctive, and therefore to convey the linguistic prestige that this morphology has gained in contemporary Italian society.

DISCUSSION
By making use of the standard variationist methodology, we empirically evaluated several hypotheses regarding the nature of linguistic and social factors conditioning and the choice of the subjunctive in contemporary Italian discourse. We were able to address three main issues characterising contemporary accounts and ideology of Italian subjunctive, i.e., the supposed semantic contribution, its social conditioning as well as its status in terms of productivity. Results does not lend support to one of the major assumptions prevailing in the literature and in normative accounts of the subjunctive, i.e., the semantic/pragmatic motivation underlying the choice of the subjunctive mood in embedded completive clauses.
On the contrary, objective and independent tests showed that rates of occurrence are The subjunctive looks very productive on the surface, which apparently mitigates this lexicalization. Its overall rate is fairly high (68%), and many verbs triggered it in discourse (N=140). By means of systematic comparisons, we were able to pinpoint to the nature of this apparent productivity: the overt prestige denoting subjunctive morphology in contemporary Italian speech. We first observed an apparent effect of style: overall rates of the subjunctive were higher in careful speech (71%), as opposed to casual speech (65%), but the core set of governors shared showed similar behaviour across the two speech styles, and Chi-square tests showed no significant difference. More strikingly, the high saliency of the subjunctive is reflected in the strong tendency of highly educated speakers to make use of the subjunctive in embedded completive clauses, not only in careful speech but in casual speech as well. This might be explained by the massive effort of the prescriptive enterprise invested in making the subjunctive a hallmark of bon usage. Subjunctive morphology is indeed very salient (DELLA VALLE; PATOTA, 2014; STEWART, 2002) and its non-standard use often provokes vitriolic reaction from writers, journalists, teachers, intellectuals and the general public as well. The overt prestige associated with the use of the (standard) subjunctive is often lamented, to the extent that even some linguists claims that its existence is threatened by the rising use of the indicative and its avoidance is often condemned as a feature of popular, uneducated, or careless speech (FOCHI, 1956;GONZÁLEZ DE SANDE, 2004;SCHMITT JENSEN, 1970;SIMONE, 1993). Our results showed that the subjunctive performs a social function. The great contrast observed between highly educated speakers and those with little or no formal education in our dataset confirmed that the overt prestige of subjunctive morphology and more importantly its productivity is due to level of education rather than a genuine internal linguistic productivity. If the subjunctive were truly productive in speech, we would expect it to arise independently of education. In other words, speakers would not need formal education to learn it.

CONCLUSIONS
The findings of this study contribute to our understanding of the variability in mood selection in the context of completive clauses and further highlights the importance of adopting empirical quantitative methods for examining linguistic variation. Our results are further evidence that quantitative patterns of co-occurrence of variants are indiscernible on the surface, and they are only accessible through systematic examination of conditioning contexts. The discrepancies between the findings reported in the current study and the claims in the literature, as well as the results diverging from previous sociolinguistic accounts of Italian subjunctive, are due to several reasons. To begin, we must take into account the inherent variability of speech, which arises from form-function asymmetries in language rather than simply matching a meaning to a form. The distinctions suggested by the linguistic theory as well as prescriptive dictates on how a given linguistic feature is conditioned and often neutralised in discourse. Secondly, the adherence to the envelope of variation based on the delimitation of a variable context defined corpus-internally as opposed to the standard ideology on how the subjunctive is supposed to function in discourse. Also, by dismissing whatever diverges from the idealized standard language can lead to erroneous conclusions based on only a handful of possibly idiosyncratic occurrences, and can therefore lose sight of the pattern regulating variability. Besides, the chaotic state of grammatical prescription for the subjunctive, as also reported by a few metalinguistic analyses of subjunctive (see DIGESTO, 2019, cap. 2;POPLACK et al., 2015;LEALESS;DION, 2013), suggests that it is virtually impossible to rely on normative expectations in order to identify the appropriate set of governors and meanings to define the variable context. In fact, despite relatively high rates of subjunctive selection, the variability observed and reported above characterizes emotive, necessity and volitive matrices as well. Even necessity, a wellspring of subjunctive use, shows a clear lexical effect in our dataset, highlighting the key contribution of the lexical identity of the governors, rather than the semantics underlying each governor, and reinforcing the importance of conducting systematic multifactorial analysis of variables susceptible to impact variant choice. The structure of variation is invisible to any but systematic and exhaustive quantitative analysis. Finally, the systematic analysis based on natural production data. The use of speaker or analyst intuition, as well as the use of written, often highly formal, datasets prevent the analyst from accounting for actual linguistic performance and for the inherent social structure of variability, understood as the set of implicit rules on variant choice which constitute the longstanding community norm.