The Use of the Cloze Test in Reading Comprehension Assessment in Brazil: Post-Pandemic Challenges

Flávia Oliveira Freitas; Gislane Evangelista dos Santos; Raquel Meister Ko Freitag

doi:10.25189/2675-4916.2025.v6.n2.id787

Compartilhe

Revisão de Literatura

The Use of the Cloze Test in Reading Comprehension Assessment in Brazil: Post-Pandemic Challenges

Flávia Oliveira Freitas Universidade Federal de Sergipe https://orcid.org/0009-0008-1795-6822
Gislane Evangelista dos Santos Universidade Federal de Sergipe https://orcid.org/0000-0002-5232-3405
Raquel Meister Ko Freitag Universidade Federal de Sergipe https://orcid.org/0000-0002-4972-4320

Resumo

O teste cloze tem sido amplamente utilizado há 70 anos para a avaliação de proficiência em compreensão de textos em diferentes línguas, tanto para L1 como para L2. A base do teste é o lacunamento sistemático de um texto, e os escores dos acertos são associados ao grau de compreensão do participante da tarefa. No Brasil, o procedimento cloze tem sido produtivo como uma das ferramentas de aferição da compreensão leitora em L1. Por ele, é possível distinguir perfis de leitores – frustrado, instrucional e independente – ao averiguar o desempenho dos participantes no preenchimento das lacunas. O levantamento de Almeida (2023) mostra que este procedimento foi adotado em 75 dos 345 estudos sobre avaliação de leitura nos últimos 20 anos. No pós-pandemia, em que as assimetrias de leitura estão ainda maiores, a adoção do critério de palavra exata não permite um diagnóstico apropriado: aplicações posteriores à pandemia realizadas no Colégio de Aplicação da Universidade Federal de Sergipe – abril de 2023 – mostraram que há alunos que se concentram no nível da insuficiência de compreensão (Machado, Santos, Cruz, 2019, Santos, Machado, 2022). Este artigo explora as medidas utilizadas para avaliar o preenchimento das lacunas tanto por meio de respostas exatas quanto por aquelas adequadas ao contexto, demonstrando a compreensão discente sobre o texto ao empregar palavras funcionais ou lexicais. Os critérios de análise dessas respostas baseiam-se na proposta inicial de Taylor (1953) – como resposta exata, resposta aceitável, múltipla escolha, clozetropia (Brown, 1980, 2013), porém sendo aqui consideradas as duas primeiras, com foco no cenário brasileiro. Destarte, este artigo realiza uma revisão sistemática cujo método considera estudos empíricos sobre avaliação de leitura com o teste cloze, utilizando como ferramenta de pesquisa a biblioteca virtual Portal de Periódicos Capes. Constatamos que a maior parte das pesquisas adota como critério de medida o preenchimento da palavra exata (Joly; Istome, 2008), e alguns poucos consideram familiaridade (Oliveira et al., 2007) ou classe gramatical (Abreu et al. 2017) da palavra. Apontamos a necessidade de adoção de técnicas de processamento de linguagem natural, com medidas de distâncias lexicais, que podem auxiliar no diagnóstico do quanto de compreensão existe em cenários de baixa leitura. Essas distâncias podem ser encontradas através de um “número mínimo de inserções, supressões ou substituições de um único carácter necessário para transformar uma palavra na outra” (Petroni; Serva, 2010). Elas serão mensuradas seguindo a mesma classe gramatical ou mesmo campo semântico da palavra esperada.

Lay Summary

The cloze test has been in use for 70 years to assess reading comprehension in several languages. It involves removing words from a text and analyzing how well participants fill in the gaps, which helps identify their reading proficiency levels. In the post-pandemic period, reading difficulties increased, making traditional exact word matching methods insufficient for accurate assessment. A study conducted at the Colégio de Aplicação of the Federal University of Sergipe confirmed that many students struggled with comprehension. This article systematically reviews studies from 2009 to 2022 that have applied the traditional cloze test to elementary school students. The findings show that most studies rely on exact word matches, including computerized versions, rather than considering alternatives such as word familiarity or grammatical class. This research highlights the need for advanced assessment methods, such as natural language processing, which can analyze responses based on lexical distance—the number of small changes needed to transform one word into another while maintaining meaning. Incorporating natural language processing allows educators to understand students’ reading abilities, leading to improved teaching strategies and literacy development.

Introduction

The cloze test, also known as the cloze procedure, has been widely used to assess reading proficiency in first- and second-language contexts. Initially developed by Taylor in 1953, the cloze test was designed to evaluate the readability of texts for native English speakers. Researchers have expanded its use by adopting the cloze test as an instrument to measure proficiency in English as a foreign language (Brown 1980, 2002). Since then, it has been integrated into exams such as the Test of English as a Second Language (TOEFL) and Teaching English to Speakers of Other Languages (TESOL) assessments.

The cloze test is currently used to evaluate how well a subject comprehends the meaning of a given text. In this procedure, the subject is required to read a text and then fill in blanks with a single word, which could be functional (e.g., prepositions, articles, conjunctions), lexical (e.g., nouns, adverbs, adjectives), or even a random term (Abreu et al., 2017; Cardoso, Menezes Freitas and Freitag, 2024; Colombo and Cárnio, 2017). The primary goal of this test is to allow participants to complete a passage by supplying omitted words, thereby obtaining meaning. The foundation of the cloze test lies in the systematic omission of words from a passage of prose (Bickley et al. 1970), with participants' scores reflecting their level of comprehension. In Brazil, the cloze procedure has been effectively utilized as a key tool for assessing L1 reading comprehension.

The cloze test can be a valuable tool for teachers and researchers to identify and distinguish three readers’ profiles – frustrated, instructional, and independent (Bormuth, 1968) –based on their performance. The frustrated level refers to participants who achieved up to 44% of correct literal answers, the instructional level applies when participants scored between 45% and 57% of correct answers, and the independent level was reached by individuals who scored more than 57% of correct answers according to the original passage of the text.

These results are useful for calculating the individual performance of cloze test-takers. Once a participant has completed the test within the stipulated time, the total number of correct responses must be counted, multiplied by 100%, and divided by the total number of deleted spaces (Cardoso et al. in press).

This study investigates methods to evaluate gap-filling through exact answers (EX) (Brown, 1980), which are considered the most appropriate to the context, thereby demonstrating students’ comprehension of the text. The criteria for analyzing these responses were based on Taylor’s initial proposal (1953), which prioritizes the exact answer. Despite alternative methods, such as multiple-choice and Clozentropy (Brown, 1980; Darnell, 1968), this study focuses on the exact-answer method, within the Brazilian context, where teachers must manually check their students’ responses, making the process less time-consuming.

The primary objective of this systematic review was to identify the most validated cloze test assessment measures through empirical studies and published research, with a particular focus on Elementary School students. Almeida (2023) reported that the cloze procedure appeared in 75 of 345 studies on reading assessment over the past 29 years, highlighting the significance of investigating this procedure and its scoring methods. Three virtual databases — Scielo, Scopus, and Web of Science — were consulted for this study.

In the post-pandemic period, reading asymmetries have widened compared to the pre-pandemic era, yet the exact word criterion does not provide a fully adequate reading diagnostic; it remains the most appropriate available method. Post-pandemic applications conducted at Colégio de Aplicação of Universidade Federal de Sergipe revealed that some students were classified as having insufficient comprehension or frustration (Santos and Machado, 2022, Cardoso et al., 2024). These findings underscore the importance of encouraging the use of well-established tools such as the cloze test, which can aid education professionals in refining their practices and enhancing students’ reading proficiency levels.

The remainder of this paper is organized as follows. It begins with a brief introduction to the cloze procedure and the stages used to create this test. The second section provides a review of the relevant literature. Subsequently, this article outlines the methodology. The subsequent sections present the results, including the analysis of the papers selected for this study and a discussion supported by additional data. In the final section, the authors emphasize the applications of cloze tests, advocate for the broader dissemination of this practice across other regions of Brazil, and suggest the adoption of semantic similarity responses.

1. Background

In 1953, journalist Wilson Taylor developed the cloze test or cloze procedure to assess children’s reading comprehension and evaluate communication effectiveness in English (Brown, 2002; Bickley; Ellington; Bickley, 1970). Since then, different science fields, including psychology, phonology, and language teaching, have adopted it as a research method either independently or in conjunction with other assessment measurements.

The cloze procedure is grounded in the Gestalt Psychology principle of “closure”, which highlights the natural human inclination to complete patterns by filling in gaps (Taylor, 1957; Oller; Conrad, 1971). In language field studies, this principle explains students' tendency to fill in the blanks of written text, drawing on their ability to understand the materials. This process involves various cognitive and linguistic factors including comprehension, general language proficiency, vocabulary, learning ability, attention, motivation, and memory (Taylor, 1957; Morais and Kolinsky, 2015).

The cloze test typically uses a prose passage of 200 to 250 words (Bickley, Ellington and Bickley, 1970; Cardoso et al., 2024). The selected text may be informational, and the test designer can adapt the content from a school textbook or news report. The construction of the text can be tailored to the social and economic background, age, and educational level of the target students depending on the researcher’s objective. The passage may be taken from any test section (Bormuth, 1968), preserving its title and initial 15–17 words. This allowed participants to read the introductory segment and contextualize the subject matter of the text. Beginning from the sixteenth word, every fifth word was omitted, requiring the participant to complete the text by filling in the blank spaces. This method aids students in understanding the task and restoring disrupted patterns (Oller and Conrad, 1971), thereby facilitating comprehension of the text.

A blank of uniform length was placed after every fourth word (Trassi et al., 2019), ensuring that all blanks were of the same size (Darnell, 1968; Brown, 1980). This standardization is crucial in preventing the participant from inferring the correct word based on the blank space, which could lead to skewed responses such as selecting longer or shorter answers. The consistent length of the blanks minimizes the potential for misleading cues, ensuring that the blank size does not influence the participant’s judgment of the most appropriate word.

Taylor’s scoring criterion adheres to a binary correct/incorrect system, where participants must fill in the blanks with exact words from the original text (Bormuth, 1968). A variation of this approach offered two answer options per blank, with only one being correct. Any responses that deviated from the original wording were considered incorrect.

Brown (1980) identified four prominent cloze test scoring methods: (1) the exact word method (EX), which only accepts the original word from the text to be correct; (2) the acceptable answer (AC), which allows any contextually appropriate word as the correct answer; (3) Clozentropy (CLNTZ-Darnell, 1968), a method that considers the frequency of native speaker responses in a pre-test and logarithmically weights acceptable answers; and (4) multiple-choice (MC), in which students are provided with alternative answers and must select the correct word for each blank. The following section outlines the methodology employed in this study.

2. Method

The primary goal of this study was to identify the measurement procedures most commonly employed in Brazilian studies to assess gap-filling responses in cloze tests. The methodological approach to this study adheres to the guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA), as Galvão and Ricarte (2019) recommend.

The descriptors utilized for the database search included “test cloze” and “reading assessment combined with the Boolean operator AND. To validate these descriptors, the bibliometric tool, Bibioshiny, was used to generate a word cloud illustrating the frequency of terms in various cloze test studies. The search was conducted simultaneously and independently using the CAPES Journal Portal, which provides access to several databases. Based on the availability of articles related to the topic, the SciELO, Web of Science, and SCOPUS databases were selected as the primary databases for the literature search. The search criteria focused on the title, abstract, and keywords, and only the studies published in the paper were considered.

The next step involved establishing inclusion and exclusion criteria. While the inclusion criterion focused on selecting empirical studies that used the cloze test to assess reading ability and involved studies with basic education in Brazil, the exclusion criterion, on the other hand, led to the removal of studies conducted on adults, graduate and undergraduate students, or those focusing on atypical populations, such as children with dyslexia or Attention Deficit Hyperactivity Disorder (ADHD).

Two researchers independently reviewed all relevant papers on the topic, organizing the information in an Excel table to consolidate the data and facilitate content analysis. The corpus construction process was divided into three stages: 1) study identification, 2) screening, and 3) eligibility assessment. During the initial identification stage, 42 papers were retrieved: 27 from SciELO, three from Web of Science, and 12 from SCOPUS. The papers were screened for further evaluation.

The second stage of this review was the screening process that explained the articles selected for this study. The exclusion criteria, based on paper titles, keywords, and abstracts, resulted in the removal of 15 texts, and one additional screening identified eight duplicated entries. Ultimately, 19 articles met our inclusion criteria.

The eligibility stage involved full-text review of the remaining papers. The final selection of this review comprised 19 empirical studies that applied the cloze test to assess the reading abilities of basic-education students in Brazil. These studies are presented in a table containing detailed information on published papers across the selected databases.

3. Results

This systematic review corpus comprises 18 papers published over thirteen years, from 2009 to 2022. The table 1 offers an overview of this information, including the authors’ names, fields of study, number of participants, and the specific measures used in the cloze test assessments.

After analyzing the papers, it was possible to realize that the participants’ preponderance was boys and girls, ranging from 6 to 14 years of age. In two studies, the participants were up to 17 years old. One paper does not specify the students’ school level, but addresses differences in textual genre issues (Brito et al., 2022). In addition, one study mentioned students’ social conditions, noting that the participants lived in socially at-risk conditions (Santos et al., 2009).

Ten studies were conducted in public schools, five in private schools, and two in industrial institutions (Cunha and Santos, 2009; 2010). Two studies did not specify the setting of the studies (Brito et al., 2016; Brito et al., 2022).

In two separate studies, Cunha and Santos (2009, 2010) investigated the validity of evidence based on the traditional cloze test, which focuses on literal or verbatim answers. Their first study (Cunha and Santos, 2009) examined the errors in students' responses to a cloze test to identify the validity evidence in the response process. They compared the results obtained from students’ responses with their school averages and academic performance, observing that students with higher grades made more lexical errors, and those with lower grades showed more semantic errors. The authors noted evidence of validity in the response process by assessing the error-type distribution homogeneity.

In the same study, the reliability of the cloze test measure was analyzed by conducting a pilot study with 314 children, both boys and girls, from 2^nd to 4^th grade. The study demonstrated satisfactory precision indices through internal consistency analysis (following the recommendations of CFP (2003), which prescribes an index above 0,70), indicating that the gap-filling text used in the research was suitable for the sample. Cronbach’s alpha was 0.83 for the participants. Internal consistency was also analyzed by grade level, with rates of 0.85 for the 2nd year, 0,70 for the 3^rdyear, and 0,72 for the 4^th year (Cunha; Santos, 2009). There were no correlation tests between the cloze test and other instruments.

In their second study, Cunha and Santos (2010) examined the same group of 266 students from a previous study to investigate the evidence of convergent validity. By relating writing acquisition to reading comprehension, they identified evidence of convergent validity in the assessment of difficulties in learning writing scales, word recognition scales, and cloze test instruments. Responses were measured by considering the exact words. However, they did not specify how the reliability of the cloze test was assessed. The correlation between the averages of this test and the other instruments indicated statistically significant differences according to the analysis of variance results. These results were as follows: [F (3,262) = 105,942; p<0,001, [F (3,262) = 49, 734; p < 0,001] (Cunha and Santos, 2010).

Mota and Santos (2009, 2014) also explored the cloze test as a reading-assessment tool. Their first paper (Mota and Santos, 2009) focused on the relationship between phonological awareness and reading comprehension. The scores obtained in the phonological awareness tasks correlated with those of the cloze test, which were corrected using verbatim criteria, excluding synonyms or misspelled words. They argued that this correction method reduces the influence of the experimenter’s subjective judgment.

The cloze tests used by Mota and Santos (2009) demonstrated satisfactory internal consistency rates (α > 0,70) and were effective in assessing 1^st-grade students in elementary education. There was a positive and significant correlation between reading comprehension and phonological awareness: [r = 0,37; p < 0,05] for alliteration and cloze test scores 1, [r = 0,37; p < 0,05] for alliteration and cloze test scores 2, [r = 0,40; p < 0,01] for rhyme and cloze test 1, and [r = 0,51; p < 0,01] for rhyme and cloze test 2 (Mota and Santos, 2009). To control for the influence of cognitive development on reading and writing acquisition, researchers also employed the vocabulary and digit subtest of the Weschler Intelligence Scale for Children WISC-III Weschler 1991) in their study. The digit memory measure showed a positive and significant correlation with the cloze test (cloze 1 p < 0,05; cloze 2 p < 0,01), whereas the vocabulary task did not correlate with the reading comprehension scores.

In their second study, in which the same cloze tests from the study mentioned above were employed, thus maintaining the same psychometric properties, Mota and Santos (2014) applied the cloze test as a measure of reading ability in two primary school grades, and compared the results with those of the School Achievement Test (TDE), whose focus is specifically on academic performance in reading and writing, used as a standard measure. The authors found that higher performance on the School Achievement Test correlated better with cloze test performance across grades and categories, with positive and significantly moderate results [r = 0,47; p < 0,001].

Using reading comprehension tests identical to those used by Mota and Santos (2009; 2014), Mota et al. (2009) conducted a study on 42 children from a private school and investigated the relationship between morphological awareness and text comprehension measured by the cloze test. The association between morphological awareness tasks and text comprehension measures showed that morphological awareness is related to contextual reading in Portuguese. There were positive and significant associations between the two cloze tests and the morphological awareness tasks (analogy task and morpheme-semantic association task): [r = 0,47, p < 0,01] for cloze 1 and analogy task, and [r = 0,52, p < 0,01] for cloze 2 and analogy task; and [ r= 0,25, p < 0,01] for cloze 1 and morpheme-semantic association task, and [r = 0,58, p < 0,01] for cloze 2 and morpheme-semantic association task (Mota et al., 2009). Mota et al. (2012) found similar results using an identical methodology.

A study conducted by Santos et al. (2009) applied the cloze test to measure reading comprehension and relate it to intelligence, as assessed by the R1-Test- Form B (Sisto, Santos and Noronha, 2004). The criterion for evaluating the participants’ responses to the cloze test was literal accuracy. The authors found a significant positive correlation between the results of the intelligence and reading tests [r=0,39; p=0,002], indicating that higher scores on R1 -Test Form B were associated with greater accuracy in completing cloze test gaps. However, it does not address the test reliability.

Santos and Oliveira (2010) sought to identify the appropriateness of cloze techniques for assessing and developing text comprehension skills through an intervention program. In this study, they employed the traditional cloze method, considering only words that matched the original text, similar to the approach used by Joly and Piovezan (2012) and other researchers. They used the Computerized Basic-MAR Cloze test as both pre-and post-measures. This test was developed by Joy and Piovezan as a reading comprehension test using the cloze technique with psychometric characteristics (Joly and Istome, 2008). Reliability was evaluated through test-retest (pre- and post-assessment). The results of both studies demonstrated a significant positive effect on students’ reading comprehension levels.

Suehiro and Magalhães (2014) and Suehiro and Santos (2015) show the results from studies conducted in different Brazilian states, such as Bahia and São Paulo. The first study aimed to examine the relationship between reading and writing among public elementary school students and search for evidence of convergent discriminant validity and concurrent criteria between the instruments used. The second paper also sought validity evidence, but this time through measures that assess related constructs such as reading comprehension and phonological awareness. Both studies demonstrated criterion validity through contrasting group methods, and the cloze test proved to be sensitive in capturing the relationship between learning to write and reading comprehension.

Santos and Fernades (2016) conducted a comprehensive study of five Brazilian regions. They aimed to explore the relationships between instruments assessing writing and reading comprehension, and to examine the predictive value of these instruments for school performance. They also investigated potential differences in children’s test performance based on gender and school grade. This study presented the reliability of the test by calculating the alpha coefficient (0,96). The measures of the instruments used – cloze, Writing Evaluation Scale (Sisto, 2005), and School Performance Test (Stein, 1994) – were correlated, indicating that cloze and writing evaluation scales predicted school performance in reading, as controlled by the school performance test in this study. The correlation coefficient between the cloze test and school performance (R=0,73, with an adjusted R² of 0,53) demonstrates the predictive value of the cloze test for academic performance.

Although the study did not specify where data collection took place, Colombo and Cárnio (2018) aimed to develop an instrument to assess reading comprehension and examine the influence of receptive vocabulary on the reading comprehension of typical elementary school students. Although some students were impacted by their low socioeconomic status, the results indicated that most scored above average in receptive lexicons. The students also performed better on explicit and implicit questions. Furthermore, cloze tests revealed more homogeneous and similar performance across different school years. The authors did not explain how the reliability of the cloze instrument was assessed, and they suggested strong significant and positive associations between the results of the cloze procedure and the question-and-answer method [p < 0,001].

Santos et al. (2018) sought to identify the relationship between reading comprehension and learning motivation among elementary-school students. Two measurement instruments were applied: the exact-answer cloze test and Motivation Assessment Scale for Learning. This scale is based on three goals that represent students’ efforts, their concern about standing out to their peers, and their concern about not showing weakness in front of their peers (Santos et al., 2018).

The results revealed a positive correlation between cloze test and learning goal scores, with significant negative effects on performance, approach, and avoidance goals. According to the statistical data, the higher the reading comprehension score, the lower the score for a student’s interest in excelling in the classroom (performance/approach) and avoiding appearances in front of their classmates and teachers (performance avoidance). The results of the significance levels were as follows: [r= 0,167; p = 0,041] for cloze and goal to learn; [r = -0,234; p = 0,003] for cloze and goal performance approach; and [r = -0,224; p = 0,004] for cloze and goal performance avoidance. However, the authors did not explicitly determine the reliability of the cloze test.

Trassi, Oliveira, and Inácio (2019) applied three measurement tools at different stages to achieve four objectives: to analyze the reading level of public elementary school students; compare their performance on the cloze test; investigate possible relationships between reading comprehension, learning strategies, and verbal reasoning; and determine whether verbal reasoning can predict other variables. A total of 470 students completed the test and assessment scales for learning strategies. Of these, only 45 were selected to complete the Wechsler Abbreviated Scales of Intelligence (WASI). This scale “briefly assesses the abilities in general, verbal and execution intelligence in 4 subtests: similarities, vocabulary, cubes, and matrix reasoning, the first two referring to the verbal area” (Trassi, Oliveira and Inácio, 2019, p. 618).

The results indicated that the sample exhibited an independent reading comprehension level, particularly among students in the 2^nd, 3^rd, and 4^th years compared to students in other grades. Based on the WASI results, the students in this sample demonstrated good verbal reasoning performance as expected. Regarding the cloze test used, there were no specific details regarding the instrument reliability. The research data indicated statistically significant correlations between the cloze test and metacognitive strategies [r = 0,217; p = 0,001], and between the cloze test and verbal IQ [r = 0,528; p = 0,001].

Cunha et al. (2020) applied the cloze test as evidence of validity through response processes and analyzed the types of semantic and syntactic errors. The highest and lowest scores on the cloze test were used to construct a scale to analyze the complexity of terms through morphosyntactic analysis. This study is similar to that of Cunha and Santos (2009), who identified five types of errors: blank space, phonological, lexical, syntactic, and semantic. Students with higher averages tended to make more syntactic and linguistic errors, whereas those with lower averages tended to make more semantic errors. Cunha and Santos (2009) demonstrated good test reliability, and an internal consistency analysis revealed a Cronbach’s alpha of 0,83.

Cunha et al. (2021) conducted a psychometric study using a questionnaire and two cloze tests to investigate the psychometric properties of the Metatextual Awareness Assessment Questionnaire, based on the “Theory of Human Information Processing and the precepts of cognitive psychology” (Cunha et al., 2021). The questionnaire responses revealed differences in performance according to the school year. The cloze tests were evaluated using traditional or literal correction methods and provided valid evidence. For this study, the internal consistency of the cloze tests indicated (α = 0,82) (cloze 1) and (α = 0,77) (cloze 2).

Fabri et al. (2022) applied the traditional cloze to verify the internal structure of the Inventory of Learning Self-regulation Processes (Polydor et al., 2011) and to assess self-regulation, learning strategies, and reading comprehension of students in early elementary school. This inventory, developed in Portugal, was designed to investigate self-regulatory processes in their different dimensions, consisting of a 9-item questionnaire, using a Likert scale, with responses ranging from “never” to “always” (Fabri et al., 2022). The authors reported significant use of self-regulatory strategies, demonstrating relationships between this construct, metacognitive strategies, and instructional reading comprehension levels. They also identified a dependent relationship between self-regulation and cognitive skill.

4. Discussion

The results of this systematic review reinforce the cloze test’s sensitivity in capturing reading and writing skills, making it a valuable indicator of overall academic performance. The cloze test is a well-established instrument that can be used independently or with other measurement tools.

Most of the analyzed studies adopted the exact word as the criterion for scoring cloze test responses (Cunha, Ferraz and Santos, 2021; Cunha et al., 2020; Cunha and Santos, 2009; Cunha and Santos, 2010; Fabri et al., 2022; Joly; Piovezan, 2012; Mota et al.; 2012; Mota et al., 2009; Mota and Santos, 2009; Mota and Santos, 2014; Santos and Fernandes, 2016; Santos, Morais and Lima, 2018; Santos and Oliveira, 2010; Santos, Suehiro and Vendemiatto, 2009; Suehiro and Magalhães, 2014; Suehiro and Santos, 2015; Trassi, Oliveira and Inácio, 2019).

There is a tendency to adopt the exact word criterion for scoring; however, a recent study by Brito et al. (2022) differs from others in that it does not specify the criterion used to score the responses. Instead, the researchers adopted a computerized correction made by the Coh-Metrix-Port 2.0. This study aimed to analyze the characteristics of words, syntactic structure, and cohesion elements in the Cloze Reading Comprehension Test (TCCL), comparing its two parts: narrative text (TCCL-N) and expository text (TCCL-E), using the Coh-Metrix software. This program allows for the analysis of cohesion, coherence, and difficulty in understanding a text (Brito, Ribeiro and Seabra, 2022) using Natural Language Processing resources and tools. The results showed that TCCL-E is slightly more complex than TCC-N, as supported by the literature, and the authors emphasized the importance of considering reader characteristics when determining text difficulty.

Although Santos and Fernandes’s research in 2016 was the only study conducted across all five Brazilian regions, most participants were from the southeast region, particularly São Paulo, the most populous state (IBGE, 2010). Of the 19 papers analyzed, eight were from São Paulo (Santos et al.; 2009; Cunha and Santos, 2009; Santos and Oliveira, 2010; Cunha and Santos, 2010; Suehiro and Santos, 2015; Santos et al., 2018; Cunha et al., 2020), Minas Gerais (Mota et al., 2009; Mota and Santos, 2009; Mota et al., 2012; Mota and Santos, 2014; Cunha et al., 2021), and Paraná (Trassi et al., 2019; Fabri et al., 2022). Only one study was from Bahia (Suehiro and Magalhães, 2014), in the northeast region, and none were from the North or Midwest regions. Three papers did not specify the location of their study (Joly and Piovezan, 2012; Colombo and Cárnio, 2018; Brito et al., 2022).

The field that produced the most studies on the cloze test was psychology with 16 papers. One study was published in speech-language pathology (Colombo and Cárnio, 2018), one in education (Fabri et al., 2022), and one in the humanities (Joly and Piovezan, 2012).

Despite their specific objectives, each study used the cloze test as a reading assessment, applying it to examine the relationship between reading comprehension and other linguistic abilities such as morphological and phonological awareness (Suehiro; Santos, 2015; Mota; Santos, 2009; Mota et al., 2009; Mota et al., 2012) and metatextual awareness (Cunha et al., 2021).

The cloze test has also been used to assess writing ability (Cunha and Santos, 2010; Santos and Fernandes, 2016; Suehiro and Magalhães, 2014) as well as other factors related to education, such as learning motivation (Santos et al., 2018), learning strategies, verbal reasoning (Trassi et al., 2019), and school development (Colombo and Cárnio, 2018). Furthermore, the latter study aimed to create a tool for assessing textual reading comprehension. It was the only study to consult linguists to validate the cloze tests before applying them.

A few studies have used the cloze test exclusively to analyze reading without relating it to other skills, such as Joly and Piovezan’s (2012) study, which aimed to evaluate their Computerized Reading Program Strategy (Programa Informatizado de Leitura Estratégica-PILE) (Joly, 2008).

This systematic review revealed that the analyzed studies consistently adopted exact answer scoring criteria, including those using computerized tests. Clozentropy or multiple-choice scoring methods have not been mentioned in previous papers.

5. Conclusion

The findings of this systematic review reinforce the consolidation of the cloze test in Brazil as a tool for assessing reading proficiency particularly in research that focuses on primary and elementary school students. It also underscores that the psychology and phonology areas have produced more research on the cloze procedure than the language studies field. Teachers, linguists, psychologists, and other professionals can work together to examine the development of these tests, ensuring that they are appropriately aligned with the participants’ age, educational level, and knowledge base.

Currently, such research is concentrated in southeastern Brazil. Expanding these studies to other Brazilian states could provide a broader understanding of reading difficulties across the country, particularly in the Covid-19 post-pandemic context, where reading asymmetries have widened. Furthermore, the “exact word” criterion used in many cloze tests limits the accuracy of reading diagnosis as it narrows the range of acceptable responses and excludes answers with semantic similarity, such as synonyms. Despite these limitations, one piece of information should be highlighted. The research results assume that it is a reliable measure to assess reading comprehension and predicts students' performance in other skills related to reading activity.

Natural language processing techniques could be employed, utilizing measures of lexical distance, to better assess comprehension in contexts with low reading proficiency. Lexical distances, defined as the “minimum number of insertions, deletions or substitutions of a single character needed to transform one word into the other”,[1] could be used to measure responses that follow the same grammatical class or semantic field as the expected word, revealing alternative responses possibilities (Gois et al., 2025).

Previous studies reinforce the validity of the cloze test as an instrument for measuring reading comprehension. Due to its low cost and ease of application, it is a highly practical tool for diagnosing reading proficiency, especially in the post-pandemic context, where performance rates have declined, raising concerns among education networks and challenging teachers to address this issue. Mapping how the cloze technique has been applied in Brazil contributes to the development of new analytical approaches, particularly those based on natural language processing (Gois et al., 2025) and novel gap types, such as verbal aspect (Santos, 2025), ultimately aiding in the reduction of reading asymmetries.

Acknowledgments

We thank the CAPES for the master’s scholarship granted to one of the authors, which partially provided financial support for this study, that is part of “Impactos da Pandemia de COVID-19 na linguagem da criança e do adulto: foco no desenvolvimento e na aprendizagem da leitura” project (Impacts of the COVID-19 Pandemic on the Language of Children and Adults: Focus on Development and Reading Learning), funded by Programa de Desenvolvimento da Pós-Graduação (PDGP) - Impactos da Pandemia (Graduate Development Program - Impacts of the Pandemic) CAPES 12/2021, and is supported by other federal institutions. At the Universidade Federal de Sergipe, the project is developed through the subproject “Linguagem e Emoções no Cenário Educacional Pós-Pandêmico: Tecnologia de Avaliação e Monitoramento: (Language and Emotions in the Post-Pandemic Educational Scenario: Assessment and Monitoring Technology), funded by FAPITEC/SE/SEDUC 09/2022.

Additional Information

Conflict of Interest

The authors declare no competing interests.

Statement of Data Availability

Data sharing does not apply to this article, as no new data was created or analyzed in this study.

Funding Sources

This article is one of the activities developed during the master’s degree of one of the authors, which was carried out with financial support from the CAPES through a scholarship.

References

ABREU, Kátia Nazareth Moura de; GARCIA, Daniela Cid de; HORA, Katharine de Freitas P. N. A. da; SOUZA, Cristiane Ramos de. O teste de Cloze como instrumento de medida da proficiência em leitura: Fatores linguísticos e não linguísticos. Revista de Estudos da Linguagem, Belo Horizonte, v. 25, n. 3. p. 1767-1799, 2017.

ALMEIDA, Luise Maria da Silva. Métodos de investigação em leitura no Brasil: o Teste Cloze e a pesquisa em compreensão leitora no Brasil. Universidade Federal de Santa Catarina, 2023. DOI https://repositorio.ufsc.br/handle/123456789/250902. Acesso em: 12 mar. 2025.

BICKLEY, AC; ELLINGTON, Billie J.; BICKLEY, Rachel T. The Cloze procedure: a conspectus. Journal of Reading Behavior, v. 2, n. 3, p. 232-249, 1970.

BORMUTH, John R. Cloze test readability: Criterion references scores. University of Chicago. Journal of Educational Measurement. v. 5, n. 3, 1968.

BRITO, Gabriel Rodriguez; RIBEIRO, Camila Fragoso; SEABRA, Alessandra Gotuzo. Análise da inteligibilidade dos textos narrativo e expositivo do Teste Cloze de compreensão de leitura. Revista de Estudos e Investigação em Psicologia e Educação, v. 9, n. 2, p. 207-225, 2022. DOI https://doi.org/10.17979/reipe.2022.9.2.9101. Acesso em: 12 mar. 2025.

BROWN, James Dean. Do Cloze Tests work? Or, is it just an illusion? Second language Studies, v. 21, n. 1, p. 79-125, 2002.

BROWN, James Dean. Relative merits of four methods for scoring Cloze Tests. The Modern Language Journal, Autumn, v. 64, n. 3, p. 311-317, 1980.

CARDOSO, Paloma Batista; MENEZES, Keila Vasconcelos. In press. Adivinhando palavras. p. 63-71. In: FREITAG, Raquel; TEJADA, Julian. Pra ler melhor. Manuscrito submetido ao edital 08/23 SEDUC.

CARDOSO, Paloma Batista; MENEZES, Keila Vasconcelos; FREITAS, Flávia Oliveira; FREITAG, Raquel Meister Ko. Eficiência na leitura: medidas de precisão e velocidade entre alunos do Colégio de Aplicação da Universidade Federal de Sergipe. Revista Científica Sigma, v. 5, n. 5, p. 120-143, 2024. Disponível em: https://iesap.edu.br/ojs/index.php/sigma/article/view/103. Acesso em: 14 mar. 2025.

COLOMBO, Renata Correia; CÁRNIO, Maria Silvia. 2018. Compreensão de leitura e vocabulário receptivo em escolares típicos do ensino fundamental I. CoDAS. Sociedade Brasileira de Fonoaudiologia, v. 30, n. 4 e 201700145, 2018. DOI:10.1590/2317-1782/20182017145.

Conselho Federal de Psicologia. Resolução nº 002/2003. Brasília: CFP, 2003. Disponível em: http://www.pol.org.br.

CUNHA, Neide de Brito; FERRAZ, Adriana Satico; SANTOS, Acácia Aparecida Angeli dos. Estudo psicométrico do questionário de avaliação da consciência metatextual. Avaliação Psicológica: Avaliaçao Psicologica: Interamerican Journal of Psychological Assessment, v. 20, n. 4, p. 401-409, 2021.

CUNHA, Neide de Brito; LIMA, Thatiana Helena de; SANTOS, Acácia Aparecida Angeli dos; OLIVEIRA, Katya Luciane de. Teste de Cloze: evidência de validade por processo de resposta. Psicologia Escolar e Educacional, v. 24. 2020. DOI https://doi.org/10.1590/2175-35392020191537. Acesso em: 12 mar. 2025.

CUNHA, Neide de Brito; SANTOS, Acácia Aparecida Angeli dos. Estudos de validade entre instrumentos que avaliam habilidades linguísticas. Estudos de Psicologia, Campinas, v. 27, N. 3, p. 305-314, 2010.

CUNHA, Neide de Brito; SANTOS, Acácia Aparecida Angeli dos. Validade por processo de resposta no teste de Cloze. Fractal: Revista de Psicologia, v. 21, p. 549-562, 2009. DOI https://doi.org/10.1590/S1984-02922009000300010. Acesso em: 12 mar. 2025.

DARNELL, Donal K. The development of an English language proficiency for foreign students, using a Clozentropy procedure. U.S. Department of Health, Education, and Welfare. Office Education, Bureau of Research. Presented at the Annual Convention of the Association for Education in Journalism, San Diego, August, 1968.

FABRI, Nayla Beatriz; OLIVEIRA, Katya Luciane de; INÁCIO, Amanda Lays Monteiro; SCHIAVON, Andreza; BZUNECK, José Aloyseo. Autorregulação, estratégias de aprendizagem e compreensão de leitura no Ensino Fundamental I. Revista Brasileira de Educação, v. 27, e270068, 2022. DOI https://doi.org/10.1590/s1413-24782022270068. Acesso em: 12 mar. 2025.

FREITAG, Raquel Meister Ko.; SARMENTO, Victor Hugo Vitorino; COSTA, Camila Conceição; SANTOS, Katiana Leite. Teste Cloze e a competência em leitura de universitários: uma experiência no curso Química/Licenciatura da UFS/Itabaiana. InterSciencePlace, p. 1-13, 2014.

FREITAS, Flávia Oliveira; MENEZES, Keila Vasconcelos. Desenvolvimento e aplicação do teste cloze workshop: uma sugestão pedagógica para avaliar a proficiência em leitura dos alunos. Revista de Estudos de Cultura, v. 10, n. 25, p. 1–18, 2024. DOI https://doi.org/10.32748/revec.v10i25.21467. Acesso em: 13 mar. 2025.

GALVÃO, Maria Cristiane Barbosa; RICARTE, Ivan Luiz Marques. Revisão sistemática da literatura: conceituação, produção e publicação. Logeion: Filosofia da informação, v. 6, n. 1, p. 57–73, 2019. DOI https://doi.org/10.21728/logeion.2019v6n1.p57-73. Acesso em: 12 mar. 2025.

GOIS, Túlio Sousa; FREITAS, Flávia Oliveira; TEJADA; Julian; FREITAG, Raquel Meister Ko. NLP and education: using semantic similarity to evaluate filled gaps in a large-scale cloze test in the classroom. The Mental Lexicon, 2025. DOI https://doi.org/10.1075/ml.24027.deg. Acesso em: 14 mar. 2025.

JOLY, Maria Cristina Rodrigues Azevedo; ISTOME, Aline Christina. Compreensão em leitura e capacidade cognitiva: estudo de validade do Teste Cloze Básico-MAR. Psic: Revista da Vetor Editora, v. 9, n. 2, p. 219-228, 2008.

JOLY, Maria Cristina Rodrigues Azevedo. Programa Informatizado de Leitura Estratégica (PILE) – Pesquisa em desenvolvimento. Itatiba, SP: Universidade São Francisco, 2008.

JOLY, Maria Cristina Rodrigues Azevedo; PIOVEZAN, Nayane Martoni. Avaliação do Programa Informatizado de Leitura Estratégia para Estudantes do Ensino Fundamental. Paideia, v. 22, n. 51, p. 83-90, 2012.

MORAIS, José. A arte de ler. Tradução: Álvaro Lorencini. São Paulo: Editora da Universidade Estadual Paulista, 1996.

MORAIS, José; KOLINSKY, Régine. Psicolinguística e leitura. In: MAIA, Marcus. Psicolinguística, psicolinguísticas: uma introdução. São Paulo: Contexto, 2015, p. 86-93

MOTA, Márcia Maria Peruzzi Elia da; SANTOS, Acácia Aparecida Angeli dos. O Cloze como instrumento de avaliação de leitura nas séries iniciais. Revista quadrimestral da Associação Brasileira de Psicologia Escolar e Educacional, São Paulo, v. 18, n. 1, p. 135-142, 2014.

MOTA, Márcia Maria Peruzzi Elia da; SANTOS, Acácia Aparecida Angeli dos. O papel da consciência fonológica na leitura contextual medida pelo teste de Cloze. Estudos de psicologia, v. 14, n. 3, p. 207-212, 2009. DOI https://doi.org/10.1590/S1413-294X2009000300004 Acesso em 12 mar. 2025.

MOTA, Márcia Maria Peruzzi Elia da; VIEIRA, Marcel de Toledo; BASTOS, Ronaldo Rocha; DIAS, Jaqueline; PAIVA, Nádia; MANSUR-LISBOA, Stella; ANDRADE-SILVA, Danielle. Leitura contextual e processamento metalinguístico no português do Brasil: um estudo longitudinal. Psicologia: Reflexão e Crítica, v. 25, n. 1, 2012. DOI https://doi.org/10.1590/S0102-79722012000100014. Acesso em: 12 mar. 2025.

MOTA, Márcia Maria Peruzzi Elia da; LISBOA, Rafaela; DIAS, Jaqueline; GONTIJO, Rhaisa; PAIVA, Nádia; MANSUR-LISBOA, Stella; SILVA, Danielle Andrade; SANTOS, Acácia Aparecida Angeli dos. Relação entre consciência morfológica e leitura contextual medida pelo Teste Cloze. Psicologia: Reflexão e Crítica, v. 22, n. 2, p. 223-229, 2009. DOI https://doi.org/10.1590/S0102-79722009000200008. Acesso em: 12 mar. 2025.

OLLER, John W. Jr.; CONRAD, Christine A. The Cloze technique and ESL proficiency. Language Learning , v. 21, n. 2, p. 183-194, 1971. DOI https://doi.org/10.1111/j.1467-1770.1971.tb00057.x. Acesso em: 12 mar. 2025.

PETRONI, Filippo; SERVA, Maurizio. Measures of lexical distance between languages. Physica A: Statistical Mechanics and its Applications, v. 389, n. 11, p. 2280-2283, 2010. DOI https://doi.org/10.1016/j.physa.2010.02.004. Acesso em: 12 mar. 2025.

POLYDORO, S. A. J; ROSÁRIO, P.; SAMPAIO, R. K. N.; FEITAS, F. A. de. Sucesso no ensino superior e variáveis envolvidas. In: Congresso Nacional de Psicologia Escolar e Educacional: caminhos trilhados, caminhos a percorrer, 2011.

SANTOS, Gislane Evangelista dos; MACHADO, Alessandra Pereira Gomes. Fluência em leitura oral como avaliação diagnóstica de leitura de estudantes do 6º ano. Língu@ Nostr@, v. 10, n. 2, p. 148-163, 2022. DOI https://doi.org/10.29327/232521.9.1-25. Acesso em: 12 mar. 2025.

SANTOS, Gislane Evangelista dos. O preenchimento de lacunas de aspecto verbal em teste cloze: pistas de compreensão em leitura. 2025. Dissertação (Mestrado em Estudos Linguísticos) – Universidade Federal de Sergipe, Sergipe, 2025.

SANTOS, Acácia Aparecida Angeli dos; MORAES, Mayara Salgado de; LIMA, Thatiana Helena. Compreensão de leitura e motivação para aprendizagem de alunos do ensino fundamental. Psicologia Escolar e Educacional, São Paulo, v. 22, n. 1, p. 93-101, 2018. DOI https://doi.org/10.1590/2175-35392018012208. Acesso em: 12 mar. 2025.

SANTOS, Acácia Aparecida Angeli dos; FERNANDES, Eliane Sousa de Oliveira. Habilidade de escrita e compreensão de leitura como preditores de desempenho escolar. Psicologia Escolar e Educacional, São Paulo, v. 20, n. 3, p. 465-473, 2016. DOI https://doi.org/10.1590/2175-3539201502031013. Acesso em: 12 mar. 2025.

SANTOS, Acácia Aparecida Angeli dos; OLIVEIRA, Evelin Zago de. Avaliação e desenvolvimento da compreensão em leitura no ensino fundamental. Psico-USF, v. 15, n.1, p. 81-91, 2010. DOI https://doi.org/10.1590/S1413-82712010000100009. Acesso em: 12 mar. 2025.

SANTOS, Acácia Aparecida Angeli dos; SUEHIRO, Adriana Cristina Boulhoça; VENDEMIATTO, Bianca Carolina. Inteligencia y comprensión en lectura de adolescentes en situación de riesgo social. Paradigma, v. 30, n. 2, p. 113-12, 2009.

SISTO, F. F.; SANTOS, A. A. A.; NORONHA, A. P. P. R1: Teste não verbal de avaliação da inteligência Forma B – Manual. São Paulo: Vetor Editora Psicopedagógica Ltda, 2004.

SISTO, F. F. Escala de Avaliação da Escrita (EAVE). Relatório Técnico, Universidade São Francisco, Itatiba-SP, 2005.

STEIN, L. M. TDE - Teste de Desempenho Escolar: manual para aplicação e interpretação. São Paulo: Casa do Psicólogo, 1994.

SUEHIRO, Adriana Cristina Boulhoça; SANTOS, Acácia Aparecida Angeli dos. Compreensão de leitura e consciência fonológica: evidências de validade de suas medidas. Estudos de Psicologia. Campinas, v. 32, n. 2, p. 201-211, 2015. DOI https://doi.org/10.1590/0103-166X2015000200005. Acesso em: 12 mar. 2025.

SUEHIRO, Adriana Cristina Boulhoça; MAGALHÃES, Marilene Moreira da Silva. Relação entre medidas de avaliação da linguagem escrita em estudantes do Ensino Fundamental. Psico-USF, Bragança Paulista, v. 19, n. 3, p. 489-498, 2014. DOI https://doi.org/10.1590/1413-8271201401900301. Acesso em: 12 mar. 2025.

TAYLOR, Wilson L. Cloze procedure: a new tool for measuring readability. Journalism quarterly, v. 30, n. 4, p. 415-433, 1953. DOI https://doi.org/10.1177/107769905303000401. Acesso em: 12 mar. 2025.

TRASSI, Angélica Polvani; OLIVEIRA, Katya Luciane de; INÁCIO, Amanda Lays Monteiro. Reading comprehension, learning strategies, and verbal reasoning: Possible relationships. Psico-USF, Bragança Paulista, v. 24, n. 4, p. 615-624, 2019. DOI https://doi.org/10.1590/1413-82712019240401. Acesso em: 12 mar. 2025.

WECHSLER, D. WISC-III: Escala de inteligência Weschsler para crianças. São Paulo: Casa do Psicólogo, 1991.

Review

DOI: https://doi.org/10.25189/2675-4916.2025.V6.N2.ID787.R

Editorial Decision

EDITOR 1: Tan Arda Gedik

ORCID: https://orcid.org/0000-0003-1429-9675

AFFILIATION: Friedrich-Alexander-Universität, Baviera, Alemanha.

EDITOR 2: Leonarda Prela

ORCID: https://orcid.org/

AFFILIATION: Friedrich-Alexander-Universität, Baviera, Alemanha.

EDITOR 3: Vania De la Garza

ORCID: https://orcid.org/

AFFILIATION: Friedrich-Alexander-Universität, Baviera, Alemanha.

DECISION LETTER: The manuscript The use of the cloze test in reading comprehension assessment in Brazil: post-pandemic challenges by Flávia Oliveira Freitas and colleagues systematically reviews studies published across three virtual databases that applied the traditional cloze test to assess reading comprehension. The study highlighted the cloze test’s sensitivity in capturing reading and writing skills, making it a valuable indicator of overall academic performance. This research aligns closely with the special issue Beyond Letters: Perspectives on the Effects of Illiteracy from Linguistics and Beyond, as it analyses the cloze test as a reading assessment. The study also emphasizes that it is essential to incorporate natural language processing techniques, such as lexical distance measurements. The latter can provide a more nuanced understanding of comprehension, especially in contexts with low reading proficiency. Tthe manuscript contributes to discussions on the assessing of reading comprehension through the cloze test. The study also points out that most of the research on cloze test has been produced in the field of psychology, promoting the possibility of collaboration between teachers, linguists, psychologists, and other professionals to research the development and application of these tests.

Rounds of Review

REVIEWER 1: Richenda Wright

ORCID: http://orcid.org/

AFFILIATION: Friedrich-Alexander-Universität, Baviera, Alemanha.

REVIEWER 2: Kátia Nazareth Moura de Abreu

ORCID: https://orcid.org/0000-0002-8505-4512

AFFILIATION: Universidade do Estado do Rio de Janeiro, Rio de Janeiro, Brasil.

ROUND 1

REVIEWER 1

2024-07-29 | 04:41 PM

Assessment

The authors found that the most frequent way to score cloze tests is by only accepting answers that are an exact match to the source text. Beyond frequency in published studies, the validity or reliability of the scoring methods was not analysed in a systematic way and, thus, cannot be accurately judged. Comments on issues in individual articles and how cloze test performance relates to other linguistic skills are given. They also provide information about the geographic location and populations were this tool is most often used in research. This provides information about populations who have not yet been involved in research and could do so in the future. The authors briefly discuss problems with using the exact answer scoring method but did not undertake a more in-depth analysis to support a discussion of these problems and possible solutions based on the current literature.

Overall, the paper provides a good overview of the current situation in measuring reading and reading-related skills using cloze tests in their specific context.

Public Review

Summary of the authors' objectives.

The authors aimed to summarise the current state and validity of the procedures used to evaluate responses in cloze tests or gap texts in the Brazilian context. Children in primary education were the target population in the literature review.

Evaluation of the major strengths and weaknesses of the methods and results.

The methods attempted to get a good overview of the use of cloze tests in the target population by consulting three different databases. It is, however, not clear how the databases were chosen. The method section contains ambiguity about which search terms in which languages formed part of the final search. A total of 144 hits are described in the search term section, but only 42 are said to be part of the initial corpus of pre-exclusion articles. This forms part of general comments on the clarity of the writing across the paper. The authors followed PRISMA guidelines in conducting the review. Including additional information from the PRISMA checklist (e.g. comments on robustness or reliability) in the review would strengthen the findings from the paper.

Assessment of whether the authors achieved their goals and if the results support their conclusions.

The authors provided a comprehensive summary of the way in which cloze tests are scored when used in studies with Brazilian primary school children and the validity of the methods. It is, however, difficult to interpret whether the most frequently-used measure is the best as there are no comments on the reliability of the measures themselves (test-retest or internal reliability such as split-half) or of the extent to which the cloze test predicts the outcomes they assume to measure (e.g. reading comprehension, morphological awareness etc.).

The authors did succeed in providing a summary of the current state of research in their chosen domain. They gave a good overview of the research fields that commonly use cloze tests, how they are most often scored, and the most frequently-included populations and geographical locations.

Discussion of the anticipated impact of the work on the field and the utility of the methods and data for the community.

It is important to investigate widely-used assessment measures, such as cloze tests in novel settings to ensure their validity in different contexts. Providing a clear overview of the use of such tools and the methods used to score them guides future research and practice in several domains, including education, psychology, and linguistics.

The current results provide an overview of the current state of use in terms of scoring method, geographic location, and population. They also provide practical suggestions for overcoming some of the issues with the most frequently used scoring method. The insight into possible uses of natural language processing techniques shows future-oriented thinking. The paper, thus, provides a starting point for future research with this assessment tool in the Brazilian context into investigating the validity and reliability of the tools, a systematic analysis of how the outcomes relate to or, possibly, predict other linguistic outcomes, and better scoring methods using technology.

Any additional context you believe would aid readers in interpreting or understanding the significance of the work.

The results are mostly qualitative in nature and should be interpreted as a general comment on the current state of research and not a quantitative analysis upon which to base the recommendation to use specific scoring methods for cloze tests or their general validity or reliability.

Given the public nature of these reviews, they should:

Recommendations for the authors

Suggestions for enhancing or adding experiments, data, or analyses, especially if they can enhance the impact of the work.

Your current analysis, in effect, can only comment in how frequently a specific scoring method is used. You are not able to systematically answer the question about how valid cloze tests themselves or the different scoring methods are.

I would propose two additional analyses that would strengthen your paper.

The first is an analysis of the reliability of the measure in each of the included papers. This can be inter-rater reliability, test-retest reliability in the cases where you have a pre-and post-assessment, or split-half reliability. As this is not very often reported, it is an important conclusion to draw at the end of your own paper that this is an aspect that needs further investigation in future research.
The practical application of cloze tests in predicting other language outcomes. There are several themes that repeat in your included literature (e.g. reading comprehension, phonological awareness). You discuss some findings, but reporting the correlations between these measures systematically or, even better, significance in predicting other linguistic outcomes where regression analyses were done, would also provide valuable insight into how results from cloze tests can be interpreted and applied.

It would be beneficial if the results from the review (e.g. in spreadsheet format) were made available publically.

Recommendations for improving the writing and presentation.

Initial suggestions for changes to the paper were made by means of comments. I would strongly suggest having the manuscript professionally proofread and corrected before resubmission. The language used is often unclear or redundant and takes away from the overall importance of the research, as does the punctuation.

The authors are also advised to carefully review their in-text referencing style. The placement of references obscures which information comes from the source in some cases. There are also several missing in-text references.

REVIEWER 2

2025-01-25 | 06:58 PM

O artigo apresenta um tema de relevância, com contribuição significativa para o campo das ciências humanas e sociais, especialmente, para os estudos da área da Linguagem e da Educação. A temática da avaliação da compreensão leitora tem sido discutida, porém a revisão sistemática, por ora apresentada, aponta para a importância de os estudos linguísticos se voltarem para essa questão, pois poderiam trazer contribuições originais e específicas sobre níveis de linguagem envolvidos nesse processo de compreensão, capazes de informar os professores da educação básica acerca da complexidade presente em uma atividade de avaliação de leitura.

ROUND 2

REVIEWER 1

2025-02-03 | 08:30 AM

The manuscript aimed to investigate which of the scoring methods available for cloze tests is most valid and often used in research in Brazil with elementary school children. The manuscript would be of interest to teachers (both native and foreign languages) who routinely make use of cloze tests, along with psychologists, linguists, and researchers from other related fields.The authors report on the most frequent method of scoring cloze tests – the exact answer. They report on the reliability of the method and the relationship that cloze test results have with other measures. In addition to the population and geographic location of the included studies, they comment on the fields of research that make use of the test. Geographic location provides important information about understudied populations. Overall, the paper provides a good overview of the current situation in measuring reading and reading-related skills using cloze tests in their specific context and how test results relate to performance in reading and other academic skills.

XML

PDF

Edição: v. 6 n. 2 (2025)
Enviado: 25/04/2024
Publicado: 02/05/2025
DOI: 10.25189/2675-4916.2025.v6.n2.id787

Como Citar

FREITAS, F. O.; SANTOS, G. E. dos; FREITAG, R. M. K. The Use of the Cloze Test in Reading Comprehension Assessment in Brazil: Post-Pandemic Challenges. Cadernos de Linguística, [S. l.], v. 6, n. 2, p. e787, 2025. DOI: 10.25189/2675-4916.2025.v6.n2.id787. Disponível em: https://cadernos.abralin.org/index.php/cadernos/article/view/787. Acesso em: 30 jul. 2025.

ACM
ACS
APA
ABNT
Chicago
Harvard
IEEE
MLA
Turabian
Vancouver

Estatísticas

Artigo visto: 1334 vez(es)
XML baixado: 8 vez(es)
PDF baixado: 42 vez(es)

Revisão de Literatura

The Use of the Cloze Test in Reading Comprehension Assessment in Brazil: Post-Pandemic Challenges

Flávia Oliveira Freitas

Gislane Evangelista dos Santos

Raquel Meister Ko Freitag

Resumo

Lay Summary

Introduction

1. Background

2. Method

3. Results

4. Discussion

5. Conclusion

Acknowledgments

Additional Information

References

Review

Editorial Decision

Rounds of Review

Como Citar

Estatísticas

Copyright

Cadernos de Linguística apoia o movimento Ciência Aberta