The United Nations Sustainable Development Goals (SDGs) address the global challenges the planet faces, including those related to poverty, inequality and climate change. Researchers worldwide have kept the seventeen SDGs as their aim and English has been the international lingua franca of knowledge transfer (Jenkins, 2014). In order to carry out a preliminary exploratory research about the SDGs’ lexicon, we used the Brazilian Corpus of Academic English - BrACE (TAVARES PINTO, P., REES, G., FRANKENBERG-GARCIA, A., 2021; PINTO et al., 2021), which was compiled with articles in English published by Brazilian researchers, to observe how some of the SDG themes were discussed between the years 2017 and 2018. We used a corpus-based theoretical and methodological background (SINCLAIR, 2001; BERBER SARDINHA, 2004; VIANA and TAGNIN, 2011) to analyse the general SDG-themes in BrACE with the aid of Sketch Engine tool (KILGARIFF, 2004) . The preliminary results showed that some themes were more discussed across the areas, such as sustainability, and others, such as poverty, are still not discussed as much. We point out that BrACE is being updated so that we can have better results of how the SDGs have been discussed in the Brazilian research in recent years. We intend to contribute to a broader view about the SDGs in Brazil and, as a consequence, publicize Brazilian studies to the scientific community and to civil society.


In 2015, The United Nations presented the Sustainable Development Goals1 (SDGs) to the world with the aim of achieving a better and more sustainable future for the whole planet. The SDGs “address the global challenges human beings face, including those related to poverty, inequality, climate change, environmental degradation, peace and justice” ( The 17 Goals are all interconnected and should be achieved by 2030. The 17 SDGs are: (1) No Poverty, (2) Zero Hunger, (3) Good Health and Well-being, (4) Quality Education, (5) Gender Equality, (6) Clean Water and Sanitation, (7) Affordable and Clean Energy, (8) Decent Work and Economic Growth, (9) Industry, Innovation and Infrastructure, (10) Reducing Inequality, (11) Sustainable Cities and Communities, (12) Responsible Consumption and Production, (13) Climate Action, (14) Life Below Water, (15) Life On Land, (16) Peace, Justice, and Strong Institutions, (17) Partnerships for the Goals.

Bearing those goals in mind, researchers of all many countries have directed their studies to find solutions for the issues regarding the SDGs. Nowadays, different sectors of society are mobilized for a decade of action in three levels: global action, local action and people action, in which academia and other stakeholders are included, with the aim of generating an “unstoppable movement pushing for the required transformations”.

With the striking Coronavirus pandemic, the world witnessed the importance of strengthening research actions regarding all matters of humanity surviving. The health crisis has highlighted the problems related to the unfairness in economic distribution of resources and encouraged the run for finding a vaccine against Covid-19. In this sense, scientific publications and studies involving the theme have emerged not only in the area of Health, but also in Sociology, Economy, Chemistry, Math, Linguistics among others. In that sense, English has been used as the lingua franca of scientific research (Jenkins, 2014). Anyone wishing to publish their studies with the top international journals in their fields are required to write up their research in English. Likewise, anyone wishing to disseminate their findings at major global conferences must be able to present their work in English. It is even necessary to know how to write in English to apply for research funding.

Searching for the acronym SDG in open-access platforms of research papers, such as SciELO2, PLOs3 and Google Scholar4, we noticed that not all studies use the acronym within their texts, although some of them discuss the SDG themes and you will find the acronym in the aricle’s references. Therefore, if a nation wants to know how they are dealing with those themes, they will have to use specific methodological searches to have a view of how those issues have been dealt nationwide. One way of analysing the SDGs in research papers is by using a corpus-based search with the help of computational tools to explore large collections of texts. This methodology will be discussed in this paper.

1. The Sustainable Development Goals in Brazilian studies

Giving continuity to the studies in Tavares Pinto et al. (2021) and Pinto et al. (2021), in which collocations used by Brazilian researchers were observed, now we aim to analyse ‘if’ and ‘how’ the SDGs have been discussed in articles published by Brazilian researchers in a pre-compiled corpus subdivided in eight different areas. The aim is to set a corpus-based methodology to be used in the future to explore a larger corpus and understand how the issues related to those themes are being discussed in the Brazilian scientific studies. According to Sinclair (1991), a corpus is a collection of “naturally-occurring language text” that are chosen to characterize a state of variety of a language. In this study, as we will discuss in more detail in the next section, the corpus to be used is composed of research papers written in English by Brazilian researchers. The theoretical framework to be used is that of Corpus Linguistics, whose aim is to analyse “naturally occuring language on the basis of computerized corpora (NESSELHAUF, 2005). According to the author, the studies in Corpus Linguistics are performed with the aid of computational softwares and take into account the frequency of the feature being discussed. In our study, we will use a quali-quantitative study with a written corpus of medium size, which contains 906,000 words. Since the aim is to set a preliminary study, we will select two main words, which are “sustainability” and “poverty” in order to have an overview of these words in BrACE.

2. Methodology

This section discusses (i) the data used for the compilation of BrACE corpus; (ii) the use of Wordsketch and Concordance to be used in the analysis of “poverty” and “sustainability”.

2.1. Data

We used a multi-disciplinary, electronic corpus of journal articles written in English by Brazilian researchers to search for the two selected search words in this study. The Brazilian Academic Corpus of English - BrACE, was first compiled based on Kuhn (2017) and was used for the analysis of academic collocations used by Brazilian researchers (TAVARES PINTO et al. 2021; PINTO et al., 2021). Its configuration is shown in Table 1:

Areas Journal Papers Words
Agricultural Sciences 1. Acta Scientiarum. Agronomy (2018) 2. Arquivo Brasileiro de Medicina Veterinária e Zootecnia (2017) 20 88,740
Biological Sciences 1. Acta Botanica Brasilica (2018) 2. Memórias do Instituto Oswaldo Cruz (2018) 20 92,220
Health Sciences 1. Jornal Brasileiro de Pneumologia (2018) 2. Arquivos de Neuro-Psiquiatria (2018) 3. Brazilian Dental Journal (2017) 4. Brazilian Journal of Pharmaceutical Sciences (2017) 20 74,254
Physical and Earth Sciences 1. Revista Brasileira de Meteorologia (2017) 2. Brazilian Journal of Oceanography (2017) 3. Boletim de Ciências Geodésicas (2017) 20 82,440
Engineering 1. Journal of Aerospace Technology and Management (2017), (2018) 2. Journal of Microwaves, Optoelectronics and Electromagnetic Applications (2017) 3. Latin American Journal of Solids and Structures (2017) 4. Revista IBRACON de Estruturas e Materiais (2017) 20 109,236
Humanities & Applied Social Sciences 1. Ambiente and Sociedade (2017) 2. Brazilian Journal of Political Economy (2017) 3. Cadernos Pagu (2017) 40 294,882
Languages, Linguistics and Arts 1. Alfa: Revista de Linguística (2017) 2. Revista Brasileira de Estudos da Presença (2017) 3. Ilha do Desterro (2017) 20 164,263
TOTAL 906,035
Table 1.Table 1: Journal articles mined from SciELO for BrACE (Tavares Pinto et al., 2021)

BrACE is a multidisciplinary corpus of 906,035 running words which are part of 160 research articles written in English and published in Brazilian peer-reviewed journals. The articles were selected from open-access journals available from the Scientific Electronic Library Online (SciELO). In this platform, the reader will find articles in English, Portuguese and Spanish.

When choosing the articles, we selected the ones that had higher journal quality scores in their areas and that had publications in English, since not all of them had papers in this language. Therefore, when the subareas overlapped, as in Agronomy and Biology, we selected the area in which the journal had higher scores in the Brazilian Qualis reference.

The number of tokens in each area differed since the accepted length of the articles would also depend on the area. After downloading the articles in different folders they were saved with filenames indicating each area and file number, such as in SOC1 and SO2, for Applied Social Studies. After that, we uploaded the articles to Sketch Engine tools to analyse the corpus.

As previously mentioned, we are now updating the corpus because we still have 2019, 2020 and 2021 to be analysed, so the results we are presenting now refer to the years 2017 and 2018.

2.2. The tools for the SDGs Analysis

We chose three tools from the Sketch Engine platform (KILGARIFF et al., 2004) which had already been previously used, which are the Keywords, Wordsketch and the Concordance. The Keywords tool “compares corpora and identifies what is unique or typical” (SKETCH ENGINE). The study corpus will be compared to larger corpus that will filter words that are statistically more relevant in this study corpus. WordSketch is a “one-page summary of a word’s grammatical and collocational behaviour” (KILGARRIFF et al., 2014, p. 9). Concordance will show all lines of a search word in the corpus with its cotext, that is, one line with the search word highlighted. If the researcher wants to have a larger context with the search word he may do so.

In the next section we will show examples of the analysis with each tool.

3. Analysis

In this analysis we will discuss the search for “sustainability” and “poverty” with each tool and we will be able to observe how each search theme has been discussed by their researchers.

3.1. Preliminary analysis of “Sustainability” in BrACE

Sketch Engine showed there were 117 hits with the word “sustainability”. By using WordSketch, we will know how this word combines with other nouns, preposition, adverbs and modifiers, as we see in the list below:

Figure 1.Table 2: WordSketch results for “sustainability” in BrACE

We can see that the theme has been discussed geographically, since we find terms such as “regional sustainability”, “coastal sustainability” and “urban sustainability”. We will also find “social sustainability” and “ecological sustainability” which point to a discussion based on social status and Ecology.

Our attention was raised by the definition of sustainability as a “cure”, as a “phenomenon” and as a “concept”, showing different perspectives for this noun.

For a broader view of this search word in different positions such as subject, object or modifier, WordSketch generates an image showing all combinations of a search word, such as we see next:

Figure 2.Figure 1: WordSketch of “sustainability” in BrACE

By doing so, we can see sustainability as an environmental issue, as we could expect, but we also see the possibility of rethinking and inserting sustainability, as the search word is the object of “rethink”, and “insert” and the combination “sky-rocketted sustainability” showing a rapid increase of it, which could be a positive information.

In order to know more about the context in which the search word is being discussed, we can generate concordance lines, as we see in Fig. 2:

Figure 3.Figure 2: Concordance lines with the search word “sustainability” in BrACE

The concordance lines in the articles pointed out the following topics: “sustainability in viticulture”, “sustainability at non-timber forest product harvest”, “sustainability and cost saving”, “sustainability in decentralized proposals” and “sustainability into the international agenda”. All these segments are related to the SDGs regarding the environment, which means Brazilian researchers published papers on this theme in the articles of BrACE.

For a larger context to know more about the theme, we can generate n-grams of three or four words and click on the three dots on the right to have concordance lines, which can be enlarged, as we see in Figure 3:

Figure 4.Figure 3: Expanded concordance lines with the search word “sustainability” in BrACE

By having a longer text, we can see the author was discussing the biogeographical model of “islands of diversity” which are protected natural spaces related to environmental policy strategies. By searching for keywords from the SDGs we can know how Brazilian researchers were dealing with the theme and how they used the academic language, which is also the focus of this analysis.

Although present, there were only two texts in BrACE discussing “sustainable development”. The reason for that is that the corpus contains 20 papers from each area of SciELO, so the discussion on this matter is still limited to those texts. We believe more papers regarding the SDGs have been published since 2018, therefore we will enlarge BrACE and have a better look on how the SDGs have been dealt with in Brazilian research.

3.2. Preliminary analysis of “Poverty” in BrACE

Another keyword we wanted to look up was “poverty”, which showed only 21 hits in BrACE. After generating a WordSketch for this word, we could see some of the co-occurrences as they are shown in Table 3:

Figure 5.Table 3: WordSketch results for “poverty” in BrACE

We see that some articles were relating poverty to topics such as “race”, “health promotion” and discussing poverty regarding “rural” and “urban” areas. One term that called our attention was “commodifying poverty”, which is exactly the opposite of what the SDGs suggest, so we believe there might be a relevant discussion on the topic in the studies of Brazilian researchers. It is important to point out that the tool will show relevant information on with the wordsketch lists, however, the researcher must be careful when reading these lists as we can see on Table 3, where “insecurity” can not be considered a modifier of poverty, but a word that was listed before it, with a comma between them.

As we did with “sustainability”, we wanted to see the search word “poverty” in different positions in the word cloud image, as it is shown in Fig. 4

Figure 6.Figure 4: WordSketch of “poverty” in BrACE

Although there are few positions and combinations with the noun “poverty”, the image is useful to show in which area and how it has been discussed, such as in “human poverty” and the verbs “reduce” and “fight” with “poverty, which shows there has been some discussion regarding poverty in different contexts. On the other hand, “insecurity” is also related to the word “poverty” and, although there are few terms related to the search word, we do have a general idea on the topics related to the theme.

The same way, we can access the concordance lines to know more about the context in which it had been discussed, as we see in Fig. 5:

Figure 7.Figure 5: Concordance lines with the search word “sustainability” in BrACE

The lines show us that sometimes what people believe to be environmentalism is, in reality, “poverty and social marginalization”. The authors also question policies to reduce poverty and discuss the difference between specific communities such as “quilombolas”, where there is still poverty and “ecovillages” in Global South, where they fight against poverty.

Finally, if the researcher wants to know more about the topic with a larger context, we may expand it as we did in the previous section. We can see the discussion over “urban poverty” in Figure 6:

Figure 8.Figure 6: Expanded concordance lines with the search word “poverty” in BrACE

By reading the context we observe there is a discussion related to economic liberalism and how industrial capitalism was “characterized by massive urban poverty and social dislocation”. The author is probably contextualizing his/her reader before showing his research data.

In this section we showed how we can use some of the tools of corpus linguistics to analyse the themes related to the SDGs and by carrying out a bottom-up interpretation in which we start by a key word or theme and analyse its co-occurences, co-text and context, as well as concordance lines that will bring other views of the same search word.

4. Final Remarks

In this paper we presented the themes related to the United Nations Agenda 2030 whose intention is to find solutions to major issues related to global problems. Among those problems we can cite “sustainability” and “poverty”. We use the Brazilian Academic Corpus of English - BrACE, which is a medium-size corpus, to start an exploratory study about how the SDGs have been discussed in the research papers of Brazilian authors. Although there are only 117 hits for sustainability and even less for poverty, only 21 hits, the corpus linguistics tools were very useful in retrieving information that showed us how these two themes have been discussed in Brazil. We were able to know that sustainability has been presented as a “concept”, as a “phenomenon” and even as a”cure”. The theme is also discussed by its geographical point of view since the authors mention “coastal”, “sub-regional” sustainability, at the same time, we see verbs such as “insert” and “rethink” related to the topic, which show a positive intention towards it. On the other hand, we do not see such enthusiasm related to the theme “poverty”, however, we see an interesting discussion on the differences between traditional groups such as “quilombolas” and “ecovillages” that show that this theme is being described to be dealt with. We also see the verb “fighting poverty”, which shows some action is being taken against it.

In this study, we have explored a medium-size corpus of research papers that were published at the SciELO open-access library. We used three tools that are part of the Sketch Engine program which were Keywords, WordSketch and Concordance. By using these tools, we proposed steps that researchers can take to know if and how the SDG themes have been discussed in the studies present at the BrACE corpus.

Finally, we need to point out that it is not our intention to have a broad view of the SDGs in Brazil since, as we have mentioned, our study corpus is still small to do so. We have been working on updating articles so we can have a better understanding of the themes discussed. Also, we will be able to observe which journals have mostly discussed the SDGs and which ones do not bring the theme in their published volumes. Our aim here was to propose methodological steps in a preliminary study to evaluate whether a bottom-up approach would work for future investigations.

5. Acknowledgments

The author would like to acknowledge the research funding from FAPESP - grant #2016/25198-6, São Paulo Research Foundation (FAPESP).


BERBER SARDINHA, Tony. Linguística de Corpus. Barueri, SP: Editora Manole Ltda, 2004.

KILGARRIF, A., et al. The Sketch Engine. In: EURALEX, 2004. Proceedings of Euralex. Lorient: France, 2004. p.105–116.

KILGARRIFF, A.; BAISA, V.; BUŠTA, J.; JAKUBÍČEK, M.; KOVÁŘ, V.; MICHELFEIT, J.; RYCHLÝ, P.; SUCHOMEL, V. The Sketch Engine: ten years on. Lexicography: Journal of ASIALEX, [S. l.], v. 1, n. 1, p. 7–36, 2014. DOI: 10.1007/s40607-014-0009-9. Available at: Date accessed: 15 july 2021.

KUHN, Tanara Zingano. A design proposal of an online corpus-driven dictionary of Portuguese for University Students. 2017. (Doctoral Thesis) – University of Lisbon, Lisbon, 2017.

NESSELHAUF, Nadja. Collocations in a learner corpus. John Benjamins Publishing, 2005.

PINTO, Paula Tavares et al. Analysing the behaviour of academic collocations in a corpus of research-papers: a data-driven study / Analisando o comportamento de colocações acadêmicas em um corpus de artigos científicos: um estudo dirigido por dados. REVISTA DE ESTUDOS DA LINGUAGEM, [S.l.], v. 29, n. 2, p. 1229-1252, mar. 2021. ISSN 2237-2083. Available at: <>. Date accessed: 16 sep. 2021. doi:

SciELO (n.d.)

SINCLAIR, J. Corpus, Concordance, Collocation. Oxford: Oxford University Press, 1991.



VIANA, Vander; E. O TAGNIN, Stella. Corpora no ensino de línguas estrangeiras. Hub Editorial, 2011.

TAVARES PINTO, Paula.; REES, Geraint.; FRANKENBERG-GARCIA, Ana. “Identifying collocation issues in English L2 research article writing”. In: CHARLES, Maggie; FRANKENBERG-GARCIA, Ana. Corpora in ESP/EAP Writing Instruction: Preparation, Exploitation, Analysis. 01ed. London: Rouledge, 2021, p. 01-20.