Journal articles: 'Literary corpora'

1

Herrera, J. P., and P. A. Pury. "Statistical keyword detection in literary corpora." European Physical Journal B 63, no. 1 (May 2008): 135–46. http://dx.doi.org/10.1140/epjb/e2008-00206-x.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Pavan, Luca. "Comparing Lexicons Diachronically in Italian Literary Corpora." International Journal of Linguistics, Literature and Translation 3, no. 8 (August 30, 2021): 85–89. http://dx.doi.org/10.32996/ijllt.2021.4.8.13.

Full text

Abstract:

The goal of the article is to provide a comparison between several words from Florentine vernacular language and modern Italian language, using software written by the author. This paper focuses on two corpora: the first one includes a selection of Florentine vernacular literature and the second one a group of literary books written in a modern Italian language from the end of XIX Century up until the present. The article demonstrates the use of some features of the software to compare the two corpora, ranking the lexicographic entries using different strategies. It is possible to analyse the lexicon taking into consideration different types of sorting, using only three parameters: the word frequency, the percentage of frequency according to the number of words in the corpus, and the percentage of texts where the word is found in the corpus. From these parameters a fourth parameter also arises the level of persistence of words in each corpus. The software allows observing the differences in the use of lexicon in various periods of history, comparing the Florentine vernacular language, which was used in the Italian peninsula till the beginning of XIX Century, to the modern Italian language.

APA, Harvard, Vancouver, ISO, and other styles

3

Změlík, Richard. "The Use of Authorial Corpora Beyond Linguistics." Journal of Linguistics/Jazykovedný casopis 68, no. 2 (December 1, 2017): 404–14. http://dx.doi.org/10.1515/jazcas-2017-0050.

Full text

Abstract:

Abstract The study concentrates on the issue of quantitative and qualitative methods within the context of literary theory. It intends namely to present the concept of the literary corpus of Czech prose and define main parameters of the corpus. Besides the project of a specialized corpus, primarily intended for the use in the field of literary theory, the study deals with current stochastic and corpus methods applied by foreign scholars in analysis of literary prosaic texts. The study tries to incorporate the original project of Czech prose literary corpus in this contemporary context that represents one form of a recently flourishing discipline called Digital Humanities (Digital Literary Studies).

APA, Harvard, Vancouver, ISO, and other styles

4

Bystrova-Mcintyre, Tatyana. "Looking at the overlooked: A corpora study of punctuation use in Russian and English1." Translation and Interpreting Studies 2, no. 1 (January 1, 2007): 137–62. http://dx.doi.org/10.1075/tis.2.1.04bys.

Full text

Abstract:

This study was designed to analyze the comparative use of punctuation marks in Russian and English newspaper editorials. The study was conducted using corpora of English-language editorials, taken from the New York Times in 2005, and of Russian-language editorials, taken from Izvestiia in the same year. Results indicated that the comma, colon, and the em-dash were used more often in the Russian corpus. The difference was determined to be statistically significant. The author then compared these results to the results of punctuation use in corpora of Russian and English literary texts. Again these punctuation marks were used more frequently in the Russian literary corpus than in the English one. At the same time, in both the Russian and English literary corpora these marks were used much more frequently than in the corpora of Russian and English editorials. In the second part of the article, the author attempts to isolate the reasons for the discrepancy in use of the colon by examining rules for its use as elaborated in authoritative Russian and English style guides. On the basis of this, the author suggests guidelines for the translation of the colon into English.

APA, Harvard, Vancouver, ISO, and other styles

5

MONTEMURRO, MARCELO A., and PEDRO A. PURY. "LONG-RANGE FRACTAL CORRELATIONS IN LITERARY CORPORA." Fractals 10, no. 04 (December 2002): 451–61. http://dx.doi.org/10.1142/s0218348x02001257.

Full text

Abstract:

In this paper, we analyze the fractal structure of long human language records by mapping large samples of texts onto time series. The particular mapping set up in this work is inspired on linguistic basis in the sense that is retains the word as the fundamental unit of communication. The results confirm that beyond the short-range correlations resulting from syntactic rules acting at sentence level, long-range structures emerge in large written language samples that give rise to long-range correlations in the use of words.

APA, Harvard, Vancouver, ISO, and other styles

6

Bhan, Jaemi, Sowoon Kim, Jongkwang Kim, Younghun Kwon, Sung-il Yang, and Kunsang Lee. "Long-range correlations in Korean literary corpora." Chaos, Solitons & Fractals 29, no. 1 (July 2006): 69–81. http://dx.doi.org/10.1016/j.chaos.2005.08.214.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Gemeinböck, Iris. "Representativeness in Corpora of Literary Texts: Introducing the C18P Project." Matlit Revista do Programa de Doutoramento em Materialidades da Literatura 4, no. 2 (July 11, 2016): 29–48. http://dx.doi.org/10.14195/2182-8830_4-2_2.

Full text

Abstract:

Currently there are very few specialised corpora of literary texts that are tailored to the needs of literary critics who are interested in corpus stylistic analyses of prose fiction. Many existing corpora including literary texts were compiled for linguistic research interests and are often unsuitable for corpus stylistic purposes. The paper addresses three of the main problems: the absence of labelling of the texts for literary genre, the use of extracts, and the prevalence of linguistic periodisation schemes. C18P is a corpus of prose fiction designed specifically to address these issues. It traces the early development of the novel from 1700 up until the Victorian era. It can, for instance, be used for an analysis of the characteristic linguistic features of individual literary genres and forms. The following paper introduces the design of the corpus as well as some of its potential uses.

APA, Harvard, Vancouver, ISO, and other styles

8

Crisostomo, C. Jay. "Writing Sumerian, Creating Texts: Reflections on Text-building Practices in Old Babylonian Schools." Journal of Ancient Near Eastern Religions 15, no. 2 (March 18, 2016): 121–42. http://dx.doi.org/10.1163/15692124-12341271.

Full text

Abstract:

Sumerian lexical and literary compositions both emerged from the same social sphere, namely scribal education. The complexities of inter-compositional dependence in these two corpora have not been thoroughly explored, particularly as relevant to questions of text-building during the Old Babylonian period (c. 1800–1600 bce). Copying practices evident in lexical texts indicate that students and scholars adopted various methods of replication, including visual copying, copying from memory, and ad hoc innovation. They were not confined to reproducing a received text. Such practices extend to copying literary compositions. A study of compositions from Advanced Lexical Education in comparison with several literary compositions shows a complex inter-dialectic between the corpora, in which lexical compositions demonstrate dependence on literary compositions and vice versa. Thus, Old Babylonian students and scholars could experiment with multiple text-building practices, drawing on their knowledge of the lexical and the literary, regularly creating new versions of familiar compositions.

APA, Harvard, Vancouver, ISO, and other styles

9

O’Connell, Daniel C., Sabine Kowal, and Scott P. King. "Interjections in literary readings and artistic performance." Pragmatics. Quarterly Publication of the International Pragmatics Association (IPrA) 17, no. 3 (September 1, 2007): 417–38. http://dx.doi.org/10.1075/prag.17.3.04con.

Full text

Abstract:

Numerosity and privileges of occurrence of various types of interjections (primary conventional, primary non-conventional, secondary, and onomatopoeic) were investigated in three different literary readings of Winnie-the-Pooh (Milne 1926), in one reading of Ulysses (Joyce 1960), and in an artistic performance by actors (the film The third man, Korda, Selznik, & Reed 1949). The spoken corpora, based on printed texts as source, consisted of 667 interjections. Ameka’s (1992 b, 1994) hypothesis that, parallel to their independence from ambient grammar, interjections would also be isolated temporally by preceding and following pauses, was not confirmed; for the entire corpus, only 39% of all interjections were thus isolated. However, an alternative hypothesis, that interjections serve an initializing function, was confirmed: Altogether, 77% of the interjections were found to be initializing, i.e., were preceded by a pause, introduced a speaking turn, introduced an utterance, and/or introduced a citation. Primary conventional interjections constituted the majority of interjections (overall 56%), but only two of these were common to all the corpora (oh and ah). By far the highest percentage (28 %) of primary non- conventional interjections occurred in the artistic performance of The third man. None of these occurred in either the novel or the screenplay of The third man, unlike the primary non-conventional interjections throughout the text of the literary readings. Functions of interjections are discussed in terms of Goffman’s (1981: 226) animators (literary readers, 26% of whose spoken interjections were added to those in the printed text) and principals (actors, 79% of whose spoken interjections were added to those in the printed text), in terms of literacy and orality, and in terms of the emotional stance and perspective of a speaker at the very moment of utterance.

APA, Harvard, Vancouver, ISO, and other styles

10

Meldrum, Yukari Fukuchi. "Translationese in Japanese Literary Translation." TTR 22, no. 1 (October 21, 2010): 93–118. http://dx.doi.org/10.7202/044783ar.

Full text

Abstract:

Translationese in Japanese, despite its distinct characteristics when compared to natural Japanese, has so far been systematically studied by only one researcher (Furuno, 2005). In addition to this general lack of scholarly interest, the translational situations in Japan are not well-known in the West. In this paper, the notions of translationese in Japan are investigated from the perspective of Translation Studies and of Kokugogaku (studies of Japanese language). In addition, this study provides reasons for conducting systematic studies of translationese in Japan, where Translation Studies is still in its initial stages. Finally, the results of a preliminary examination of small comparable corpora using a translation and a non-translation are presented.

APA, Harvard, Vancouver, ISO, and other styles

11

Dobson, Teresa M., Marlene Asselin, and Alemu Abebe. "Considerations for Design and Production of Digital Books for Early Literacy in Ethiopia." Language and Literacy 20, no. 3 (July 19, 2018): 134–58. http://dx.doi.org/10.20360/langandlit29414.

Full text

Abstract:

This paper considers the implications of digital text production models for the development of reading materials for emergent and early readers in the Ethiopian context. We draw from several theoretical frameworks and also from comments of Ethiopian academics, writers, and publishers to ground descriptions of Ethiopian contexts of language and literacy. We then present three different models for the production and curation of digital stories for children and contemplate how these models align with existing literacy traditions and practices. We also raise questions about the potential effects on the development of literary culture and children’s literature in Ethiopia of projects aimed at rapidly producing large corpora of literature for children. Ultimately, we pose complicated cultural and linguistic questions that need to be taken into consideration to provide appropriate and original early literacy materials in Ethiopia.

APA, Harvard, Vancouver, ISO, and other styles

12

Novakova, Iva, and Marion Gymnich. "Extended phraseological units and literary genres." Varia, no. 28 (July 1, 2021): 87–112. http://dx.doi.org/10.54563/lexique.624.

Full text

Abstract:

The present paper is based on the assumption that the language of the novel as well as that of its various subgenres is characterized by a statistically relevant overrepresentation of certain linguistic phenomena (e.g., lexemes, key words, collocations and colligations, Siepmann, 2015, 2016). Applying state-of-the-art lexicometric methods to extract recurring polylexical units in two large corpora of contemporary French and English novels, we explore the role of phraseological motifs in distinguishing literary subgenres. Unlike traditional corpus-stylistic analyses, which frequently focus on the style of a single author, our corpus-driven approach identifies features of literary (prose) genres on the basis of automatically extracted lexico-syntactic constructions (LSCs) that are statistically specific to a certain subgenre.

APA, Harvard, Vancouver, ISO, and other styles

13

Nurieva, Fanuza Shakurovna, Gulshat Raisovna Galiullina, and Airat Faikovich Yusupov. "Research Perspectives on the Tatar language based on the LingvoDoc platform." Proceedings of the Institute for System Programming of the RAS 34, no. 6 (2022): 173–78. http://dx.doi.org/10.15514/ispras-2022-34(6)-13.

Full text

Abstract:

The article discusses research perspectives on the Tatar language based on the LingvoDoc platform. Digitalization of language learning in modern linguistics allows us to move to a new level of describing the language structure. Large corpora containing millions of word forms have been created in all European languages since the 90s of the last century. Currently, this has been done not only in the Russian language, but also in many national languages of Russia such as Tatar, Bashkir, Udmurt, Mari, Moksha, Komi, etc. One of the recognized platforms in modern national linguistics is the development of the LingvoDoc virtual laboratory, created ISP RAS. This platform gives an opportunity to create, store and analyze multilayer dictionaries, language materials and dialects. The main functionality of Lingvodoc is used by more than 250 linguists who process their materials online, more than 1000 dictionaries and 300 text corpora in the national languages of the Russian Federation have already been collected. We consider the possibilities of this platform to study the Tatar language. We believe that electronic corpora allow us to solve a variety of theoretical and practical problems of the language. At present, when the Tatar literary and everyday spoken language is actively used in all fields, it is very important to make a complete description of its features, which will help create more accurate grammars and dictionaries. The relevance of the study is due to the need to use a gloss corpus of texts in the Tatar language. As modern studies in linguistics show, nowadays it is impossible to describe the state of the language without such corpora and analyze its grammatical structure, which corresponds to the world standards of modern science. The LingvoDoc platform makes it possible to process a significant amount of material in a short time and create corpora with glossing and removed homonymy based on samples of the Tatar literary, business, colloquial and dialect languages.

APA, Harvard, Vancouver, ISO, and other styles

14

Lin, Yan, Hazlina Abdul Halim, and Farhana Muslim Mohd Jalis. "Building a Parallel Corpus for Chinese Folk Songs Translation Studies: A Case Study of Northern Shaanxi and Hua’er Folk Songs." Theory and Practice in Language Studies 14, no. 2 (February 1, 2024): 454–68. http://dx.doi.org/10.17507/tpls.1402.17.

Full text

Abstract:

Folk songs, collaboratively created by the public and transmitted orally, have gained widespread popularity. The translation of folk songs primarily centers on lyrics translation, a subset of literary translation. Recent advancements in corpus technology have highlighted the significance of corpus-based research approaches for the analysis of literary translation. The corpus method, now employed as a hybrid research approach, enables the generation of quantitative data for descriptive translation studies. Scholars are increasingly using parallel corpora containing both the source text (ST) and the target text (TT) to explore translation universals across diverse texts. Despite the growing body of literature on the translation of Chinese folk songs, most studies have involved straightforward analyses of a limited number of translated texts without the utilization of quantitative approaches. This article aims to bridge this gap by presenting a prospective study on the creation of the Chinese-English Parallel Corpus of Northern Shaanxi and Hua’er Folk Songs (CEPCNSHFS). The study covers essential aspects such as sampling, corpus structure, corpora selection, and corpora processing. Moreover, to assess the practical utility of the CEPCNSHFS, a pilot study was conducted. The primary contributions of this article reside in the potential of the CEPCNSHFS to support diverse research topics, including the exploration of translation language characteristics, styles, and methods employed in translating Northern Shaanxi and Hua’er folk songs, both of which hold significant positions within Chinese folk song traditions.

APA, Harvard, Vancouver, ISO, and other styles

15

MONTEMURRO, MARCELO A., and DAMIÁN H. ZANETTE. "ENTROPIC ANALYSIS OF THE ROLE OF WORDS IN LITERARY TEXTS." Advances in Complex Systems 05, no. 01 (March 2002): 7–17. http://dx.doi.org/10.1142/s0219525902000493.

Full text

Abstract:

Beyond the local constraints imposed by grammar, words concatenated in long sequences carrying a complex message show statistical regularities that may reflect their linguistic role in the message. In this paper, we perform a systematic statistical analysis of the use of words in literary English corpora. We show that there is a quantitative relation between the role of content words in literary English and the Shannon information entropy defined over an appropriate probability distribution. Without assuming any previous knowledge about the syntactic structure of language, we are able to cluster certain groups of words according to their specific role in the text.

APA, Harvard, Vancouver, ISO, and other styles

16

Lalić, Ana. "On the Sources of Spoken (Italian) Language in a Historical Perspective." Društvene i humanističke studije (Online) 8, no. 3(24) (December 31, 2023): 135–50. http://dx.doi.org/10.51558/2490-3647.2023.8.3.135.

Full text

Abstract:

In this paper, we research the importance of corpora in historical pragmatics. From the origins of historical pragmatics as a science the question of choosing an appropriate corpus has been in the center of attention. The problematics of historical pragmatics demand a corpus that reflects faithfully the spoken language. However, such registered sources do not exist. Thus, researchers face problems when choosing corpora that have to be written but also reflect speech. That is why we synthesize a review of often-used corpora and we present their advantages and disadvantages. We apply the theory of Giovanni Nencioni (1976) that differs between the spoken language in the true meaning of the word (parlato parlato), written spoken language (parlato scritto), and recited spoken language (parlato recitato). Based on his research, we can claim that some of the privileged forms in historical pragmatics are literary works, theater pieces, religious scriptures, and speech transcripts. Thus, we examine the advantages and disadvantages of the aforementioned corpora and we aim to determine if they are suitable to be used in historical pragmatics. The research shows that it is impossible to find a completely adequate corpus because none of them reflect the spoken language in all of its characteristics and they approach it only partially.

APA, Harvard, Vancouver, ISO, and other styles

17

Orrequia-Barea, Aroa, and Cristian Marín-Honor. "Building a parallel corpus of literary texts featuring onomatopoeias: ONPACOR." Research in Corpus Linguistics 8, no. 2 (2020): 46–62. http://dx.doi.org/10.32714/ricl.08.02.03.

Full text

Abstract:

Onomatopoeias constitute a much neglected subject in linguistics. The rather scarce literature on onomatopoeias is derived from a lack of reliable empirical data on the topic. In order to bridge this gap, we have compiled a parallel corpus of literary texts featuring onomatopoeias: the Onomatopoeia Parallel Corpus (ONPACOR). The corpus consists of onomatopoeias in English, Spanish and French extracted from comics and representative corpora of each language. ONPACOR has been built on the basis of existing translations to the languages of reference. This article describes the methodology used to compile the corpus, as well as the applications that it can have.

APA, Harvard, Vancouver, ISO, and other styles

18

Mischke, Dennis, and Christopher Ohge. "Digital Melville and Computational Methods in Literary Studies." Leviathan 25, no. 2 (June 2023): 35–60. http://dx.doi.org/10.1353/lvn.2023.a904374.

Full text

Abstract:

Abstract: The use of computational tools for the study of literature can facilitate new perspectives and avenues for critical work. Reflecting on the recent emergence of (and increasing hype around) large languages models such as GPT, this essay argues that the creation of "smart" data sets and corpora as new forms of literary objects requires and enables the development of computational methods and tools that can create "data stories". Smart data sets continue a humanistic tradition of textual scholarship and bibliography while also preparing text data to be "read" by machines. In telling "data stories" about Herman Melville, we bridge the gap from "numbers to meaning" in a variety of examples from Billy Budd . The essay closes with a broader reflection on reading Melville in the age of machine learning and artificial intelligence.

APA, Harvard, Vancouver, ISO, and other styles

19

Leonardi, Letizia. "Literature in and through Translation: Literary Translation as a Pedagogical Resource." International Journal of Linguistics, Literature and Translation 7, no. 3 (March 15, 2024): 93–102. http://dx.doi.org/10.32996/ijllt.2024.7.3.11.

Full text

Abstract:

This article is the revised version of the paper that I presented at the 5th APTIS (Association of Programmes in Translation and Interpreting Studies) 2023 conference (“The teaching and learning that matter today”), whose proceedings were never published. As a result of globalisation, the number of books requiring translation considerably increased. Nevertheless, readers do not always acknowledge translations as such, and literary translators do not generally obtain the recognition they deserve. Academia may be partly responsible for that: on the one side, indeed, literary translation is not as discussed as other topics within the broader field of Translation Studies; on the other, whilst teaching texts in translation is becoming increasingly common, translated literature is not generally considered as an academic discipline on its own. To promote a wider circulation and appreciation of translated literature in and beyond academia, translated literary texts could be systematically introduced into the curricula of courses in literature and literary translation. This could be achieved through the compilation and use of parallel corpora, namely collections of source texts and respective translations. In this light, this paper has two main objectives: explaining how courses in literature and literary translation could be taught using parallel corpora; showcasing the pedagogical advantages that such an approach may have on different levels. As for courses in literature it would provide students with an understanding of the mechanisms behind the production of literary translations and their relevance within the broader literary system. On what concerns courses in literary translation, it may represent a compromise between theory and practice, and between the research-orientated environment of academic settings and the commercially-orientated publishing industry. The study was conducted through the review of pedagogical practices and contexts where literary texts are taught in translation. The paper concludes with the observation that this corpus-based teaching approach may have some positive repercussions outside academia: it would not only contribute to a broader appreciation of translated literary texts among the general public but also foster a broader recognition of the role of the literary translator in shaping and constructing foreign literature.

APA, Harvard, Vancouver, ISO, and other styles

20

Vázquez, Nila, Laura Esteban Segura, and Teresa Marqués Aguado. "A descriptive approach to computerised English historical corpora in the 21^st century." International Journal of English Studies 11, no. 2 (December 1, 2011): 119. http://dx.doi.org/10.6018/ijes/2011/2/149671.

Full text

Abstract:

Historical corpora offer many potentialities for linguistic research. Thus, the present article provides an overview of the major English historical corpora compiled or being compiled both in Spain and abroad. They include different types such as tagged and parsed corpora, and their main features will be outlined. As for the organisation of the article, after the introductory section, the historical corpora created abroad will be presented. Then, those being constructed in Spain (Coruña, Las Palmas, Málaga, Salamanca, Santiago and Sevilla) will be discussed. Some final remarks and the references close the article.

APA, Harvard, Vancouver, ISO, and other styles

21

Novakova, Iva. "Phraseological motifs for Distinguishing Between Literary Genres. A Case Study on the Motifs of Verbal and Non-Verbal Communication." Kalbotyra 74 (September 15, 2021): 160–81. http://dx.doi.org/10.15388/kalbotyra.2021.74.9.

Full text

Abstract:

The present paper is based on the assumption that the language of the novel is characterized by a statistically relevant overrepresentation of certain linguistic units (e.g. lexemes, key words, collocations and colligations, Siepmann 2015). First steps towards checking the validity of this hypothesis had been undertaken in pioneering works in the 1990s/2000s (e.g. Stubbs & Barth 2003). These studies were however limited by the small size of their (exclusively English) corpora. The present study explores the role of some patterns (phraseological motifs) in distinguishing French literary subgenres. It also proposes a case study of some motifs related to the verbal (dire avec sourire ‘to say with a smile’) and non-verbal communication (adresser un sourire ‘to send a smile’). Unlike traditional corpus-stylistic analyses, which frequently focus on the style of a single author, our corpus-driven approach identifies lexico-syntactic constructions in literary genres which are automatically extracted from the corpora.The main purpose is to show the relevance of the notion of phraseological motif (Legallois 2012; Longrée & Mellet 2013; Novakova & Siepmann 2020) for the distinction of literary subgenres. Linking form and meaning, these ‘multidimensional units’ fulfil pragmatic as well as discursive functions.The data has been extracted from large French corpora of the PhraseoRom research project https://phraseorom.univ-grenoble-alpes.fr. They are accessible on http://phraseotext.univ-grenoble-alpes.fr/phraseobase/index.html and contain 1000 novels (published from the 1950s to the present), partitioned into six sub-corpora: general literature (GEN), crime fiction (CRIM), romances (ROM), historical novels (HIST), science fiction (SF) and fantasy (FY).The results of our study reveal some unexpected differences between the literary subgenres: e.g. the motif dire d’une voix ‘to say in a voice’ in HIST compared to GEN. In FY, expressions of verbal communication are related to shouting and screaming. Expressions related to the non-verbal communication (prendre dans ses bras ‘to take in one’s arms’) are specific to ROM, where body language is overrepresented. In SF, there is a very limited number of these types of expressions. More generally, the motifs provide the link between the micro level (phraseological recurrences) and the macro level (the fictional script).

APA, Harvard, Vancouver, ISO, and other styles

22

Henc, Adriana. "Kanon historiograficzny w literaturoznawstwie polsko-ukraińskim XVIII–XX wieku." Studia Ukrainica Posnaniensia 10, no. 1 (November 15, 2022): 105–17. http://dx.doi.org/10.14746/sup.2022.10.1.06.

Full text

Abstract:

The article investigates the theoretical and practical principles of creating a canon in Polish and Ukrainian historical and literary syntheses in the context of European historiographical traditions. The core criteria of selection of historical and literary material are singled out, and taxonomic tendencies in the histories of literature from the 18th–20th centuries are considered. The main aspects of the canonization of texts in Ukrainian and Polish historiographical corpora are clarified, which are primarily reflected in the formal characteristics, criteria for selecting literary texts, the author’s interpretations, conceptual approaches and the methodological basis. The article articulates the main achievements of literary historians of the 18th–20th centuries, as well as comprehensively defining the methodology for creating a holistic and scientifically sound corpus of the history of Ukrainian and Polish literature.

APA, Harvard, Vancouver, ISO, and other styles

23

Freidson, Olga Aleksandrovna, and Ekaterina Evgen'evna Verezubova. "Corpus methods in research and study/teaching of the French language." Litera, no. 2 (February 2023): 22–33. http://dx.doi.org/10.25136/2409-8698.2023.2.37471.

Full text

Abstract:

The aim of the work is to identify the possibilities and specifics of using corpus methods in conducting research on the material of the French language and in teaching French. The growing interest in the methods of corpus research based on specific language data and the insufficient development of the issue on the material of the French language determine the relevance of the work. The analysis has shown that today there are various resources for conducting corpus research on the material of the French language, including literary text corpora, parallel corpora, oral speech corpora, which create a specially organized multidimensional infrastructure of the language space, giving a comprehensive idea of language units, their compatibility, semantics and functions. The authors have demonstrated that the existing corpus managers can be successfully applied in teaching French at the initial level, from the very beginning forming important linguistic and methodological competencies among linguist students. The scientific novelty of the research consists in a comprehensive review of the existing French corpus resources and the possibilities of their use in research and in teaching French. The results of the study can be used both for further development of research in the field of history, grammar, lexicology, stylistics of the French language based on corpora, and for the development of tasks for teaching French using corpus data, which is of practical significance of the study.

APA, Harvard, Vancouver, ISO, and other styles

24

Lindqvist, Christina. "Élaboration d’une liste de vocabulaire dans un contexte FLE." Studia Romanica Posnaniensia 49, no. 4 (January 9, 2023): 65–74. http://dx.doi.org/10.14746/strop.2022.494.004.

Full text

Abstract:

In this article, we discuss the elaboration and use of vocabulary lists aimed for learners of French as a for-eign language. These lists are commonly based on corpora, which, in the ideal case, are representative, relevant and large. As for the English language, this kind of corpora has been available for a long time (e.g. the COCA and the BNC). Vocabulary lists, which are often used in learning contexts, have been based on these corpora. The situation is, however, less favorable when it comes to the French language, with fewer corpora meeting the mentioned criteria, and, thus, fewer possibilities to create vocabulary lists that are useful for learners. In this contribution, we present work that has been done in order to create a vocabulary list, Riksprovsordlistan, containing about 4,000 words and used at all Swedish universities. The discussion focuses on methodological challenges such as choice of counting unit – lemma vs. word family –, the role of frequency, thematic vocabulary, as well as characteristics of written vs. spoken corpora.

APA, Harvard, Vancouver, ISO, and other styles

25

Hasselgård, Hilde. "Attribution in novice academic writing." English Text Construction 14, no. 2 (December 31, 2021): 203–30. http://dx.doi.org/10.1075/etc.00047.has.

Full text

Abstract:

Abstract Academic attribution, the direct acknowledgement of external sources, is investigated in two corpora of novice academic English, representing first and second language writing in linguistics. The forms and uses of attribution are analysed in a formal-functional framework. There is an overall underrepresentation of attribution in the learner corpus. However, the corpora have a similar proportional distribution of integral and non-integral attribution, but a difference in subtypes of these. Undated attributions are discussed as a special case. They occur in specific contexts, of which reference to course reading is peculiar to novice writing. Comparisons with expert corpora in Norwegian and English indicate that some, but not all, of the differences between the novice corpora may be linked to influence from the learners’ first language and culture.

APA, Harvard, Vancouver, ISO, and other styles

26

Bjerring-Hansen, Jens, and Sebastian Ørtoft Rasmussen. "Litteratursociologi og kvantitative litteraturstudier." Passage - Tidsskrift for litteratur og kritik 38, no. 89 (June 5, 2023): 171–89. http://dx.doi.org/10.7146/pas.v38i89.137987.

Full text

Abstract:

This article builds a case for how literary historiography can be given a crucial sociological perspective through the analytical possibilities offered by digital corpora and methods. Based on a corpus of almost 900 Danish and Norwegian novels published in Denmark between 1870 and 1900, we will test and illustrate how combining a field-analytical approach with more quantitative and data-driven analyses brings new perspectives to traditional literary sociology. Our case is the historical novel and its status and development in a period, the so-called "modern breakthrough" in Scandinavian literature, where, at first sight, it did not belong at all.

APA, Harvard, Vancouver, ISO, and other styles

27

Oostdijk, Nelleke. "Corpora and language learners." English Studies 88, no. 3 (June 2007): 368–69. http://dx.doi.org/10.1080/00138380701270499.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Norrick, Neal R. "Swearing in literary prose fiction and conversational narrative." Narrative Inquiry 22, no. 1 (December 31, 2012): 24–49. http://dx.doi.org/10.1075/ni.22.1.03nor.

Full text

Abstract:

This article compares swearing in novels with swearing in everyday talk based on a representative sample of British and American prose fiction and a several large corpora of natural conversation. Swearing allegedly makes fictional dialogue more realistic, but up till now no one has attempted a systematic comparison of fictional and natural conversational swearing. Fiction writers incorporate swearing into their dialogue to delineate characters and to signal emotions, sometimes setting it off from non-swearing talk and commenting on it in various ways. Traditionally, the author’s own voice contained no swearing. By contrast, in conversational narratives, tellers use swearing to obtain the floor, to evaluate action, to mark climaxes and closings, in addition to portraying their characters as swearing. Moreover, in conversation, tellers may hear their listeners swearing along with them, not only to support and evaluate, but also to oppose and even complain about their telling performance.

APA, Harvard, Vancouver, ISO, and other styles

29

Cummins, Sarah, and Geneviève Parent. "Translating maman and papa: A corpus-based survey." Translation and Interpreting Studies 2, no. 1 (January 1, 2007): 3–45. http://dx.doi.org/10.1075/tis.2.1.01cum.

Full text

Abstract:

This study examines the translation of the French terms maman and papa by English-language translators from the nineteenth century to the present. Following a comparative analysis of the semantics of the French terms and of their most typical English translations, the authors of the study isolate trends in the translation of these terms through analysis of corpora of French and Quebecois literary texts and their translations.

APA, Harvard, Vancouver, ISO, and other styles

30

Pořízka, Petr. "CapekDraCor: A New Contribution to the European Programable Drama Corpora." Journal of Linguistics/Jazykovedný casopis 74, no. 1 (June 1, 2023): 244–53. http://dx.doi.org/10.2478/jazcas-2023-0042.

Full text

Abstract:

Abstract The aim of this paper is to present the new CapekDraCor corpus and the DraCor project with its research-oriented concept of a programmable corpora focused on quantitative analyses within the framework of computational literary studies. This digital platform extends the possibilities of large-scale drama analysis with a focus on the dramatic character(s). The basic operationalisation is the interaction within a dramatic configuration, i.e., the scenic co-presence of two speakers, from which network data are automatically extracted, both global networks of interactions of dramas and data characterising individual actors, i.e., literary characters. The paper demonstrates the CapekDraCor corpus, a new contribution to the extensive DraCor database, and presents the way the data are processed with respect to their specific multi-layered structure. The corpus contains all the plays written by Karel and Josef Čapek and the data are processed in a standardized format based on XML and general TEI guidelines for processing drama with a defined basic drama tagset. CapekDraCor also uses the newly created EZdrama format for data processing, which works as an intermediate step from .txt to .xml file as a lightweight YAML-like markup language. A file in this format can be automatically converted into a DraCor-ready XML file with a TEI header. The advantage of the programmable corpora concept is the possibility to use suitably structured data for drama research outside the DraCor platform and with other methods or tools for textual analysis. Simultaneously, this approach moves the researcher from the technical requirements of the analysis to operationalised computational analysis based on research questions and pre-prepared and flexible tools. DraCor is a unique open infrastructure (both in terms of data and tools) for the analysis of European drama, currently comprising 15 corpora in 10 different languages with a total of about 3,000 plays from a wide range of periods.

APA, Harvard, Vancouver, ISO, and other styles

31

Jacobs, Arthur M. "(Neuro-)Cognitive poetics and computational stylistics." Empirical Studies of Literariness 8, no. 1 (December 31, 2018): 165–208. http://dx.doi.org/10.1075/ssol.18002.jac.

Full text

Abstract:

Abstract This perspective paper discusses four general desiderata of current computational stylistics and (neuro-)cognitive poetics concerning the development of (a) appropriate databases/training corpora, (b) advanced qualitative-quantitative narrative analysis (Q2NA) and machine learning tools for feature extraction, (c) ecologically valid literary test materials, and (d) open-access reader-response data banks. In six explorative computational stylistics studies, it introduces a number of tools that provide QNA indices of the foregrounding potential at the sublexical, lexical, inter- and supralexical levels for poems by Shakespeare, Blake, or Dickens. These concern lexical diversity and aesthetic potential, sentiment analysis, sublexical sonority scores or phrase structure, and topics analysis. The results illustrate the complex interplay of stylistic features and the necessity for theoretical guidance and interdisciplinary cooperation in selecting adequate training corpora, QNA tools, test texts, and response measures.

APA, Harvard, Vancouver, ISO, and other styles

32

Cholewa, Joanna, and Vita Valiukienė. "A parallel corpus-based study of the French verb tomber ‘to fall’: Its semantic plurivocity and equivalents in Polish and Lithuanian." Kalbotyra 75 (December 30, 2022): 7–26. http://dx.doi.org/10.15388/kalbotyra.2022.75.1.

Full text

Abstract:

The analysis presented in this article is based on the trilingual parallel corpus CTLFR-PL-LT, composed of original French literary texts and their translations into Polish and Lithuanian. The undeniable usefulness of parallel corpora for studies in contrastive linguistics has already been established (Teubert 1996; Kraif 2011; Altenberg and Granger 2002 et al.). The corpora offer rich and reliable data, they [...] “allow us to see meaning through translation” (Johansson 2007, 57).The purpose of this study is twofold: we first aim to analyse the semantic plurivocity of the French verb tomber ‘to fall’ in literary texts. Secondly, we propose to examine the heterogeneity of the equivalents of tomber ‘to fall’ in Polish and Lithuanian, which will allow us to specify which strategies (verbal or other) are used to translate the meanings of the chosen verb. We also check whether the selected verbs in the Lithuanian and Polish translations express the same meaning element of tomber ‘to fall’. It will be particularly interesting to be able to observe whether strategies adopted by translators are specific to each of the target languages, genetically different, or common to both. Thus, we hope to contribute to contrastive French-Polish-Lithuanian research, which has so far been quite scare.

APA, Harvard, Vancouver, ISO, and other styles

33

Kavanagh, Barry. "Bridging the Gap from the Other Side: How Corpora Are Used by English Teachers in Norwegian Schools." Nordic Journal of English Studies 20, no. 1 (May 28, 2021): 1–35. http://dx.doi.org/10.35360/njes.522.

Full text

Abstract:

Researchers have written of ‘bridging the gap’ between corpus linguistics and teaching practice. This study focuses on in-service English teacher informants from Norwegian schools, to try to address the ‘gap’ from the teaching practice ‘side’, rather than from the linguist ‘side’ engaged in spreading corpus linguistics. The study collects data on teachers’ familiarity with corpus linguistics, what corpora are used for and how, and teachers’ views on the obstacles to corpus use. The research question is How are corpora used by in-service English teachers in Norwegian schools? The research design consists of an online questionnaire and follow-up interviews. The questionnaire was answered by 210 teachers, 34 of whom answered they had done some work with corpora. The interviews were with three corpus-using teachers. The corpora they used were GloWbE, SkELL, Netspeak and COCA. Teacher-corpus interaction was for reference and for creating vocabulary and varieties of English exercises, and pupil-corpus interaction was encouraged by two of the teachers. The obstacles to the use of corpora were identified as differences between school levels, usability, and lack of teacher need. In concluding remarks, it is suggested that a starting point for corpus use among teachers may be to teach the tools and methods that seem to be already working for in-service teachers.

APA, Harvard, Vancouver, ISO, and other styles

34

Al Fajri, Muchamad Sholakhuddin, and Ikmi Nur Oktavianti. "Multifunctionality in English: Corpora, Language and Academic Literacy Pedagogy." 3L The Southeast Asian Journal of English Language Studies 29, no. 2 (June 27, 2023): 216–18. http://dx.doi.org/10.17576/3l-2023-2902-15.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Halawachy, Huda, and Nawar Alobaidy. "“Let us call it a truthful hyperbole!” A Semantic Perspective on Hyperbole in War Poetry on Iraq (2003)." International Journal of Language and Literary Studies 2, no. 4 (December 26, 2020): 151–66. http://dx.doi.org/10.36892/ijlls.v2i4.439.

Full text

Abstract:

As has long been known, though prevalent in everyday discourse across cultures, hyperbole is a neglected figurative language in the linguistic and/or literary sphere. In this talk, we propose a semantic taxonomy of hyperbole in American and British modern war poetry showing how this taxonomy helps readers figure out the poet’s meaning on a deeper level via a variety of hyperboles. The main objectives are to (1) identify the elements of such a trope in the corpora, (2) approach a semantic taxonomy of hyperbolic elements, and (3) come up with the true hidden messages and nature of the trope in accordance with the typology of the semantic field under which the trope is embraced. The corpora consist of two impressive poems – ‘Abu Ghraib’ by Curtis D. Bennett (American), and ‘A Message from Tony Blair to the People of Iraq by David Roberts (British). Findings indicate that both the evaluative and the quantitative dimensions are key characteristics that often coincide and should, therefore, be included in every interpretation of the figurative hyperbolic language in war poetry. A strong preference is also observed for negative effects, auxesis, and absolute savage in the corpora, though the trope sounds positive on the surface.

APA, Harvard, Vancouver, ISO, and other styles

36

Schnelle, Gohar, Mathilde Hennig, Carolin Odebrecht, and Anke Lüdeling. "Historische Korpora in sprachhistorisch orientierter germanistischer Hochschullehre." Beiträge zur Geschichte der deutschen Sprache und Literatur 145, no. 2 (June 1, 2023): 175–217. http://dx.doi.org/10.1515/bgsl-2023-0012.

Full text

Abstract:

Abstract This paper argues for incorporating corpus data into the teaching of historical linguistics. While deeply annotated historical corpora are becoming available and corpus data is already widely used to answer various research questions, corpora are as yet rarely used in teaching. We believe they are ideally suited to make the variation in historical data transparent and help students to explore contexts and parameters. In our first study, we show how the KaJuK corpus and its more elaborated version, the GiesKaNe corpus, can be exploited to study adverbial sentences. Using the RIDGES corpus, the second study deals with phrasal and lexical development. Both studies focus on explaining the method and its extension to other corpora and research questions.

APA, Harvard, Vancouver, ISO, and other styles

37

Siegali, Michal Bar-Asher. "Ifra Hormiz and the Use of Mini-Corpora in the Study of the Babylonian Talmud." Jewish Quarterly Review 113, no. 4 (September 2023): 615–38. http://dx.doi.org/10.1353/jqr.2023.a913347.

Full text

Abstract:

Abstract: Ifra Hormiz is mentioned in five short stories in the Babylonian Talmud, each time accompanied by a description: "the mother of Shapur Malka," Shapur the king. This essay examines the stories about Ifra Hormiz from a literary angle. It suggests that an intratalmudic, comparative, literary analysis of these stories can offer a new perspective on their creation, and that we can better reveal and highlight the differing agendas and motifs of stories that have similar literary nuclei when we examine them as part of a broader corpus of similar stories. This analysis will shed light on the stories' anonymous authors, their intended audiences, and the ways they chose to address gender issues and their attitude toward the Persian rulers of their times.

APA, Harvard, Vancouver, ISO, and other styles

38

Milojkovic, Marija. "Bill Louw’s Contextual Prosodic Theory as the basis of (foreign language) classroom corpus stylistics research." Research in Corpus Linguistics 1 (2013): 47–63. http://dx.doi.org/10.32714/ricl.01.05.

Full text

Abstract:

Corpus empiricism may alter the act of reading. This began as the reader searched a reference corpus for individual words and phrases. With the admission of lexicographers that intuition no longer suffices in providing a definition, corpus stylistics must go further by showing that a literary text can no longer be properly interpreted if not seen against the background of the wealth of recorded textual experience. This by no means suggests that a literary text may not have a satisfying impact on an individual reader; rather, corpus stylistics enhances our interpretation by means that are easily available. The core of Bill Louw’s stylistic approach is his claim that prior knowledge is no longer perceived as concepts (unsatisfyingly intuitive). Therefore, reference corpora may serve to enhance our stylistic interpretation of a literary text that was clearly written to be appreciated as a unique textual experience. Roughly, a large reference corpus will provide many parallel textual experiences, so that ‘events’ in the studied text are augmented by their counterparts in corpora. Thus, our understanding of the text will be augmented by what is absent from it, but present in the reference corpora. If, furthermore, our classroom is a foreign language one, the reference corpus will serve as missing anguage experience in the foreign language learner, even if the learner is very proficient. After giving a brief overview of Louw’s Contextual Prosodic Theory (CPT) and its implications for classroom corpus stylistics, the paper describes a study conducted with second-year students of English from the University of Belgrade. The aims of the study are to verify Louw’s principle that text reads text and to test the proposed CPT-based methodology. The study consists of a quantitative part (where the learning phase is followed by a final test) and a qualitative part (questionnaire). The proposed methodology relies on confronting the subjects with concordance lines as a means of interpreting a collocation in a given short excerpt, with an absolute minimum of theoretical background. The subjects are tested on semantic prosodies, absent collocates and auras of grammatical strings, through tasks that vary in format. The results obtained are encouraging for CPT, despite the study’s limitations, which are also discussed.

APA, Harvard, Vancouver, ISO, and other styles

39

Natan-Yulzary, Shirly. "Contrast and Meaning in the ʾAqhat Story." Vetus Testamentum 62, no. 3 (2012): 433–49. http://dx.doi.org/10.1163/156853312x645254.

Full text

Abstract:

Abstract Creating contrast between different elements in the narrative is one of the Ugaritic poet’s main poetic devices. This literary tool is employed to encourage the audience to elicit and produce narrative meaning. In ʾAqhat it is a prominent technique, abundant in the lexical make up and stylistic texture of the narrative, in its content, as well as in the narrative structure. The examples analyzed in the article represent only a sampling of the Ugaritic poet’s elaborate and complex range of literary creativity. They illustrate the prominence of this device and demonstrate that its use is akin to that familiar from biblical narrative. Thus, this essay also indirectly supports the thesis the literary precursors and background of biblical narrative poetics are reflected in the Ugaritic epics, and that these two corpora are representative of the same literary tradition, not only regard to thematics and language, but also in respect to their poetics.

APA, Harvard, Vancouver, ISO, and other styles

40

Corretger, Montserrat. "The Literature of Exile after 1939: A Bridge between Catalan Collective Memory and Identity." Journal of Catalan Intellectual History 1, no. 11 (October 1, 2017): 68–82. http://dx.doi.org/10.1515/jocih-2016-0006.

Full text

Abstract:

AbstractThe present article reflects on and emphasises the importance of the still-unrecognised work by Catalan writers who bore witness to the exile of 1939 and the preceding historical period of the Second Spanish Republic (1931–1939) and the Civil War (1936–1939). The article explores how these exiled writers and their literary corpora played a fundamental role in recovering Catalan historical collective memory and identity. In particular, it focusses on two writers, Domènec Guansé and Vicenç Riera Llorca, in the light of recent studies of literary history, which have begun this process of re-evaluating the literature of exile, and thereafter relates their work to the theories of Lowenthal, Ricoeur and Traverso regarding the past and memory.

APA, Harvard, Vancouver, ISO, and other styles

41

Egbert, Jesse. "Style in nineteenth century fiction." Scientific Study of Literature 2, no. 2 (December 31, 2012): 167–98. http://dx.doi.org/10.1075/ssol.2.2.01egb.

Full text

Abstract:

Recent years have seen substantial advances in ‘corpus stylistics’, which is the use of corpora and computational techniques to study literary style. Corpus stylistics has produced analyses of otherwise imperceptible features of literary style. However, studies in corpus stylistics have rarely considered the full set of core linguistic features. The present study explores literary style through the application of Multi-Dimensional analysis. Stylistic variation along three dimensions is accounted for using a large, principled corpus of fiction. The dimensions of variation are interpreted as ‘Thought Presentation versus Description’, ‘Abstract Exposition versus Concrete Action’, and ‘Dialogue versus Narrative’. These three dimensions are then used to compare the styles of nineteenth-century fiction between authors, and the range of stylistic variation among the novels of individual authors. The findings are interpreted qualitatively and with reference to previous analyses of author style.

APA, Harvard, Vancouver, ISO, and other styles

42

Christou, Despina, and Grigorios Tsoumakas. "Extracting Semantic Relationships in Greek Literary Texts." Sustainability 13, no. 16 (August 21, 2021): 9391. http://dx.doi.org/10.3390/su13169391.

Full text

Abstract:

In the era of Big Data, the digitization of texts and the advancements in Artificial Intelligence (AI) and Natural Language Processing (NLP) are enabling the automatic analysis of literary works, allowing us to delve into the structure of artifacts and to compare, explore, manage and preserve the richness of our written heritage. This paper proposes a deep-learning-based approach to discovering semantic relationships in literary texts (19th century Greek Literature) facilitating the analysis, organization and management of collections through the automation of metadata extraction. Moreover, we provide a new annotated dataset used to train our model. Our proposed model, REDSandT_Lit, recognizes six distinct relationships, extracting the richest set of relations up to now from literary texts. It efficiently captures the semantic characteristics of the investigating time-period by finetuning the state-of-the-art transformer-based Language Model (LM) for Modern Greek in our corpora. Extensive experiments and comparisons with existing models on our dataset reveal that REDSandT_Lit has superior performance (90% accuracy), manages to capture infrequent relations (100%F in long-tail relations) and can also correct mislabelled sentences. Our results suggest that our approach efficiently handles the peculiarities of literary texts, and it is a promising tool for managing and preserving cultural information in various settings.

APA, Harvard, Vancouver, ISO, and other styles

43

Bratić, Vesna, and Milica Vuković Stamatović. "Lexical profile of literary academic articles." Ibérica, no. 42 (December 31, 2021): 115–38. http://dx.doi.org/10.17398/2340-2784.42.115.

Full text

Abstract:

In this paper, we examine the lexical profile of literary academic articles with a view to determining how they differ from research articles in other disciplines and how the vocabulary level and complexity affect reading comprehension, particularly for non-native speakers of English. For this purpose, a corpus of 110 literary articles from reputable journals was compiled and compared against two corpora featuring the same number of articles: one consisting of research articles from Science, Technology and Medicine (STM), and the other comprising research articles from social sciences and other humanities. The results reveal that the lexical profile of literary academic papers is, as expected, more similar to social sciences and other humanities than to the STM field when it comes to the coverage of general-purpose vocabulary, vocabulary level and vocabulary diversity. Despite the lexical similarities to social sciences and other humanities, the vocabulary of literary academic papers is somewhat more complex and diverse than that found in them. The largest differences were noted with respect to the level of academic vocabulary, whose use is much sparser in literary studies than in all other fields. The pedagogical implications include advocating for refraining from reading literary academic articles earlier than postgraduate studies for non-native-speakers of English (with some exceptions), as their vocabulary level will generally be insufficient for those purposes. We also point to the limited value of teaching academic vocabulary to students of literary studies.

APA, Harvard, Vancouver, ISO, and other styles

44

Babić-Antić, Jelena, and Nikola Dančetović. "Women's language in literary discourse: The focus on modality." Bastina, no. 51 (2020): 125–42. http://dx.doi.org/10.5937/bastina30-26947.

Full text

Abstract:

Various linguistic means characterizing women's language were presented to the public in the 70-s. The first systematic and critical approach to linguistic features of women's language was presented in a pioneering study by Robin Lakoff in 1975. Since then, it has been considered as a beginning of a new interdisciplinary field of "gender studies". Nowadays, women's language presents an important issue of sociolinguistic studies of language and gender. This paper analyzes the means for expressing modality in women's language found in literary discourse. The corpora are comprised of seven novels about Harry Potter in English and Serbian. According to some theoreticians of language and gender, among whom are Coates, Mills, Livia and others, frequent use of epistemic modality presents a significant trait of women's linguistic pattern. If we assume that attitudes and ideologies could be expressed by linguistic features such as modality, this paper aims to compare the modal forms and their meaning in English and Serbian by using contrastive linguistic analysis, determine the differences in linguistic patterns between the two, and attempt to explain possible ideological implications from the CDA perspective.

APA, Harvard, Vancouver, ISO, and other styles

45

Cresti, Emanuela, and Massimo Moneglia. "The definition of the TOPIC within Language into Act Theory and its identification in spontaneous speech corpora." Revue Romane / Langue et littérature. International Journal of Romance Languages and Literatures 53, no. 1 (August 10, 2018): 30–62. http://dx.doi.org/10.1075/rro.00005.cre.

Full text

Abstract:

Abstract The paper presents the definition of the TOPIC information unit within the Language into Act Theory (L-AcT) and the prosodic and informational criteria used for its recovery in spontaneous speech corpora: Italian, Brazilian Portuguese, Spanish and American English. The TOPIC develops the specific function of field of application of the illocutionary force accomplished by the COMMENT unit, it is performed through a prefix prosodic unit and precedes the Comment. The TOPIC must be coherent with the set of requirements determined by the illocutionary force of the Comment and adequate to the speaker-addressee relation. TOPIC mostly correlates in spoken corpora with NP and ADVP and must be functionally distinguished from “postponed Topic” (APPENDIX in the L-ACT framework). However, corpora also show a good percentage of modal expressions filling its prosodic and distributional conditions.

APA, Harvard, Vancouver, ISO, and other styles

46

Ibrahim, Wesam Mohamed Abdelkhalek. "UTILISING CORPUS STYLISTICS TO FACILITATE LITERARY ANALYSIS: AN ASSESSMENT OF THE EFFECTIVENESS OF SEMANTIC DOMAINS IN IDENTIFYING MAJOR LITERARY THEMES IN A SELECTION OF CHARLES DICKENS’ NOVELS." Malaysian Journal of Applied Linguistics (MJoAL) 1 (January 2, 2024): 55–82. http://dx.doi.org/10.32890/mjoal2023.1.5.

Full text

Abstract:

Though still in an early stage of development, corpus-assisted literary analysis is becoming increasingly popular as having the full potential of corpus linguistics methodology for literary stylistics. This paper argues that corpus linguistic procedures can be considered an addition to the analytical inventory of traditional stylistics. It aims to explore how corpus linguistic procedures, particularly semantic domains, can be effective in detecting major literary themes in fiction. In order to do so, five corpora were compiled: a corpus for each of the four novels of Charles Dickens’ selected (i.e., Oliver Twist, David Copperfield, Great Expectations and Our Mutual Friend) and a compiled corpus combining all four novels. Wmatrix 5, with the BNC Sampler-Written as a reference corpus, was used to extract the key semantic domains in each corpus respectively. The literature on the selected novels was consulted to identify the major themes. Then, it was verified whether these themes were reflected in the corpus analysis, and, finally, the extent to which the procedure was effective in reflecting the major literary themes was also explored. The findings confirmed the effectiveness of the procedure of analysing semantic domains in studying literary texts, particularly in relation to their themes.

APA, Harvard, Vancouver, ISO, and other styles

47

Çelik, Hülya, and Ani Sargsyan. "Introducing Transcription Standards for Armeno-Turkish Literary Studies." DIYÂR 3, no. 2 (2022): 161–89. http://dx.doi.org/10.5771/2625-9842-2022-2-161.

Full text

Abstract:

Turkish literature in Armenian script comprises a large corpus of manuscripts dating from the 14th century together with printed material published between the 18th and 20th centuries. Books were printed in a wide geographical area and their contents were produced by mono- and bilingual Turkish- (and Armenian)-speaking Ottoman Armenians. Therefore, Armeno-Turkish text production represents the textual output enabled through Armenian and Turkish cross-cultural interactions, including various genres and different types of text. Although the scope of Armeno-Turkish text production is extensive, scholarly engagement with Armeno-Turkish texts at universities has only been markedly evident since the 2000s. The most significant reason for this late and limited engagement may lie in the obstacle of the hybrid nature of the script and the language, whereby Armeno-Turkish literature has a place neither in Turkish nor in Armenian literary studies. The aims of this article are therefore (1) to give a short overview of hitherto scholarly work with Armeno-Turkish text corpora and (2) to propose a standard for the transcription of Turkish texts in Armenian script. In a longue durée perspective, we aim to conduct inclusive literary studies and examine Armeno-Turkish literature within the greater framework of (Ottoman) Turkish literature.

APA, Harvard, Vancouver, ISO, and other styles

48

Grabowski, Lukasz. "Interfacing corpus linguistics and computational stylistics." International Journal of Corpus Linguistics 18, no. 2 (September 27, 2013): 254–80. http://dx.doi.org/10.1075/ijcl.18.2.04gra.

Full text

Abstract:

This study attempts to examine the potential of selected corpus linguistics and computational stylistics methods in the investigation of translation universals in translational literary Polish. It deals with T-universals (Chesterman 2004), with emphasis on the simplification hypothesis, as manifested in the core patterns of lexical use (Laviosa 1998) and the levelling out hypothesis (Baker 1996). To that end, the purpose-designed corpora, each with approximately 350,000 tokens, of contemporary translational and non-translational literary Polish were compiled. The results confirm the simplification and the levelling out hypotheses but only with reference to the mean sentence length and variance for the mean sentence length. On the other hand, the results of multivariate analyses (Principal Components Analysis and Cluster Analysis) confirm the levelling out hypothesis that translations are more alike as compared with native texts.

APA, Harvard, Vancouver, ISO, and other styles

49

Robin, Edina, Andrea Götz, Éva Pataky, and Henriette Szegh. "Translation Studies and Corpus Linguistics: Introducing the Pannonia Corpus." Acta Universitatis Sapientiae, Philologica 9, no. 3 (December 1, 2017): 99–116. http://dx.doi.org/10.1515/ausp-2017-0032.

Full text

Abstract:

AbstractThe tools of corpus linguistics have become indispensable for research in descriptive translation studies (DTS), which aims to describe the characteristics of the translation process, and translational texts. Machinereadable corpora of translated texts are crucially important since they can yield statistically significant results that underpin the findings of empirical studies. Baker’s (1993) seminal paper gave new impetus to translation research as it has re-calibrated the goals of DTS to study and uncover the particular properties of the so-called “third code” (Frawley 1984), i.e. the language of translated texts, with the help of computerized corpora. The present study, after providing a brief overview of international and Hungarian corpus linguistic research, introduces the Pannonia Corpus Project developed by Eötvös Loránd University’sTranslation Studies Doctoral Programme, which was created to make a Hungarian translation corpus, containing millions of words, available for translation researchers. The Pannonia Corpus (PC) is a multi-modal corpus: it contains translated, interpreted, and audiovisual texts. It represents a diverse array of texts of specialized and literary genres, reflecting modern language use and the current state of the translation industry. The PC provides researchers with a vital opportunity as its multimodality, diverse textual make-up, and substantial size are unparalleled in the Hungarian context. Until now, there were no large corpora available to researchers that could have facilitated qualitative as well as quantitative research, satisfying the demands of modern translation studies research in Hungary.

APA, Harvard, Vancouver, ISO, and other styles

50

Lewis, Derek, Jenny Thomas, and Mick Short. "Using Corpora for Language Research." Modern Language Review 93, no. 3 (July 1998): 763. http://dx.doi.org/10.2307/3736497.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Journal articles on the topic 'Literary corpora'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles