Academic literature on the topic 'News text corpus'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'News text corpus.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "News text corpus"

1

Sharjeel, Muhammad, Rao Muhammad Adeel Nawab, and Paul Rayson. "COUNTER: corpus of Urdu news text reuse." Language Resources and Evaluation 51, no. 3 (September 10, 2016): 777–803. http://dx.doi.org/10.1007/s10579-016-9367-2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Zhang, Yiqiong. "Retailing science: genre hybridization in online science news stories." Text & Talk 38, no. 2 (February 23, 2018): 243–65. http://dx.doi.org/10.1515/text-2017-0040.

Full text
Abstract:
AbstractThis study explores how marketing and science rhetoric have become entrenched in online science news stories. The schematic structures of a corpus of 270 news stories from three types of website (university websites, the websites of Futurity.org and MSNBC.com) have been analyzed and compared. An eight-move structure identified from the corpus suggests that the genre of news stories is a hybridization of promotional discourse for marketization and science discourse for explanation. Hybridization is first evident in university press releases, which are then spread by the mass media without significant changes. From the perspective of intertextual chains, the emerging discourse practices can be attributed to the power shifting of news production from journalists to science institutions and further from journalistic to scientific norms. In turn, the discourse practices accelerate the shift of power, which could ultimately lead to the loss of independent and critical science journalism.
APA, Harvard, Vancouver, ISO, and other styles
3

Watanabe, Chiaki, and Ichiro Kobayashi. "Intelligent Information Presentation Corresponding to User Request Based on Collaboration Between Text and 2D Charts." Journal of Advanced Computational Intelligence and Intelligent Informatics 12, no. 1 (January 20, 2008): 10–15. http://dx.doi.org/10.20965/jaciii.2008.p0010.

Full text
Abstract:
We discuss intelligent information provision involving different modal information collaboratively presented, with an example of news articles about stock prices summarized based on 2D chart representation on stock prices. We use the MuST corpus, an annotated corpus for easily extracting trends in information, e.g., statistical values, etc., as the news article corpus to be summarized. We associate the MuST corpus with numerical data on the stock prices, and propose a way to provide people with a summarized text about news articles on prices corresponding to 2D chart representation.
APA, Harvard, Vancouver, ISO, and other styles
4

Dong, Min, and Mengfei Gao. "Appraisal as co-selection and media performativity: 5G technology imaged in German news discourse." Text & Talk 42, no. 2 (November 2, 2021): 177–208. http://dx.doi.org/10.1515/text-2020-0012.

Full text
Abstract:
Abstract This article views appraisal as co-selection patterns of target, source and evaluative parameters and investigates the ways in which news discourse retells news stories and reproduces truthful reality. We combined the corpus-assisted method and quantitative/qualitative analysis of the data, i.e., 904 sentences which were extracted from the corpus of German 5G news reports by selecting the top 5 items from each of the noun keywords lists of the three subcorpora of economics, politics and technology news reports. It was found that the German media restage the necessity and desirability to promote the development of German communication facilities/technology through international cooperation, particularly Germany-Sino cooperation. In addition, a hesitant image was evoked as to the high-profile 5G development in Germany with an awareness of the potential security risks and economic losses. On the intersubjective dimension, our findings suggest that journalists make full exploitation of different dialogistic positioning strategies for closing down or opening up the dialogic space to a greater or lesser degree. More specifically, they tend to acknowledge and endorse the positive/negative attitudes attributed to the non-authorial voices towards particular targets in the fields of economics, politics or technology. A future comparison with the genre of news comments or editorials would deepen our understanding of the performativity of media.
APA, Harvard, Vancouver, ISO, and other styles
5

Hou, Zhide. "The American Dream meets the Chinese Dream: a corpus-driven phraseological analysis of news texts." Text & Talk 38, no. 3 (April 25, 2018): 317–40. http://dx.doi.org/10.1515/text-2018-0006.

Full text
Abstract:
Abstract This study is a corpus-driven examination of frequent lexical words and keywords in the news texts related to the American Dream and the Chinese Dream. Based on Sinclair’s (Sinclair, John McHardy. 2004. Trust the Text. Routledge: London) five categories of co-selection as framework, it discusses the patterns of co-selection across the corpora of news texts, with a particular focus on the cumulative effects of the co-construction of situated meanings and establishment of ideological positions associated with the two dreams. The corpus linguistic tool Wordsmith is used to generate frequent words and keywords for detailed concordance analysis along both syntagmatic and paradigmatic relations in order to indicate collocation, colligation, semantic preference, and semantic prosody. The findings demonstrate the individualistic home, work and education associations of the American Dream versus the collectivistic attributions of the Chinese Dream of national rejuvenation. The study not only confirms different cultural practices, but also reveals different social-historical conditions, and political influences associated with media representations of the American Dream and the Chinese Dream.
APA, Harvard, Vancouver, ISO, and other styles
6

Ho, Janet. "An earthquake or a category 4 financial storm? A corpus study of disaster metaphors in the media framing of the 2008 financial crisis." Text & Talk 39, no. 2 (March 26, 2019): 191–212. http://dx.doi.org/10.1515/text-2019-2024.

Full text
Abstract:
Abstract This study investigates the use of disaster metaphors in the American media coverage of the 2008 global financial crisis. More specifically, it aims to examine the role of different sub-metaphors in performing various pragmatic and rhetorical functions in financial news discourse. Using the Metaphor Identification Procedure, this study identifies key words from the 1-million-word corpus which comprised the news articles published from September 15, 2008 to March 15, 2009, and examines the associated concordance lines to discern their metaphorical connotations. The findings show that a wide range of sub-source domains of disaster—namely, wind, storm, and water—metaphors was deployed by journalists to capture the various negative impacts of the financial crisis. These findings suggest that the salient extension and mixing of metaphors could enhance the popularization of specialist financial news discourse. The findings also indicate that the news media was complicit in constructing the collective illusion that the financial crisis was unavoidable and not caused by anyone.
APA, Harvard, Vancouver, ISO, and other styles
7

Best, Michael L. "An Ecology of Text: Using Text Retrieval to Study Alife on the Net." Artificial Life 3, no. 4 (October 1997): 261–87. http://dx.doi.org/10.1162/artl.1997.3.4.261.

Full text
Abstract:
I introduce a new alife model, an ecology based on a corpus of text, and apply it to the analysis of posts to USENET News. In this corporal ecology posts are organisms, the newsgroups of NetNews define an environment, and human posters situated in their wider context make up a scarce resource. I apply latent semantic indexing (LSI), a text retrieval method based on principal component analysis, to distill from the corpus those replicating units of text. LSI arrives at suitable replicators because it discovers word co-occurrences that segregate and recombine with appreciable frequency. I argue that natural selection is necessarily in operation because sufficient conditions for its occurrence are met: replication, mutagenicity, and trait/fitness covariance. I describe a set of experiments performed on a static corpus of over 10,000 posts. In these experiments I study average population fitness, a fundamental element of population ecology. My study of fitness arrives at the tinhappy discovery that a flame-war, centered around an overly prolific poster, is the king of the jungle.
APA, Harvard, Vancouver, ISO, and other styles
8

Cenek, Martin, Rowan Bulkow, Eric Pak, Levi Oyster, Boyd Ching, and Ashika Mulagada. "Semantic Network Analysis Pipeline—Interactive Text Mining Framework for Exploration of Semantic Flows in Large Corpus of Text." Applied Sciences 9, no. 24 (December 5, 2019): 5302. http://dx.doi.org/10.3390/app9245302.

Full text
Abstract:
Historical topic modeling and semantic concepts exploration in a large corpus of unstructured text remains a hard, opened problem. Despite advancements in natural languages processing tools, statistical linguistics models, graph theory and visualization, there is no framework that combines these piece-wise tools under one roof. We designed and constructed a Semantic Network Analysis Pipeline (SNAP) that is available as an open-source web-service that implements work-flow needed by a data scientist to explore historical semantic concepts in a text corpus. We define a graph theoretic notion of a semantic concept as a flow of closely related tokens through the corpus of text. The modular work-flow pipeline processes text using natural language processing tools, statistical content narrowing, creates semantic networks from lexical token chaining, performs social network analysis of token networks and creates a 3D visualization of the semantic concept flows through corpus for interactive concept exploration. Finally, we illustrate the framework’s utility to extract the information from a text corpus of Herman Melville’s novel Moby Dick, the transcript of the 2015–2016 United States (U.S.) Senate Hearings on Environment and Public Works, and the Australian Broadcast Corporation’s short news articles on rural and science topics.
APA, Harvard, Vancouver, ISO, and other styles
9

Yulita, Winda, Sigit Priyanta, and Azhari SN. "Automatic Text Summarization Based on Semantic Networks and Corpus Statistics." IJCCS (Indonesian Journal of Computing and Cybernetics Systems) 13, no. 2 (April 30, 2019): 137. http://dx.doi.org/10.22146/ijccs.38261.

Full text
Abstract:
One simple automatic text summarization method that can minimize redundancy, in summary, is the Maximum Marginal Relevance (MMR) method. The MMR method has the disadvantage of having parts that are separated from each other in summary results that are not semantically connected. Therefore, this study aims to compare summary results using the MMR method based on semantic and non-semantic based MMR. Semantic-based MMR methods utilize WordNet Bahasa and corpus in processing text summaries. The MMR method is non-semantic based on the TF-IDF method. This study also carried out summary compression of 30%, 20%, and 10%. The research data used is 50 online news texts. Testing of the summary text results is done using the ROUGE toolkit. The results of the study state that the best value of the f-score in the semantic-based MMR method is 0.561, while the best f-score in the non-semantic MMR method is 0.598. This value is generated by adding a preprocessing process in the form of stemming and compression of a 30% summary result. The difference in value obtained is due to incomplete WordNet Bahasa and there are several words in the news title that are not in accordance with EYD (KBBI).
APA, Harvard, Vancouver, ISO, and other styles
10

Pryzant, Reid, Richard Diehl Martinez, Nathan Dass, Sadao Kurohashi, Dan Jurafsky, and Diyi Yang. "Automatically Neutralizing Subjective Bias in Text." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 01 (April 3, 2020): 480–89. http://dx.doi.org/10.1609/aaai.v34i01.5385.

Full text
Abstract:
Texts like news, encyclopedias, and some social media strive for objectivity. Yet bias in the form of inappropriate subjectivity — introducing attitudes via framing, presupposing truth, and casting doubt — remains ubiquitous. This kind of bias erodes our collective trust and fuels social conflict. To address this issue, we introduce a novel testbed for natural language generation: automatically bringing inappropriately subjective text into a neutral point of view (“neutralizing” biased text). We also offer the first parallel corpus of biased language. The corpus contains 180,000 sentence pairs and originates from Wikipedia edits that removed various framings, presuppositions, and attitudes from biased sentences. Last, we propose two strong encoder-decoder baselines for the task. A straightforward yet opaque concurrent system uses a BERT encoder to identify subjective words as part of the generation process. An interpretable and controllable modular algorithm separates these steps, using (1) a BERT-based classifier to identify problematic words and (2) a novel join embedding through which the classifier can edit the hidden states of the encoder. Large-scale human evaluation across four domains (encyclopedias, news headlines, books, and political speeches) suggests that these algorithms are a first step towards the automatic identification and reduction of bias.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "News text corpus"

1

Alruily, Meshrif. "Using text mining to identify crime patterns from Arabic crime news report corpus." Thesis, De Montfort University, 2012. http://hdl.handle.net/2086/7584.

Full text
Abstract:
Most text mining techniques have been proposed only for English text, and even here, most research has been conducted on specific texts related to special contexts within the English language, such as politics, medicine and crime. In contrast, although Arabic is a widely spoken language, few mining tools have been developed to process Arabic text, and some Arabic domains have not been studied at all. In fact, Arabic is a language with a very complex morphology because it is highly inflectional, and therefore, dealing with texts written in Arabic is highly complicated. This research studies the crime domain in the Arabic language, exploiting unstructured text using text mining techniques. Developing a system for extracting important information from crime reports would be useful for police investigators, for accelerating the investigative process (instead of reading entire reports) as well as for conducting further or wider analyses. We propose the Crime Profiling System (CPS) to extract crime-related information (crime type, crime location and nationality of persons involved in the event), automatically construct dictionaries for the existing information, cluster crime documents based on certain attributes and utilize visualisation techniques to assist in crime data analysis. The proposed information extraction approach is novel, and it relies on computational linguistic techniques to identify the abovementioned information, i.e. without using predefined dictionaries (e.g. lists of location names) and annotated corpus. The language used in crime reporting is studied to identify patterns of interest using a corpus-based approach. Frequency analysis, collocation analysis and concordance analysis are used to perform the syntactic analysis in order to discover the local grammar. Moreover, the Self Organising Map (SOM) approach is adopted in order to perform the clustering and visualisation tasks for crime documents based on crime type, location or nationality. This clustering technique is improved because only refined data containing meaningful keywords extracted through the information extraction process are inputted into it, i.e. the data is cleaned by removing noise. As a result, a huge reduction in the quantity of data fed into the SOM is obtained, consequently, saving memory, data loading time and the execution time needed to perform the clustering. Therefore, the computation of the SOM is accelerated. Finally, the quantization error is reduced, which leads to high quality clustering. The outcome of the clustering stage is also visualised and the system is able to provide statistical information in the form of graphs and tables about crimes committed within certain periods of time and within a particular area.
APA, Harvard, Vancouver, ISO, and other styles
2

Arevian, Garen Zohrab. "Recurrent neural networks for text classification of news articles from the Reuters Corpus." Thesis, University of Sunderland, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.439972.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Hassel, Martin. "Resource Lean and Portable Automatic Text Summarization." Doctoral thesis, Stockholm : Numerisk analys och datalogi Numerical Analysis and Computer Science, Kungliga Tekniska högskolan, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-4414.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Walle, Spencer Benjamin. "Quantified characteristics of easy-to-read Finnish news texts." Thesis, Uppsala universitet, Institutionen för moderna språk, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-412607.

Full text
Abstract:
I denna studie analyseras nyheter på lättläst finska för att ta reda på hur texterna kvantitativt präglas av riktlinjer kring lättläst finska. Korpusarna samlades av nyhetsartiklar skrivna på standardfinska respektive lättläst finska och den komparativa analysen syftade till att fastställa vissa kvantitativa parametrar, bl.a. genomsnittlig meningslängd och genomsnittlig ordlängd samt lexikal densitet, som tillsammans med lexikala särdrag kan karakterisera lättläst skrivning. Analysen av materialet visade att både meningslängden och längden på själva texterna i enlighet med tidigare forskning var betydlig kortare i lättlästa texter än i texter på standardfinska, men meningslängden också var ännu kortare än den övre gränsen som angetts i riktlinjerna om lättläst finska. Ett överraskande resultat var att båda korpusarna hade ungefär den samma genomsnittliga ordlängden. Även lexikal densitet låg på ungefär samma nivå mellan korpusarna. Denna studies resultat stödjer tidigare slutsatser om meningslängd men avslöjar oväntade likheter angående ordlängd och lexikon.
In this study, news in easy-to-read Finnish is analyzed to find out how the texts are quantitatively characterized by guidelines for easy-to-read Finnish. The corpora were collected from news articles written in standard Finnish and easy-to-read Finnish, and the comparative analysis was aimed at establishing certain quantitative parameters, including average sentence length and average word length as well as lexical density, which together with lexical features can characterize easy-to-read writing. The analysis of the material showed that both the sentence length and the length of the texts themselves, consistent with previous research, were considerably shorter in easy-to- read texts than in standard Finnish texts, but the sentence length was also shorter than the upper limit specified in the guidelines for easy-to-read Finnish. A surprising result was that both corpora had about the same average word length. Also, lexical density was at approximately the same level between the corpora. The results of this study support previous conclusions on sentence length but reveal unexpected similarities regarding word length and lexicons.
APA, Harvard, Vancouver, ISO, and other styles
5

Stephans, Christie L. "Assessing the Reproducibility of Coral-based Climate Records: A Multi-proxy Replication Test using Three Porites lutea Coral Heads from New Caledonia." [Tampa, Fla.] : University of South Florida, 2003. http://purl.fcla.edu/fcla/etd/SFE0000165.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Wasserman, Gertruida Petronella. "Modality on trek : diachronic changes in written South African English across text and context / G.P. Wasserman." Thesis, North West University, 2014. http://hdl.handle.net/10394/13042.

Full text
Abstract:
This study describes the diachronic development of modality in South African English (henceforth SAfE) from the early 19th century up to its contemporary state (1820s to 1990s) in the registers of letters, news, fiction/narrative and non-fiction, on the basis of the theoretical framework of socio historical linguistics and the empirical approach of corpus linguistics. Both quantitative and qualitative analyses are conducted for modal and quasi-modal verbs, by means of the newly compiled historical corpus of SAfE and ICE-SA (with the addition of Afrikaans corpora for comparison). The study explores general frequency changes, register-internal changes and macro- and micro semantic changes, with the focus of the main semantic analysis more strongly on the obligation and necessity cluster1. A set of parameters is compiled for analysing the strength of obligation in the modals must and should, and the quasi-modal HAVE to, and is applied in the micro semantic analyses. The findings are compared with the trends for modality in other native English’s, such as American, British and Australian English (cf. e.g. Mair & Leech, 2006; Collins, 2009a; Leech, 2011), in an attempt to present a complete and comprehensive description of SAfE modality, as opposed to the traditional approach of focusing on peculiar features. It is reported that the trends of modality in SAfE correspond to those of other native varieties in some cases, but do not correspond in others. The modals of SAfE for example have declined more and the quasi-modals have increased less over the 20th century than in other native varieties of English. One particular case, in which SAfE is reported to be unique among other varieties, is the quantitative and qualitative trends for must, which has some implications for the manifestation of the democratisation process. Must in SAfE has not declined significantly over the 20th century (as it has in other native varieties) and has become less face threatening, since uses with a median (weaker) degree of force are just as frequent as those with a higher degree of force by the 1990s (unlike in other native varieties, where must has become restricted to high-degree obligative contexts). Based on socio historical, as well as linguistic evidence (on both quantitative and qualitative levels), language contact with Afrikaans is posited as the main influence for the increased use of must in contexts that are not face threatening. Extrapolating from the semantic findings, some new insights are offered regarding the phase in which SAfE finds itself within Schneider’s (2003) model of the evolution of New English’s, and some support is offered for Bekker’s (2012:143) argument that “SAfE is ...the youngest of the colonial varieties of English”, especially in the Southern Hemisphere. Ultimately, this thesis offers a piece in the larger puzzle that is SAfE, both in terms of linguistic (textual) and socio historical (contextual) aspects.
PhD (English), North-West University, Vaal Triangle Campus, 2014
APA, Harvard, Vancouver, ISO, and other styles
7

Manuilov, Illia, and S. V. Petrasova. "Method for Paraphrase Extraction from the News Text Corpus." Thesis, 2019. http://repository.kpi.kharkov.ua/handle/KhPI-Press/46384.

Full text
Abstract:
The paper discusses the process of automatic extraction of paraphrases used in rewriting. The researchers propose the method for extracting paraphrases from English news text corpora. The method is based on both the developed syntactic rules to define phrases and synsets to identify synonymous words in the designed text corpus of BBC news. In order to implement the method, Natural Language Toolkit, Universal Dependencies parser and WordNet are used.
APA, Harvard, Vancouver, ISO, and other styles
8

Komrsková, Zuzana. "Souvýskyt primárních předložek v soudobých publicistických textech (korpusová sonda)." Master's thesis, 2013. http://www.nusl.cz/ntk/nusl-321493.

Full text
Abstract:
Diploma thesis is concentrated on the phenomenon called coocurance of primary prepositions. It is a construction of two prepositions directly standing next to each other, where each of them is part of another prepositional phrase. The first preposition is not directly in front of its phrases, but there is the second prepositional phrase inserted between the first preposition and its phrases. The aim of the thesis was to find these structures in the corpus SYN2009PUB, and their description of the semantic and formal aspects. Inserted phrase is in most cases syntactically incorporated into sentence, where takes the adverbial or object distinction. Comparison in length (in number of words) of inserted phrase with the phrases, which can be inserted, showed that during insertion depends on the length of the transferred phrases. The paper outlines other factors affecting coocurence, e. g. pronounceability.
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "News text corpus"

1

Nicolas Martinez, Maria Carlota, ed. Ricerche sul Corpus del parlato romanzo C-ORAL-ROM. Florence: Firenze University Press, 2007. http://dx.doi.org/10.36253/978-88-8453-568-9.

Full text
Abstract:
This book is the first initiative of its kind aimed at underscoring the importance of the C-ORAL-ROM project, and proposing new methods of utilisation and study of the corpora comprised within it, especially in the framework of language teaching. The objective of the project is that of creating a corpus of the spontaneous spoken language for the principal Romance languages: French, Italian, Portuguese and Spanish. The publication includes both written and audio texts, considered as the most appropriate manner of utilising and studying the oral corpora. The texts of the authors hosted in the volume dwell on the various aspects that enable C-ORAL-ROM to be used as a container of information, as a teaching instrument and also as a means of analysing the formal, structural and prosodic characteristics of the texts. The last part of the book presents a teaching unit that proposes a direct application to the teaching of the oral corpora.
APA, Harvard, Vancouver, ISO, and other styles
2

A lost edition of the Letters of Paul: A reassessment of the text of the Pauline corpus attested by Marcion. Washington, DC: Catholic Biblical Association of America, 1989.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
3

Goul, Pauline, and Usher, eds. Early Modern Écologies. NL Amsterdam: Amsterdam University Press, 2020. http://dx.doi.org/10.5117/9789462985971.

Full text
Abstract:
Early Modern Écologies is the first collective volume to offer perspectives on the relationship between contemporary ecological thought and early modern French literature. If Descartes spoke of humans as being ‘masters and possessors of Nature’ in the seventeenth century, the writers taken up in this volume arguably demonstrated a more complex and urgent understanding of the human relationship to our shared planet. Opening up a rich archive of literary and non-literary texts produced by Montaigne and his contemporaries, this volume foregrounds not how ecocriticism renews our understanding of a literary corpus, but rather how that corpus causes us to re-think or to nuance contemporary eco-theory. The sparsely bilingual title (an acute accent on écologies) denotes the primary task at hand: to pluralize (i.e. de-Anglophone-ize) the Environmental Humanities. Featuring established and emerging scholars from Europe and the United States, Early Modern Écologies opens up new dialogues between ecotheorists such as Timothy Morton, Gilles Deleuze, and Bruno Latour and Montaigne, Ronsard, Du Bartas, and Olivier de Serres.
APA, Harvard, Vancouver, ISO, and other styles
4

Mazzoni, Stefania, and Franca Pecchioli, eds. The Uşaklı Höyük Survey Project (2008-2012). Florence: Firenze University Press, 2016. http://dx.doi.org/10.36253/978-88-6655-902-3.

Full text
Abstract:
This book presents the results of the survey conducted by the University of Florence, in the years 2008-2012, at the site and in the surrounding territory of Uşaklı Höyük on the central Anatolian plateau in Turkey. Geological, geomorphological, topographic and geophysical research have provided new information and data relating to the environment and the settlement landscape, as well as producing new maps of the area and indicating the presence of large buried buildings on the site. Analysis of the rich corpus of pottery collected from the surface indicates that the site and its territory were continuously settled from the late Early Bronze Age through the Iron Age and down to the Late Roman and Byzantine periods. A few fragments of cuneiform tablets with Hittite texts, a sealing with two impressions of a stamp seal, and pottery stamps illustrate the importance of Uşaklı Höyük and support the hypothesis of its identification with the town of Zippalanda, known from the Hittite sources as a seat of the cult of the Storm God.
APA, Harvard, Vancouver, ISO, and other styles
5

Spencer-Hall, Alicia. Medieval Saints and Modern Screens. NL Amsterdam: Amsterdam University Press, 2017. http://dx.doi.org/10.5117/9789462982277.

Full text
Abstract:
This ground-breaking book brings theoretical perspectives from twenty-first century media, film, and cultural studies to medieval hagiography. Medieval Saints and Modern Screens stakes the claim for a provocative new methodological intervention: consideration of hagiography as media. More precisely, hagiography is most productively understood as cinematic media. Medieval mystical episodes are made intelligible to modern audiences through reference to the filmic - the language, form, and lived experience of cinema. Similarly, reference to the realm of the mystical affords a means to express the disconcerting physical and emotional effects of watching cinema. Moreover, cinematic spectatorship affords, at times, a (more or less) secular experience of visionary transcendence: an 'agape-ic encounter'. The medieval saint's visions of God are but one pole of a spectrum of visual experience which extends into our present multi-media moment. We too conjure godly visions: on our smartphones, on the silver screen, and on our TVs and laptops. This book places contemporary pop-culture media - such as blockbuster movie The Dark Knight, Kim Kardashian West's social media feeds, and the outputs of online role-players in Second Life - in dialogue with a corpus of thirteenth-century Latin biographies, 'Holy Women of Liège'. In these texts, holy women see God, and see God often. Their experiences fundamentally orient their life, and offer the women new routes to knowledge, agency, and belonging. For the holy visionaries of Liège, as with us modern 'seers', visions are physically intimate, ideologically overloaded spaces. Through theoretically informed close readings, Medieval Saints and Modern Screens reveals the interconnection of decidedly 'old' media - medieval textualities - and artefacts of our 'new media' ecology, which all serve as spaces in which altogether human concerns are brought before the contemporary culture's eyes.
APA, Harvard, Vancouver, ISO, and other styles
6

Bons, Eberhard. Textual Criticism of the Prophetic Corpus. Edited by Carolyn J. Sharp. Oxford University Press, 2016. http://dx.doi.org/10.1093/oxfordhb/9780199859559.013.7.

Full text
Abstract:
This chapter provides an introduction to the essential issues, questions, and methods of textual criticism of the prophetic books of the Hebrew Bible (Isaiah, Jeremiah, Ezekiel, the Twelve Minor Prophets). Particular focus is put on their major textual witnesses, i.e. the Masoretic Text (MT), the Septuagint as the oldest pre-Christian translation of the biblical text (LXX), the Qumran fragments of the prophetic corpus, and the Vulgate. The chapter confines itself to present basic text-critical issues of each of the books of Isaiah, Jeremiah, Ezekiel, and the Twelve Minor prophets. Attention is paid to new methods and procedures using a number of selected examples, each of which illustrates a specific category of problems.
APA, Harvard, Vancouver, ISO, and other styles
7

Alcorn, Rhona, Joanna Kopaczyk, Bettelou Los, and Benjamin Molineaux, eds. Historical Dialectology in the Digital Age. Edinburgh University Press, 2019. http://dx.doi.org/10.3366/edinburgh/9781474430531.001.0001.

Full text
Abstract:
Drawing on the resources created by the Institute of Historical Dialectology at the University of Edinburgh (now the Angus McIntosh Centre for Historical Linguistics), such as eLALME (the electronic version A Linguistic Atlas of Late Medieval English), LAEME (A Linguistic Atlas of Early Middle English) and LAOS (A Linguistic Atlas of Older Scots), this volume illustrates how traditional methods of historical dialectology can benefit from new methods of corpus data-collection to test out theoretical and empirical claims. In showcasing the results that these digital text resources can yield, the book highlights novel methods for presenting, mapping and analysing the quantitative data of historical dialects, and sets the research agenda for future work in this field. Bringing together a range of distinguished researchers, the book sets out the key corpus-building strategies for working with regional manuscript data at different levels of linguistic analysis including syntax, morphology, phonetics and phonology. The chapters also show the ways in which the geographical spread of phonological, morphological and lexical features of a language can be used to improve our assessment of the geographical provenance of historical texts.
APA, Harvard, Vancouver, ISO, and other styles
8

Polis, Stéphane. The Scribal Repertoire of Amennakhte Son of Ipuy. Oxford University Press, 2018. http://dx.doi.org/10.1093/oso/9780198768104.003.0005.

Full text
Abstract:
This chapter investigates linguistic variation in the texts written by the Deir el-Medina scribe Amennakhte son of Ipuy in New Kingdom Egypt (Twentieth Dynasty; c. 1150 BCE). After a discussion of the challenge posed by the identification of scribes and authors in this sociocultural setting, I provide an overview of the corpus of texts that can tentatively be linked to this individual and justify the selection that has been made for the present study. The core of this paper is then devoted to a multidimensional analysis of Amennakhte’s linguistic registers. By combining the results of this section with a description of Amennakhte’s scribal habits—both at the graphemo-morphological and constructional levels—I test the possibility of using ‘idiolectal’ features to identify the scribe (or the author) of other texts stemming from the community of Deir el-Medina and closely related to Amennakhte.
APA, Harvard, Vancouver, ISO, and other styles
9

Baltussen, Han. The Aristotelian Tradition. Edited by Daniel S. Richter and William A. Johnson. Oxford University Press, 2017. http://dx.doi.org/10.1093/oxfordhb/9780199837472.013.42.

Full text
Abstract:
This chapter examines the relationship between the Aristotelian philosophers (30 bce to 200 ce) and the so-called Second Sophistic. It discusses how the study of Aristotle’s works experienced a revival, leading to a new text-based approach to his corpus. The evidence for the main protagonists of those interested in Aristotle is fragmentary. Some were leading thinkers of the school (Andronicus of Rhodes), others eclectic readers of Aristotle (Xenarchus of Seleucia, Galen of Pergamum). The views of both styles of scholar on Aristotle arose mostly in a didactic context, clarifying the texts to students. Thus philosophers began to engage in scholarly commentary as a standard way to practice philosophy. This trend quickly culminated in the running commentary, the prime example of which is the work of Alexander of Aphrodisias (ca. 200 ce), who also had connections to the imperial court.
APA, Harvard, Vancouver, ISO, and other styles
10

Singer, Julie. Lyrical Humor(s) in the “Fumeur” Songs. Edited by Blake Howe, Stephanie Jensen-Moulton, Neil Lerner, and Joseph Straus. Oxford University Press, 2016. http://dx.doi.org/10.1093/oxfordhb/9780199331444.013.26.

Full text
Abstract:
A small corpus of late fourteenth-century comical lyrics, most composed by the poet Eustache Deschamps, presents the lyricist or performer as afumeur(literally, “smoker”): a creative but volatile artist with a melancholic nature. These texts and their musical settings engage in sophisticated play on late medieval medical understandings of madness and its cures. A new reading of these lyrics in light of Deschamps’s theorization of “natural” and “artificial” music reveals that a particularlydisabledembodiment underpins that author’s poetic vision; and the existence of a corpus of closely relatedfumeursongs composed by multiple authors invites us to revisit received ideas about disabled identity in the Middle Ages.
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "News text corpus"

1

Moe, Richard Elling. "Clustering in a News Corpus." In Text, Speech and Dialogue, 301–7. Cham: Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-10816-2_37.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Caled, Danielle, Paula Carvalho, and Mário J. Silva. "MINT - Mainstream and Independent News Text Corpus." In Lecture Notes in Computer Science, 26–36. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-030-98305-5_3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Kim, Dongwoo, and Alice Oh. "Topic Chains for Understanding a News Corpus." In Computational Linguistics and Intelligent Text Processing, 163–76. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-19437-5_13.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Šilić, Artur, and Bojana Dalbelo Bašić. "Exploring Classification Concept Drift on a Large News Text Corpus." In Computational Linguistics and Intelligent Text Processing, 428–37. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-28604-9_35.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Borms, Samuel, Kris Boudt, Frederiek Van Holle, and Joeri Willems. "Semi-supervised Text Mining for Monitoring the News About the ESG Performance of Companies." In Data Science for Economics and Finance, 217–39. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-66891-4_10.

Full text
Abstract:
AbstractWe present a general monitoring methodology to summarize news about predefined entities and topics into tractable time-varying indices. The approach embeds text mining techniques to transform news data into numerical data, which entails the querying and selection of relevant news articles and the construction of frequency- and sentiment-based indicators. Word embeddings are used to achieve maximally informative news selection and scoring. We apply the methodology from the viewpoint of a sustainable asset manager wanting to actively follow news covering environmental, social, and governance (ESG) aspects. In an empirical analysis, using a Dutch-written news corpus, we create news-based ESG signals for a large list of companies and compare these to scores from an external data provider. We find preliminary evidence of abnormal news dynamics leading up to downward score adjustments and of efficient portfolio screening.
APA, Harvard, Vancouver, ISO, and other styles
6

Altuncu, M. Tarik, Sophia N. Yaliraki, and Mauricio Barahona. "Graph-Based Topic Extraction from Vector Embeddings of Text Documents: Application to a Corpus of News Articles." In Complex Networks & Their Applications IX, 154–66. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-65351-4_13.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Jelínek, Tomáš, Jan Křivan, Vladimír Petkevič, Hana Skoumalová, and Jana Šindlerová. "SYN2020: A New Corpus of Czech with an Innovated Annotation." In Text, Speech, and Dialogue, 48–59. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-83527-9_4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Duszkin, Maksim, Danuta Roszko, and Roman Roszko. "New Parallel Corpora of Baltic and Slavic Languages — Assumptions of Corpus Construction." In Text, Speech, and Dialogue, 172–83. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-83527-9_15.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Svoboda, Lukáš, and Tomáš Brychcín. "New Word Analogy Corpus for Exploring Embeddings of Czech Words." In Computational Linguistics and Intelligent Text Processing, 103–14. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-75477-2_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Christodoulides, George. "A New Corpus of Collaborative Dialogue Produced Under Cognitive Load Using a Driving Simulator." In Text, Speech, and Dialogue, 380–92. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-64206-2_43.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "News text corpus"

1

Zgank, Andrej, Darinka Verdonik, Aleksandra Zögling Markus, and Zdravko Kacic. "BNSI Slovenian broadcast news database - speech and text corpus." In Interspeech 2005. ISCA: ISCA, 2005. http://dx.doi.org/10.21437/interspeech.2005-451.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Chouigui, Amina, Oussama Ben Khiroun, and Bilel Elayeb. "ANT Corpus: An Arabic News Text Collection for Textual Classification." In 2017 IEEE/ACS 14th International Conference on Computer Systems and Applications (AICCSA). IEEE, 2017. http://dx.doi.org/10.1109/aiccsa.2017.22.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Wolska, Magdalena, and Yulia Clausen. "Simplifying metaphorical language for young readers: A corpus study on news text." In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications. Stroudsburg, PA, USA: Association for Computational Linguistics, 2017. http://dx.doi.org/10.18653/v1/w17-5035.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Ramisa, Arnau. "Multimodal News Article Analysis." In Twenty-Sixth International Joint Conference on Artificial Intelligence. California: International Joint Conferences on Artificial Intelligence Organization, 2017. http://dx.doi.org/10.24963/ijcai.2017/737.

Full text
Abstract:
The intersection of Computer Vision and Natural Language Processing has been a hot topic of research in recent years, with results that were unthinkable only a few years ago. In view of this progress, we want to highlight online news articles as a potential next step for this area of research. The rich interrelations of text, tags, images or videos, as well as a vast corpus of general knowledge are an exciting benchmark for high-capacity models such as the deep neural networks. In this paper we present a series of tasks and baseline approaches to leverage corpus such as the BreakingNews dataset.
APA, Harvard, Vancouver, ISO, and other styles
5

Mukherjee, Sumanta, and Kamal Sarkar. "Analyzing Large News Corpus Using Text Mining Techniques for Recognizing High Crime Prone Areas." In 2020 IEEE Calcutta Conference (CALCON). IEEE, 2020. http://dx.doi.org/10.1109/calcon49167.2020.9106554.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Wu, Xingsu, and Jiang He. "Character-level Recurrent Neural Network for Text Classification Applied to Large Scale Chinese News Corpus." In MLMI '20: 2020 The 3rd International Conference on Machine Learning and Machine Intelligence. New York, NY, USA: ACM, 2020. http://dx.doi.org/10.1145/3426826.3426842.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Lohar, Pintu, and Andy Way. "Parallel Data Extraction using Word Embeddings." In 10th International Conference on Advances in Computing and Information Technology (ACITY 2020). AIRCC Publishing Corporation, 2020. http://dx.doi.org/10.5121/csit.2020.101521.

Full text
Abstract:
Building a robust MT system requires a sufficiently large parallel corpus to be available as training data. In this paper, we propose to automatically extract parallel sentences from comparable corpora without using any MT system or even any parallel corpus at all. Instead, we use crosslingual information retrieval (CLIR), average word embeddings, text similarity and a bilingual dictionary, thus saving a significant amount of time and effort as no MT system is involved in this process. We conduct experiments on two different kinds of data: (i) formal texts from news domain, and (ii) user-generated content (UGC) from hotel reviews. The automatically extracted sentence pairs are then added to the already available parallel training data and the extended translation models are built from the concatenated data sets. Finally, we compare the performance of our new extended models against the baseline models built from the available data. The experimental evaluation reveals that our proposed approach is capable of improving the translation outputs for both the formal texts and UGC.
APA, Harvard, Vancouver, ISO, and other styles
8

Gomes, Laerth, and Hilário Oliveira. "A Multi-document Summarization System for News Articles in Portuguese using Integer Linear Programming." In Encontro Nacional de Inteligência Artificial e Computacional. Sociedade Brasileira de Computação - SBC, 2019. http://dx.doi.org/10.5753/eniac.2019.9320.

Full text
Abstract:
Automatic Text Summarization (ATS) has been demanding intense research in recent years. Its importance is given the fact that ATS systems can aid in the processing of large amounts of textual documents. The ATS task aims to create a summary of one or more documents by extracting their most relevant information. Despite the existence of several works, researches involving the development of ATS systems for documents written in Brazilian Portuguese are still a few. In this paper, we propose a multi-document summarization system following a concept-based approach using Integer Linear Programming for the generation of summaries from news articles written in Portuguese. Experiments using the CSTNews corpus were performed to evaluate different aspects of the proposed system. The experimental results obtained regarding the ROUGE measures demonstrate that the developed system presents encourage results, outperforming other works of the literature.
APA, Harvard, Vancouver, ISO, and other styles
9

Hassanzadeh, Oktie, Debarun Bhattacharjya, Mark Feblowitz, Kavitha Srinivas, Michael Perrone, Shirin Sohrabi, and Michael Katz. "Answering Binary Causal Questions Through Large-Scale Text Mining: An Evaluation Using Cause-Effect Pairs from Human Experts." In Twenty-Eighth International Joint Conference on Artificial Intelligence {IJCAI-19}. California: International Joint Conferences on Artificial Intelligence Organization, 2019. http://dx.doi.org/10.24963/ijcai.2019/695.

Full text
Abstract:
In this paper, we study the problem of answering questions of type "Could X cause Y?" where X and Y are general phrases without any constraints. Answering such questions will assist with various decision analysis tasks such as verifying and extending presumed causal associations used for decision making. Our goal is to analyze the ability of an AI agent built using state-of-the-art unsupervised methods in answering causal questions derived from collections of cause-effect pairs from human experts. We focus only on unsupervised and weakly supervised methods due to the difficulty of creating a large enough training set with a reasonable quality and coverage. The methods we examine rely on a large corpus of text derived from news articles, and include methods ranging from large-scale application of classic NLP techniques and statistical analysis to the use of neural network based phrase embeddings and state-of-the-art neural language models.
APA, Harvard, Vancouver, ISO, and other styles
10

Vajjala, Sowmya, and Ivana Lucic. "OneStopEnglish corpus: A new corpus for automatic readability assessment and text simplification." In Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018. http://dx.doi.org/10.18653/v1/w18-0535.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "News text corpus"

1

Haring, Christopher, and David Biedenharn. Channel assessment tools for rapid watershed assessment. Engineer Research and Development Center (U.S.), April 2021. http://dx.doi.org/10.21079/11681/40379.

Full text
Abstract:
Existing Delta Headwaters Project (DHP) watershed stabilization studies are focused on restoration and stabilization of degraded stream systems. The original watershed studies formerly under the Demonstration Erosion Control (DEC) Project started in the mid 1980s. The watershed stabilization activities are continuing, and because of the vast number of degraded watersheds and limited amount of yearly funding, there is a need for developing a rapid watershed assessment approach to determine which watersheds to prioritize for further work. The goal of this project is to test the FluvialGeomorph (FG) toolkit to determine if the Rapid Geomorphic Assessment approach can identify channel stability trends in Campbell Creek and its main tributary. The FG toolkit (Haring et al. 2019; Haring et al. 2020) is a new rapid watershed assessment approach using high-resolution terrain data (Light Detection and Ranging [LiDAR]) to support U.S. Army Corps of Engineers (USACE) watershed planning. One of the principal goals of the USACE SMART (Specific Measureable Attainable Risk-Informed Timely) Planning is to leverage existing data and resources to complete studies. The FG approach uses existing LiDAR to rapidly assess either reach-specific analysis for smaller more focused studies or larger watersheds or ecosystems. The rapid assessment capability can reduce the time and cost of planning by using existing information to complete a preliminary watershed assessment and provide rapid results regarding where to focus more detailed study efforts.
APA, Harvard, Vancouver, ISO, and other styles
2

Pedersen, Gjertrud. Symphonies Reframed. Norges Musikkhøgskole, August 2018. http://dx.doi.org/10.22501/nmh-ar.481294.

Full text
Abstract:
Symphonies Reframed recreates symphonies as chamber music. The project aims to capture the features that are unique for chamber music, at the juncture between the “soloistic small” and the “orchestral large”. A new ensemble model, the “triharmonic ensemble” with 7-9 musicians, has been created to serve this purpose. By choosing this size range, we are looking to facilitate group interplay without the need of a conductor. We also want to facilitate a richness of sound colours by involving piano, strings and winds. The exact combination of instruments is chosen in accordance with the features of the original score. The ensemble setup may take two forms: nonet with piano, wind quartet and string quartet (with double bass) or septet with piano, wind trio and string trio. As a group, these instruments have a rich tonal range with continuous and partly overlapping registers. This paper will illuminate three core questions: What artistic features emerge when changing from large orchestral structures to mid-sized chamber groups? How do the performers reflect on their musical roles in the chamber ensemble? What educational value might the reframing unfold? Since its inception in 2014, the project has evolved to include works with vocal, choral and soloistic parts, as well as sonata literature. Ensembles of students and professors have rehearsed, interpreted and performed our transcriptions of works by Brahms, Schumann and Mozart. We have also carried out interviews and critical discussions with the students, on their experiences of the concrete projects and on their reflections on own learning processes in general. Chamber ensembles and orchestras are exponents of different original repertoire. The difference in artistic output thus hinges upon both ensemble structure and the composition at hand. Symphonies Reframed seeks to enable an assessment of the qualities that are specific to the performing corpus and not beholden to any particular piece of music. Our transcriptions have enabled comparisons and reflections, using original compositions as a reference point. Some of our ensemble musicians have had first-hand experience with performing the original works as well. Others have encountered the works for the first time through our productions. This has enabled a multi-angled approach to the three central themes of our research. This text is produced in 2018.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography