Dissertations / Theses on the topic 'Text linguistics'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Text linguistics.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Atwell, Eric Steven. "Corpus linguistics and language learning : bootstrapping linguistic knowledge and resources from text." Thesis, University of Leeds, 2008. http://etheses.whiterose.ac.uk/7504/.
Full textClough, Paul D. "Measuring text reuse." Thesis, University of Sheffield, 2002. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.275023.
Full textBoer, Maria Ângela de Sousa. "Systemic linguistics and the grammar of the text." reponame:Repositório Institucional da UFPR, 2010. http://hdl.handle.net/1884/24322.
Full textTagg, Caroline. "A corpus linguistics study of SMS text messaging." Thesis, University of Birmingham, 2009. http://etheses.bham.ac.uk//id/eprint/253/.
Full textRoloff, Vera Lucia Posnik. "Foreign language reading comprehension: Text representation and the effects of text explicitness and reading ability." Thesis, University of Ottawa (Canada), 1999. http://hdl.handle.net/10393/8791.
Full textMaisto, Alessandro. "A Hybrid Framework for Text Analysis." Doctoral thesis, Universita degli studi di Salerno, 2017. http://hdl.handle.net/10556/2481.
Full textIn Computational Linguistics there is an essential dichotomy between Linguists and Computer Scientists. The rst ones, with a strong knowledge of language structures, have not engineering skills. The second ones, contrariwise, expert in computer and mathematics skills, do not assign values to basic mechanisms and structures of language. Moreover, this discrepancy, especially in the last decades, has increased due to the growth of computational resources and to the gradual computerization of the world; the use of Machine Learning technologies in Arti cial Intelligence problems solving, which allows for example the machines to learn , starting from manually generated examples, has been more and more often used in Computational Linguistics in order to overcome the obstacle represented by language structures and its formal representation. The dichotomy has resulted in the birth of two main approaches to Computational Linguistics that respectively prefers: rule-based methods, that try to imitate the way in which man uses and understands the language, reproducing syntactic structures on which the understanding process is based on, building lexical resources as electronic dictionaries, taxonomies or ontologies; statistic-based methods that, conversely, treat language as a group of elements, quantifying words in a mathematical way and trying to extract information without identifying syntactic structures or, in some algorithms, trying to confer to the machine the ability to learn these structures. One of the main problems is the lack of communication between these two di erent approaches, due to substantial di erences characterizing them: on the one hand there is a strong focus on how language works and on language characteristics, there is a tendency to analytical and manual work. From other hand, engineering perspective nds in language an obstacle, and recognizes in the algorithms the fastest way to overcome this problem. However, the lack of communication is not only an incompatibility: following Harris, the best way to approach natural language, could result by taking the best of both. At the moment, there is a large number of open-source tools that perform text analysis and Natural Language Processing. A great part of these tools are based on statistical models and consist on separated modules which could be combined in order to create a pipeline for the processing of the text. Many of these resources consist in code packages which have not a GUI (Graphical User Interface) and they result impossible to use for users without programming skills. Furthermore, the vast majority of these open-source tools support only English language and, when Italian language is included, the performances of the tools decrease signi cantly. On the other hand, open source tools for Italian language are very few. In this work we want to ll this gap by present a new hybrid framework for the analysis of Italian texts. It must not be intended as a commercial tool, but the purpose for which it was built is to help linguists and other scholars to perform rapid text analysis and to produce linguistic data. The framework, that performs both statistical and rule-based analysis, is called LG-Starship. The idea is to built a modular software that includes, in the beginning, the basic algorithms to perform di erent kind of analysis. Modules will perform the following tasks: Preprocessing Module: a module with which it is possible to charge a text, normalize it or delete stop-words. As output, the module presents the list of tokens and letters which compose the texts with respective occurrences count and the processed text. Mr. Ling Module: a module with which POS tagging and Lemmatization are performed. The module also returns the table of lemmas with the count of occurrences and the table with the quanti cation of grammatical tags. Statistic Module: with which it is possible to calculate Term Frequency and TF-IDF of tokens or lemmas, extract bi-grams and tri-grams units and export results as tables. Semantic Module: which use The Hyperspace Analogue to Language algorithm to calculate semantic similarity between words. The module returns similarity matrices of words per word which can be exported and analyzed. SyntacticModule: which analyze syntax structures of a selected sentence and tag the verbs and its arguments with semantic labels. The objective of the Framework is to build an all-in-one platform for NLP which allows any kind of users to perform basic and advanced text analysis. With the purpose of make the Framework accessible to users who have not speci c computer science and programming language skills, the modules have been provided with an intuitive GUI. The framework can be considered hybrid in a double sense: as explained in the previous lines, it uses both statistical and rule/based methods, by relying on standard statistical algorithms or techniques, and, at the same time, on Lexicon-Grammar syntactic theory. In addition, it has been written in both Java and Python programming languages. LG-Starship Framework has a simple Graphic User Interface but will be also released as separated modules which may be included in any NLP pipelines independently. There are many resources of this kind, but the large majority works for English. There are very few free resources for Italian language and this work tries to cover this need by proposing a tool which can be used both by linguists or other scientist interested in language and text analysis who have no idea about programming languages, as by computer scientists, who can use free modules in their own code or in combination with di erent NLP algorithms. The Framework takes the start from a text or corpus written directly by the user or charged from an external resource. The LG-Starship Framework work ow is described in the owchart shown in g. 1. The pipeline shows that the Pre-Processing Module is applied on original imported or generated text in order to produce a clean and normalized preprocessed text. This module includes a function for text splitting, a stop-word list and a tokenization method. On the text preprocessed the Statistic Module or the Mr. Ling Module can be applied. The rst one, which includes basic statistics algorithm as Term Frequency, tf-idf and n-grams extraction, produces as output databases of lexical and numerical data which can be used to produce charts or perform more external analysis; the second one, is divided in two main task: a Pos tagger, based on the Averaged Perceptron Tagger [?] and trained on the Paisà Corpus [Lyding et al., 2014], perform the Part-Of- Speech Tagging and produce an annotated text. A lemmatization method, which relies on a set of electronic dictionaries developed at the University of Salerno [Elia, 1995, Elia et al., 2010], take as input the Postagged text and produces a new lemmatized version of original text with information about syntactic and semantic properties. This lemmatized text, which can also be processed with the Statistic Module, serves as input for two deeper level of text analysis carried out by both the Syntactic Module and the Semantic Module. The rst one lays on the Lexicon Grammar Theory [Gross, 1971, 1975] and use a database of Predicate structures in development at the Department of Political, Social and Communication Science. Its objective is to produce a Dependency Graph of the sentences that compose the text. The Semantic Module uses the Hyperspace Analogue to Language distributional semantics algorithm [Lund and Burgess, 1996] trained on the Paisà Corpus to produce a semantic network of the words of the text. These work ow has been included in two di erent experiments in which two User Generated Corpora have been involved. The rst experiment represent a statistical study of the language of Rap Music in Italy through the analysis of a great corpus of Rap Song lyrics downloaded from on line databases of user generated lyrics. The second experiment is a Feature-Based Sentiment Analysis project performed on user product reviews. For this project we integrated a large domain database of linguistic resources for Sentiment Analysis, developed in the past years by the Department of Political, Social and Communication Science of the University of Salerno, which consists of polarized dictionaries of Verbs, Adjectives, Adverbs and Nouns. These two experiment underline how the linguistic framework can be applied to di erent level of analysis and to produce both Qualitative data and Quantitative data. For what concern the obtained results, the Framework, which is only at a Beta Version, obtain discrete results both in terms of processing time that in terms of precision. Nevertheless, the work is far from being considered complete. More algorithms will be added to the Statistic Module and the Syntactic Module will be completed. The GUI will be improved and made more attractive and modern and, in addiction, an open-source on-line version of the modules will be published. [edited by author]
XV n.s.
Kof, Leonid. "Text analysis for requirements engineering : application of computational linguistics /." Saarbrücken : VDM Verl. Dr. Müller, 2007. http://deposit.d-nb.de/cgi-bin/dokserv?id=3021639&prov=M&dok_var=1&dok_ext=htm.
Full textDawson, David Allan. "Text-linguistics and Biblical Hebrew : an examination of methodologies." Thesis, University of Edinburgh, 1994. http://hdl.handle.net/1842/19674.
Full textFulford, Heather. "Term acquisition : a text-probing approach." Thesis, University of Surrey, 1997. http://epubs.surrey.ac.uk/843700/.
Full textLaffling, John D. "Machine disambiguation and translation of polysemous nouns : a lexicon-driven model for text-semantic analysis and parallel text-dependent transfer in German-English translation of party political texts." Thesis, University of Wolverhampton, 1990. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.254466.
Full textWilson, Christin M. L. "Variation and Text Type in Old Occitan Texts." The Ohio State University, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=osu1331136026.
Full textLaw, Yee Wah Mary. "The study of register differentiation of two types of press text : opinion article & feature news." HKBU Institutional Repository, 2003. http://repository.hkbu.edu.hk/etd_ra/488.
Full textCheng, Chi Wa. "Probabilistic topic modeling and classification probabilistic PCA for text corpora." HKBU Institutional Repository, 2011. http://repository.hkbu.edu.hk/etd_ra/1263.
Full textWhitelaw, Casey. "Systemic features for text classification." Thesis, The University of Sydney, 2005. https://hdl.handle.net/2123/28097.
Full textJ'Fellers, J., and Theresa McGarry. "Language and Linguistics." Digital Commons @ East Tennessee State University, 2009. https://dc.etsu.edu/etsu-works/6151.
Full textHeister, Julian, and Reinhold Kliegl. "Comparing word frequencies from different German text corpora." Universität Potsdam, 2012. http://opus.kobv.de/ubp/volltexte/2012/6234/.
Full textKeenan, Francis Gerard. "Large vocabulary syntactic analysis for text recognition." Thesis, Nottingham Trent University, 1992. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.334311.
Full textUnaldi, Aylin. "Investigating reading for academic purposes : sentence, text and multiple texts." Thesis, University of Bedfordshire, 2010. http://hdl.handle.net/10547/279255.
Full textFournier, Christopher. "Evaluating Text Segmentation." Thèse, Université d'Ottawa / University of Ottawa, 2013. http://hdl.handle.net/10393/24064.
Full textScott, Sam. "Feature engineering for a symbolic approach to text classification." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp01/MQ36741.pdf.
Full textMohamed, Muhidin Abdullahi. "Automatic text summarisation using linguistic knowledge-based semantics." Thesis, University of Birmingham, 2016. http://etheses.bham.ac.uk//id/eprint/6659/.
Full textTeich, Elke, and Peter Fankhauser. "Exploring lexical patterns in text : lexical cohesion analysis with WordNet." Universität Potsdam, 2005. http://opus.kobv.de/ubp/volltexte/2006/868/.
Full textUsing an electronic thesaurus-like resource, Princeton WordNet, and the Brown Corpus of English, we have implemented a process of annotating text with lexical chains and a graphical user interface for inspection of the annotated text.
We describe the system and report on some sample linguistic analyses carried out using the combined thesaurus-corpus resource.
Calderon, de Bolivar Adriana. "Interaction through written text : a discourse analysis of newspaper editorials." Thesis, University of Birmingham, 1986. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.312040.
Full textaf, Geijerstam Åsa. "Att skriva i naturorienterande ämnen i skolan." Doctoral thesis, Uppsala University, Department of Linguistics and Philology, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-7352.
Full textWhen children encounter new subjects in school, they are also faced with new ways of using language. Learning science thus means learning the language of science, and writing is one of the ways this is accomplished. The present study investigates writing in natural sciences in grades 5 and 8 in Swedish schools. Major theoretical influences for these investigations are found within the socio-cultural, dialogical and social semiotic perspectives on language use.
The study is based on texts written by 97 students, interviews around these texts and observations from 16 different classroom practices. Writing is seen as a situated practice; therefore analysis is carried out of the activities surrounding the texts. The student texts are analysed in terms of genre and in relation to their abstraction, density and use of expansions. This analysis shows among other things that the texts show increasing abstraction and density with increasing age, whereas the text structure and the use of expansions do not increase.
It is also argued that a central point in school writing must be the students’ way of talking about their texts. Analysis of interviews with the students is thus carried out in terms of text movability. The results from this analysis indicate that students find it difficult to talk about their texts. They find it hard to express the main content of the text, as well as to discuss it’s function and potential readers.
Previous studies argue that writing constitutes a potential for learning. In the material studied in this thesis, this potential learning tool is not used to any large extent. To be able to participate in natural sciences in higher levels, students need to take part in practices where the specialized language of natural science is used in writing as well as in speech.
Vajjala, Balakrishna Sowmya [Verfasser], and Detmar [Akademischer Betreuer] Meurers. "Analyzing Text Complexity and Text Simplification : Connecting Linguistics, Processing and Educational Applications / Sowmya Vajjala Balakrishna ; Betreuer: Detmar Meurers." Tübingen : Universitätsbibliothek Tübingen, 2015. http://d-nb.info/1163397652/34.
Full textMcGarry, Theresa, and J. Mwinyelle. "Adverbial Clauses and Gender in English and Spanish." Digital Commons @ East Tennessee State University, 2014. https://dc.etsu.edu/etsu-works/6155.
Full textPindi, Makaya ma Kimvwela. "Schematic structure and the modulation of propositions in economics forecasting text." Thesis, Online version, 1988. http://ethos.bl.uk/OrderDetails.do?did=1&uin=uk.bl.ethos.3821053.
Full textMason, Oliver Jan. "The automatic extraction of linguistic information from text corpora." Thesis, University of Birmingham, 2006. http://etheses.bham.ac.uk//id/eprint/116/.
Full textDanielsson, Benjamin. "A Study on Text Classification Methods and Text Features." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-159992.
Full textDoyle, Paul G. "Replicating corpus linguistics : a corpus-driven investigation of lexical networks in text." Thesis, Lancaster University, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.418685.
Full textBaka, Farida. "The discourse of biology lectures : aspects of its mode and text structure." Thesis, Aston University, 1989. http://publications.aston.ac.uk/14815/.
Full textEwert, Doreen Elizabeth. "The expression of temporality in the written discourse of L2 learners of English : distinguishing text-types and text passages /." [Bloomington, Ind.] : Indiana University, 2006. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3220175.
Full textSource: Dissertation Abstracts International, Volume: 67-05, Section: A, page: 1710. Adviser: Kathleen Bardovi-Harlig. "Title from dissertation home page (viewed June 20, 2007)."
Lindén, Johannes. "Extracting Text into Meta-Data : Improving machine text-understanding of news-media articles." Licentiate thesis, Mittuniversitetet, Institutionen för informationssystem och –teknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-41775.
Full textVid tidpunkten för presentationen var följande delarbeten opublicerade: delarbete 4 inskickat.
At the time of the public defence the following papers were unpublished: paper 4 submitted.
Folkeryd, Jenny W. "Writing with an Attitude : Appraisal and student texts in the school subject of Swedish." Doctoral thesis, Uppsala universitet, Institutionen för lingvistik och filologi, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-7410.
Full textZhang, Yaxi. "Named Entity Recognition for Social Media Text." Thesis, Uppsala universitet, Institutionen för lingvistik och filologi, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-395978.
Full textBrewer, C. D. "Some implications of the Z-text for the textual tradition of Piers Plowman." Thesis, University of Oxford, 1985. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.371610.
Full textEdling, Agnes. "Abstraction and authority in textbooks : The textual paths towards specialized language." Doctoral thesis, Uppsala University, Department of Linguistics and Philology, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-6989.
Full textDuring a few hours of a school day, a student might read textbook texts which are highly diversified in terms of abstraction. Abstraction is a central feature of specialized language and the transition from everyday language to specialized language is one of the most important things formal education can offer students. That transition is the focus of this thesis.
This study introduces a new three-graded classification of abstraction including the levels of specificity, generalization and abstraction, based on a discussion of the concept of abstraction. The investigations performed, based on this classification, show that texts from different subject areas display distinct patterns of abstraction. The Swedish literary texts had the lowest degree of abstraction, the social science texts had an intermediate degree and the natural science texts were the most generalized and abstract. The results also show that the degree of abstraction in the textbook texts increases in later grade levels.
The thesis presents a new way of analyzing shifts between levels of abstraction and their functions. Interestingly, the texts with a medium degree of abstraction, the social science texts, are the ones with the greatest variety in shifts. The functions of the shifts differ with respect to cultural domains. The shifts in the Swedish literary texts in general belong to the everyday domain while the shifts in the natural science texts belong to a specialized domain. The shifts in the social science texts had features of both domains.
A secondary aim of the thesis is to develop the understanding of the relationship between author and reader in the texts. The results from my investigation of modality in the Swedish textbook texts confirm the earlier findings from English and Spanish textbooks. In comparison to other text types, textbook texts present knowledge in a more authoritative and less modalized way.
From time to time, abstraction is described as a feature that hinders students accessing texts. Some researchers even suggest a removal of features of specialized language in textbook texts, in order to increase students’ understanding. However, in a society where specialized knowledge is necessary, the access to specialized texts is important. A democratic view of education and school mandates that children and adolescents have the opportunity to encounter and learn to encounter specialized language in school. In analyzing the texts special attention is paid to the relationship between the texts, the contexts of use and the student readers.
Paun, Silviu. "Topic models for short text data." Thesis, University of Essex, 2017. http://repository.essex.ac.uk/19715/.
Full textWilliams, Ken. "A framework for text categorization." Thesis, The University of Sydney, 2003. https://hdl.handle.net/2123/27951.
Full textMills, Jon. "Computer assisted lemmatisation of a Cornish text corpus for lexicographical purposes." Thesis, University of Kent, 2002. http://kar.kent.ac.uk/8301/.
Full textMicallef, Paul. "A text to speech synthesis system for Maltese." Thesis, University of Surrey, 1997. http://epubs.surrey.ac.uk/842702/.
Full textForsyth, Richard. "Stylistic structures : a computational approach to text classification." Thesis, University of Nottingham, 1996. http://eprints.nottingham.ac.uk/13445/.
Full textDelisle, Sylvain. "Text processing without a priori domain knowledge: Semi-automatic linguistic analysis for incremental knowledge acquisition." Thesis, University of Ottawa (Canada), 1994. http://hdl.handle.net/10393/6574.
Full textXu, Jingguo. "A study of the reading process in Chinese through detecting errors in a meaningful text." Diss., The University of Arizona, 1998. http://hdl.handle.net/10150/282855.
Full textPlum, Guenter Arnold. "Text and Contextual Conditioning in Spoken English: A genre approach." Thesis, The University of Sydney, 1988. http://hdl.handle.net/2123/608.
Full textPlum, Guenter Arnold. "Text and Contextual Conditioning in Spoken English: A genre approach." University of Sydney. Linguistics, 1988. http://hdl.handle.net/2123/608.
Full textSantos, Rodrigo Maia Theodoro dos. "Procedimentos e operações de reconstrução textual." Pontifícia Universidade Católica de São Paulo, 2012. https://tede2.pucsp.br/handle/handle/14248.
Full textCoordenação de Aperfeiçoamento de Pessoal de Nível Superior
The theme for this thesis is a study of textual revision procedures, taken by base beyond the grammatical criteria, factors textuality, seeking rearticulations operating in the reconstruction process of the text. The main objective is to identify, in the corpus selected items that present the possible operations methodological review. The intention is to develop a diagram with procedures to guide professionals dealing with the text. To discuss the theoretical bias of the thesis was based on the approach taken by the Textlinguistics and criteria for textuality, identified as the main responsible for the articulation of the text as a meaningful unity. From this perspective, the corpus of the thesis will consist of summaries of academic completion of course work. The choice of this kind is due to the fact that the summary be characterized as a succinct way of rescuing a broader context, which features a procedure retextualization. Thus, the search for textual reconstruction procedures can be performed with higher quality and clarity. Nevertheless, the thesis showed objectively and exemplifying the need for the teacher or reviewer to consider items that are beyond the grammatical aspects. From the procedures developed in the thesis, it was revealed that adaptation to gender and textuality are key factors to reach a production of relevant text
A presente tese tem por tema um estudo de procedimentos de revisão textual, tomados por base, para além dos critérios gramaticais, os fatores de textualidade, em busca de rearticulações operacionais no processo de reconstrução do texto. O objetivo principal do trabalho é identificar, no corpus selecionado, itens que apresentem as possíveis operações metodológicas de revisão. A intenção é desenvolver um diagrama com procedimentos para orientar os profissionais que lidam com o texto. Para discorrer sobre o viés teórico da tese, foi tomada por base a abordagem da Lingüística Textual e os critérios de textualidade, apontados como os principais responsáveis pela articulação do texto como uma unidade significativa. Nessa perspectiva, o corpus da tese será constituído por resumos acadêmicos de trabalho de conclusão de curso. A escolha desse gênero se deve ao fato de o resumo se caracterizar como uma forma sucinta de resgate de um texto mais amplo, o que caracteriza um procedimento de retextualização. Dessa forma, a busca pelos procedimentos de reconstrução textual pode ser realizada com mais qualidade e clareza. Não obstante, a tese evidenciou de forma objetiva e exemplificativa a necessidade de o professor ou o revisor considerar itens que estão além dos aspectos gramaticais. A partir dos procedimentos e operações desenvolvidas na tese, foi possível perceber que a adaptação ao gênero e aos fatores de textualidade são fundamentais para chegarmos a uma produção de texto competente
Al-Jubouri, Adnan J. R. "Computer-aided categorisation and quantification of connectives in English and Arabic (based on newspaper text corpora)." Thesis, Aston University, 1987. http://publications.aston.ac.uk/10283/.
Full textCash, Cash Phillip E. "Timnakni Timat (writing from the heart): Sahaptin discourse and text in the speaker writing of Xiluxin." Thesis, The University of Arizona, 2000. http://hdl.handle.net/10150/278750.
Full textRennes, Evelina. "Improved Automatic Text Simplification by Manual Training." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-120001.
Full text