Дисертації з теми "Discourse analysis, Literary – Data processing"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся з топ-18 дисертацій для дослідження на тему "Discourse analysis, Literary – Data processing".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Переглядайте дисертації для різних дисциплін та оформлюйте правильно вашу бібліографію.
李嘉雯 and Ka-man Carmen Lee. "Chinese and English computer-mediated communication in the context of New Literacy Studies." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2002. http://hub.hku.hk/bib/B29872959.
Повний текст джерелаStephens, Maegan R. "A computerized content analysis of Oprah Winfrey's discourse during the James Frey controversy." Virtual Press, 2008. http://liblink.bsu.edu/uhtbin/catkey/1397651.
Повний текст джерелаDepartment of Communication Studies
Caines, Andrew Paul. "You talking to me? : zero auxiliary constructions in British English." Thesis, University of Cambridge, 2011. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.609153.
Повний текст джерелаPaterson, Kimberly Laurel Ms. "TSPOONS: Tracking Salience Profiles Of Online News Stories." DigitalCommons@CalPoly, 2014. https://digitalcommons.calpoly.edu/theses/1222.
Повний текст джерелаRickly, Rebecca J. "Exploring the dimensions of discourse : a multi-model analysis of electronic and oral discussions in developmental English." Virtual Press, 1995. http://liblink.bsu.edu/uhtbin/catkey/1001179.
Повний текст джерелаDepartment of English
Mazidi, Karen. "Infusing Automatic Question Generation with Natural Language Understanding." Thesis, University of North Texas, 2016. https://digital.library.unt.edu/ark:/67531/metadc955021/.
Повний текст джерелаFaruque, Md Ehsanul. "A Minimally Supervised Word Sense Disambiguation Algorithm Using Syntactic Dependencies and Semantic Generalizations." Thesis, University of North Texas, 2005. https://digital.library.unt.edu/ark:/67531/metadc4969/.
Повний текст джерелаSinha, Ravi Som. "Graph-based Centrality Algorithms for Unsupervised Word Sense Disambiguation." Thesis, University of North Texas, 2008. https://digital.library.unt.edu/ark:/67531/metadc9736/.
Повний текст джерелаSilveira, Gabriela. "Narrativas produzidas por indivíduos afásicos e indivíduos cognitivamente sadios: análise computadorizada de macro e micro estrutura." Universidade de São Paulo, 2018. http://www.teses.usp.br/teses/disponiveis/5/5170/tde-01112018-101055/.
Повний текст джерелаINTRODUCTION: The aphasic discourse analysis provides important information about the phonological, morphological, syntactic, semantic and pragmatic aspects of the language of patients who have suffered a stroke. The evaluation of the discourse, along with other methods, can contribute to observation of the evolution of the language and communication of aphasic patients; however, manual analysis is laborious and can lead to errors. OBJECTIVES: (1) to analyze, by computerized technologies, macro and microstructural aspects of the discourse of healthy cognitive individuals, Broca\'s and anomic aphasics; (2) to explore the discourse as indicator of the evolution of aphasia; (3) to analyze the contribution of single photon emission computed tomography (SPECT) to verify the correlation between behavioral and neuroimaging evolution data. METHOD: Two groups of patients were studied: GA1, consisting of eight individuals with Broca\'s aphasia and anomic aphasia, who were analyzed longitudinally from the sub-acute phase of the lesion and after three and six months; GA2 composed of 15 individuals with Broca\'s and anomic aphasia, with varying times of stroke installation and GC consisting of 30 cognitively healthy participants. Computerized technologies were explored for the analysis of metrics related to the micro and macrostructure of discourses uttered from Cinderela history and Cookie Theft picture. RESULTS: Comparing the GC and GA2, in relation to the discourse macrostructure, it was observed that the GA2 aphasics differed significantly from the GC in relation to the total number of propositions emitted; considering the microstructure, seven metrics differentiated both groups. There was a significant difference in the macro and microstructure between the discourses of Broca\'s aphasic subjects and anomic ones. It was possible to verify differences in macro and microstructure measurements in GA1 with the advancement of injury time. In GA1, the comparison between parameters in the sub-acute phase and after 6 months of stroke revealed differences in macrostructure - increase in the number of propositions of the orientation block and of the total propositions. Regarding the microstructure, the initial measures of syllable metrics by word content, incidence of nouns and incidence of content words differed after 6 months of intervention. The variable incidence of missing words in the dictionary showed a significantly lower value after three months of stroke. Cinderella\'s story provided more complete microstructure data than the Cookie Theft picture. There was no change in SPECT over time, without demonstration of change with the evolution of aphasia. CONCLUSION: The discourse produced from the history of Cinderella and the Cookie Theft picture generated material for macrostructure and microstructure analysis of cognitively healthy and aphasic individuals, made it possible to quantify and qualify the evolution of language in different phases of stroke recuperation and distinguished the behavior of healthy and with Broca´s and anomic aphasia, in macro and microstructure aspects. The exploration of computerized tools facilitated the analysis of the data in relation to the microstructure, but it was not applicable to the macrostructure, demonstrating that there is a need for tool adjustments for the discourse analysis of patients. SPECT data did not reflect the behavioral improvement of the language of aphasic subjects
Pienaar, Cheryl Leelavathie. "Towards a corpus of Indian South African English (ISAE) : an investigation of lexical and syntactic features in a spoken corpus of contemporary ISAE." Thesis, Rhodes University, 2008. http://hdl.handle.net/10962/d1002640.
Повний текст джерелаBarakat, Arian. "What makes an (audio)book popular?" Thesis, Linköpings universitet, Statistik och maskininlärning, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-152871.
Повний текст джерелаLópez, del Castillo Wilderbeek Francisco Leslie. "El Discurso social en España." Doctoral thesis, Universitat Pompeu Fabra, 2018. http://hdl.handle.net/10803/663746.
Повний текст джерелаThis research has proposed the understanding of the entire discursive flow of a society in a given time. Such effort has taken as reference the work of the historian of ideas Marc Angenot who in his work The social discourse (2010) intended to interpret all the written material in the France of 1889. The dimension of current content production has forced the extension of discursive fields in which to observe the voice of a society, from mass media to social media. This quantitative change has involved the application of diverse methodologies from models for the processing of large volumes of texts to the semio-discursive analysis of significant samples obtained computationally. The final result both by the conclusions obtained and by the path taken is the attempt to transfer a concept of extended limits to a concrete and fruitful method.
Oyerinde, Oyeyinka Dantala. "Creating public value in information and communication technology: a learning analytics approach." Thesis, 2019. http://hdl.handle.net/10500/26446.
Повний текст джерелаSchool of Computing
Ph.D. (Information Systems)
"Chinese readability analysis and its applications on the internet." 2007. http://library.cuhk.edu.hk/record=b5893108.
Повний текст джерелаThesis submitted in: October 2006.
Thesis (M.Phil.)--Chinese University of Hong Kong, 2007.
Includes bibliographical references (leaves 110-122).
Abstracts in English and Chinese.
Abstract --- p.i
Acknowledgement --- p.v
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Motivation and Major Contributions --- p.1
Chapter 1.1.1 --- Chinese Readability Analysis --- p.1
Chapter 1.1.2 --- Web Readability Analysis --- p.3
Chapter 1.2 --- Thesis Chapter Organization --- p.6
Chapter 2 --- Related Work --- p.7
Chapter 2.1 --- Readability Assessment --- p.7
Chapter 2.1.1 --- Assessment for Text Document --- p.8
Chapter 2.1.2 --- Assessment for Web Page --- p.13
Chapter 2.2 --- Support Vector Machine --- p.14
Chapter 2.2.1 --- Characteristics and Advantages --- p.14
Chapter 2.2.2 --- Applications --- p.16
Chapter 2.3 --- Chinese Word Segmentation --- p.16
Chapter 2.3.1 --- Difficulty in Chinese Word Segmentation --- p.16
Chapter 2.3.2 --- Approaches for Chinese Word Segmentation --- p.17
Chapter 3 --- Chinese Readability Analysis --- p.20
Chapter 3.1 --- Chinese Readability Factor Analysis --- p.20
Chapter 3.1.1 --- Systematic Analysis --- p.20
Chapter 3.1.2 --- Feature Extraction --- p.30
Chapter 3.1.3 --- Limitation of Our Analysis and Possible Extension --- p.32
Chapter 3.2 --- Research Methodology --- p.33
Chapter 3.2.1 --- Definition of Readability --- p.33
Chapter 3.2.2 --- Data Acquisition and Sampling --- p.34
Chapter 3.2.3 --- Text Processing and Feature Extraction . --- p.35
Chapter 3.2.4 --- Regression Analysis using Support Vector Regression --- p.36
Chapter 3.2.5 --- Evaluation --- p.36
Chapter 3.3 --- Introduction to Support Vector Regression --- p.38
Chapter 3.3.1 --- Basic Concept --- p.38
Chapter 3.3.2 --- Non-Linear Extension using Kernel Technique --- p.41
Chapter 3.4 --- Implementation Details --- p.42
Chapter 3.4.1 --- Chinese Word Segmentation --- p.42
Chapter 3.4.2 --- Building Basic Chinese Character / Word Lists --- p.47
Chapter 3.4.3 --- Pull Sentence Detection --- p.49
Chapter 3.4.4 --- Feature Selection Using Genetic Algorithm --- p.50
Chapter 3.5 --- Experiments --- p.55
Chapter 3.5.1 --- Experiment 1: Evaluation on Chinese Word Segmentation using the LMR-RC Tagging Scheme --- p.56
Chapter 3.5.2 --- Experiment 2: Initial SVR Parameters Searching with Different Kernel Functions --- p.61
Chapter 3.5.3 --- Experiment 3: Feature Selection Using Genetic Algorithm --- p.63
Chapter 3.5.4 --- Experiment 4: Training and Cross-validation Performance using the Selected Feature Subset --- p.67
Chapter 3.5.5 --- Experiment 5: Comparison with Linear Regression --- p.74
Chapter 3.6 --- Summary and Future Work --- p.76
Chapter 4 --- Web Readability Analysis --- p.78
Chapter 4.1 --- Web Page Readability --- p.79
Chapter 4.1.1 --- Readability as Comprehension Difficulty . --- p.79
Chapter 4.1.2 --- Readability as Grade Level --- p.81
Chapter 4.2 --- Web Site Readability --- p.83
Chapter 4.3 --- Experiments --- p.85
Chapter 4.3.1 --- Experiment 1: Web Page Readability Analysis -Comprehension Difficulty --- p.87
Chapter 4.3.2 --- Experiment 2: Web Page Readability Analysis -Grade Level --- p.92
Chapter 4.3.3 --- Experiment 3: Web Site Readability Analysis --- p.98
Chapter 4.4 --- Summary and Future Work --- p.101
Chapter 5 --- Conclusion --- p.104
Chapter A --- List of Symbols and Notations --- p.107
Chapter B --- List of Publications --- p.110
Bibliography --- p.113
"Towards discourse classication for Chinese, a resource-poor language." 2014. http://repository.lib.cuhk.edu.hk/en/item/cuhk-1290645.
Повний текст джерелаAt the beginning, we propose a novel bootstrapping unsupervised method based on semantic sequential representation (SSR) for discourse classification. SSR is a new representation for discourse instances which integrate basic bag-of-words information with lexical, semantic and word sequential information. Our method starts with a small set of cue-phrase-based patterns to collect large number of discourse instances which are later converted to SSRs. We then propose an unsupervised SSR learner to generate, weigh and filter new SSRs without cue phrases for recognizing discourse relations. Experimental results showed that our method outperformed previous unsupervised method by 7% in F-score. We also show that SSRs are effective features for supervised learning methods.
The SSR-based method (F-score = 0:63) ignores the ambiguities of discourse connectives. As a result, it suffers from low recall (Recall = 0:49). To discover and eliminate these ambiguities, we further propose a cross-language framework for discourse classification. In our framework, discourse classification for Chinese is achieved in two steps: (1) Discourse connective/trigger identification and (2) Sense classification. English Penn Discourse Treebank 2 (PDTB2) and Chinese-English parallel data are coupled to provide the training data for a co-training based framework. Experimental results showed that our method achieved significant improvement comparing to SSR based method. The proposed framework is practical and effective especially in coping with the inter community problem, which is common in cross-language discourse classification. Moreover, the proposed framework does not integrate any language specific features, making it theoretically applicable for other languages.
Every language has its unique characteristics, our cross-language framework which focuses on the common characteristics between languages is ineffective in detecting Chinese language specific characteristics. As a result, we package the corpus we used in this research to form the Discourse Treebank for Chinese (DTBC). DTBC adopts the principles of PDTB2, and at the same time, it incorporates the linguistic characteristics of Chinese. The annotation work adds a discourse layer to 890 articles from the Penn Chinese Tree Bank 5 (CTB5). DTBC is the first ever open Chinese discourse treebank, which will be an invaluable linguistic resource for future research in Chinese discourse.
語篇(Discourse)提出了關於語義理解的問題,特別是篇章的銜接與連貫問題。與詞法分析、語法分析相似,語篇分類问题是計算語言學的基本問題之一。較同领域其他問題而言,語篇分類的研究尚處於初級階段。對於除英文外的絕大多數語言,由於缺乏语篇標注資料,語篇分類的研究受到了很大的限制。眾所周知,語篇資料的標注工作複雜度较高而且需要花費大量的時間。為了克服這一困境,一種方法是探索無指導的語篇分類方法。然而,在英文上的先行研究表明,無指導语篇分类方法的缺陷是準確率較低並且僅能處理粗粒度的語篇關係。另一種方法是將語篇分類技術從有大量標注資料的源語言遷移到其他目標語言。然而,當前跨語言語篇分類技術尚不成熟。本文以中文為目標語言,首創了在本地標注資料非常有限(Resource-Poor)的情況下,對中文進行語篇分類的研究。不僅如此,我們還標註了中文第一個公開的,包含890篇新聞文章的語篇樹庫。
為了克服以往無指導方法的缺點,我們首先提出了一種新穎的,基於語義有序標記法 (SSR: Semantic Sequential Representation) 的無指導方法。語義有序標記法是一種新的表示語篇實例的方法,它集成了詞袋(bag-of-words)資訊,詞法資訊,語義資訊以及詞序資訊。我們的方法首先從一小組基於語篇連接詞的模式出發,在中文生語料中獲取大量的語篇實例,我們用語義有序標記法表示這些語篇實例。然後,我們提出了一種無指導的,在不考慮語篇連接詞的情況下,對語義有序表示進行挖掘,打分和過濾的方法。實驗結果證明,我們提出的方法比先前的方法在F值上提高了7%。我們還證明了語義有序表示也可以成為有指導語篇分類方法的有效特徵。
基於挖掘語義有序表示的無指導方法(F-score=0.63)忽略了語篇連接詞的歧義性。因此,其召回率較低。爲消除歧義,我們進一步提出了一種跨語言的語篇分類框架。在我們的框架中,中文語篇分類任務由兩個步驟組成:(1)語篇連詞/觸發詞的發現;(2)語篇關係分類。我們將英文語篇樹庫(PDTB2: Penn Discourse TreeBank 2.0)和中文樹庫(CTB5: Chinese TreeBank 5.0)結合起來作為訓練資料,作為co-training演算法框架的輸入。實驗結果表明,我們提出的跨語言語篇分類方法比單純使用語義有序表示的方法在F值上有非常顯著的提高。 這說明我們提出的跨語言框架可以有效地通過雙語平行語料的橋樑作用,識別不同語言之間的語篇分類的共通性。值得一提的是,我們提出的演算法框架並不需要特定的,語言相關的特徵,因此,它具有很強的擴展並應用到其他語言的能力。
每種語言都有其獨特的特點,我們提出的跨語言方法主要注重於發掘語言之間的共同特點,因此並不能有效地發掘中文篇章分類的獨有特點。我們將實驗中標注過的中文語篇分析資料進行了總結和歸納,形成了中文語篇樹庫(DTBC: Discourse TreeBank for Chinese)。中文語篇樹庫繼承了英文語篇庫的構建原則,與此同時,它針對中文獨有的特點進行了大量的本地化工作。我們的標注工作為中文樹庫 (CTB5: The Chinese TreeBank 5.0)的全部890篇新聞文章添加了語篇資訊層。中文語篇樹庫是第一個開放的、大規模中文語篇樹庫語料。它為未來的中文語篇分析研究提供了至關重要的基礎性標註數據。
Zhou, Lanjun.
Thesis (Ph.D.)--Chinese University of Hong Kong, 2014.
Includes bibliographical references (leaves 98-104).
Abstracts also in Chinese.
Title from PDF title page (viewed on 20, December, 2016).
Detailed summary in vernacular field only.
Detailed summary in vernacular field only.
Detailed summary in vernacular field only.
Detailed summary in vernacular field only.
Stewart, Graham Douglas James. "The implications of e-text resource development for Southern African literary studies in terms of analysis and methodology." Thesis, 1999. http://hdl.handle.net/10413/9002.
Повний текст джерелаThesis (Ph.D)-University of Durban-Westville, Durban,1999.
Mak, King Tong. "The dynamics of collocation: a corpus-based study of the phraseology and pragmatics of the introductory-it construction." Thesis, 2005. http://hdl.handle.net/2152/1776.
Повний текст джерелаAkova, Ferit. "A nonparametric Bayesian perspective for machine learning in partially-observed settings." Thesis, 2014. http://hdl.handle.net/1805/4825.
Повний текст джерелаRobustness and generalizability of supervised learning algorithms depend on the quality of the labeled data set in representing the real-life problem. In many real-world domains, however, we may not have full knowledge of the underlying data-generating mechanism, which may even have an evolving nature introducing new classes continually. This constitutes a partially-observed setting, where it would be impractical to obtain a labeled data set exhaustively defined by a fixed set of classes. Traditional supervised learning algorithms, assuming an exhaustive training library, would misclassify a future sample of an unobserved class with probability one, leading to an ill-defined classification problem. Our goal is to address situations where such assumption is violated by a non-exhaustive training library, which is a very realistic yet an overlooked issue in supervised learning. In this dissertation we pursue a new direction for supervised learning by defining self-adjusting models to relax the fixed model assumption imposed on classes and their distributions. We let the model adapt itself to the prospective data by dynamically adding new classes/components as data demand, which in turn gradually make the model more representative of the entire population. In this framework, we first employ suitably chosen nonparametric priors to model class distributions for observed as well as unobserved classes and then, utilize new inference methods to classify samples from observed classes and discover/model novel classes for those from unobserved classes. This thesis presents the initiating steps of an ongoing effort to address one of the most overlooked bottlenecks in supervised learning and indicates the potential for taking new perspectives in some of the most heavily studied areas of machine learning: novelty detection, online class discovery and semi-supervised learning.