Dissertations / Theses on the topic 'Spoken language'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Spoken language.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Ryu, Koichiro, and Shigeki Matsubara. "SIMULTANEOUS SPOKEN LANGUAGE TRANSLATION." INTELLIGENT MEDIA INTEGRATION NAGOYA UNIVERSITY / COE, 2006. http://hdl.handle.net/2237/10466.
Full textJones, J. M. "Iconicity and spoken language." Thesis, University College London (University of London), 2017. http://discovery.ucl.ac.uk/1559788/.
Full textDinarelli, Marco. "Spoken Language Understanding: from Spoken Utterances to Semantic Structures." Doctoral thesis, Università degli studi di Trento, 2010. https://hdl.handle.net/11572/367830.
Full textDinarelli, Marco. "Spoken Language Understanding: from Spoken Utterances to Semantic Structures." Doctoral thesis, University of Trento, 2010. http://eprints-phd.biblio.unitn.it/280/1/PhD-Thesis-Dinarelli.pdf.
Full textMelander, Linda. "Language attitudes : Evaluational Reactions to Spoken Language." Thesis, Högskolan Dalarna, Engelska, 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:du-2282.
Full textHarwath, David F. (David Frank). "Learning spoken language through vision." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/118081.
Full textCataloged from PDF version of thesis.
Includes bibliographical references (pages 145-159).
Humans learn language at an early age by simply observing the world around them. Why can't computers do the same? Conventional automatic speech recognition systems have a long history and have recently made great strides thanks to the revival of deep neural networks. However, their reliance on highly supervised (and therefore expensive) training paradigms has restricted their application to the major languages of the world, accounting for a small fraction of the more than 7,000 human languages spoken worldwide. This thesis introduces datasets, models, and methodologies for grounding continuous speech signals at the raw waveform level to natural image scenes. The context and constraint provided by the visual information enables our models to efficiently learn linguistic units, such as words, along with their visual semantics. For example, our models are able to recognize instances of the spoken word "water" within spoken captions and associate them with image regions containing bodies of water. Further, we demonstrate that our models are capable of learning cross-lingual semantics by using the visual space as an interlingua to perform speech-to-speech retrieval between English and Hindi. In all cases, this learning is done without linguistic transcriptions or conventional speech recognition - yet we show that our methods achieve retrieval scores close to what is possible when transcriptions are available. This offers a promising new direction for speech processing that only requires speakers to provide narrations of what they see.
by David Frank Harwath.
Ph. D.
Lainio, Jarmo. "Spoken Finnish in urban Sweden." Uppsala : Centre for multiethnic research, 1989. http://catalogue.bnf.fr/ark:/12148/cb35513801d.
Full textKanda, Naoyuki. "Open-ended Spoken Language Technology: Studies on Spoken Dialogue Systems and Spoken Document Retrieval Systems." 京都大学 (Kyoto University), 2014. http://hdl.handle.net/2433/188874.
Full textIntilisano, Antonio Rosario. "Spoken dialog systems: from automatic speech recognition to spoken language understanding." Doctoral thesis, Università di Catania, 2016. http://hdl.handle.net/10761/3920.
Full textZámečník, Jiří [Verfasser], Christian [Akademischer Betreuer] Mair, and John A. [Akademischer Betreuer] Nerbonne. "Disfluency prediction in natural spoken language." Freiburg : Universität, 2019. http://d-nb.info/1238517714/34.
Full textAlhanai, Tuka(Tuka Waddah Talib Ali Al Hanai). "Detecting cognitive impairment from spoken language." Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/122724.
Full textCataloged from PDF version of thesis.
Includes bibliographical references (pages 141-165).
Dementia comes second only to spinal cord injuries in terms of its debilitating effects; from memory-loss to physical disability. The standard approach to evaluate cognitive conditions are neuropsychological exams, which are conducted via in-person interviews to measure memory, thinking, language, and motor skills. Work is on-going to determine biomarkers of cognitive impairment, yet one modality that has been relatively less explored is speech. Speech has the advantage of being easy to record, and contains the majority of information transmitted during neuropsychological exams. To determine the viability of speech-based biomarkers, we utilize data from the Framingham Heart Study, that contains hour-long audio recordings of neuropsychological exams for over 5,000 individuals. The data is representative of a population and the real-world prevalence of cognitive conditions (3-4%). We first explore modeling cognitive impairment from a relatively small set of 92 subjects with complete information on audio, transcripts, and speaker turns. We loosen these constraints by modeling with only a fraction of audio (~2-3 minutes), of which the speaker segments are defined through text-based diarization. We next apply this diarization method to extract audio features from all 7,000+ recordings (most of which have no transcripts), to model cognitive impairment (AUC 0.83, spec. 78%, sens. 79%). Finally, we eliminate the need for feature-engineering by training a neural network to learn higher-order representations from filterbank features (AUC 0.85, spec. 81%, sens. 82%). Our speech models exhibit strong performance and are comparable to the baseline demographic model (AUC 0.85, spec. 93%, sens. 65%). Further analysis shows that our neural network model automatically learns to detect specific speech activity which clusters according to: pause followed by onset of speech, short burst of speech, speech activity in high-frequency spectral energy bands, and silence.
by Tuka Alhanai.
Ph. D.
Ph.D. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
Nguyen, Tu Anh. "Spoken Language Modeling from Raw Audio." Electronic Thesis or Diss., Sorbonne université, 2024. http://www.theses.fr/2024SORUS089.
Full textSpeech has always been a dominant mode of social connection and communication. However, speech processing and modeling have been challenging due to its variability. Classic speech technologies rely on cascade modeling, i.e. transcribing speech to text with an Automatic Speech Recognition (ASR) system, processing transcribed text using Natural Language Processing (NLP) methods, and converting text back to speech with a Speech Synthesis model. This method eliminates speech variability but requires a lot of textual datasets, which are not always available for all languages. In addition, it removes all the expressivity contained in the speech itself.Recent advancements in self-supervised speech learning (SpeechSSL) have enabled the learning of good discrete speech representations from raw audio, bridging the gap between speech and text technologies. This allows to train language models on discrete representations (discrete units, or pseudo-text) obtained from the speech and has given rise to a new domain called TextlessNLP, where the task is to learn the language directly on audio signals, bypassing the need for ASR systems. The so-called Spoken Language Models (Speech Language Models, or SpeechLMs) have been shown to be working and offer new possibilities for speech processing compared to cascade systems.The objective of this thesis is thus to explore and improve this newly-formed domain. We are going to analyze why these discrete representations work, discover new applications of SpeechLMs to spoken dialogues, extend TextlessNLP to more expressive speech as well as improve the performance of SpeechLMs to reduce the gap between SpeechLMs and TextLMs
Goldie, Anna Darling. "CHATTER : a spoken language dialogue system for language learning applications." Thesis, Massachusetts Institute of Technology, 2011. http://hdl.handle.net/1721.1/66420.
Full textCataloged from PDF version of thesis.
Includes bibliographical references (p. 110).
The goal of this thesis is to build a Computer Aided Language Learning game that simulates a casual conversation in Mandarin Chinese. In the envisioned system, users will chat with a computer on topics ranging from relationship status to favorite Chinese dish. I hope to provide learners with more opportunities to practice speaking and reading foreign languages. The system was designed with generality in mind. The framework allows developers to easily implement dialogue systems to allow students to practice communications in a variety of situations, such as in a street market, at a restaurant, or in a hospital. A user simulator was also implemented, which was useful for the code development, as a tutor for the student, and as an evaluation tool. All of the 18 topics were covered within the 20 sample dialogues, no two dialogues took the same path, questions and remarks were worded differently, and no two users had the same profile, resulting in high variety, coherence, and natural language quality.
by Anna Darling Goldie.
M.Eng.
Inagaki, Yasuyoshi, Katsuhiko Toyama, Shigeki Matsubara, and Yoshihide Kato. "SPOKEN LANGUAGE PARSING BASED ON INCREMENTAL DISAMBIGUATION." ISCA(International Speech Communication Association), 2000. http://hdl.handle.net/2237/15103.
Full textInagaki, Yasuyoshi, Katsuhiko Toyama, Nobuo Kawaguchi, Shigeki Matsubara, and Yasuyuki Aizawa. "Spoken Language Corpus for Machine Interpretation Research." ISCA(International Speech Communication Association), 2000. http://hdl.handle.net/2237/15104.
Full textHe, Y. "A statistical approach to spoken language understanding." Thesis, University of Cambridge, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.603917.
Full textMolle, Jo. "Shallow semantic processing in spoken language comprehension." Thesis, University of Strathclyde, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.442012.
Full textMcGraw, Ian C. (Ian Carmichael). "Crowd-supervised training of spoken language systems." Thesis, Massachusetts Institute of Technology, 2012. http://hdl.handle.net/1721.1/75641.
Full textCataloged from PDF version of thesis.
Includes bibliographical references (p. 155-166).
Spoken language systems are often deployed with static speech recognizers. Only rarely are parameters in the underlying language, lexical, or acoustic models updated on-the-fly. In the few instances where parameters are learned in an online fashion, developers traditionally resort to unsupervised training techniques, which are known to be inferior to their supervised counterparts. These realities make the development of spoken language interfaces a difficult and somewhat ad-hoc engineering task, since models for each new domain must be built from scratch or adapted from a previous domain. This thesis explores an alternative approach that makes use of human computation to provide crowd-supervised training for spoken language systems. We explore human-in-the-loop algorithms that leverage the collective intelligence of crowds of non-expert individuals to provide valuable training data at a very low cost for actively deployed spoken language systems. We also show that in some domains the crowd can be incentivized to provide training data for free, as a byproduct of interacting with the system itself. Through the automation of crowdsourcing tasks, we construct and demonstrate organic spoken language systems that grow and improve without the aid of an expert. Techniques that rely on collecting data remotely from non-expert users, however, are subject to the problem of noise. This noise can sometimes be heard in audio collected from poor microphones or muddled acoustic environments. Alternatively, noise can take the form of corrupt data from a worker trying to game the system - for example, a paid worker tasked with transcribing audio may leave transcripts blank in hopes of receiving a speedy payment. We develop strategies to mitigate the effects of noise in crowd-collected data and analyze their efficacy. This research spans a number of different application domains of widely-deployed spoken language interfaces, but maintains the common thread of improving the speech recognizer's underlying models with crowd-supervised training algorithms. We experiment with three central components of a speech recognizer: the language model, the lexicon, and the acoustic model. For each component, we demonstrate the utility of a crowd-supervised training framework. For the language model and lexicon, we explicitly show that this framework can be used hands-free, in two organic spoken language systems.
by Ian C. McGraw.
Ph.D.
Den, Yasuharu. "A Uniform Approach to Spoken Language Analysis." Kyoto University, 1996. http://hdl.handle.net/2433/154673.
Full textKyoto University (京都大学)
0048
新制・論文博士
博士(工学)
乙第9340号
論工博第3147号
新制||工||1050(附属図書館)
UT51-96-W431
(主査)教授 長尾 真, 教授 堂下 修司, 教授 石田 亨
学位規則第4条第2項該当
Hall, Mica. "Russian as spoken by the Crimean Tatars /." Thesis, Connect to this title online; UW restricted, 1997. http://hdl.handle.net/1773/7163.
Full textHoffiz, Benjamin Theodore III. "Morphology of United Arab Emirates Arabic, Dubai dialect." Diss., The University of Arizona, 1995. http://hdl.handle.net/10150/187179.
Full textMac, Eoin Gearóid. "What language was spoken in Ireland before Irish?" Universität Potsdam, 2007. http://opus.kobv.de/ubp/volltexte/2008/1923/.
Full textHillard, Dustin Lundring. "Automatic sentence structure annotation for spoken language processing /." Thesis, Connect to this title online; UW restricted, 2008. http://hdl.handle.net/1773/6080.
Full textInagaki, Yasuyoshi, Nobuo Kawaguchi, Shigeki Matsubara, and Tomohiro Ohno. "SPIRAL CONSTRUCTION OF SYNTACTICALLY ANNOTATED SPOKEN LANGUAGE CORPUS." IEEE, 2003. http://hdl.handle.net/2237/15085.
Full textInagaki, Yasuyoshi, Nobuo Kawaguchi, Takahisa Murase, and Shigeki Matsubara. "Stochastic Dependency Parsing of Spontaneous Japanese Spoken Language." ACL(Association for computational linguistics), 2002. http://aclweb.org/anthology/.
Full textOhno, Tomohiro, Shigeki Matsubara, Nobuo Kawaguchi, and Yasuyoshi Inagaki. "Robust Dependency Parsing of Spontaneous Japanese Spoken Language." IEICE, 2005. http://hdl.handle.net/2237/7824.
Full textMoss, Helen Elizabeth. "Access to word meanings during spoken language comprehension." Thesis, University of Cambridge, 1991. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.334148.
Full textKuo, Chen-Li. "Interpreting intonation in English-Chinese spoken language translation." Thesis, University of Manchester, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.492917.
Full textPon-Barry, Heather Roberta. "Inferring Speaker Affect in Spoken Natural Language Communication." Thesis, Harvard University, 2012. http://dissertations.umi.com/gsas.harvard:10710.
Full textEngineering and Applied Sciences
Kirk, Steven J. "Second language spoken fluency in monologue and dialogue." Thesis, University of Nottingham, 2016. http://eprints.nottingham.ac.uk/38421/.
Full textLee, Vivienne C. (Vivienne Catherine). "LanguageLand : a multimodal conversational spoken language learning system." Thesis, Massachusetts Institute of Technology, 2004. http://hdl.handle.net/1721.1/33143.
Full textIncludes bibliographical references (leaves 96-98).
LanguageLand is a multimodal conversational spoken language learning system whose purpose is to help native English users learn Mandarin Chinese. The system is centered on a game that involves navigation on a simulated map. It consists of an edit and play mode. In edit mode, users can set up the street map with whichever objects (such as a house, a church, a policeman) they see in the toolbar. This can be done through spoken conversation over the telephone or typed text along with mouse clicks. In play mode, users are given a start and end corner and the goal is to get from the start to the end on the map. While the system only responds actively to accurate Mandarin phrases, the user can speak or type in English to obtain Mandarin translations of those English words or phrases. The LanguageLand application is built using Java and Swing. The overall system is constructed using the Galaxy Communicator architecture and existing SLS technologies including Summit for speech recognition, Tina for NL understanding, Genesis for NL generation, and Envoice for speech synthesis.
by Vivienne C. Lee.
M.Eng.
Cowan, Brooke A. (Brooke Alissa) 1972. "PLUTO : a preprocessor for multilingual spoken language generation." Thesis, Massachusetts Institute of Technology, 2004. http://hdl.handle.net/1721.1/30102.
Full textIncludes bibliographical references (p. 111-115).
Surface realization, a subtask of natural language generation, maps a meaning representation to a natural language string. This thesis presents an architecture for a surface realization component in a spoken dialogue system. The architecture divides the surface realization task in two: (1) modification of the meaning representation to adhere to the constraints of the target language, and (2) string production. Each subtask is handled by a separate module. PLUTO is a new module, responsible for meaning representation modification, that has been added to the Spoken Language Systems group's surface realization component. PLUTO acts as a preprocessor to the main processor, GENESIS, which is responsible for string production. We show how this new, decoupled architecture is amenable to a hybrid approach to machine translation that combines transfer and interlingua. We also present a policy for generation that specifies the roles of PLUTO, GENESIS, and the lexicon they share. This policy formalizes a way of writing robust, reusable grammars. The primary contribution of this work is to simplify the development of such grammars in multilingual speech-based applications.
by Brooke A. Cowan.
S.M.
Korpusik, Mandy B. "Spoken language understanding in a nutrition dialogue system." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/99860.
Full textCataloged from PDF version of thesis.
Includes bibliographical references (pages 105-111).
Existing approaches for the prevention and treatment of obesity are hampered by the lack of accurate, low-burden methods for self-assessment of food intake, especially for hard-to-reach, low-literate populations. For this reason, we propose a novel approach to diet tracking that utilizes speech understanding and dialogue technology in order to enable efficient self-assessment of energy and nutrient consumption. We are interested in studying whether speech can lower user workload compared to existing self-assessment methods, whether spoken language descriptions of meals can accurately quantify caloric and nutrient absorption, and whether dialogue can efficiently and effectively be used to ascertain and clarify food properties, perhaps in conjunction with other modalities. In this thesis, we explore the core innovation of our nutrition system: the language understanding component which relies on machine learning methods to automatically detect food concepts in a user's spoken meal description. In particular, we investigate the performance of conditional random field (CRF) models for semantic labeling and segmentation of spoken meal descriptions. On a corpus of 10,000 meal descriptions, we achieve an average F1 test score of 90.7 for semantic tagging and 86.3 for associating foods with properties. In a study of users interacting with an initial prototype of the system, semantic tagging achieved an accuracy of 83%, which was sufficiently high to satisfy users.
by Mandy B. Korpusik.
S.M.
Lau, Tien-Lok Jonathan 1980. "SLLS : an online conversational spoken language learning system." Thesis, Massachusetts Institute of Technology, 2003. http://hdl.handle.net/1721.1/29684.
Full textIncludes bibliographical references (leaves 75-77).
The Spoken Language Learning System (SLLS) is intended to be an engaging, educational, and extensible spoken language learning system showcasing the multilingual capabilities of the Spoken Language Systems Group's (SLS) systems. The motivation behind SLLS is to satisfy both the demand for spoken language learning in an increasingly multi-cultural society and the desire for continued development of the multilingual systems at SLS. SLLS is an integration of an Internet presence with augmentations to SLS's Mandarin systems built within the Galaxy architecture, focusing on the situation of an English speaker learning Mandarin. We offer language learners the ability to listen to spoken phrases and simulated conversations online, engage in interactive dynamic conversations over the telephone, and review audio and visual feedback of their conversations. We also provide a wide array of administration and maintenance features online for teachers and administrators to facilitate continued system development and user interaction, such as lesson plan creation, vocabulary management, and a requests forum. User studies have shown that there is an appreciation for the potential of the system and that the core operation is intuitive and entertaining. The studies have also helped to illuminate the vast array of future work necessary to further polish the language learning experience and reduce the administrative burden. The focus of this thesis is the creation of the first iteration of SLLS; we believe we have taken the first step down the long but hopeful path towards helping people speak a foreign language.
by Tien-Lok Jonathan Lau.
M. Eng.
M.Eng. Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science
Mrkšić, Nikola. "Data-driven language understanding for spoken dialogue systems." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/276689.
Full textCoria, Juan Manuel. "Continual Representation Learning in Written and Spoken Language." Electronic Thesis or Diss., université Paris-Saclay, 2023. http://www.theses.fr/2023UPASG025.
Full textAlthough machine learning has recently witnessed major breakthroughs, today's models are mostly trained once on a target task and then deployed, rarely (if ever) revisiting their parameters.This problem affects performance after deployment, as task specifications and data may evolve with user needs and distribution shifts.To solve this, continual learning proposes to train models over time as new data becomes available.However, models trained in this way suffer from significant performance loss on previously seen examples, a phenomenon called catastrophic forgetting.Although many studies have proposed different strategies to prevent forgetting, they often rely on labeled data, which is rarely available in practice. In this thesis, we study continual learning for written and spoken language.Our main goal is to design autonomous and self-learning systems able to leverage scarce on-the-job data to adapt to the new environments they are deployed in.Contrary to recent work on learning general-purpose representations (or embeddings), we propose to leverage representations that are tailored to a downstream task.We believe the latter may be easier to interpret and exploit by unsupervised training algorithms like clustering, that are less prone to forgetting. Throughout our work, we improve our understanding of continual learning in a variety of settings, such as the adaptation of a language model to new languages for sequence labeling tasks, or even the adaptation to a live conversation in the context of speaker diarization.We show that task-specific representations allow for effective low-resource continual learning, and that a model's own predictions can be exploited for full self-learning
譚成珠 and Chengzhu Tan. "Sentence structure in spoken modern standard Chinese." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1995. http://hub.hku.hk/bib/B31213649.
Full textToivanen, Juhani H. "Perspectives on intonation English, Finnish, and English spoken by Finns /." Frankfurt am Main ; New York : Peter Lang, 2001. http://catalog.hathitrust.org/api/volumes/oclc/47142055.html.
Full textJensen, Marie-Thérèse 1949. "Corrective feedback to spoken errors in adult ESL classrooms." Monash University, Faculty of Education, 2001. http://arrow.monash.edu.au/hdl/1959.1/8620.
Full textCheepen, C. "The interactive basis of spoken dialogue." Thesis, University of Hertfordshire, 1987. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.376103.
Full textChristensen, Matthew B. "Variation in spoken and written Mandarin narrative discourse /." The Ohio State University, 1994. http://rave.ohiolink.edu/etdc/view?acc_num=osu1487859313344186.
Full textMasud, Rabia. "Language spoken around the world: lessons from Le Corbusier." Thesis, Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/33952.
Full textInagaki, Yasuyoshi, Shigeki Matsubara, Atsushi Mizuno, and Koichiro Ryu. "Incremental Japanese Spoken Language Generation in Simultaneous Machine Interpretation." IEICE, 2004. http://hdl.handle.net/2237/15091.
Full textYao, Huan 1976. "Utterance verification in large vocabulary spoken language understanding system." Thesis, Massachusetts Institute of Technology, 1998. http://hdl.handle.net/1721.1/47633.
Full textReimers, Stian John. "Representations of phonology in spoken language comprehension and production." Thesis, University of Cambridge, 2002. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.620381.
Full textZhuang, Jie. "Lexical, semantic, and syntactic processes in spoken language comprehension." Thesis, University of Cambridge, 2010. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.608554.
Full textHatchard, Rachel. "A construction-based approach to spoken language in aphasia." Thesis, University of Sheffield, 2015. http://etheses.whiterose.ac.uk/10385/.
Full textHoshino, Takane Noda Mari. "An analysis of Hosii in modern spoken Japanese /." Connect to this title online, 1991. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1116617297.
Full textThomson, Blaise Roger Marie. "Statistical methods for spoken dialogue management." Thesis, University of Cambridge, 2010. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.609054.
Full textPijitra, Dissawarotham David Thomas. "The phonology of Plang as spoken in Banhuaynamkhum Chiengrai province /." abstract, 1986. http://mulinet3.li.mahidol.ac.th/thesis/2529/29E-Pijitra-D.pdf.
Full text