Дисертації з теми "Cloud speech recognition adaptation"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся з топ-50 дисертацій для дослідження на тему "Cloud speech recognition adaptation".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Переглядайте дисертації для різних дисциплін та оформлюйте правильно вашу бібліографію.
Chan, Carlos Chun Ming. "Speaker model adaptation in automatic speech recognition." Thesis, Robert Gordon University, 1993. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.339307.
Повний текст джерелаHo, Ka-Lung. "Kernel eigenvoice speaker adaptation /." View Abstract or Full-Text, 2003. http://library.ust.hk/cgi/db/thesis.pl?COMP%202003%20HOK.
Повний текст джерелаIncludes bibliographical references (leaves 56-61). Also available in electronic version. Access restricted to campus users.
Humphries, J. J. "Accent modelling and adaptation in automatic speech recognition." Thesis, University of Cambridge, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.604784.
Повний текст джерелаCox, S. J. "Techniques for rapid speaker adaptation in speech recognition." Thesis, University of East Anglia, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.267271.
Повний текст джерелаNieuwoudt, Christoph. "Cross-language acoustic adaptation for automatic speech recognition." Thesis, Pretoria : [s.n.], 2000. http://upetd.up.ac.za/thesis/available/etd-01062005-071829.
Повний текст джерелаHewett, Andrew John. "Training and speaker adaptation in template-based speech recognition." Thesis, University of Cambridge, 1989. https://www.repository.cam.ac.uk/handle/1810/250961.
Повний текст джерелаClarkson, P. R. "Adaptation of statistical language models for automatic speech recognition." Thesis, University of Cambridge, 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.597745.
Повний текст джерелаGe, Zhenhao. "Mispronunciation detection for language learning and speech recognition adaptation." Thesis, Purdue University, 2014. http://pqdtopen.proquest.com/#viewpdf?dispub=3613127.
Повний текст джерелаThe areas of "mispronunciation detection" (or "accent detection" more specifically) within the speech recognition community are receiving increased attention now. Two application areas, namely language learning and speech recognition adaptation, are largely driving this research interest and are the focal points of this work.
There are a number of Computer Aided Language Learning (CALL) systems with Computer Aided Pronunciation Training (CAPT) techniques that have been developed. In this thesis, a new HMM-based text-dependent mispronunciation system is introduced using text Adaptive Frequency Cepstral Coefficients (AFCCs). It is shown that this system outperforms the conventional HMM method based on Mel Frequency Cepstral Coefficients (MFCCs). In addition, a mispronunciation detection and classification algorithm based on Principle Component Analysis (PCA) is introduced to help language learners identify and correct their pronunciation errors at the word and syllable levels.
To improve speech recognition by adaptation, two projects have been explored. The first one improves name recognition by learning acceptable variations in name pronunciations, as one of the approaches to make grammar-based name recognition adaptive. The second project is accent detection by examining the shifting of fundamental vowels in accented speech. This approach uses both acoustic and phonetic information to detect accents and is shown to be beneficial with accented English. These applications can be integrated into an automated international calling system, to improve recognition of callers' names and speech. It determines the callers' accent based in a short period of speech. Once the type of accents is detected, it switches from the standard speech recognition engine to an accent-adaptive one for better recognition results.
Uebel, Luís Felipe. "Speaker normalisation and adaptation in large vocabulary speech recognition." Thesis, University of Cambridge, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.616207.
Повний текст джерелаMcInnes, Fergus Robert. "Adaptation of reference patterns in word-based speech recognition." Thesis, University of Edinburgh, 1988. http://hdl.handle.net/1842/12618.
Повний текст джерелаWang, Chien-Jen. "Joint-space adaptation technique for robust continuous speech recognition /." Thesis, Connect to this title online; UW restricted, 1997. http://hdl.handle.net/1773/5860.
Повний текст джерелаWu, Jian, and 武健. "Discriminative speaker adaptation and environmental robustness in automatic speech recognition." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2004. http://hub.hku.hk/bib/B31246138.
Повний текст джерелаVipperla, Ravichander. "Automatic Speech Recognition for ageing voices." Thesis, University of Edinburgh, 2011. http://hdl.handle.net/1842/5725.
Повний текст джерелаHsiao, Roger Wend Huu. "Kernel eigenspace-based MLLR adaptation /." View abstract or full-text, 2004. http://library.ust.hk/cgi/db/thesis.pl?COMP%202004%20HSIAO.
Повний текст джерелаLu, Jianhua. "Missing feature decoding and model adaptation for noisy speech recognition." Thesis, Queen's University Belfast, 2010. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.527938.
Повний текст джерелаShou-Chun, Yin 1980. "Speaker adaptation in joint factor analysis based text independent speaker verification." Thesis, McGill University, 2006. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=100735.
Повний текст джерелаMandal, Arindam. "Transformation sharing strategies for MLLR speaker adaptation /." Thesis, Connect to this title online; UW restricted, 2007. http://hdl.handle.net/1773/6087.
Повний текст джерелаStokes-Rees, Ian James. "A Study of the Automatic Speech Recognition Process and Speaker Adaptation." Thesis, University of Waterloo, 2000. http://hdl.handle.net/10012/840.
Повний текст джерелаStokes-Rees, Ian. "A study of the automatic speech recognition process and speaker adaptation." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape3/PQDD_0018/MQ56683.pdf.
Повний текст джерелаNolazco, Flores Juan Arturo. "Spectral subtraction and model adaptation for robust speech recognition in noise." Thesis, University of Cambridge, 1993. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.318436.
Повний текст джерелаHe, Xiaodong. "Model selection based speaker adaptation and its application to nonnative speech recognition /." free to MU campus, to others for purchase, 2003. http://wwwlib.umi.com/cr/mo/fullcit?p3115555.
Повний текст джерелаHaque, Serajul. "Perceptual features for speech recognition." University of Western Australia. School of Electrical, Electronic and Computer Engineering, 2008. http://theses.library.uwa.edu.au/adt-WU2008.0187.
Повний текст джерелаAhadi-Sarkani, Seyed Mohammad. "Bayesian and predictive techniques for speaker adaptation." Thesis, University of Cambridge, 1996. https://www.repository.cam.ac.uk/handle/1810/273100.
Повний текст джерелаGabriel, Naveen. "Automatic Speech Recognition in Somali." Thesis, Linköpings universitet, Statistik och maskininlärning, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-166216.
Повний текст джерелаMa, Bin, and 馬斌. "A study on acoustic modeling and adaptation in HMM-based speech recognition." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2000. http://hub.hku.hk/bib/B31242145.
Повний текст джерелаMa, Bin. "A study on acoustic modeling and adaptation in HMM-based speech recognition." Hong Kong : University of Hong Kong, 2000. http://sunzi.lib.hku.hk/hkuto/record.jsp?B22142423.
Повний текст джерелаSwietojanski, Paweł. "Learning representations for speech recognition using artificial neural networks." Thesis, University of Edinburgh, 2016. http://hdl.handle.net/1842/22835.
Повний текст джерелаLi, Wei. "A study of an active approach to speaker and task adaptation based on automatic analysis of vocabulary confusability." Click to view the E-thesis via HKUTO, 2007. http://sunzi.lib.hku.hk/hkuto/record/B39634073.
Повний текст джерелаLi, Wei, and 李威. "A study of an active approach to speaker and task adaptation based on automatic analysis of vocabulary confusability." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2007. http://hub.hku.hk/bib/B39634073.
Повний текст джерелаFraga, Da Silva Thiago. "Reducing development costs of large vocabulary speech recognition systems." Thesis, Paris 11, 2014. http://www.theses.fr/2014PA112232/document.
Повний текст джерелаOne of the outstanding challenges in large vocabulary automatic speech recognition (ASR) is the reduction of development costs required to build a new recognition system or adapt an existing one to a new task, language or dialect. The state-of-the-art ASR systems are based on the principles of the statistical learning paradigm, using information provided by two stochastic models, an acoustic (AM) and a language (LM) model. The standard methods used to estimate the parameters of such models are founded on two main assumptions : the training data sets are large enough, and the training data match well the target task. It is well-known that a great part of system development costs is due to the construction of corpora that fulfill these requirements. In particular, manually transcribing the audio data is the most expensive and time-consuming endeavor. For some applications, such as the recognition of low resourced languages or dialects, finding and collecting data is also a hard (and expensive) task. As a means to lower the cost required for ASR system development, this thesis proposes and studies methods that aim to alleviate the need for manually transcribing audio data for a given target task. Two axes of research are explored. First, unsupervised training methods are explored in order to build three of the main components of ASR systems : the acoustic model, the multi-layer perceptron (MLP) used to extract acoustic features and the language model. The unsupervised training methods aim to estimate the model parameters using a large amount of automatically (and inaccurately) transcribed audio data, obtained thanks to an existing recognition system. A novel method for unsupervised AM training that copes well with the automatic audio transcripts is proposed : the use of multiple recognition hypotheses (rather than the best one) leads to consistent gains in performance over the standard approach. Unsupervised MLP training is proposed as an alternative to build efficient acoustic models in a fully unsupervised way. Compared to cross-lingual MLPs trained in a supervised manner, the unsupervised MLP leads to competitive performance levels even if trained on only about half of the data amount. Unsupervised LM training approaches are proposed to estimate standard back-off n-gram and neural network language models. It is shown that unsupervised LM training leads to additive gains in performance on top of unsupervised AM training. Second, this thesis proposes the use of model interpolation as a rapid and flexible way to build task specific acoustic models. In reported experiments, models obtained via interpolation outperform the baseline pooled models and equivalent maximum a posteriori (MAP) adapted models. Interpolation proves to be especially useful for low resourced dialect ASR. When only a few (2 to 3 hours) or no acoustic data truly matching the target dialect are available for AM training, model interpolation leads to substantial performance gains compared to the standard training methods
Amdal, Ingunn. "Learning pronunciation variation : A data-driven approach to rule-based lecxicon adaptation for automatic speech recognition." Doctoral thesis, Norwegian University of Science and Technology, Faculty of Information Technology, Mathematics and Electrical Engineering, 2002. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-1560.
Повний текст джерелаTo achieve a robust system the variation seen for different speaking styles must be handled. An investigation of standard automatic speech recognition techniques for different speaking styles showed that lexical modelling using general-purpose variants gave small improvements, but the errors differed compared with using only one canonical pronunciation per word. Modelling the variation using the acoustic models (using context dependency and/or speaker dependent adaptation) gave a significant improvement, but the resulting performance for non-native and spontaneous speech was still far from read speech.
In this dissertation a complete data-driven approach to rule-based lexicon adaptation is presented, where the effect of the acoustic models is incorporated in the rule pruning metric. Reference and alternative transcriptions were aligned by dynamic programming, but with a data-driven method to derive the phone-to-phone substitution costs. The costs were based on the statistical co-occurrence of phones, association strength. Rules for pronunciation variation were derived from this alignment. The rules were pruned using a new metric based on acoustic log likelihood. Well trained acoustic models are capable of modelling much of the variation seen, and using the acoustic log likelihood to assess the pronunciation rules prevents the lexical modelling from adding variation already accounted for as shown for direct pronunciation variation modelling.
For the non-native task data-driven pronunciation modelling by learning pronunciation rules gave a significant performance gain. Acoustic log likelihood rule pruning performed better than rule probability pruning.
For spontaneous dictation the pronunciation variation experiments did not improve the performance. The answer to how to better model the variation for spontaneous speech seems to lie neither in the acoustical nor the lexical modelling. The main differences between read and spontaneous speech are the grammar used and disfluencies like restarts and long pauses. The language model may thus be the best starting point for more research to achieve better performance for this speaking style.
Kleynhans, Neil Taylor. "Automatic speech recognition for resource-scarce environments / N.T. Kleynhans." Thesis, North-West University, 2013. http://hdl.handle.net/10394/9668.
Повний текст джерелаThesis (PhD (Computer and Electronic Engineering))--North-West University, Potchefstroom Campus, 2013.
Sooful, Jayren Jugpal. "Automated phoneme mapping for cross-language speech recognition." Diss., Pretoria [s.n.], 2004. http://upetd.up.ac.za/thesis/available/etd-01112005-131128.
Повний текст джерелаFanner, Robert M. "Analysis and implementation of the speaker adaptation techniques : MAP, MLLR, and MLED." Thesis, Stellenbosch : Stellenbosch University, 2002. http://hdl.handle.net/10019.1/52653.
Повний текст джерелаENGLISH ABSTRACT: The topic of this thesis is speaker adaptation, whereby speaker-independent speech models are adapted to more closely match individual speakers by utilising a small amount of data from the targeted individual. Speaker adaptation methods - specifically, the MAP, MLLR and MLED speaker adaptation methods - are critically evaluated and compared. Two novel extensions of the MLED adaptation method are introduced, derived and evaluated. The first incorporates the explicit modelling of the mean speaker model in the speaker-space into the MLED framework. The second extends MLED to use basis vectors modelling inter-class variance for classes of speech models, instead of basis vectors modelling inter-speaker variance. An evaluation of the effect of two different types of feature vector - PLP-cepstra and LPCCs - on the performance of speaker adaptation is made, to determine which feature vector is optimal for speaker-independent systems and the adaptation thereof.
AFRIKAANSE OPSOMMING: Die onderwerp van hierdie tesis is spreker-aanpassing, dit wil sê, die verandering van 'n spreker-onafhanklike spraakmodel om nader aan 'n spreker-afhanklike model vir 'n individu te wees, gegewe 'n klein hoeveelheid spraakdata van die individu. Die volgende sprekeraanpassing-metodes word geëvalueer: MAP, MLLR en MLED. Twee nuwe uitbreidings vir die MLED-metode word beskryf, afgelei en geëvalueer. Die eerste inkorporeer die eksplisiete modellering van die gemiddelde sprekermodel van die sprekerruimte in die MLED metode. Die tweede uitbreiding maak gebruik van basisvektore vir MLED wat vanaf die interklas-variansie tussen 'n stel sprekerklasse in plaas van die interspreker-variansie afgelei is. Die effek van twee tipes kenmerk-vektore - PLP-kepstra en LPCC's - op die prestasie van sprekeraanpassings-metodes word ondersoek, sodat die optimale tipe kenmerk-vektor vir spreker-onafhanklike modelle en hul aanpassing gevind kan word.
Shin, Sung-Hwan. "Objective-driven discriminative training and adaptation based on an MCE criterion for speech recognition and detection." Diss., Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/50255.
Повний текст джерелаGangireddy, Siva Reddy. "Recurrent neural network language models for automatic speech recognition." Thesis, University of Edinburgh, 2017. http://hdl.handle.net/1842/28990.
Повний текст джерелаPinet, M. "Accent effects on the recognition of speech in noise : second-language proficiency, accent similarity and adaptation." Thesis, University College London (University of London), 2012. http://discovery.ucl.ac.uk/1370645/.
Повний текст джерелаMartirosian, Olga Meruzhanovna. "Adapting a pronunciation dictionary to Standard South African English for automatic speech recognition / Olga Meruzhanovna Martirosian." Thesis, North-West University, 2009. http://hdl.handle.net/10394/4902.
Повний текст джерелаThesis (M.Ing. (Computer Engineering))--North-West University, Potchefstroom Campus, 2009.
Tomashenko, Natalia. "Speaker adaptation of deep neural network acoustic models using Gaussian mixture model framework in automatic speech recognition systems." Thesis, Le Mans, 2017. http://www.theses.fr/2017LEMA1040/document.
Повний текст джерелаDifferences between training and testing conditions may significantly degrade recognition accuracy in automatic speech recognition (ASR) systems. Adaptation is an efficient way to reduce the mismatch between models and data from a particular speaker or channel. There are two dominant types of acoustic models (AMs) used in ASR: Gaussian mixture models (GMMs) and deep neural networks (DNNs). The GMM hidden Markov model (GMM-HMM) approach has been one of the most common technique in ASR systems for many decades. Speaker adaptation is very effective for these AMs and various adaptation techniques have been developed for them. On the other hand, DNN-HMM AMs have recently achieved big advances and outperformed GMM-HMM models for various ASR tasks. However, speaker adaptation is still very challenging for these AMs. Many adaptation algorithms that work well for GMMs systems cannot be easily applied to DNNs because of the different nature of these models. The main purpose of this thesis is to develop a method for efficient transfer of adaptation algorithms from the GMM framework to DNN models. A novel approach for speaker adaptation of DNN AMs is proposed and investigated. The idea of this approach is based on using so-called GMM-derived features as input to a DNN. The proposed technique provides a general framework for transferring adaptation algorithms, developed for GMMs, to DNN adaptation. It is explored for various state-of-the-art ASR systems and is shown to be effective in comparison with other speaker adaptation techniques and complementary to them
Ravindran, Sourabh. "Physiologically Motivated Methods For Audio Pattern Classification." Diss., Georgia Institute of Technology, 2006. http://hdl.handle.net/1853/14066.
Повний текст джерелаITAKURA, Fumitada, Kazuya TAKEDA, Katsunobu ITOU, and Weifeng LI. "Single-Channel Multiple Regression for In-Car Speech Enhancement." Institute of Electronics, Information and Communication Engineers, 2006. http://hdl.handle.net/2237/15051.
Повний текст джерелаBorges, Liselene de Abreu. "Sistemas de adaptação ao locutor utilizando autovozes." Universidade de São Paulo, 2001. http://www.teses.usp.br/teses/disponiveis/3/3142/tde-05052003-104044/.
Повний текст джерелаThis present work describe two speaker adaptation technique, using a small amount of adaptation data, for a speech recognition system. These techniques are Maximum Likelihood Linear Regression (MLLR) and Eigenvoices. Both re-estimates the mean of a continuous density Hidden Markov Model system. MLLR technique estimates a set of linear transformations for mean parameters of a Gaussian system. The eigenvoice technique is based on a previous knowledge about speaker variation. For obtaining this previous knowledge, that are retained in eigenvoices, it necessary to apply principal component analysis (PCA). We make adaptation tests over an isolated word recognition system, restrict vocabulary. If a large amount of adaptation data is available (up to 70% of all vocabulary) Eigenvoices technique does not appear to be a good implementation if compared with the MLLR technique. Now, when just a small amount of adaptation data is available (less than 15 % of all vocabulary), Eigenvoices technique get better results than MLLR technique.
Paliesek, Jakub. "Vliv akustiky prostředí na úspěšnost rozpoznávače řeči." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2021. http://www.nusl.cz/ntk/nusl-445549.
Повний текст джерелаKalantari, Shahram. "Improving spoken term detection using complementary information." Thesis, Queensland University of Technology, 2015. https://eprints.qut.edu.au/90074/1/Shahram_Kalantari_Thesis.pdf.
Повний текст джерелаŠvec, Ján. "Adaptace rozpoznávače řeči na datech bez přepisu." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2015. http://www.nusl.cz/ntk/nusl-234944.
Повний текст джерелаShukla, Saurabh. "Development of a Human-AI Teaming Based Mobile Language Learning Solution for Dual Language Learners in Early and Special Educations." Wright State University / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=wright1547127943126526.
Повний текст джерелаLelong, Amélie. "Convergence phonétique en interaction Phonetic convergence in interaction." Thesis, Grenoble, 2012. http://www.theses.fr/2012GRENT079/document.
Повний текст джерелаThe work presented in this manuscript is based on the study of a phenomenon called phonetic convergence which postulates that two people in interaction will tend to adapt how they talk to their partner in a communicative purpose. We have developed a paradigm called “Verbal Dominoes“ to collect a large corpus to characterize this phenomenon, the ultimate goal being to fill a conversational agent of this adaptability in order to improve the quality of human-machine interactions.We have done several studies to investigate the phenomenon between pairs of unknown people, good friends, and between people coming from the same family. We expect that the amplitude of convergence is proportional to the social distance between the two speakers. We found this result. Then, we have studied the knowledge of the linguistic target impact on adaptation. To characterize the phonetic convergence, we have developed two methods: the first one is based on a linear discriminant analysis between the MFCC coefficients of each speaker and the second one used speech recognition techniques. The last method will allow us to study the phenomenon in less controlled conditions.Finally, we characterized the phonetic convergence with a subjective measurement using a new perceptual test called speaker switching. The test was performed using signals coming from real interactions but also with synthetic data obtained with the harmonic plus
Sam, Sethserey. "Vers une adaptation autonome des modèles acoustiques multilingues pour le traitement automatique de la parole." Phd thesis, Université de Grenoble, 2011. http://tel.archives-ouvertes.fr/tel-00685204.
Повний текст джерелаHromádko, Michal. "Jednoduchý diktovací systém." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2011. http://www.nusl.cz/ntk/nusl-237008.
Повний текст джерелаTran, David, and Jonathan Böcker. "Virtual office assistant on Magic Mirror." Thesis, Malmö högskola, Fakulteten för teknik och samhälle (TS), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-20231.
Повний текст джерела