Tesis sobre el tema "Speech"
Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros
Consulte los 50 mejores tesis para su investigación sobre el tema "Speech".
Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.
También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.
Explore tesis sobre una amplia variedad de disciplinas y organice su bibliografía correctamente.
Sun, Felix (Felix W. ). "Speech Representation Models for Speech Synthesis and Multimodal Speech Recognition". Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/106378.
Texto completoThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 59-63).
The field of speech recognition has seen steady advances over the last two decades, leading to the accurate, real-time recognition systems available on mobile phones today. In this thesis, I apply speech modeling techniques developed for recognition to two other speech problems: speech synthesis and multimodal speech recognition with images. In both problems, there is a need to learn a relationship between speech sounds and another source of information. For speech synthesis, I show that using a neural network acoustic model results in a synthesizer that is more tolerant of noisy training data than previous work. For multimodal recognition, I show how information from images can be effectively integrated into the recognition search framework, resulting in improved accuracy when image data is available.
by Felix Sun.
M. Eng.
Alcaraz, Meseguer Noelia. "Speech Analysis for Automatic Speech Recognition". Thesis, Norwegian University of Science and Technology, Department of Electronics and Telecommunications, 2009. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-9092.
Texto completoThe classical front end analysis in speech recognition is a spectral analysis which parametrizes the speech signal into feature vectors; the most popular set of them is the Mel Frequency Cepstral Coefficients (MFCC). They are based on a standard power spectrum estimate which is first subjected to a log-based transform of the frequency axis (mel- frequency scale), and then decorrelated by using a modified discrete cosine transform. Following a focused introduction on speech production, perception and analysis, this paper gives a study of the implementation of a speech generative model; whereby the speech is synthesized and recovered back from its MFCC representations. The work has been developed into two steps: first, the computation of the MFCC vectors from the source speech files by using HTK Software; and second, the implementation of the generative model in itself, which, actually, represents the conversion chain from HTK-generated MFCC vectors to speech reconstruction. In order to know the goodness of the speech coding into feature vectors and to evaluate the generative model, the spectral distance between the original speech signal and the one produced from the MFCC vectors has been computed. For that, spectral models based on Linear Prediction Coding (LPC) analysis have been used. During the implementation of the generative model some results have been obtained in terms of the reconstruction of the spectral representation and the quality of the synthesized speech.
Kleinschmidt, Tristan Friedrich. "Robust speech recognition using speech enhancement". Thesis, Queensland University of Technology, 2010. https://eprints.qut.edu.au/31895/1/Tristan_Kleinschmidt_Thesis.pdf.
Texto completoBlank, Sarah Catrin. "Speech comprehension, speech production and recovery of propositional speech following aphasic stroke". Thesis, Imperial College London, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.407772.
Texto completoPrice, Moneca C. "Interactions between speech coders and disordered speech". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp01/MQ28640.pdf.
Texto completoChong, Fong Loong. "Objective speech quality measurement for Chinese speech". Thesis, University of Canterbury. Computer Science and Software Engineering, 2005. http://hdl.handle.net/10092/9607.
Texto completoStedmon, Alexander Winstan. "Putting speech in, taking speech out : human factors in the use of speech interfaces". Thesis, University of Nottingham, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.420342.
Texto completoMiyajima, C., D. Negi, Y. Ninomiya, M. Sano, K. Mori, K. Itou, K. Takeda y Y. Suenaga. "Audio-Visual Speech Database for Bimodal Speech Recognition". INTELLIGENT MEDIA INTEGRATION NAGOYA UNIVERSITY / COE, 2005. http://hdl.handle.net/2237/10460.
Texto completoTang, Lihong. "Nonsensical speech : speech acts in postsocialist Chinese culture /". Thesis, Connect to this title online; UW restricted, 2008. http://hdl.handle.net/1773/6662.
Texto completoItakura, Fumitada, Tetsuya Shinde, Kiyoshi Tatara, Taisuke Ito, Ikuya Yokoo, Shigeki Matsubara, Kazuya Takeda y Nobuo Kawaguchi. "CIAIR speech corpus for real world speech recognition". The oriental chapter of COCOSDA (The International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques), 2002. http://hdl.handle.net/2237/15462.
Texto completoWang, Peidong. "Robust Automatic Speech Recognition By Integrating Speech Separation". The Ohio State University, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=osu1619099401042668.
Texto completoLimbu, Sireesh Haang. "Direct Speech to Speech Translation Using Machine Learning". Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-439141.
Texto completoHu, Ke. "Speech Segregation in Background Noise and Competing Speech". The Ohio State University, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=osu1339018952.
Texto completoAl-Otaibi, Abdulhadi S. "Arabic speech processing : syllabic segmentation and speech recognition". Thesis, Aston University, 1988. http://publications.aston.ac.uk/8064/.
Texto completoSmith, Peter Wilfred Hesling. "Speech act theory, discourse structure and indirect speech". Thesis, University of Leeds, 1991. http://etheses.whiterose.ac.uk/734/.
Texto completoTran, Viet Anh. "Silent communication : whispered speech-to-clear speech conversion". Grenoble INPG, 2010. http://www.theses.fr/2010INPG0006.
Texto completoIn recent years, advances in wireless communication technology have led to the widespread use of cellular phones. Because of noisy environmental conditions and competing surrounding conversations, users tend to speak loudly. As a consequence, private policies and public legislation tend to restrain the use of cellular phone in public places. Silent speech which can only be heard by a limited set of listeners close to the speaker is an attractive solution to this problem if it can effectively be used for quiet and private communication. The motivation of this research thesis was to investigate ways of improving the naturalness and the intelligibility of synthetic speech obtained from the conversion of silent or whispered speech. A Non-audible murmur (NAM) condenser microphone, together with signal-based Gaussian Mixture Model (GMM) mapping, were chosen because promising results were already obtained with this sensor and this approach, and because the size of the NAM sensor is well adapted to mobile communication technology. Several improvements to the speech conversion obtained with this sensor were considered. A first set of improvement concerns characteristics of the voiced source. One of the features missing in whispered or silent speech with respect to loud or modal speech is F0, which is crucial in conveying linguistic (question vs. Statement, syntactic grouping, etc. ) as well as paralinguistic (attitudes, emotions) information. The proposed estimation of voicing and F0 for converted speech by separate predictors improves both predictions. The naturalness of the converted speech was then further improved by extending the context window of the input feature from phoneme size to syllable size and using a Linear Discriminant Analysis (LDA) instead of a Principal Component Analysis (PCA) for the dimension reduction of input feature vector. The objective positive influence of this new approach of the quality of the output converted speech was confirmed by perceptual tests. Another approach investigated in this thesis consisted in integrating visual information as a complement to the acoustic information in both input and output data. Lip movements which significantly contribute to the intelligibility of visual speech in face-to-face human interaction were explored by using an accurate lip motion capture system from 3D positions of coloured beads glued on the speaker's face. The visual parameters are represented by 5 components related to the rotation of the jaw, to lip rounding, upper and lower lip vertical movements and movements of the throat which is associated with the underlying movements of the larynx and hyoid bone. Including these visual features in the input data significantly improved the quality of the output converted speech, in terms of F0 and spectral features. In addition, the audio output was replaced by an audio-visual output. Subjective perceptual tests confirmed that the investigation of the visual modality in either the input or output data or both, improves the intelligibility of the whispered speech conversion. Both of these improvements are confirmed by subjective tests. Finally, we investigated the technique using a phonetic pivot by combining Hidden Markov Model (HMM)-based speech recognition and HMM-based speech synthesis techniques to convert whispered speech data to audible one in order to compare the performance of the two state-of-the-art approaches. Audiovisual features were used in the input data and audiovisual speech was produced as an output. The objective performance of the HMM-based system was inferior to the direct signal-to-signal system based on a GMM. A few interpretations of this result were proposed together with future lines of research
Chuchilina, L. M. y I. E. Yeskov. "Speech recognition". Thesis, Видавництво СумДУ, 2008. http://essuir.sumdu.edu.ua/handle/123456789/15995.
Texto completoWindchy, Eli. "Keynote Speech". Digital Commons @ East Tennessee State University, 2018. https://dc.etsu.edu/dcseug/2018/schedule/9.
Texto completoChua, W. W. "Speech recognition predictability of a Cantonese speech intelligibility index". Click to view the E-thesis via HKUTO, 2004. http://sunzi.lib.hku.hk/hkuto/record/B30509737.
Texto completoOverton, Katherine. "Perceptual Differences in Natural Speech and Personalized Synthetic Speech". Scholar Commons, 2017. http://scholarcommons.usf.edu/etd/6921.
Texto completoMailend, Marja-Liisa y Marja-Liisa Mailend. "Speech Motor Planning in Apraxia of Speech and Aphasia". Diss., The University of Arizona, 2017. http://hdl.handle.net/10150/625882.
Texto completoMak, Cheuk-yan Charin. "Effects of speech and noise on Cantonese speech intelligibility". Click to view the E-thesis via HKUTO, 2006. http://sunzi.lib.hku.hk/hkuto/record/B37989790.
Texto completoEvans, N. W. D. "Spectral subtraction for speech enhancement and automatic speech recognition". Thesis, Swansea University, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.636935.
Texto completoChua, W. W. y 蔡蕙慧. "Speech recognition predictability of a Cantonese speech intelligibility index". Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2004. http://hub.hku.hk/bib/B30509737.
Texto completoMak, Cheuk-yan Charin y 麥芍欣. "Effects of speech and noise on Cantonese speech intelligibility". Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2006. http://hub.hku.hk/bib/B37989790.
Texto completoLe, Cornu Thomas. "Reconstruction of intelligible audio speech from visual speech information". Thesis, University of East Anglia, 2016. https://ueaeprints.uea.ac.uk/67012/.
Texto completoJett, Brandi. "The role of coarticulation in speech-on-speech recognition". Case Western Reserve University School of Graduate Studies / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=case1554498179209764.
Texto completoBi, Ning. "Speech conversion and its application to alaryngeal speech enhancement". Diss., The University of Arizona, 1995. http://hdl.handle.net/10150/187290.
Texto completoGordon, Jane S. "Use of synthetic speech in tests of speech discrimination". PDXScholar, 1985. https://pdxscholar.library.pdx.edu/open_access_etds/3443.
Texto completoMUKHERJEE, SANKAR. "Sensorimotor processes in speech listening and speech-based interaction". Doctoral thesis, Università degli studi di Genova, 2019. http://hdl.handle.net/11567/941827.
Texto completoKong, Jessica Lynn. "The Effect Of Mean Fundamental Frequency Normalization Of Masker Speech For A Speech-In-Speech Recognition Task". Case Western Reserve University School of Graduate Studies / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=case1588949121900459.
Texto completoSchramm, Hauke. "Modeling spontaneous speech variability for large vocabulary continuous speech recognition". [S.l.] : [s.n.], 2006. http://deposit.ddb.de/cgi-bin/dokserv?idn=97968479X.
Texto completoLidstone, Jane Stephanie May. "Private speech and inner speech in typical and atypical development". Thesis, Durham University, 2010. http://etheses.dur.ac.uk/526/.
Texto completoHoward, John Graham. "Temporal aspects of auditory-visual speech and non-speech perception". Thesis, University of Reading, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.553127.
Texto completoSimm, William Alexander. "Dysarthric speech measures for use in evidence-based speech therapy". Thesis, Lancaster University, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.531724.
Texto completoLebart, Katia. "Speech dereverberation applied to automatic speech recognition and hearing aids". Thesis, University of Sussex, 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.285064.
Texto completoAlghamdi, Najwa. "Visual speech enhancement and its application in speech perception training". Thesis, University of Sheffield, 2017. http://etheses.whiterose.ac.uk/19667/.
Texto completoMwanyoha, Sadiki Pili 1974. "A speech recognition module for speech-to-text language translation". Thesis, Massachusetts Institute of Technology, 1998. http://hdl.handle.net/1721.1/9862.
Texto completoIncludes bibliographical references (leaves 47-48).
by Sadiki Pili Mwanyoha.
S.B.and M.Eng.
Moers-Prinz, Donata [Verfasser]. "Fast Speech in Unit Selection Speech Synthesis / Donata Moers-Prinz". Bielefeld : Universitätsbibliothek Bielefeld, 2020. http://d-nb.info/1219215201/34.
Texto completoLEBART, KATIA. "Speech dereverberation applied to automatic speech recognition and hearing aids". Rennes 1, 1999. http://www.theses.fr/1999REN10033.
Texto completoSöderberg, Hampus. "Engaging Speech UI's - How to address a speech recognition interface". Thesis, Malmö högskola, Fakulteten för teknik och samhälle (TS), 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:mau:diva-20591.
Texto completoShuster, Linda Irene. "Speech perception and speech production : between and within modal adaptation /". The Ohio State University, 1986. http://rave.ohiolink.edu/etdc/view?acc_num=osu148726754698296.
Texto completoKim, Hyo-Jong. "Stephen's speech missiological implications of Stephen's speech in Luke-Acts /". Online full text .pdf document, available to Fuller patrons only, 1999. http://www.tren.com.
Texto completoVescovi, Federico <1993>. "Understanding Speech Acts: Towards the Automated Detection of Speech Acts". Master's Degree Thesis, Università Ca' Foscari Venezia, 2019. http://hdl.handle.net/10579/15644.
Texto completoEriksson, Mattias. "Speech recognition availability". Thesis, Linköping University, Department of Computer and Information Science, 2004. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-2651.
Texto completoThis project investigates the importance of availability in the scope of dictation programs. Using speech recognition technology for dictating has not reached the public, and that may very well be a result of poor availability in today’s technical solutions.
I have constructed a persona character, Johanna, who personalizes the target user. I have also developed a solution that streams audio into a speech recognition server and sends back interpreted text. Johanna affirmed that the solution was successful in theory.
I then incorporated test users that tried out the solution in practice. Half of them do indeed claim that their usage has been and will continue to be increased thanks to the new level of availability.
Øygarden, Jon. "Norwegian Speech Audiometry". Doctoral thesis, Norges teknisk-naturvitenskapelige universitet, Institutt for språk- og kommunikasjonsstudier, 2009. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-5409.
Texto completoNilsson, Mattias. "Entropy and Speech". Doctoral thesis, Stockholm : Sound and Image Processing Laboratory, School of Electrical Engineering, Royal Institute of Technology, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-3990.
Texto completoJanardhanan, Deepa. "Wideband speech enhancement". Aachen Shaker, 2008. http://d-nb.info/989298310/04.
Texto completoDonovan, R. E. "Trainable speech synthesis". Thesis, University of Cambridge, 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.598598.
Texto completoOliver, Richard George. "Malocclusion and speech". Thesis, Cardiff University, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.390247.
Texto completo