Academic literature on the topic 'Perceptual features for speech recognition'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Perceptual features for speech recognition.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Journal articles on the topic "Perceptual features for speech recognition"
Li, Guan Yu, Hong Zhi Yu, Yong Hong Li, and Ning Ma. "Features Extraction for Lhasa Tibetan Speech Recognition." Applied Mechanics and Materials 571-572 (June 2014): 205–8. http://dx.doi.org/10.4028/www.scientific.net/amm.571-572.205.
Full textHaque, Serajul, Roberto Togneri, and Anthony Zaknich. "Perceptual features for automatic speech recognition in noisy environments." Speech Communication 51, no. 1 (January 2009): 58–75. http://dx.doi.org/10.1016/j.specom.2008.06.002.
Full textTrabelsi, Imen, and Med Salim Bouhlel. "Comparison of Several Acoustic Modeling Techniques for Speech Emotion Recognition." International Journal of Synthetic Emotions 7, no. 1 (January 2016): 58–68. http://dx.doi.org/10.4018/ijse.2016010105.
Full textDua, Mohit, Rajesh Kumar Aggarwal, and Mantosh Biswas. "Optimizing Integrated Features for Hindi Automatic Speech Recognition System." Journal of Intelligent Systems 29, no. 1 (October 1, 2018): 959–76. http://dx.doi.org/10.1515/jisys-2018-0057.
Full textAl Mahmud, Nahyan, and Shahfida Amjad Munni. "Qualitative Analysis of PLP in LSTM for Bangla Speech Recognition." International journal of Multimedia & Its Applications 12, no. 5 (October 30, 2020): 1–8. http://dx.doi.org/10.5121/ijma.2020.12501.
Full textKamińska, Dorota. "Emotional Speech Recognition Based on the Committee of Classifiers." Entropy 21, no. 10 (September 21, 2019): 920. http://dx.doi.org/10.3390/e21100920.
Full textDmitrieva, E., V. Gelman, K. Zaitseva, and A. Orlov. "Psychophysiological features of perceptual learning in the process of speech emotional prosody recognition." International Journal of Psychophysiology 85, no. 3 (September 2012): 375. http://dx.doi.org/10.1016/j.ijpsycho.2012.07.034.
Full textSeyedin, Sanaz, Seyed Mohammad Ahadi, and Saeed Gazor. "New Features Using Robust MVDR Spectrum of Filtered Autocorrelation Sequence for Robust Speech Recognition." Scientific World Journal 2013 (2013): 1–11. http://dx.doi.org/10.1155/2013/634160.
Full textKaur, Gurpreet, Mohit Srivastava, and Amod Kumar. "Genetic Algorithm for Combined Speaker and Speech Recognition using Deep Neural Networks." Journal of Telecommunications and Information Technology 2 (June 29, 2018): 23–31. http://dx.doi.org/10.26636/jtit.2018.119617.
Full textTrabelsi, Imen, and Med Salim Bouhlel. "Feature Selection for GUMI Kernel-Based SVM in Speech Emotion Recognition." International Journal of Synthetic Emotions 6, no. 2 (July 2015): 57–68. http://dx.doi.org/10.4018/ijse.2015070104.
Full textDissertations / Theses on the topic "Perceptual features for speech recognition"
Haque, Serajul. "Perceptual features for speech recognition." University of Western Australia. School of Electrical, Electronic and Computer Engineering, 2008. http://theses.library.uwa.edu.au/adt-WU2008.0187.
Full textGu, Y. "Perceptually-based features in automatic speech recognition." Thesis, Swansea University, 1991. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.637182.
Full textChu, Kam Keung. "Feature extraction based on perceptual non-uniform spectral compression for noisy speech recognition /." access full-text access abstract and table of contents, 2005. http://libweb.cityu.edu.hk/cgi-bin/ezdb/thesis.pl?mphil-ee-b19887516a.pdf.
Full text"Submitted to Department of Electronic Engineering in partial fulfillment of the requirements for the degree of Master of Philosophy" Includes bibliographical references (leaves 143-147)
Koniaris, Christos. "Perceptually motivated speech recognition and mispronunciation detection." Doctoral thesis, KTH, Tal-kommunikation, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-102321.
Full textQC 20120914
European Union FP6-034362 research project ACORNS
Computer-Animated language Teachers (CALATea)
Koniaris, Christos. "A study on selecting and optimizing perceptually relevant features for automatic speech recognition." Licentiate thesis, Stockholm : Kungliga Tekniska högskolan, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-11470.
Full textSklar, Alexander Gabriel. "Channel Modeling Applied to Robust Automatic Speech Recognition." Scholarly Repository, 2007. http://scholarlyrepository.miami.edu/oa_theses/87.
Full textAtassi, Hicham. "Rozpoznání emočního stavu z hrané a spontánní řeči." Doctoral thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2014. http://www.nusl.cz/ntk/nusl-233665.
Full textTemko, Andriy. "Acoustic event detection and classification." Doctoral thesis, Universitat Politècnica de Catalunya, 2007. http://hdl.handle.net/10803/6880.
Full textsortides de diversos sistemes de classificació. Els sistemes de classificació d'events acústics
desenvolupats s'han testejat també mitjançant la participació en unes quantes avaluacions d'àmbit
internacional, entre els anys 2004 i 2006. La segona principal contribució d'aquest treball de tesi consisteix en el desenvolupament de sistemes de detecció d'events acústics. El problema de la detecció és més complex, ja que inclou tant la classificació dels sons com la determinació dels intervals temporals on tenen lloc. Es desenvolupen dues versions del sistema i es proven amb els conjunts de dades de les dues campanyes d'avaluació internacional CLEAR que van tenir lloc els anys 2006 i 2007, fent-se servir dos tipus de bases de dades: dues bases d'events acústics aïllats, i una base d'enregistraments de seminaris interactius, les quals contenen un nombre relativament elevat d'ocurrències dels events acústics especificats. Els sistemes desenvolupats, que consisteixen en l'ús de classificadors basats en SVM que operen dins
d'una finestra lliscant més un post-processament, van ser els únics presentats a les avaluacions
esmentades que no es basaven en models de Markov ocults (Hidden Markov Models) i cada un d'ells
va obtenir resultats competitius en la corresponent avaluació. La detecció d'activitat oral és un altre dels objectius d'aquest treball de tesi, pel fet de ser un cas particular de detecció d'events acústics especialment important. Es desenvolupa una tècnica de millora de l'entrenament dels SVM per fer front a la necessitat de reducció de l'enorme conjunt de dades existents. El sistema resultant, basat en SVM, és testejat amb uns quants conjunts de dades de l'avaluació NIST RT (Rich Transcription), on mostra puntuacions millors que les del sistema basat en GMM, malgrat que aquest darrer va quedar entre els primers en l'avaluació NIST RT de 2006.
Per acabar, val la pena esmentar alguns resultats col·laterals d'aquest treball de tesi. Com que s'ha dut a terme en l'entorn del projecte europeu CHIL, l'autor ha estat responsable de l'organització de les avaluacions internacionals de classificació i detecció d'events acústics abans esmentades, liderant l'especificació de les classes d'events, les bases de dades, els protocols d'avaluació i, especialment, proposant i implementant les diverses mètriques utilitzades. A més a més, els sistemes de detecció
s'han implementat en la sala intel·ligent de la UPC, on funcionen en temps real a efectes de test i demostració.
The human activity that takes place in meeting-rooms or class-rooms is reflected in a rich variety of acoustic events, either produced by the human body or by objects handled by humans, so the determination of both the identity of sounds and their position in time may help to detect and describe that human activity.
Additionally, detection of sounds other than speech may be useful to enhance the robustness of speech technologies like automatic speech recognition. Automatic detection and classification of acoustic events is the objective of this thesis work. It aims at processing the acoustic signals collected by distant microphones in meeting-room or classroom environments to convert them into symbolic descriptions corresponding to a listener's perception of the different sound events that are present in the signals and their sources. First of all, the task of acoustic event classification is faced using Support Vector Machine (SVM) classifiers, which are motivated by the scarcity of training data. A confusion-matrix-based variable-feature-set clustering scheme is developed for the multiclass recognition problem, and tested on the gathered database. With it, a higher classification rate than the GMM-based technique is obtained, arriving to a large relative average error reduction with respect to the best result from the conventional binary tree scheme. Moreover, several ways to extend SVMs to sequence processing are compared, in an attempt to avoid the drawback of SVMs when dealing with audio data, i.e. their restriction to work with fixed-length vectors, observing that the dynamic time warping kernels work well for sounds that show a temporal structure. Furthermore, concepts and tools from the fuzzy theory are used to investigate, first, the importance of and degree of interaction among features, and second, ways to fuse the outputs of several classification systems. The developed AEC systems are tested also by participating in several international evaluations from 2004 to 2006, and the results
are reported. The second main contribution of this thesis work is the development of systems for detection of acoustic events. The detection problem is more complex since it includes both classification and determination of the time intervals where the sound takes place. Two system versions are developed and tested on the datasets of the two CLEAR international evaluation campaigns in 2006 and 2007. Two kinds of databases are used: two databases of isolated acoustic events, and a database of interactive seminars containing a significant number of acoustic events of interest. Our developed systems, which consist of SVM-based classification within a sliding window plus post-processing, were the only submissions not using HMMs, and each of them obtained competitive results in the corresponding evaluation. Speech activity detection was also pursued in this thesis since, in fact, it is a -especially important - particular case of acoustic event detection. An enhanced SVM training approach for the speech activity detection task is developed, mainly to cope with the problem of dataset reduction. The resulting SVM-based system is tested with several NIST Rich Transcription (RT) evaluation datasets, and it shows better scores than our GMM-based system, which ranked among the best systems in the RT06 evaluation. Finally, it is worth mentioning a few side outcomes from this thesis work. As it has been carried out in the framework of the CHIL EU project, the author has been responsible for the organization of the above mentioned international evaluations in acoustic event classification and detection, taking a leading role in the specification of acoustic event classes, databases, and evaluation protocols, and, especially, in the proposal and implementation of the various metrics that have been used. Moreover, the detection systems have been implemented in the UPC's smart-room and work in real time for purposes of testing and demonstration.
Lileikytė, Rasa. "Quality estimation of speech recognition features." Doctoral thesis, Lithuanian Academic Libraries Network (LABT), 2012. http://vddb.laba.lt/obj/LT-eLABa-0001:E.02~2012~D_20120302_090132-92071.
Full textŠnekos signalų atpažinimo sistemų tikslumas priklauso nuo šnekos signalus aprašančių požymių ir šiuos požymius naudojančių klasifikatorių savybių. Vertinant tradiciškai atpažinimo sistemų tikslumą kiekvienai pasirinktai požymių sistemai ir kiekvienam klasifikatoriaus tipui tenka atlikti atpažinimo tikslumo skaičiavimus. Tokių darbų apimtis galima sumažinti įvertinus pasirenkamų požymių kokybę. Todėl buvo atlikti šnekos signalų požymių kokybės vertinimo tyrimai. Ištirtas metodas šnekos signalų atpažinimo požymių kokybei vertinti, grindžiamas trijų metrikų panaudojimu. Parodyta, kad tokiu būdu atrinkti šnekos signalų požymiai Euklido erdvėje aprašo atpažinimo sistemų kokybę ir leidžia sumažinti atpažinimo sistemų kokybės vertinimo darbų apimtis. Parodyta, kad šnekos signalų požymių kokybės vertinimo metodo algoritmo sudėtingumas yra O(2Rlog2R), o atpažinimo sistemos, kuriame naudojamas dinaminio laiko skalės kraipymo klasifikatorius, atpažinimo kokybės vertinimo algoritmo sudėtingumas yra O(R^2), R – šnekos signalų etalonų vektorių skaičius. Eksperimentinių tyrimų rezultatai patvirtino pateikto šnekos signalų atpažinimo požymių kokybės vertinimo metodo teisingumą.
Matthews, Iain. "Features for audio-visual speech recognition." Thesis, University of East Anglia, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.266736.
Full textBooks on the topic "Perceptual features for speech recognition"
Rao, K. Sreenivasa, and Shashidhar G. Koolagudi. Emotion Recognition using Speech Features. New York, NY: Springer New York, 2013. http://dx.doi.org/10.1007/978-1-4614-5143-3.
Full textRao, K. Sreenivasa, and Manjunath K E. Speech Recognition Using Articulatory and Excitation Source Features. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-49220-9.
Full textGabsdil, Malte. Automatic classification of speech recognition hypotheses using acoustic and pragmatic features. Saarbrücken: DFKI & Universität des Saarlandes, 2005.
Find full textRao, K. Sreenivasa. Robust Emotion Recognition using Spectral and Prosodic Features. New York, NY: Springer New York, 2013.
Find full textKulshreshtha, Manisha. Dialect Accent Features for Establishing Speaker Identity: A Case Study. Boston, MA: Springer US, 2012.
Find full textRao, K. Sreenivasa Sreenivasa, and Manjunath K. E. Speech Recognition Using Articulatory and Excitation Source Features. Springer, 2017.
Find full textLeibo, Joel Z., and Tomaso Poggio. Perception. Oxford University Press, 2018. http://dx.doi.org/10.1093/oso/9780199674923.003.0025.
Full textLee, Lisa. The role of the structure of the lexicon in perceptual word learning. 1993.
Find full textRao, K. Sreenivasa, and Shashidhar G. Koolagudi. Robust Emotion Recognition using Spectral and Prosodic Features. Springer, 2013.
Find full textRao, K. Sreenivasa, and Shashidhar G. Koolagudi. Robust Emotion Recognition using Spectral and Prosodic Features. Springer, 2013.
Find full textBook chapters on the topic "Perceptual features for speech recognition"
Revathi, A., R. Nagakrishnan, D. Vishnu Vashista, Kuppa Sai Sri Teja, and N. Sasikaladevi. "Emotion Recognition from Speech Using Perceptual Features and Convolutional Neural Networks." In Lecture Notes in Electrical Engineering, 355–65. Singapore: Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-3992-3_29.
Full textZhang, Linjuan, Longbiao Wang, Jianwu Dang, Lili Guo, and Haotian Guan. "Convolutional Neural Network with Spectrogram and Perceptual Features for Speech Emotion Recognition." In Neural Information Processing, 62–71. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-04212-7_6.
Full textGrau, Antoni, Joan Aranda, and Joan Climent. "Stepwise selection of perceptual texture features." In Advances in Pattern Recognition, 837–44. Berlin, Heidelberg: Springer Berlin Heidelberg, 1998. http://dx.doi.org/10.1007/bfb0033309.
Full textKaur, Gurpreet, Mohit Srivastava, and Amod Kumar. "Speech Recognition Fundamentals and Features." In Cognitive Computing Systems, 327–48. First edition.: Apple Academic Press, 2021. http://dx.doi.org/10.1201/9781003082033-18.
Full textFrasconi, Paolo, Marco Gori, and Giovanni Soda. "Automatic speech recognition with neural networks: Beyond nonparametric models." In Intelligent Perceptual Systems, 104–21. Berlin, Heidelberg: Springer Berlin Heidelberg, 1993. http://dx.doi.org/10.1007/3-540-57379-8_6.
Full textPotapova, Rodmonga, and Liliya Komalova. "Auditory-Perceptual Recognition of the Emotional State of Aggression." In Speech and Computer, 89–95. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-23132-7_11.
Full textSendlmeier, Walter F. "Primary Perceptual Units in Word Recognition." In Recent Advances in Speech Understanding and Dialog Systems, 165–69. Berlin, Heidelberg: Springer Berlin Heidelberg, 1988. http://dx.doi.org/10.1007/978-3-642-83476-9_16.
Full textSo, Stephen, and Kuldip K. Paliwal. "Quantization of Speech Features: Source Coding." In Advances in Pattern Recognition, 131–61. London: Springer London, 2008. http://dx.doi.org/10.1007/978-1-84800-143-5_7.
Full textKarlos, Stamatis, Nikos Fazakis, Katerina Karanikola, Sotiris Kotsiantis, and Kyriakos Sgarbas. "Speech Recognition Combining MFCCs and Image Features." In Speech and Computer, 651–58. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-43958-7_79.
Full textBimbot, Frédéric, Gérard Chollet, and Jean-Pierre Tubach. "Phonetic features extraction using Time-Delay Neural Networks." In Speech Recognition and Understanding, 299–304. Berlin, Heidelberg: Springer Berlin Heidelberg, 1992. http://dx.doi.org/10.1007/978-3-642-76626-8_31.
Full textConference papers on the topic "Perceptual features for speech recognition"
Revathi, A., and C. Jeyalakshmi. "Robust speech recognition in noisy environment using perceptual features and adaptive filters." In 2017 2nd International Conference on Communication and Electronics Systems (ICCES). IEEE, 2017. http://dx.doi.org/10.1109/cesys.2017.8321168.
Full textUmakanthan, Padmalochini, and Kaliappan Gopalan. "A Perceptual Masking based Feature Set for Speech Recognition." In Modelling and Simulation. Calgary,AB,Canada: ACTAPRESS, 2013. http://dx.doi.org/10.2316/p.2013.804-024.
Full textRevathi, A., and Y. Venkataramani. "Perceptual Features Based Isolated Digit and Continuous Speech Recognition Using Iterative Clustering Approach." In 2009 First International Conference on Networks & Communications. IEEE, 2009. http://dx.doi.org/10.1109/netcom.2009.32.
Full textNguyen Quoc Trung and Phung Trung Nghia. "The perceptual wavelet feature for noise robust Vietnamese speech recognition." In 2008 Second International Conference on Communications and Electronics (ICCE). IEEE, 2008. http://dx.doi.org/10.1109/cce.2008.4578968.
Full textAlatwi, Aadel, Stephen So, and Kuldip K. Paliwal. "Perceptually motivated linear prediction cepstral features for network speech recognition." In 2016 10th International Conference on Signal Processing and Communication Systems (ICSPCS). IEEE, 2016. http://dx.doi.org/10.1109/icspcs.2016.7843309.
Full textBiswas, Astik, P. K. Sahu, Anirban Bhowmick, and Mahesh Chandra. "Acoustic feature extraction using ERB like wavelet sub-band perceptual Wiener filtering for noisy speech recognition." In 2014 Annual IEEE India Conference (INDICON). IEEE, 2014. http://dx.doi.org/10.1109/indicon.2014.7030474.
Full textFrolova, Оlga, and Elena Lyakso. "PERCEPTUAL FEATURES OF SPEECH AND VOCALIZATIONS OF 5-8 YEARS OLD CHILDREN WITH AUTISM SPECTRUM DISORDERS AND INTELLECTUAL DISABILITIES: RECOGNITION OF THE CHILD'S GENDER, AGE AND STATE." In XVI International interdisciplinary congress "Neuroscience for Medicine and Psychology". LLC MAKS Press, 2020. http://dx.doi.org/10.29003/m1310.sudak.ns2020-16/485-486.
Full textWu, Chung-Hsien, Yu-Hsien Chiu, and Huigan Lim. "Perceptual speech modeling for noisy speech recognition." In Proceedings of ICASSP '02. IEEE, 2002. http://dx.doi.org/10.1109/icassp.2002.5743735.
Full textChung-Hsien Wu, Yu-Hsien Chiu, and Huigan Lim. "Perceptual speech modeling for noisy speech recognition." In IEEE International Conference on Acoustics Speech and Signal Processing ICASSP-02. IEEE, 2002. http://dx.doi.org/10.1109/icassp.2002.1005757.
Full textSezgin, Cenk, Bilge Gunsel, and Canberk Hacioglu. "Audio emotion recognition by perceptual features." In 2012 20th Signal Processing and Communications Applications Conference (SIU). IEEE, 2012. http://dx.doi.org/10.1109/siu.2012.6204799.
Full textReports on the topic "Perceptual features for speech recognition"
Nahamoo, David. Robust Models and Features for Speech Recognition. Fort Belvoir, VA: Defense Technical Information Center, March 1998. http://dx.doi.org/10.21236/ada344834.
Full text