Gotowe bibliografie tematyczne / Automatic speech recognition

Gotowa bibliografia na temat „Automatic speech recognition”

Autor: Grafiati

Data publikacji: 4 czerwca 2021

Data aktualizacji: 30 lipca 2024

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Spis treści

Artykuły w czasopismach
Rozprawy doktorskie
Książki
Części książek
Streszczenia konferencji
Raporty organizacyjne

Zobacz listy aktualnych artykułów, książek, rozpraw, streszczeń i innych źródeł naukowych na temat „Automatic speech recognition”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Artykuły w czasopismach na temat "Automatic speech recognition"

Fried, Louis. "AUTOMATIC SPEECH RECOGNITION". Information Systems Management 13, nr 1 (styczeń 1996): 29–37. http://dx.doi.org/10.1080/10580539608906969.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Chigier, Benjamin. "Automatic speech recognition". Journal of the Acoustical Society of America 103, nr 1 (styczeń 1998): 19. http://dx.doi.org/10.1121/1.423151.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Hovell, Simon Alexander. "Automatic speech recognition". Journal of the Acoustical Society of America 107, nr 5 (2000): 2325. http://dx.doi.org/10.1121/1.428610.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Espy‐Wilson, Carol. "Automatic speech recognition". Journal of the Acoustical Society of America 117, nr 4 (kwiecień 2005): 2403. http://dx.doi.org/10.1121/1.4786105.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Merrill, John W. "Automatic speech recognition". Journal of the Acoustical Society of America 121, nr 1 (2007): 29. http://dx.doi.org/10.1121/1.2434334.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Rao, P. V. S., i K. K. Paliwal. "Automatic speech recognition". Sadhana 9, nr 2 (wrzesień 1986): 85–120. http://dx.doi.org/10.1007/bf02747521.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

SAYEM, Asm. "Speech Analysis for Alphabets in Bangla Language: Automatic Speech Recognition". International Journal of Engineering Research 3, nr 2 (1.02.2014): 88–93. http://dx.doi.org/10.17950/ijer/v3s2/211.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Carlson, Gloria Stevens, i Jared Bernstein. "Automatic speech recognition of impaired speech". International Journal of Rehabilitation Research 11, nr 4 (grudzień 1988): 396–97. http://dx.doi.org/10.1097/00004356-198812000-00013.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

SAGISAKA, Yoshinori. "AUTOMATIC SPEECH RECOGNITION MODELS". Kodo Keiryogaku (The Japanese Journal of Behaviormetrics) 22, nr 1 (1995): 40–47. http://dx.doi.org/10.2333/jbhmk.22.40.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Receveur, Simon, Robin Weiss i Tim Fingscheidt. "Turbo Automatic Speech Recognition". IEEE/ACM Transactions on Audio, Speech, and Language Processing 24, nr 5 (maj 2016): 846–62. http://dx.doi.org/10.1109/taslp.2016.2520364.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Więcej źródeł

Rozprawy doktorskie na temat "Automatic speech recognition"

Alcaraz, Meseguer Noelia. "Speech Analysis for Automatic Speech Recognition". Thesis, Norwegian University of Science and Technology, Department of Electronics and Telecommunications, 2009. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-9092.

Pełny tekst źródła

Streszczenie:

The classical front end analysis in speech recognition is a spectral analysis which parametrizes the speech signal into feature vectors; the most popular set of them is the Mel Frequency Cepstral Coefficients (MFCC). They are based on a standard power spectrum estimate which is first subjected to a log-based transform of the frequency axis (mel- frequency scale), and then decorrelated by using a modified discrete cosine transform. Following a focused introduction on speech production, perception and analysis, this paper gives a study of the implementation of a speech generative model; whereby the speech is synthesized and recovered back from its MFCC representations. The work has been developed into two steps: first, the computation of the MFCC vectors from the source speech files by using HTK Software; and second, the implementation of the generative model in itself, which, actually, represents the conversion chain from HTK-generated MFCC vectors to speech reconstruction. In order to know the goodness of the speech coding into feature vectors and to evaluate the generative model, the spectral distance between the original speech signal and the one produced from the MFCC vectors has been computed. For that, spectral models based on Linear Prediction Coding (LPC) analysis have been used. During the implementation of the generative model some results have been obtained in terms of the reconstruction of the spectral representation and the quality of the synthesized speech.

Style APA, Harvard, Vancouver, ISO itp.

Gabriel, Naveen. "Automatic Speech Recognition in Somali". Thesis, Linköpings universitet, Statistik och maskininlärning, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-166216.

Pełny tekst źródła

Streszczenie:

The field of speech recognition during the last decade has left the research stage and found its way into the public market, and today, speech recognition software is ubiquitous around us. An automatic speech recognizer understands human speech and represents it as text. Most of the current speech recognition software employs variants of deep neural networks. Before the deep learning era, the hybrid of hidden Markov model and Gaussian mixture model (HMM-GMM) was a popular statistical model to solve speech recognition. In this thesis, automatic speech recognition using HMM-GMM was trained on Somali data which consisted of voice recording and its transcription. HMM-GMM is a hybrid system in which the framework is composed of an acoustic model and a language model. The acoustic model represents the time-variant aspect of the speech signal, and the language model determines how probable is the observed sequence of words. This thesis begins with background about speech recognition. Literature survey covers some of the work that has been done in this field. This thesis evaluates how different language models and discounting methods affect the performance of speech recognition systems. Also, log scores were calculated for the top 5 predicted sentences and confidence measures of pre-dicted sentences. The model was trained on 4.5 hrs of voiced data and its corresponding transcription. It was evaluated on 3 mins of testing data. The performance of the trained model on the test set was good, given that the data was devoid of any background noise and lack of variability. The performance of the model is measured using word error rate(WER) and sentence error rate (SER). The performance of the implemented model is also compared with the results of other research work. This thesis also discusses why log and confidence score of the sentence might not be a good way to measure the performance of the resulting model. It also discusses the shortcoming of the HMM-GMM model, how the existing model can be improved, and different alternatives to solve the problem.

Style APA, Harvard, Vancouver, ISO itp.

Al-Shareef, Sarah. "Conversational Arabic Automatic Speech Recognition". Thesis, University of Sheffield, 2015. http://etheses.whiterose.ac.uk/10145/.

Pełny tekst źródła

Streszczenie:

Colloquial Arabic (CA) is the set of spoken variants of modern Arabic that exist in the form of regional dialects and are considered generally to be mother-tongues in those regions. CA has limited textual resource because it exists only as a spoken language and without a standardised written form. Normally the modern standard Arabic (MSA) writing convention is employed that has limitations in phonetically representing CA. Without phonetic dictionaries the pronunciation of CA words is ambiguous, and can only be obtained through word and/or sentence context. Moreover, CA inherits the MSA complex word structure where words can be created from attaching affixes to a word. In automatic speech recognition (ASR), commonly used approaches to model acoustic, pronunciation and word variability are language independent. However, one can observe significant differences in performance between English and CA, with the latter yielding up to three times higher error rates. This thesis investigates the main issues for the under-performance of CA ASR systems. The work focuses on two directions: first, the impact of limited lexical coverage, and insufficient training data for written CA on language modelling is investigated; second, obtaining better models for the acoustics and pronunciations by learning to transfer between written and spoken forms. Several original contributions result from each direction. Using data-driven classes from decomposed text are shown to reduce out-of-vocabulary rate. A novel colloquialisation system to import additional data is introduced; automatic diacritisation to restore the missing short vowels was found to yield good performance; and a new acoustic set for describing CA was defined. Using the proposed methods improved the ASR performance in terms of word error rate in a CA conversational telephone speech ASR task.

Style APA, Harvard, Vancouver, ISO itp.

Jalalvand, Shahab. "Automatic Speech Recognition Quality Estimation". Doctoral thesis, Università degli studi di Trento, 2017. https://hdl.handle.net/11572/368743.

Pełny tekst źródła

Streszczenie:

Evaluation of automatic speech recognition (ASR) systems is difficult and costly, since it requires manual transcriptions. This evaluation is usually done by computing word error rate (WER) that is the most popular metric in ASR community. Such computation is doable only if the manual references are available, whereas in the real-life applications, it is a too rigid condition. A reference-free metric to evaluate the ASR performance is \textit{confidence measure} which is provided by the ASR decoder. However, the confidence measure is not always available, especially in commercial ASR usages. Even if available, this measure is usually biased towards the decoder. From this perspective, the confidence measure is not suitable for comparison purposes, for example between two ASR systems. These issues motivate the necessity of an automatic quality estimation system for ASR outputs. This thesis explores ASR quality estimation (ASR QE) from different perspectives including: feature engineering, learning algorithms and applications. From feature engineering perspective, a wide range of features extractable from input signal and output transcription are studied. These features represent the quality of the recognition from different aspects and they are divided into four groups: signal, textual, hybrid and word-based features. From learning point of view, we address two main approaches: i) QE via regression, suitable for single hypothesis scenario; ii) QE via machine-learned ranking (MLR), suitable for multiple hypotheses scenario. In the former, a regression model is used to predict the WER score of each single hypothesis that is created through a single automatic transcription channel. In the latter, a ranking model is used to predict the order of multiple hypotheses with respect to their quality. Multiple hypotheses are mainly generated by several ASR systems or several recording microphones. From application point of view, we introduce two applications in which ASR QE makes salient improvement in terms of WER: i) QE-informed data selection for acoustic model adaptation; ii) QE-informed system combination. In the former, we exploit single hypothesis ASR QE methods in order to select the best adaptation data for upgrading the acoustic model. In the latter, we exploit multiple hypotheses ASR QE methods to rank and combine the automatic transcriptions in a supervised manner. The experiments are mostly conducted on CHiME-3 English dataset. CHiME-3 consists of Wall Street Journal utterances, recorded by multiple far distant microphones in noisy environments. The results show that QE-informed acoustic model adaptation leads to 1.8\% absolute WER reduction and QE-informed system combination leads to 1.7% absolute WER reduction in CHiME-3 task. The outcomes of this thesis are packed in the frame of an open source toolkit named TranscRater -transcription rating toolkit- (https://github.com/hlt-mt/TranscRater) which has been developed based on the aforementioned studies. TranscRater can be used to extract informative features, train the QE models and predict the quality of the reference-less recognitions in a variety of ASR tasks.

Style APA, Harvard, Vancouver, ISO itp.

Jalalvand, Shahab. "Automatic Speech Recognition Quality Estimation". Doctoral thesis, University of Trento, 2017. http://eprints-phd.biblio.unitn.it/2058/1/PhD_Thesis.pdf.

Pełny tekst źródła

Streszczenie:

Style APA, Harvard, Vancouver, ISO itp.

Wang, Peidong. "Robust Automatic Speech Recognition By Integrating Speech Separation". The Ohio State University, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=osu1619099401042668.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Seward, Alexander. "Efficient Methods for Automatic Speech Recognition". Doctoral thesis, KTH, Tal, musik och hörsel, 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-3675.

Pełny tekst źródła

Streszczenie:

This thesis presents work in the area of automatic speech recognition (ASR). The thesis focuses on methods for increasing the efficiency of speech recognition systems and on techniques for efficient representation of different types of knowledge in the decoding process. In this work, several decoding algorithms and recognition systems have been developed, aimed at various recognition tasks. The thesis presents the KTH large vocabulary speech recognition system. The system was developed for online (live) recognition with large vocabularies and complex language models. The system utilizes weighted transducer theory for efficient representation of different knowledge sources, with the purpose of optimizing the recognition process. A search algorithm for efficient processing of hidden Markov models (HMMs) is presented. The algorithm is an alternative to the classical Viterbi algorithm for fast computation of shortest paths in HMMs. It is part of a larger decoding strategy aimed at reducing the overall computational complexity in ASR. In this approach, all HMM computations are completely decoupled from the rest of the decoding process. This enables the use of larger vocabularies and more complex language models without an increase of HMM-related computations. Ace is another speech recognition system developed within this work. It is a platform aimed at facilitating the development of speech recognizers and new decoding methods. A real-time system for low-latency online speech transcription is also presented. The system was developed within a project with the goal of improving the possibilities for hard-of-hearing people to use conventional telephony by providing speech-synchronized multimodal feedback. This work addresses several additional requirements implied by this special recognition task.
QC 20100811

Style APA, Harvard, Vancouver, ISO itp.

Vipperla, Ravichander. "Automatic Speech Recognition for ageing voices". Thesis, University of Edinburgh, 2011. http://hdl.handle.net/1842/5725.

Pełny tekst źródła

Streszczenie:

With ageing, human voices undergo several changes which are typically characterised by increased hoarseness, breathiness, changes in articulatory patterns and slower speaking rate. The focus of this thesis is to understand the impact of ageing on Automatic Speech Recognition (ASR) performance and improve the ASR accuracies for older voices. Baseline results on three corpora indicate that the word error rates (WER) for older adults are significantly higher than those of younger adults and the decrease in accuracies is higher for males speakers as compared to females. Acoustic parameters such as jitter and shimmer that measure glottal source disfluencies were found to be significantly higher for older adults. However, the hypothesis that these changes explain the differences in WER for the two age groups is proven incorrect. Experiments with artificial introduction of glottal source disfluencies in speech from younger adults do not display a significant impact on WERs. Changes in fundamental frequency observed quite often in older voices has a marginal impact on ASR accuracies. Analysis of phoneme errors between younger and older speakers shows a pattern of certain phonemes especially lower vowels getting more affected with ageing. These changes however are seen to vary across speakers. Another factor that is strongly associated with ageing voices is a decrease in the rate of speech. Experiments to analyse the impact of slower speaking rate on ASR accuracies indicate that the insertion errors increase while decoding slower speech with models trained on relatively faster speech. We then propose a way to characterise speakers in acoustic space based on speaker adaptation transforms and observe that speakers (especially males) can be segregated with reasonable accuracies based on age. Inspired by this, we look at supervised hierarchical acoustic models based on gender and age. Significant improvements in word accuracies are achieved over the baseline results with such models. The idea is then extended to construct unsupervised hierarchical models which also outperform the baseline models by a good margin. Finally, we hypothesize that the ASR accuracies can be improved by augmenting the adaptation data with speech from acoustically closest speakers. A strategy to select the augmentation speakers is proposed. Experimental results on two corpora indicate that the hypothesis holds true only when the amount of available adaptation is limited to a few seconds. The efficacy of such a speaker selection strategy is analysed for both younger and older adults.

Style APA, Harvard, Vancouver, ISO itp.

Guzy, Julius Jonathan. "Automatic speech recognition : a refutation approach". Thesis, De Montfort University, 1988. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.254196.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Deterding, David Henry. "Speaker normalisation for automatic speech recognition". Thesis, University of Cambridge, 1990. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.359822.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Więcej źródeł

Książki na temat "Automatic speech recognition"

Yu, Dong, i Li Deng. Automatic Speech Recognition. London: Springer London, 2015. http://dx.doi.org/10.1007/978-1-4471-5779-3.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Lee, Kai-Fu. Automatic Speech Recognition. Boston, MA: Springer US, 1989. http://dx.doi.org/10.1007/978-1-4615-3650-5.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Woelfel, Matthias. Distant speech recognition. Chichester, West Sussex, U.K: Wiley, 2009.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Junqua, Jean-Claude, i Jean-Paul Haton. Robustness in Automatic Speech Recognition. Boston, MA: Springer US, 1996. http://dx.doi.org/10.1007/978-1-4613-1297-0.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Lee, Chin-Hui, Frank K. Soong i Kuldip K. Paliwal, red. Automatic Speech and Speaker Recognition. Boston, MA: Springer US, 1996. http://dx.doi.org/10.1007/978-1-4613-1367-0.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Keshet, Joseph, i Samy Bengio, red. Automatic Speech and Speaker Recognition. Chichester, UK: John Wiley & Sons, Ltd, 2009. http://dx.doi.org/10.1002/9780470742044.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Huang, X. D. Hidden Markov models for speech recognition. Edinburgh: Edinburgh University Press, 1990.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Markowitz, Judith A. Using speech recognition. Upper Saddle River, N.J: Prentice Hall PTR, 1996.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Ainsworth, W. A. Speech recognition by machine. London, U.K: P. Peregrinus on behalf of the Institution of Electrical Engineers, 1988.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Ainsworth, W. A. Speech recognition by machine. London: Peregrinus on behalf of the Institution of Electrical Engineers, 1987.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Więcej źródeł

Części książek na temat "Automatic speech recognition"

Kurematsu, Akira, i Tsuyoshi Morimoto. "Speech Recognition". W Automatic Speech Translation, 9–41. London: CRC Press, 2023. http://dx.doi.org/10.1201/9780429333385-2.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Lu, Xugang, Sheng Li i Masakiyo Fujimoto. "Automatic Speech Recognition". W SpringerBriefs in Computer Science, 21–38. Singapore: Springer Singapore, 2019. http://dx.doi.org/10.1007/978-981-15-0595-9_2.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Owens, F. J. "Automatic Speech Recognition". W Signal Processing of Speech, 138–73. London: Macmillan Education UK, 1993. http://dx.doi.org/10.1007/978-1-349-22599-6_7.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Schäuble, Peter. "Automatic Speech Recognition". W Multimedia Information Retrieval, 61–120. Boston, MA: Springer US, 1997. http://dx.doi.org/10.1007/978-1-4615-6163-7_4.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Soltau, Hagen, George Saon, Lidia Mangu, Hong-Kwang Kuo, Brian Kingsbury, Stephen Chu i Fadi Biadsy. "Automatic Speech Recognition". W Natural Language Processing of Semitic Languages, 409–59. Berlin, Heidelberg: Springer Berlin Heidelberg, 2014. http://dx.doi.org/10.1007/978-3-642-45358-8_13.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Chowdhary, K. R. "Automatic Speech Recognition". W Fundamentals of Artificial Intelligence, 651–68. New Delhi: Springer India, 2020. http://dx.doi.org/10.1007/978-81-322-3972-7_20.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Gruhn, Rainer E., Wolfgang Minker i Satoshi Nakamura. "Automatic Speech Recognition". W Signals and Communication Technology, 5–17. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-19586-0_2.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Kamath, Uday, John Liu i James Whitaker. "Automatic Speech Recognition". W Deep Learning for NLP and Speech Recognition, 369–404. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-14596-5_8.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Weik, Martin H. "automatic speech recognition". W Computer Science and Communications Dictionary, 88. Boston, MA: Springer US, 2000. http://dx.doi.org/10.1007/1-4020-0613-6_1147.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Potamianos, Gerasimos, Lori Lamel, Matthias Wölfel, Jing Huang, Etienne Marcheret, Claude Barras, Xuan Zhu i in. "Automatic Speech Recognition". W Computers in the Human Interaction Loop, 43–59. London: Springer London, 2009. http://dx.doi.org/10.1007/978-1-84882-054-8_6.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Streszczenia konferencji na temat "Automatic speech recognition"

O'Shaughnessy, Douglas. "Automatic speech recognition". W 2015 Chilean Conference on Electrical, Electronics Engineering, Information and Communication Technologies (CHILECON). IEEE, 2015. http://dx.doi.org/10.1109/chilecon.2015.7400411.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Glasser, Abraham. "Automatic Speech Recognition Services". W CHI '19: CHI Conference on Human Factors in Computing Systems. New York, NY, USA: ACM, 2019. http://dx.doi.org/10.1145/3290607.3308461.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Catariov, Alexandru. "Automatic speech recognition systems". W Chisinau - DL tentative, redaktorzy Andrei M. Andriesh i Veacheslav L. Perju. SPIE, 2005. http://dx.doi.org/10.1117/12.612047.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Paulik, M., S. Stuker, C. Fugen, T. Schultz, T. Schaaf i A. Waibel. "Speech translation enhanced automatic speech recognition". W IEEE Workshop on Automatic Speech Recognition and Understanding, 2005. IEEE, 2005. http://dx.doi.org/10.1109/asru.2005.1566488.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Ahmed, Basem H. A., i Ayman S. Ghabayen. "Arabic Automatic Speech Recognition Enhancement". W 2017 Palestinian International Conference on Information and Communication Technology (PICICT). IEEE, 2017. http://dx.doi.org/10.1109/picict.2017.12.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Adi, Derry Pramono, Agustinus Bimo Gumelar i Ralin Pramasuri Arta Meisa. "Interlanguage of Automatic Speech Recognition". W 2019 International Seminar on Application for Technology of Information and Communication (iSemantic). IEEE, 2019. http://dx.doi.org/10.1109/isemantic.2019.8884310.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Anoop, C. S., i A. G. Ramakrishnan. "Automatic Speech Recognition for Sanskrit". W 2019 2nd International Conference on Intelligent Computing, Instrumentation and Control Technologies (ICICICT). IEEE, 2019. http://dx.doi.org/10.1109/icicict46008.2019.8993283.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Munteanu, Cosmin, Gerald Penn, Ron Baecker i Yuecheng Zhang. "Automatic speech recognition for webcasts". W the 8th international conference. New York, New York, USA: ACM Press, 2006. http://dx.doi.org/10.1145/1180995.1181005.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Potamianos, Alexandros, Shrikanth Narayanan i Sungbok Lee. "Automatic speech recognition for children". W 5th European Conference on Speech Communication and Technology (Eurospeech 1997). ISCA: ISCA, 1997. http://dx.doi.org/10.21437/eurospeech.1997-623.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Chen, C. Julian. "Speech recognition with automatic punctuation". W 6th European Conference on Speech Communication and Technology (Eurospeech 1999). ISCA: ISCA, 1999. http://dx.doi.org/10.21437/eurospeech.1999-115.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Raporty organizacyjne na temat "Automatic speech recognition"

Clements, Mark A., John H. Hansen, Kathleen E. Cummings i Sungjae Lim. Automatic Recognition of Speech in Stressful Environments. Fort Belvoir, VA: Defense Technical Information Center, sierpień 1991. http://dx.doi.org/10.21236/ada242917.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Brown, Peter F. The Acoustic-Modeling Problem in Automatic Speech Recognition. Fort Belvoir, VA: Defense Technical Information Center, grudzień 1987. http://dx.doi.org/10.21236/ada188529.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Vergyri, Dimitra, i Katrin Kirchhoff. Automatic Diacritization of Arabic for Acoustic Modeling in Speech Recognition. Fort Belvoir, VA: Defense Technical Information Center, styczeń 2004. http://dx.doi.org/10.21236/ada457846.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Bass, James D. Advancing Noise Robust Automatic Speech Recognition for Command and Control Applications. Fort Belvoir, VA: Defense Technical Information Center, marzec 2006. http://dx.doi.org/10.21236/ada461436.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Stevenson, G. Analysis of Pre-Trained Deep Neural Networks for Large-Vocabulary Automatic Speech Recognition. Office of Scientific and Technical Information (OSTI), lipiec 2016. http://dx.doi.org/10.2172/1289367.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Fatehifar, Mohsen, Josef Schlittenlacher, David Wong i Kevin Munro. Applications Of Automatic Speech Recognition And Text-To-Speech Models To Detect Hearing Loss: A Scoping Review Protocol. INPLASY - International Platform of Registered Systematic Review and Meta-analysis Protocols, styczeń 2023. http://dx.doi.org/10.37766/inplasy2023.1.0029.

Pełny tekst źródła

Streszczenie:

Review question / Objective: This scoping review aims to identify published methods that have used automatic speech recognition or text-to-speech recognition technologies to detect hearing loss and report on their accuracy and limitations. Condition being studied: Hearing enables us to communicate with the surrounding world. According to reports by the World Health Organization, 1.5 billion suffer from some degree of hearing loss of which 430 million require medical attention. It is estimated that by 2050, 1 in every 4 people will experience some sort of hearing disability. Hearing loss can significantly impact people’s ability to communicate and makes social interactions a challenge. In addition, it can result in anxiety, isolation, depression, hindrance of learning, and a decrease in general quality of life. A hearing assessment is usually done in hospitals and clinics with special equipment and trained staff. However, these services are not always available in less developed countries. Even in developed countries, like the UK, access to these facilities can be a challenge in rural areas. Moreover, during a crisis like the Covid-19 pandemic, accessing the required healthcare can become dangerous and challenging even in large cities.

Style APA, Harvard, Vancouver, ISO itp.

Oran, D. Requirements for Distributed Control of Automatic Speech Recognition (ASR), Speaker Identification/Speaker Verification (SI/SV), and Text-to-Speech (TTS) Resources. RFC Editor, grudzień 2005. http://dx.doi.org/10.17487/rfc4313.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Tao, Yang, Amos Mizrach, Victor Alchanatis, Nachshon Shamir i Tom Porter. Automated imaging broiler chicksexing for gender-specific and efficient production. United States Department of Agriculture, grudzień 2014. http://dx.doi.org/10.32747/2014.7594391.bard.

Pełny tekst źródła

Streszczenie:

Extending the previous two years of research results (Mizarch, et al, 2012, Tao, 2011, 2012), the third year’s efforts in both Maryland and Israel were directed towards the engineering of the system. The activities included the robust chick handling and its conveyor system development, optical system improvement, online dynamic motion imaging of chicks, multi-image sequence optimal feather extraction and detection, and pattern recognition. Mechanical System Engineering The third model of the mechanical chick handling system with high-speed imaging system was built as shown in Fig. 1. This system has the improved chick holding cups and motion mechanisms that enable chicks to open wings through the view section. The mechanical system has achieved the speed of 4 chicks per second which exceeds the design specs of 3 chicks per second. In the center of the conveyor, a high-speed camera with UV sensitive optical system, shown in Fig.2, was installed that captures chick images at multiple frames (45 images and system selectable) when the chick passing through the view area. Through intensive discussions and efforts, the PIs of Maryland and ARO have created the protocol of joint hardware and software that uses sequential images of chick in its fall motion to capture opening wings and extract the optimal opening positions. This approached enables the reliable feather feature extraction in dynamic motion and pattern recognition. Improving of Chick Wing Deployment The mechanical system for chick conveying and especially the section that cause chicks to deploy their wings wide open under the fast video camera and the UV light was investigated along the third study year. As a natural behavior, chicks tend to deploy their wings as a mean of balancing their body when a sudden change in the vertical movement was applied. In the latest two years, this was achieved by causing the chicks to move in a free fall, in the earth gravity (g) along short vertical distance. The chicks have always tended to deploy their wing but not always in wide horizontal open situation. Such position is requested in order to get successful image under the video camera. Besides, the cells with checks bumped suddenly at the end of the free falling path. That caused the chicks legs to collapse inside the cells and the image of wing become bluer. For improving the movement and preventing the chick legs from collapsing, a slowing down mechanism was design and tested. This was done by installing of plastic block, that was printed in a predesign variable slope (Fig. 3) at the end of the path of falling cells (Fig.4). The cells are moving down in variable velocity according the block slope and achieve zero velocity at the end of the path. The slop was design in a way that the deacceleration become 0.8g instead the free fall gravity (g) without presence of the block. The tests showed better deployment and wider chick's wing opening as well as better balance along the movement. Design of additional sizes of block slops is under investigation. Slops that create accelerations of 0.7g, 0.9g, and variable accelerations are designed for improving movement path and images.

Style APA, Harvard, Vancouver, ISO itp.

Issues in Data Processing and Relevant Population Selection. OSAC Speaker Recognition Subcommittee, listopad 2022. http://dx.doi.org/10.29325/osac.tg.0006.

Pełny tekst źródła

Streszczenie:

In Forensic Automatic Speaker Recognition (FASR), forensic examiners typically compare audio recordings of a speaker whose identity is in question with recordings of known speakers to assist investigators and triers of fact in a legal proceeding. The performance of automated speaker recognition (SR) systems used for this purpose depends largely on the characteristics of the speech samples being compared. Examiners must understand the requirements of specific systems in use as well as the audio characteristics that impact system performance. Mismatch conditions between the known and questioned data samples are of particular importance, but the need for, and impact of, audio pre-processing must also be understood. The data selected for use in a relevant population can also be critical to the performance of the system. This document describes issues that arise in the processing of case data and in the selections of a relevant population for purposes of conducting an examination using a human supervised automatic speaker recognition approach in a forensic context. The document is intended to comply with the Organization of Scientific Area Committees (OSAC) for Forensic Science Technical Guidance Document.

Style APA, Harvard, Vancouver, ISO itp.

Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!