Se connecter

Bibliographies thématiques / Speech processing systems / Articles de revues

Pour voir les autres types de publications sur ce sujet consultez le lien suivant : Speech processing systems.

Articles de revues sur le sujet « Speech processing systems »

Auteur : Grafiati

Publié le 4 juin 2021

Mis à jour le 30 juillet 2024

Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres

Choisissez une source :

Consultez les 50 meilleurs articles de revues pour votre recherche sur le sujet « Speech processing systems ».

À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.

Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.

Parcourez les articles de revues sur diverses disciplines et organisez correctement votre bibliographie.

1

Ibragimova, Sayora. « THE ADVANTAGE OFTHEWAVELET TRANSFORM IN PROCESSING OF SPEECH SIGNALS ». Technical Sciences 4, n^o 3 (30 mars 2021) : 37–41. http://dx.doi.org/10.26739/2181-9696-2021-3-6.

Texte intégral

Résumé :

This work deals with basic theory of wavelet transform and multi-scale analysis of speech signals, briefly reviewed the main differences between wavelet transform and Fourier transform in the analysis of speech signals. The possibilities to use the method of wavelet analysis to speech recognition systems and its main advantages. In most existing systems of recognition and analysis of speech sound considered as a stream of vectors whose elements are some frequency response. Therefore, the speech processing in real time using sequential algorithms requires computing resources with high performance. Examples of how this method can be used when processing speech signals and build standards for systems of recognition.Key words: digital signal processing, Fourier transform, wavelet analysis, speech signal, wavelet transform

Styles APA, Harvard, Vancouver, ISO, etc.

2

Dasarathy, Belur V. « Robust speech processing ». Information Fusion 5, n^o 2 (juin 2004) : 75. http://dx.doi.org/10.1016/j.inffus.2004.02.002.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

3

Thompson, Laura A., et William C. Ogden. « Visible speech improves human language understanding : Implications for speech processing systems ». Artificial Intelligence Review 9, n^o 4-5 (octobre 1995) : 347–58. http://dx.doi.org/10.1007/bf00849044.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

4

Scott, Sophie K., et Carolyn McGettigan. « The neural processing of masked speech ». Hearing Research 303 (septembre 2013) : 58–66. http://dx.doi.org/10.1016/j.heares.2013.05.001.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

5

M Tasbolatov, N. Mekebayev, O. Mamyrbayev, M. Turdalyuly, D. Oralbekova,. « Algorithms and architectures of speech recognition systems ». Psychology and Education Journal 58, n^o 2 (20 février 2021) : 6497–501. http://dx.doi.org/10.17762/pae.v58i2.3182.

Texte intégral

Résumé :

Digital processing of speech signal and the voice recognition algorithm is very important for fast and accurate automatic scoring of the recognition technology. A voice is a signal of infinite information. The direct analysis and synthesis of a complex speech signal is due to the fact that the information is contained in the signal. Speech is the most natural way of communicating people. The task of speech recognition is to convert speech into a sequence of words using a computer program. This article presents an algorithm of extracting MFCC for speech recognition. The MFCC algorithm reduces the processing power by 53% compared to the conventional algorithm. Automatic speech recognition using Matlab.

Styles APA, Harvard, Vancouver, ISO, etc.

6

Delic, Vlado, Darko Pekar, Radovan Obradovic et Milan Secujski. « Speech signal processing in ASR&TTS algorithms ». Facta universitatis - series : Electronics and Energetics 16, n^o 3 (2003) : 355–64. http://dx.doi.org/10.2298/fuee0303355d.

Texte intégral

Résumé :

Speech signal processing and modeling in systems for continuous speech recognition and Text-to-Speech synthesis in Serbian language are described in this paper. Both systems are fully developed by the authors and do not use any third party software. Accuracy of the speech recognizer and intelligibility of the TTS system are in the range of the best solutions in the world, and all conditions are met for commercial use of these solutions.

Styles APA, Harvard, Vancouver, ISO, etc.

7

FUNAKOSHI, KOTARO, TAKENOBU TOKUNAGA et HOZUMI TANAKA. « Processing Japanese Self-correction in Speech Dialog Systems. » Journal of Natural Language Processing 10, n^o 4 (2003) : 33–53. http://dx.doi.org/10.5715/jnlp.10.4_33.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

8

Hills, A., et K. Scott. « Perceived degradation effects in packet speech systems ». IEEE Transactions on Acoustics, Speech, and Signal Processing 35, n^o 5 (mai 1987) : 699–701. http://dx.doi.org/10.1109/tassp.1987.1165187.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

9

Gransier, Robin, et Jan Wouters. « Neural auditory processing of parameterized speech envelopes ». Hearing Research 412 (décembre 2021) : 108374. http://dx.doi.org/10.1016/j.heares.2021.108374.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

10

Moon, Todd K., Jacob H. Gunther, Cortnie Broadus, Wendy Hou et Nils Nelson. « Turbo Processing for Speech Recognition ». IEEE Transactions on Cybernetics 44, n^o 1 (janvier 2014) : 83–91. http://dx.doi.org/10.1109/tcyb.2013.2247593.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

11

Arnold, Tim, et Helen J. A. Fuller. « An Ergonomic Framework for Researching and Designing Speech Recognition Technologies in Health Care with an Emphasis on Safety ». Proceedings of the International Symposium on Human Factors and Ergonomics in Health Care 8, n^o 1 (septembre 2019) : 279–83. http://dx.doi.org/10.1177/2327857919081067.

Texte intégral

Résumé :

Automatic speech recognition (ASR) systems and speech interfaces are becoming increasingly prevalent. This includes increases in and expansion of use of these technologies for supporting work in health care. Computer-based speech processing has been extensively studied and developed over decades. Speech processing tools have been fine-tuned through the work of Speech and Language Researchers. Researchers have previously and continue to describe speech processing errors in medicine. The discussion provided in this paper proposes an ergonomic framework for speech recognition to expand and further describe this view of speech processing in supporting clinical work. With this end in mind, we hope to build on previous work and emphasize the need for increased human factors involvement in this area while also facilitating the discussion of speech recognition in contexts that have been explored in the human factors domain. Human factors expertise can contribute through proactively describing and designing these critical interconnected socio-technical systems with error-tolerance in mind.

Styles APA, Harvard, Vancouver, ISO, etc.

12

Kai, Atsuhiko, et Seiichi Nakagawa. « Comparison of continuous speech recognition systems with unknown-word processing for speech disfluencies ». Systems and Computers in Japan 29, n^o 9 (août 1998) : 43–53. http://dx.doi.org/10.1002/(sici)1520-684x(199808)29:9<43 ::aid-scj5>3.0.co;2-j.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

13

Cecinati, Riccardo. « Integrated processing unit, particularly for connected speech recognition systems ». Journal of the Acoustical Society of America 92, n^o 2 (août 1992) : 1199–200. http://dx.doi.org/10.1121/1.403986.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

14

Marshall, Stephen. « Processing of audio and visual speech for telecommunication systems ». Journal of Electronic Imaging 8, n^o 3 (1 juillet 1999) : 263. http://dx.doi.org/10.1117/1.482675.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

15

Polifroni, Joseph, Imre Kiss et Stephanie Seneff. « Speech for Content Creation ». International Journal of Mobile Human Computer Interaction 3, n^o 2 (avril 2011) : 35–49. http://dx.doi.org/10.4018/jmhci.2011040103.

Texte intégral

Résumé :

This paper proposes a paradigm for using speech to interact with computers, one that complements and extends traditional spoken dialogue systems: speech for content creation. The literature in automatic speech recognition (ASR), natural language processing (NLP), sentiment detection, and opinion mining is surveyed to argue that the time has come to use mobile devices to create content on-the-fly. Recent work in user modelling and recommender systems is examined to support the claim that using speech in this way can result in a useful interface to uniquely personalizable data. A data collection effort recently undertaken to help build a prototype system for spoken restaurant reviews is discussed. This vision critically depends on mobile technology, for enabling the creation of the content and for providing ancillary data to make its processing more relevant to individual users. This type of system can be of use where only limited speech processing is possible.

Styles APA, Harvard, Vancouver, ISO, etc.

16

Auti, Dr Nisha, Atharva Pujari, Anagha Desai, Shreya Patil, Sanika Kshirsagar et Rutika Rindhe. « Advanced Audio Signal Processing for Speaker Recognition and Sentiment Analysis ». International Journal for Research in Applied Science and Engineering Technology 11, n^o 5 (31 mai 2023) : 1717–24. http://dx.doi.org/10.22214/ijraset.2023.51825.

Texte intégral

Résumé :

Abstract: Automatic Speech Recognition (ASR) technology has revolutionized human-computer interaction by allowing users to communicate with computer interfaces using their voice in a natural way. Speaker recognition is a biometric recognition method that identifies individuals based on their unique speech signal, with potential applications in security, communication, and personalization. Sentiment analysis is a statistical method that analyzes unique acoustic properties of the speaker's voice to identify emotions or sentiments in speech. This allows for automated speech recognition systems to accurately categorize speech as Positive, Neutral, or Negative. While sentiment analysis has been developed for various languages, further research is required for regional languages. This project aims to improve the accuracy of automatic speech recognition systems by implementing advanced audio signal processing and sentiment analysis detection. The proposed system will identify the speaker's voice and analyze the audio signal to detect the context of speech, including the identification of foul language and aggressive speech. The system will be developed for the Marathi Language dataset, with potential for further development in other languages.

Styles APA, Harvard, Vancouver, ISO, etc.

17

Järvinen, Kari. « Digital speech processing : Speech coding, synthesis, and recognition ». Signal Processing 30, n^o 1 (janvier 1993) : 133–34. http://dx.doi.org/10.1016/0165-1684(93)90056-g.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

18

Varga, A., et F. Fallside. « A technique for using multipulse linear predictive speech synthesis in text-to-speech type systems ». IEEE Transactions on Acoustics, Speech, and Signal Processing 35, n^o 4 (avril 1987) : 586–87. http://dx.doi.org/10.1109/tassp.1987.1165151.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

19

Moore, Thomas J., et Richard L. McKinley. « Research on Speech Processing for Military Avionics ». Proceedings of the Human Factors Society Annual Meeting 30, n^o 13 (septembre 1986) : 1331–35. http://dx.doi.org/10.1177/154193128603001321.

Texte intégral

Résumé :

The Biological Acoustics Branch of the Armstrong Aerospace Medical Research Laboratory (AAMRL) is engaged in research in a number of speech related areas. This paper will describe the approach used to conduct research in the development and evaluation of military speech communication systems, mention the types of studies done using this approach and give examples of the types of data generated by these studies. Representative data will also be provided describing acoustic-phonetic changes that occur when speech is produced under acceleration.

Styles APA, Harvard, Vancouver, ISO, etc.

20

Fadel, Wiam, Toumi Bouchentouf, Pierre-André Buvet et Omar Bourja. « Adapting Off-the-Shelf Speech Recognition Systems for Novel Words ». Information 14, n^o 3 (13 mars 2023) : 179. http://dx.doi.org/10.3390/info14030179.

Texte intégral

Résumé :

Current speech recognition systems with fixed vocabularies have difficulties recognizing Out-of-Vocabulary words (OOVs) such as proper nouns and new words. This leads to misunderstandings or even failures in dialog systems. Ensuring effective speech recognition is crucial for the proper functioning of robot assistants. Non-native accents, new vocabulary, and aging voices can cause malfunctions in a speech recognition system. If this task is not executed correctly, the assistant robot will inevitably produce false or random responses. In this paper, we used a statistical approach based on distance algorithms to improve OOV correction. We developed a post-processing algorithm to be combined with a speech recognition model. In this sense, we compared two distance algorithms: Damerau–Levenshtein and Levenshtein distance. We validated the performance of the two distance algorithms in conjunction with five off-the-shelf speech recognition models. Damerau–Levenshtein, as compared to the Levenshtein distance algorithm, succeeded in minimizing the Word Error Rate (WER) when using the MoroccanFrench test set with five speech recognition systems, namely VOSK API, Google API, Wav2vec2.0, SpeechBrain, and Quartznet pre-trained models. Our post-processing method works regardless of the architecture of the speech recognizer, and its results on our MoroccanFrench test set outperformed the five chosen off-the-shelf speech recognizer systems.

Styles APA, Harvard, Vancouver, ISO, etc.

21

Puder, Henning, et Gerhard Schmidt. « Applied speech and audio processing ». Signal Processing 86, n^o 6 (juin 2006) : 1121–23. http://dx.doi.org/10.1016/j.sigpro.2005.07.034.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

22

Salman, Hayder Mahmood, Vian S. Al Al-Doori, Hayder sharif, Wasfi Hameed4 et Rusul S. Bader. « Accurate Recognition of Natural language Using Machine Learning and Feature Fusion Processing ». Fusion : Practice and Applications 10, n^o 1 (2023) : 128–42. http://dx.doi.org/10.54216/fpa.100108.

Texte intégral

Résumé :

To enhance the performance of Chinese language pronunciation evaluation and speech recognition systems, researchers are focusing on developing intelligent techniques for multilevel fusion processing of data, features, and decisions using deep learning-based computer-aided systems. With a combination of score level, rank level, and hybrid level fusion, as well as fusion optimization and fusion score improvement, these systems can effectively combine multiple models and sensors to improve the accuracy of information fusion. Additionally, intelligent systems for information fusion, including those used in robotics and decision-making, can benefit from techniques such as multimedia data fusion and machine learning for data fusion. Furthermore, optimization algorithms and fuzzy approaches can be applied to data fusion applications in cloud environments and e-systems, while spatial data fusion can be used to enhance the quality of image and feature data In this paper, a new approach has been presented to identify the tonal language in continuous speech. This study proposes the Machine learning-assisted automatic speech recognition framework (ML-ASRF) for Chinese character and language prediction. Our focus is on extracting highly robust features and combining various speech signal sequences of deep models. The experimental results demonstrated that the machine learning neural network recognition rate is considerably higher than that of the conventional speech recognition algorithm, which performs more accurate human-computer interaction and increases the efficiency of determining Chinese language pronunciation accuracy.

Styles APA, Harvard, Vancouver, ISO, etc.

23

Ali Abumalloh, Rabab, Hasan Muaidi Al-Serhan, Othman Bin Ibrahim et Waheeb Abu-Ulbeh. « Arabic Part-of-Speech Tagger, an Approach Based on Neural Network Modelling ». International Journal of Engineering & ; Technology 7, n^o 2.29 (22 mai 2018) : 742. http://dx.doi.org/10.14419/ijet.v7i2.29.14009.

Texte intégral

Résumé :

POS-tagging gained the interest of researchers in computational linguistics sciences in the recent years. Part-of-speech tagging systems assign the proper grammatical tag or morpho-syntactical category labels automatically to every word in the corpus per its appearance on the text. POS-tagging serves as a fundamental and preliminary step in linguistic analysis which can help in developing many natural language processing applications such as: word processing systems, spell checking systems, building dictionaries and in parsing systems. Arabic language gained the interest of researchers which led to increasing demand for Arabic natural language processing systems. Artiﬁcial neural networks has been applied in many applications such as speech recognition and part of speech prediction, but it is considered as a new approach in Part-of-speech tagging. In this research, we developed an Arabic POS-tagger using artificial neural network. A corpus of 20,620 words, which were manually assigned to the appropriate tags was developed and used to train the artificial neural network and to test the part of speech tagger systems’ overall performance. The accuracy of the developed tagger reaches 89.04% using the testing dataset. While, it reaches 98.94% using the training dataset. By combining the two datasets, the accuracy rate for the whole system is 96.96%.

Styles APA, Harvard, Vancouver, ISO, etc.

24

Romeu, E. S., et V. I. Syryamkin. « Possibilities for applied joint speech processing and computer vision systems ». IOP Conference Series : Materials Science and Engineering 516 (26 avril 2019) : 012044. http://dx.doi.org/10.1088/1757-899x/516/1/012044.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

25

Bonte, Milene, Anke Ley, Wolfgang Scharke et Elia Formisano. « Developmental refinement of cortical systems for speech and voice processing ». NeuroImage 128 (mars 2016) : 373–84. http://dx.doi.org/10.1016/j.neuroimage.2016.01.015.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

26

Savchenko, L. V., et A. V. Savchenko. « Fuzzy Phonetic Encoding of Speech Signals in Voice Processing Systems ». Journal of Communications Technology and Electronics 64, n^o 3 (mars 2019) : 238–44. http://dx.doi.org/10.1134/s1064226919030173.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

27

Chen, Tsuhan. « Video signal processing systems and methods utilizing automated speech analysis ». Journal of the Acoustical Society of America 112, n^o 2 (2002) : 368. http://dx.doi.org/10.1121/1.1507005.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

28

Weinstein, C. J. « Opportunities for advanced speech processing in military computer-based systems ». Proceedings of the IEEE 79, n^o 11 (1991) : 1626–41. http://dx.doi.org/10.1109/5.118986.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

29

de Abreu, Caio Cesar Enside, Marco Aparecido Queiroz Duarte, Bruno Rodrigues de Oliveira, Jozue Vieira Filho et Francisco Villarreal. « Regression-Based Noise Modeling for Speech Signal Processing ». Fluctuation and Noise Letters 20, n^o 03 (30 janvier 2021) : 2150022. http://dx.doi.org/10.1142/s021947752150022x.

Texte intégral

Résumé :

Speech processing systems are very important in different applications involving speech and voice quality such as automatic speech recognition, forensic phonetics and speech enhancement, among others. In most of them, the acoustic environmental noise is added to the original signal, decreasing the signal-to-noise ratio (SNR) and the speech quality by consequence. Therefore, estimating noise is one of the most important steps in speech processing whether to reduce it before processing or to design robust algorithms. In this paper, a new approach to estimate noise from speech signals is presented and its effectiveness is tested in the speech enhancement context. For this purpose, partial least squares (PLS) regression is used to model the acoustic environment (AE) and a Wiener filter based on a priori SNR estimation is implemented to evaluate the proposed approach. Six noise types are used to create seven acoustically modeled noises. The basic idea is to consider the AE model to identify the noise type and estimate its power to be used in a speech processing system. Speech signals processed using the proposed method and classical noise estimators are evaluated through objective measures. Results show that the proposed method produces better speech quality than state-of-the-art noise estimators, enabling it to be used in real-time applications in the field of robotic, telecommunications and acoustic analysis.

Styles APA, Harvard, Vancouver, ISO, etc.

30

Ungureanu, Dan, Stefan-Adrian Toma, Ion-Dorinel Filip, Bogdan-Costel Mocanu, Iulian Aciobăniței, Bogdan Marghescu, Titus Balan, Mihai Dascalu, Ion Bica et Florin Pop. « ODIN112–AI-Assisted Emergency Services in Romania ». Applied Sciences 13, n^o 1 (3 janvier 2023) : 639. http://dx.doi.org/10.3390/app13010639.

Texte intégral

Résumé :

The evolution of Natural Language Processing technologies transformed them into viable choices for various accessibility features and for facilitating interactions between humans and computers. A subset of them consists of speech processing systems, such as Automatic Speech Recognition, which became more accurate and more popular as a result. In this article, we introduce an architecture built around various speech processing systems to enhance Romanian emergency services. Our system is designed to help the operator evaluate various situations with the end goal of reducing the response times of emergency services. We also release the largest high-quality speech dataset of more than 150 h for Romanian. Our architecture includes an Automatic Speech Recognition model to transcribe calls automatically and augment the operator’s notes, as well as a Speech Recognition model to classify the caller’s emotions. We achieve state-of-the-art results on both tasks, while our demonstrator is designed to be integrated with the Romanian emergency system.

Styles APA, Harvard, Vancouver, ISO, etc.

31

Jamieson, Donald G., Vijay Parsa, Moneca C. Price et James Till. « Interaction of Speech Coders and Atypical Speech II ». Journal of Speech, Language, and Hearing Research 45, n^o 4 (août 2002) : 689–99. http://dx.doi.org/10.1044/1092-4388(2002/055).

Texte intégral

Résumé :

We investigated how standard speech coders, currently used in modern communication systems, affect the quality of the speech of persons who have common speech and voice disorders. Three standardized speech coders (GSM 6.10 RPELTP, FS1016 CELP, and FS1015 LPC) and two speech coders based on subband processing were evaluated for their performance. Coder effects were assessed by measuring the quality of speech samples both before and after processing by the speech coders. Speech quality was rated by 10 listeners with normal hearing on 28 different scales representing pitch and loudness changes, speech rate, laryngeal and resonatory dysfunction, and coder-induced distortions. Results showed that (a) nine scale items were consistently and reliably rated by the listeners; (b) all coders degraded speech quality on these nine scales, with the GSM and CELP coders providing the better quality speech; and (c) interactions between coders and individual voices did occur on several voice quality scales.

Styles APA, Harvard, Vancouver, ISO, etc.

32

Hu, J., C. C. Cheng et W. H. Liu. « Processing of speech signals using a microphone array for intelligent robots ». Proceedings of the Institution of Mechanical Engineers, Part I : Journal of Systems and Control Engineering 219, n^o 2 (1 mars 2005) : 133–43. http://dx.doi.org/10.1243/095965105x9461.

Texte intégral

Résumé :

For intelligent robots to interact with people, an efficient human-robot communication interface is very important (e.g. voice command). However, recognizing voice command or speech represents only part of speech communication. The physics of speech signals includes other information, such as speaker direction. Secondly, a basic element of processing the speech signal is recognition at the acoustic level. However, the performance of recognition depends greatly on the reception. In a noisy environment, the success rate can be very poor. As a result, prior to speech recognition, it is important to process the speech signals to extract the needed content while rejecting others (such as background noise). This paper presents a speech purification system for robots to improve the signal-to-noise ratio of reception and an algorithm with a multidirection calibration beamformer.

Styles APA, Harvard, Vancouver, ISO, etc.

33

Wu, Yixuan. « Application of deep learning-based speech signal processing technology in electronic communication ». Applied and Computational Engineering 77, n^o 1 (16 juillet 2024) : 106–11. http://dx.doi.org/10.54254/2755-2721/77/20240661.

Texte intégral

Résumé :

In recent years, the artificial intelligence boom triggered by deep learning is influencing and changing peoples lifestyles. People are no longer satisfied with human-computer interaction through simple text commands; instead, they look forward to more convenient and faster communication methods like voice interaction. Against the backdrop of innovative development, the application of speech signal processing systems is becoming increasingly widespread. Therefore, it is necessary to study the application of deep learning-based speech signal processing technology in electronic communication. This can provide more valuable references and assistance for future development, promoting the better development of deep learning-based speech signal processing technology in electronic communication. In this paper, we first review the application of deep learning in speech signal enhancement, speech recognition, and speech synthesis from a theoretical analysis perspective. Then, we discuss the application of deep learning-based speech signal processing in electronic communication, including the application of models such as Transformer, LAS (Listen, Attend and Spell), and GFT-conformer in speech signal processing. We also discuss some application scenarios of deep learning-based speech signal processing in electronic communication. Finally, we identify the need for deeper application of deep learning technology in speech signal processing and electronic communication, with continuous optimization and adjustment.

Styles APA, Harvard, Vancouver, ISO, etc.

34

Smither, Janan Al-Awar. « The Processing of Synthetic Speech by Older and Younger Adults ». Proceedings of the Human Factors Society Annual Meeting 36, n^o 2 (octobre 1992) : 190–92. http://dx.doi.org/10.1177/154193129203600211.

Texte intégral

Résumé :

This experiment investigated the demands synthetic speech places on short term memory by comparing performance of old and young adults on an ordinary short term memory task. Items presented were generated by a human speaker or by a text-to-speech computer synthesizer. Results were consistent with the idea that the comprehension of synthetic speech imposes increased resource demands on the short term memory system. Older subjects performed significantly more poorly than younger subjects, and both groups performed more poorly with synthetic than with human speech. Findings suggest that short term memory demands imposed by the processing of synthetic speech should be investigated further, particularly regarding the implementation of voice response systems in devices for the elderly.

Styles APA, Harvard, Vancouver, ISO, etc.

35

Chien, Jen-Tzung, et Man-Wai Mak. « Guest Editorial : Modern Speech Processing and Learning ». Journal of Signal Processing Systems 92, n^o 8 (9 juillet 2020) : 775–76. http://dx.doi.org/10.1007/s11265-020-01577-4.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

36

Islam, Rumana, Esam Abdel-Raheem et Mohammed Tarique. « A Novel Pathological Voice Identification Technique through Simulated Cochlear Implant Processing Systems ». Applied Sciences 12, n^o 5 (25 février 2022) : 2398. http://dx.doi.org/10.3390/app12052398.

Texte intégral

Résumé :

This paper presents a pathological voice identification system employing signal processing techniques through cochlear implant models. The fundamentals of the biological process for speech perception are investigated to develop this technique. Two cochlear implant models are considered in this work: one uses a conventional bank of bandpass filters, and the other one uses a bank of optimized gammatone filters. The critical center frequencies of those filters are selected to mimic the human cochlear vibration patterns caused by audio signals. The proposed system processes the speech samples and applies a CNN for final pathological voice identification. The results show that the two proposed models adopting bandpass and gammatone filterbanks can discriminate the pathological voices from healthy ones, resulting in F1 scores of 77.6% and 78.7%, respectively, with speech samples. The obtained results of this work are also compared with those of other related published works.

Styles APA, Harvard, Vancouver, ISO, etc.

37

Kazi, Sara. « SPEECH RECOGNITION SYSTEM ». INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, n^o 03 (22 mars 2024) : 1–5. http://dx.doi.org/10.55041/ijsrem29567.

Texte intégral

Résumé :

Speech recognition technology has witnessed remarkable progress in recent years, fueled by advancements in machine learning, deep neural networks, and signal processing techniques. This paper presents a comprehensive review of the current state-of-the-art in speech recognition systems, highlighting key methodologies and breakthroughs that have contributed to their improved performance. The paper explores various aspects, including acoustic modeling, language modeling, and the integration of contextual information, shedding light on the challenges faced and innovative solutions proposed in the field. Furthermore, the paper discusses the impact of large-scale datasets and transfer learning on the robustness and adaptability of speech recognition models. It delves into recent developments in end-to-end models and their potential to simplify the architecture while enhancing accuracy. The integration of real-time and edge computing for speech recognition applications is also explored, emphasizing the implications for practical implementations in diverse domains such as healthcare, telecommunications, and smart devices. In addition to reviewing the current landscape, the paper provides insights into future prospects and emerging trends in speech recognition research. The role of multimodal approaches, incorporating visual and contextual cues, is discussed as a potential avenue for further improvement. Ethical considerations related to privacy and bias in speech recognition systems are also addressed, emphasizing the importance of responsible development and deployment. By synthesizing current research findings and anticipating future directions, this paper contributes to the evolving discourse on speech recognition technologies, providing a valuable resource for researchers, practitioners, and industry professionals in the field. Key Words: Real-time processing , Machine learning , Deep neural networks , Technology advancements , Contextual information , Large-scale datasets Transfer learning , End-to-end models , Real-time processing Edge computing , Multimodal approaches Ethical considerations , Privacy , Bias , Future prospects Research review.

Styles APA, Harvard, Vancouver, ISO, etc.

38

Yu, Sabrina, Sherryse Corrow, Jason JS Barton et Andrea Albonico. « Facial Identity And Facial Speech Processing In Developmental Prosopagnosia ». Journal of Vision 22, n^o 14 (5 décembre 2022) : 3422. http://dx.doi.org/10.1167/jov.22.14.3422.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

39

Kosarev, Y. « Synergetics and 'insight' strategy for speech processing ». Literary and Linguistic Computing 12, n^o 2 (1 juin 1997) : 113–18. http://dx.doi.org/10.1093/llc/12.2.113.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

40

Finke, Mareike, Pascale Sandmann, Hanna Bönitz, Andrej Kral et Andreas Büchner. « Consequences of Stimulus Type on Higher-Order Processing in Single-Sided Deaf Cochlear Implant Users ». Audiology and Neurotology 21, n^o 5 (2016) : 305–15. http://dx.doi.org/10.1159/000452123.

Texte intégral

Résumé :

Single-sided deaf subjects with a cochlear implant (CI) provide the unique opportunity to compare central auditory processing of the electrical input (CI ear) and the acoustic input (normal-hearing, NH, ear) within the same individual. In these individuals, sensory processing differs between their two ears, while cognitive abilities are the same irrespectively of the sensory input. To better understand perceptual-cognitive factors modulating speech intelligibility with a CI, this electroencephalography study examined the central-auditory processing of words, the cognitive abilities, and the speech intelligibility in 10 postlingually single-sided deaf CI users. We found lower hit rates and prolonged response times for word classification during an oddball task for the CI ear when compared with the NH ear. Also, event-related potentials reflecting sensory (N1) and higher-order processing (N2/N4) were prolonged for word classification (targets versus nontargets) with the CI ear compared with the NH ear. Our results suggest that speech processing via the CI ear and the NH ear differs both at sensory (N1) and cognitive (N2/N4) processing stages, thereby affecting the behavioral performance for speech discrimination. These results provide objective evidence for cognition to be a key factor for speech perception under adverse listening conditions, such as the degraded speech signal provided from the CI.

Styles APA, Harvard, Vancouver, ISO, etc.

41

Jamal, Marwa, et Tariq A. Hassan. « Speech Coding Using Discrete Cosine Transform and Chaotic Map ». Ingénierie des systèmes d information 27, n^o 4 (31 août 2022) : 673–77. http://dx.doi.org/10.18280/isi.270419.

Texte intégral

Résumé :

Recently, data of multimedia performs an exponentially blowing tendency, saturating daily life of humans. Various modalities of data, includes images, texts and video, plays important role in different aspects and has wide. However, the key problem of utilizing data of large scale is cost of processing and massive storage. Therefore, for efficient communications and for economical storage requires effective techniques of data compression to reduce the volume of data. Speech coding is a main problem in the area of digital speech processing. The process of converting the voice signals into a more compressed form is speech coding. In this work, we demonstrate that a DCT with a chaotic system combined with run-length coding can be utilized to implement speech coding of very low bit-rate with high reconstruction quality. Experimental result show that compression ratio is about 13% when implemented on Librispeech dataset.

Styles APA, Harvard, Vancouver, ISO, etc.

42

Resende, Natália, et Andy Way. « Can Google Translate Rewire Your L2 English Processing ? » Digital 1, n^o 1 (4 mars 2021) : 66–85. http://dx.doi.org/10.3390/digital1010006.

Texte intégral

Résumé :

In this article, we address the question of whether exposure to the translated output of MT systems could result in changes in the cognitive processing of English as a second language (L2 English). To answer this question, we first conducted a survey with 90 Brazilian Portuguese L2 English speakers with the aim of understanding how and for what purposes they use web-based MT systems. To investigate whether MT systems are capable of influencing L2 English cognitive processing, we carried out a syntactic priming experiment with 32 Brazilian Portuguese speakers. We wanted to test whether speakers re-use in their subsequent speech in English the same syntactic alternative previously seen in the MT output, when using the popular Google Translate system to translate sentences from Portuguese into English. The results of the survey show that Brazilian Portuguese L2 English speakers use Google Translate as a tool supporting their speech in English as well as a source of English vocabulary learning. The results of the syntactic priming experiment show that exposure to an English syntactic alternative through GT can lead to the re-use of the same syntactic alternative in subsequent speech even if it is not the speaker’s preferred syntactic alternative in English. These findings suggest that GT is being used as a tool for language learning purposes and so is indeed capable of rewiring the processing of L2 English syntax.

Styles APA, Harvard, Vancouver, ISO, etc.

43

Yelle, Serena K., et Gina M. Grimshaw. « Hemispheric Specialization for Linguistic Processing of Sung Speech ». Perceptual and Motor Skills 108, n^o 1 (février 2009) : 219–28. http://dx.doi.org/10.2466/pms.108.1.219-228.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

44

Ito, Takayuki, Alexis R. Johns et David J. Ostry. « Left Lateralized Enhancement of Orofacial Somatosensory Processing Due to Speech Sounds ». Journal of Speech, Language, and Hearing Research 56, n^o 6 (décembre 2013) : 1875–81. http://dx.doi.org/10.1044/1092-4388(2013/12-0226).

Texte intégral

Résumé :

Purpose Somatosensory information associated with speech articulatory movements affects the perception of speech sounds and vice versa, suggesting an intimate linkage between speech production and perception systems. However, it is unclear which cortical processes are involved in the interaction between speech sounds and orofacial somatosensory inputs. The authors examined whether speech sounds modify orofacial somatosensory cortical potentials that were elicited using facial skin perturbations. Method Somatosensory event-related potentials in EEG were recorded in 3 background sound conditions (pink noise, speech sounds, and nonspeech sounds) and also in a silent condition. Facial skin deformations that are similar in timing and duration to those experienced in speech production were used for somatosensory stimulation. Results The authors found that speech sounds reliably enhanced the first negative peak of the somatosensory event-related potential when compared with the other 3 sound conditions. The enhancement was evident at electrode locations above the left motor and premotor area of the orofacial system. The result indicates that speech sounds interact with somatosensory cortical processes that are produced by speech-production-like patterns of facial skin stretch. Conclusion Neural circuits in the left hemisphere, presumably in left motor and premotor cortex, may play a prominent role in the interaction between auditory inputs and speech-relevant somatosensory processing.

Styles APA, Harvard, Vancouver, ISO, etc.

45

Abdusalomov, Akmalbek Bobomirzaevich, Furkat Safarov, Mekhriddin Rakhimov, Boburkhon Turaev et Taeg Keun Whangbo. « Improved Feature Parameter Extraction from Speech Signals Using Machine Learning Algorithm ». Sensors 22, n^o 21 (24 octobre 2022) : 8122. http://dx.doi.org/10.3390/s22218122.

Texte intégral

Résumé :

Speech recognition refers to the capability of software or hardware to receive a speech signal, identify the speaker’s features in the speech signal, and recognize the speaker thereafter. In general, the speech recognition process involves three main steps: acoustic processing, feature extraction, and classification/recognition. The purpose of feature extraction is to illustrate a speech signal using a predetermined number of signal components. This is because all information in the acoustic signal is excessively cumbersome to handle, and some information is irrelevant in the identification task. This study proposes a machine learning-based approach that performs feature parameter extraction from speech signals to improve the performance of speech recognition applications in real-time smart city environments. Moreover, the principle of mapping a block of main memory to the cache is used efficiently to reduce computing time. The block size of cache memory is a parameter that strongly affects the cache performance. In particular, the implementation of such processes in real-time systems requires a high computation speed. Processing speed plays an important role in speech recognition in real-time systems. It requires the use of modern technologies and fast algorithms that increase the acceleration in extracting the feature parameters from speech signals. Problems with overclocking during the digital processing of speech signals have yet to be completely resolved. The experimental results demonstrate that the proposed method successfully extracts the signal features and achieves seamless classification performance compared to other conventional speech recognition algorithms.

Styles APA, Harvard, Vancouver, ISO, etc.

46

Wingfield, Arthur, et Kimberly C. Lindfield. « Multiple Memory Systems in the Processing of Speech : Evidence from Aging ». Experimental Aging Research 21, n^o 2 (avril 1995) : 101–21. http://dx.doi.org/10.1080/03610739508254272.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

47

Murthy, Hema A., et B. Yegnanarayana. « Speech processing using group delay functions ». Signal Processing 22, n^o 3 (mars 1991) : 259–67. http://dx.doi.org/10.1016/0165-1684(91)90014-a.

Texte intégral

Styles APA, Harvard, Vancouver, ISO, etc.

48

Ghezaiel, Wajdi, Amel Ben Slimane et Ezzedine Ben Braiek. « On Usable Speech Detection by Linear Multi-Scale Decomposition for Speaker Identification ». International Journal of Electrical and Computer Engineering (IJECE) 6, n^o 6 (1 décembre 2016) : 2766. http://dx.doi.org/10.11591/ijece.v6i6.9844.

Texte intégral

Résumé :

<p>Usable speech is a novel concept of processing co-channel speech data. It is proposed to extract minimally corrupted speech that is considered useful for various speech processing systems. In this paper, we are interested for co-channel speaker identification (SID). We employ a new proposed usable speech extraction method based on the pitch information obtained from linear multi-scale decomposition by discrete wavelet transform. The idea is to retain the speech segments that have only one pitch detected and remove the others. Detected Usable speech was used as input for speaker identification system. The system is evaluated on co-channel speech and results show a significant improvement across various Target to Interferer Ratio (TIR) for speaker identification system.</p>

Styles APA, Harvard, Vancouver, ISO, etc.

49

Ghezaiel, Wajdi, Amel Ben Slimane et Ezzedine Ben Braiek. « On Usable Speech Detection by Linear Multi-Scale Decomposition for Speaker Identification ». International Journal of Electrical and Computer Engineering (IJECE) 6, n^o 6 (1 décembre 2016) : 2766. http://dx.doi.org/10.11591/ijece.v6i6.pp2766-2772.

Texte intégral

Résumé :

<p>Usable speech is a novel concept of processing co-channel speech data. It is proposed to extract minimally corrupted speech that is considered useful for various speech processing systems. In this paper, we are interested for co-channel speaker identification (SID). We employ a new proposed usable speech extraction method based on the pitch information obtained from linear multi-scale decomposition by discrete wavelet transform. The idea is to retain the speech segments that have only one pitch detected and remove the others. Detected Usable speech was used as input for speaker identification system. The system is evaluated on co-channel speech and results show a significant improvement across various Target to Interferer Ratio (TIR) for speaker identification system.</p>

Styles APA, Harvard, Vancouver, ISO, etc.

50

Stork, David G. « SOURCES OF NEURAL STRUCTURE IN SPEECH AND LANGUAGE PROCESSING ». International Journal of Neural Systems 02, n^o 03 (janvier 1991) : 159–67. http://dx.doi.org/10.1142/s0129065791000157.

Texte intégral

Résumé :

Because of the complexity and high dimensionality of the problem, speech recognition—perhaps more than any other problem of current interest in network research—will profit from human neurophysiology, psychoacoustics and psycholinguistics: approaches based exclusively on engineering principles will provide only limited benefits. Despite the great power of current learning algorithms in homogeneous or unstructured networks, a number of difficulties in speech recognition seem to indicate that homogeneous networks taken alone will be insufficient for the task, and that structure—representing constraints—will also be required. In the biological system, the sources of such structure include developmental and evolutionary effects. Recent considerations of the evolutionary sources of neural structure in the human speech and language systems, including models of the interrelationship between speech motor system and auditory system, are analyzed with special reference to neural network approaches.

Styles APA, Harvard, Vancouver, ISO, etc.

Nous offrons des réductions sur tous les plans premium pour les auteurs dont les œuvres sont incluses dans des sélections littéraires thématiques. Contactez-nous pour obtenir un code promo unique!