Inhaltsverzeichnis
Auswahl der wissenschaftlichen Literatur zum Thema „Speech Activity Detection (SAD)“
Geben Sie eine Quelle nach APA, MLA, Chicago, Harvard und anderen Zitierweisen an
Machen Sie sich mit den Listen der aktuellen Artikel, Bücher, Dissertationen, Berichten und anderer wissenschaftlichen Quellen zum Thema "Speech Activity Detection (SAD)" bekannt.
Neben jedem Werk im Literaturverzeichnis ist die Option "Zur Bibliographie hinzufügen" verfügbar. Nutzen Sie sie, wird Ihre bibliographische Angabe des gewählten Werkes nach der nötigen Zitierweise (APA, MLA, Harvard, Chicago, Vancouver usw.) automatisch gestaltet.
Sie können auch den vollen Text der wissenschaftlichen Publikation im PDF-Format herunterladen und eine Online-Annotation der Arbeit lesen, wenn die relevanten Parameter in den Metadaten verfügbar sind.
Zeitschriftenartikel zum Thema "Speech Activity Detection (SAD)"
Kaur, Sukhvinder, und J. S. Sohal. „Speech Activity Detection and its Evaluation in Speaker Diarization System“. INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY 16, Nr. 1 (13.03.2017): 7567–72. http://dx.doi.org/10.24297/ijct.v16i1.5893.
Der volle Inhalt der QuelleDutta, Satwik, Prasanna Kothalkar, Johanna Rudolph, Christine Dollaghan, Jennifer McGlothlin, Thomas Campbell und John H. Hansen. „Advancing speech activity detection for automatic speech assessment of pre-school children prompted speech using COMBO-SAD“. Journal of the Acoustical Society of America 148, Nr. 4 (Oktober 2020): 2469–67. http://dx.doi.org/10.1121/1.5146831.
Der volle Inhalt der QuelleMahalakshmi, P. „A REVIEW ON VOICE ACTIVITY DETECTION AND MEL-FREQUENCY CEPSTRAL COEFFICIENTS FOR SPEAKER RECOGNITION (TREND ANALYSIS)“. Asian Journal of Pharmaceutical and Clinical Research 9, Nr. 9 (01.12.2016): 360. http://dx.doi.org/10.22159/ajpcr.2016.v9s3.14352.
Der volle Inhalt der QuelleZhao, Hui, Yu Tai Wang und Xing Hai Yang. „Emotion Detection System Based on Speech and Facial Signals“. Advanced Materials Research 459 (Januar 2012): 483–87. http://dx.doi.org/10.4028/www.scientific.net/amr.459.483.
Der volle Inhalt der QuelleGelly, Gregory, und Jean-Luc Gauvain. „Optimization of RNN-Based Speech Activity Detection“. IEEE/ACM Transactions on Audio, Speech, and Language Processing 26, Nr. 3 (März 2018): 646–56. http://dx.doi.org/10.1109/taslp.2017.2769220.
Der volle Inhalt der QuelleKoh, Min‐sung, und Margaret Mortz. „Improved voice activity detection of noisy speech“. Journal of the Acoustical Society of America 107, Nr. 5 (Mai 2000): 2907–8. http://dx.doi.org/10.1121/1.428823.
Der volle Inhalt der QuelleQuan, Changqin, Bin Zhang, Xiao Sun und Fuji Ren. „A combined cepstral distance method for emotional speech recognition“. International Journal of Advanced Robotic Systems 14, Nr. 4 (01.07.2017): 172988141771983. http://dx.doi.org/10.1177/1729881417719836.
Der volle Inhalt der QuelleDash, Debadatta, Paul Ferrari, Satwik Dutta und Jun Wang. „NeuroVAD: Real-Time Voice Activity Detection from Non-Invasive Neuromagnetic Signals“. Sensors 20, Nr. 8 (16.04.2020): 2248. http://dx.doi.org/10.3390/s20082248.
Der volle Inhalt der QuelleMattys, Sven L., und Jamie H. Clark. „Lexical activity in speech processing: evidence from pause detection“. Journal of Memory and Language 47, Nr. 3 (Oktober 2002): 343–59. http://dx.doi.org/10.1016/s0749-596x(02)00037-2.
Der volle Inhalt der QuellePotamitis, I., und E. Fishler. „Speech activity detection of moving speaker using microphone arrays“. Electronics Letters 39, Nr. 16 (2003): 1223. http://dx.doi.org/10.1049/el:20030726.
Der volle Inhalt der QuelleDissertationen zum Thema "Speech Activity Detection (SAD)"
Näslund, Anton, und Charlie Jeansson. „Robust Speech Activity Detection and Direction of Arrival Using Convolutional Neural Networks“. Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-297756.
Der volle Inhalt der QuelleSociala robotar blir vanligare och vanligare i våra vardagliga liv. Inom konversationsrobotik går utvecklingen mot socialt engagerande robotar som kan ha mänskliga konversationer. Detta projekt tittar på en av de tekniska aspekterna vid taligenkänning, nämligen talaktivitets detektion. Den presenterade lösningen använder ett convolutional neuralt nätverks(CNN) baserat system för att detektera tal i ett framåtriktat azimut område. Projektet använde sig av ett dataset från FestVox, kallat CMU Artic och kompletterades genom att lägga till ett antal inspelade störningsljud. Ett bibliotek som heter Pyroomacoustics användes för att simulera en verklig miljö för att skapa ett robust system. En förenklad modell konstruerades som endast detekterade talaktivitet och en noggrannhet på 95% uppnåddes. Den färdiga maskinen resulterade i en noggrannhet på 93%. Det jämfördes med liknande projekt, en röstaktivitetsdetekterings (VAD) algoritm WebRTC med strålformning, eftersom inga tidigare publicerade lösningar för vårt projekt hittades. Det visade sig att våra lösningar hade högre noggrannhet än den WebRTC uppnådde på vårt dataset.
Kandidatexjobb i elektroteknik 2020, KTH, Stockholm
Wejdelind, Marcus, und Nils Wägmark. „Multi-speaker Speech Activity Detection From Video“. Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-297701.
Der volle Inhalt der QuelleEn social robot kommer i många fall tvingasatt hantera konversationer med flera interlokutörer och därolika personer pratar samtidigt. För att uppnå detta är detviktigt att roboten kan identifiera talaren för att i nästa ledkunna bistå eller interagera med denna. Detta projekt harundersökt problemet med en visuell utgångspunkt där ettFaltningsnätverk (CNN) implementerades och tränades medvideo-input från ett redan befintligt dataset (AVA-Speech).Målet för nätverket har varit att för varje ansikte, och i varjetidpunkt, detektera sannolikheten att den personen talar. Vårtbästa resultat vid användning av Optical Flow var 0,753 medanvi lyckades nå 0,781 med en annan typ av förprocessering avdatan. Dessa resultat motsvarade den existerande vetenskapligalitteraturen på området förvånansvärt bra där 0,77 har visatsig vara ett lämpligt jämförelsevärde.
Kandidatexjobb i elektroteknik 2020, KTH, Stockholm
Murrin, Paul. „Objective measurement of voice activity detectors“. Thesis, University of York, 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.325647.
Der volle Inhalt der QuelleLaverty, Stephen William. „Detection of Nonstationary Noise and Improved Voice Activity Detection in an Automotive Hands-free Environment“. Link to electronic thesis, 2005. http://www.wpi.edu/Pubs/ETD/Available/etd-051105-110646/.
Der volle Inhalt der QuelleMinotto, Vicente Peruffo. „Audiovisual voice activity detection and localization of simultaneous speech sources“. reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2013. http://hdl.handle.net/10183/77231.
Der volle Inhalt der QuelleGiven the tendency of creating interfaces between human and machines that increasingly allow simple ways of interaction, it is only natural that research effort is put into techniques that seek to simulate the most conventional mean of communication humans use: the speech. In the human auditory system, voice is automatically processed by the brain in an effortless and effective way, also commonly aided by visual cues, such as mouth movement and location of the speakers. This processing done by the brain includes two important components that speech-based communication require: Voice Activity Detection (VAD) and Sound Source Localization (SSL). Consequently, VAD and SSL also serve as mandatory preprocessing tools for high-end Human Computer Interface (HCI) applications in a computing environment, as the case of automatic speech recognition and speaker identification. However, VAD and SSL are still challenging problems when dealing with realistic acoustic scenarios, particularly in the presence of noise, reverberation and multiple simultaneous speakers. In this work we propose some approaches for tackling these problems using audiovisual information, both for the single source and the competing sources scenario, exploiting distinct ways of fusing the audio and video modalities. Our work also employs a microphone array for the audio processing, which allows the spatial information of the acoustic signals to be explored through the stateof- the art method Steered Response Power (SRP). As an additional consequence, a very fast GPU version of the SRP is developed, so that real-time processing is achieved. Our experiments show an average accuracy of 95% when performing VAD of up to three simultaneous speakers and an average error of 10cm when locating such speakers.
Ent, Petr. „Voice Activity Detection“. Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2009. http://www.nusl.cz/ntk/nusl-235483.
Der volle Inhalt der QuelleCho, Yong Duk. „Speech detection, enhancement and compression for voice communications“. Thesis, University of Surrey, 2001. http://epubs.surrey.ac.uk/842991/.
Der volle Inhalt der QuelleDoukas, Nikolaos. „Voice activity detection using energy based measures and source separation“. Thesis, Imperial College London, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.245220.
Der volle Inhalt der QuelleSinclair, Mark. „Speech segmentation and speaker diarisation for transcription and translation“. Thesis, University of Edinburgh, 2016. http://hdl.handle.net/1842/20970.
Der volle Inhalt der QuelleThorell, Hampus. „Voice Activity Detection in the Tiger Platform“. Thesis, Linköping University, Department of Electrical Engineering, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-6586.
Der volle Inhalt der QuelleSectra Communications AB has developed a terminal for encrypted communication called the Tiger platform. During voice communication delays have sometimes been experienced resulting in conversational complications.
A solution to this problem, as was proposed by Sectra, would be to introduce voice activity detection, which means a separation of speech parts and non-speech parts of the input signal, to the Tiger platform. By only transferring the speech parts to the receiver, the bandwidth needed should be dramatically decreased. A lower bandwidth needed implies that the delays slowly should disappear. The problem is then to come up with a method that manages to distinguish the speech parts from the input signal. Fortunately a lot of theory on the subject has been done and numerous voice activity methods exist today.
In this thesis the theory of voice activity detection has been studied. A review of voice activity detectors that exist on the market today followed by an evaluation of some of these was performed in order to select a suitable candidate for the Tiger platform. This evaluation would later become the foundation for the selection of a voice activity detector for implementation.
Finally, the implementation of the chosen voice activity detector, including a comfort noise generator, was done on the platform. This implementation was based on the special requirements of the platform. Tests of the implementation in office environments show that possible delays are steadily being reduced during periods of speech inactivity, while the active speech quality is preserved.
Buchteile zum Thema "Speech Activity Detection (SAD)"
Alam, Tanvirul, und Akib Khan. „Lightweight CNN for Robust Voice Activity Detection“. In Speech and Computer, 1–12. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-60276-5_1.
Der volle Inhalt der QuelleSolé-Casals, Jordi, Pere Martí-Puig, Ramon Reig-Bolaño und Vladimir Zaiats. „Score Function for Voice Activity Detection“. In Advances in Nonlinear Speech Processing, 76–83. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-11509-7_10.
Der volle Inhalt der QuellePertilä, Pasi, Alessio Brutti, Piergiorgio Svaizer und Maurizio Omologo. „Multichannel Source Activity Detection, Localization, and Tracking“. In Audio Source Separation and Speech Enhancement, 47–64. Chichester, UK: John Wiley & Sons Ltd, 2018. http://dx.doi.org/10.1002/9781119279860.ch4.
Der volle Inhalt der QuelleMálek, Jiří, und Jindřich Žďánský. „Voice-Activity and Overlapped Speech Detection Using x-Vectors“. In Text, Speech, and Dialogue, 366–76. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-58323-1_40.
Der volle Inhalt der QuelleHuang, Zhongqiang, und Mary P. Harper. „Speech Activity Detection on Multichannels of Meeting Recordings“. In Machine Learning for Multimodal Interaction, 415–27. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006. http://dx.doi.org/10.1007/11677482_35.
Der volle Inhalt der QuelleZelinka, Jan. „Deep Learning and Online Speech Activity Detection for Czech Radio Broadcasting“. In Text, Speech, and Dialogue, 428–35. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-00794-2_46.
Der volle Inhalt der QuelleChu, Stephen M., Etienne Marcheret und Gerasimos Potamianos. „Automatic Speech Recognition and Speech Activity Detection in the CHIL Smart Room“. In Machine Learning for Multimodal Interaction, 332–43. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006. http://dx.doi.org/10.1007/11677482_29.
Der volle Inhalt der QuelleGórriz, J. M., C. G. Puntonet, J. Ramírez und J. C. Segura. „Bispectrum Estimators for Voice Activity Detection and Speech Recognition“. In Nonlinear Analyses and Algorithms for Speech Processing, 174–85. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006. http://dx.doi.org/10.1007/11613107_15.
Der volle Inhalt der QuelleMacho, Dušan, Climent Nadeu und Andrey Temko. „Robust Speech Activity Detection in Interactive Smart-Room Environments“. In Machine Learning for Multimodal Interaction, 236–47. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006. http://dx.doi.org/10.1007/11965152_21.
Der volle Inhalt der QuelleHonarmandi Shandiz, Amin, und László Tóth. „Voice Activity Detection for Ultrasound-Based Silent Speech Interfaces Using Convolutional Neural Networks“. In Text, Speech, and Dialogue, 499–510. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-83527-9_43.
Der volle Inhalt der QuelleKonferenzberichte zum Thema "Speech Activity Detection (SAD)"
Abdulla, Waleed H., Zhou Guan und Hou Chi Sou. „Noise robust speech activity detection“. In 2009 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT). IEEE, 2009. http://dx.doi.org/10.1109/isspit.2009.5407509.
Der volle Inhalt der QuelleMatic, A., V. Osmani und O. Mayora. „Speech activity detection using accelerometer“. In 2012 34th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). IEEE, 2012. http://dx.doi.org/10.1109/embc.2012.6346377.
Der volle Inhalt der QuelleTsai, TJ, und Nelson Morgan. „Speech activity detection: An economics approach“. In ICASSP 2013 - 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2013. http://dx.doi.org/10.1109/icassp.2013.6638987.
Der volle Inhalt der QuelleKhoury, Elie, und Matt Garland. „I-Vectors for speech activity detection“. In Odyssey 2016. ISCA, 2016. http://dx.doi.org/10.21437/odyssey.2016-48.
Der volle Inhalt der QuelleK, Punnoose A. „New Features for Speech Activity Detection“. In SMM19, Workshop on Speech, Music and Mind 2019. ISCA: ISCA, 2019. http://dx.doi.org/10.21437/smm.2019-6.
Der volle Inhalt der QuelleLaskowski, Kornel, Qin Jin und Tanja Schultz. „Crosscorrelation-based multispeaker speech activity detection“. In Interspeech 2004. ISCA: ISCA, 2004. http://dx.doi.org/10.21437/interspeech.2004-350.
Der volle Inhalt der QuelleSarfjoo, Seyyed Saeed, Srikanth Madikeri und Petr Motlicek. „Speech Activity Detection Based on Multilingual Speech Recognition System“. In Interspeech 2021. ISCA: ISCA, 2021. http://dx.doi.org/10.21437/interspeech.2021-1058.
Der volle Inhalt der QuelleHarsha, B. V. „A noise robust speech activity detection algorithm“. In Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, 2004. IEEE, 2004. http://dx.doi.org/10.1109/isimp.2004.1434065.
Der volle Inhalt der QuelleShahsavari, Sajad, Hossein Sameti und Hossein Hadian. „Speech activity detection using deep neural networks“. In 2017 Iranian Conference on Electrical Engineering (ICEE). IEEE, 2017. http://dx.doi.org/10.1109/iraniancee.2017.7985293.
Der volle Inhalt der QuelleHeese, Florian, Markus Niermann und Peter Vary. „Speech-codebook based soft Voice Activity Detection“. In ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2015. http://dx.doi.org/10.1109/icassp.2015.7178789.
Der volle Inhalt der Quelle