Academic literature on the topic 'Speech Activity Detection (SAD)'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Speech Activity Detection (SAD).'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Journal articles on the topic "Speech Activity Detection (SAD)"
Kaur, Sukhvinder, and J. S. Sohal. "Speech Activity Detection and its Evaluation in Speaker Diarization System." INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY 16, no. 1 (March 13, 2017): 7567–72. http://dx.doi.org/10.24297/ijct.v16i1.5893.
Full textDutta, Satwik, Prasanna Kothalkar, Johanna Rudolph, Christine Dollaghan, Jennifer McGlothlin, Thomas Campbell, and John H. Hansen. "Advancing speech activity detection for automatic speech assessment of pre-school children prompted speech using COMBO-SAD." Journal of the Acoustical Society of America 148, no. 4 (October 2020): 2469–67. http://dx.doi.org/10.1121/1.5146831.
Full textMahalakshmi, P. "A REVIEW ON VOICE ACTIVITY DETECTION AND MEL-FREQUENCY CEPSTRAL COEFFICIENTS FOR SPEAKER RECOGNITION (TREND ANALYSIS)." Asian Journal of Pharmaceutical and Clinical Research 9, no. 9 (December 1, 2016): 360. http://dx.doi.org/10.22159/ajpcr.2016.v9s3.14352.
Full textZhao, Hui, Yu Tai Wang, and Xing Hai Yang. "Emotion Detection System Based on Speech and Facial Signals." Advanced Materials Research 459 (January 2012): 483–87. http://dx.doi.org/10.4028/www.scientific.net/amr.459.483.
Full textGelly, Gregory, and Jean-Luc Gauvain. "Optimization of RNN-Based Speech Activity Detection." IEEE/ACM Transactions on Audio, Speech, and Language Processing 26, no. 3 (March 2018): 646–56. http://dx.doi.org/10.1109/taslp.2017.2769220.
Full textKoh, Min‐sung, and Margaret Mortz. "Improved voice activity detection of noisy speech." Journal of the Acoustical Society of America 107, no. 5 (May 2000): 2907–8. http://dx.doi.org/10.1121/1.428823.
Full textQuan, Changqin, Bin Zhang, Xiao Sun, and Fuji Ren. "A combined cepstral distance method for emotional speech recognition." International Journal of Advanced Robotic Systems 14, no. 4 (July 1, 2017): 172988141771983. http://dx.doi.org/10.1177/1729881417719836.
Full textDash, Debadatta, Paul Ferrari, Satwik Dutta, and Jun Wang. "NeuroVAD: Real-Time Voice Activity Detection from Non-Invasive Neuromagnetic Signals." Sensors 20, no. 8 (April 16, 2020): 2248. http://dx.doi.org/10.3390/s20082248.
Full textMattys, Sven L., and Jamie H. Clark. "Lexical activity in speech processing: evidence from pause detection." Journal of Memory and Language 47, no. 3 (October 2002): 343–59. http://dx.doi.org/10.1016/s0749-596x(02)00037-2.
Full textPotamitis, I., and E. Fishler. "Speech activity detection of moving speaker using microphone arrays." Electronics Letters 39, no. 16 (2003): 1223. http://dx.doi.org/10.1049/el:20030726.
Full textDissertations / Theses on the topic "Speech Activity Detection (SAD)"
Näslund, Anton, and Charlie Jeansson. "Robust Speech Activity Detection and Direction of Arrival Using Convolutional Neural Networks." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-297756.
Full textSociala robotar blir vanligare och vanligare i våra vardagliga liv. Inom konversationsrobotik går utvecklingen mot socialt engagerande robotar som kan ha mänskliga konversationer. Detta projekt tittar på en av de tekniska aspekterna vid taligenkänning, nämligen talaktivitets detektion. Den presenterade lösningen använder ett convolutional neuralt nätverks(CNN) baserat system för att detektera tal i ett framåtriktat azimut område. Projektet använde sig av ett dataset från FestVox, kallat CMU Artic och kompletterades genom att lägga till ett antal inspelade störningsljud. Ett bibliotek som heter Pyroomacoustics användes för att simulera en verklig miljö för att skapa ett robust system. En förenklad modell konstruerades som endast detekterade talaktivitet och en noggrannhet på 95% uppnåddes. Den färdiga maskinen resulterade i en noggrannhet på 93%. Det jämfördes med liknande projekt, en röstaktivitetsdetekterings (VAD) algoritm WebRTC med strålformning, eftersom inga tidigare publicerade lösningar för vårt projekt hittades. Det visade sig att våra lösningar hade högre noggrannhet än den WebRTC uppnådde på vårt dataset.
Kandidatexjobb i elektroteknik 2020, KTH, Stockholm
Wejdelind, Marcus, and Nils Wägmark. "Multi-speaker Speech Activity Detection From Video." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-297701.
Full textEn social robot kommer i många fall tvingasatt hantera konversationer med flera interlokutörer och därolika personer pratar samtidigt. För att uppnå detta är detviktigt att roboten kan identifiera talaren för att i nästa ledkunna bistå eller interagera med denna. Detta projekt harundersökt problemet med en visuell utgångspunkt där ettFaltningsnätverk (CNN) implementerades och tränades medvideo-input från ett redan befintligt dataset (AVA-Speech).Målet för nätverket har varit att för varje ansikte, och i varjetidpunkt, detektera sannolikheten att den personen talar. Vårtbästa resultat vid användning av Optical Flow var 0,753 medanvi lyckades nå 0,781 med en annan typ av förprocessering avdatan. Dessa resultat motsvarade den existerande vetenskapligalitteraturen på området förvånansvärt bra där 0,77 har visatsig vara ett lämpligt jämförelsevärde.
Kandidatexjobb i elektroteknik 2020, KTH, Stockholm
Murrin, Paul. "Objective measurement of voice activity detectors." Thesis, University of York, 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.325647.
Full textLaverty, Stephen William. "Detection of Nonstationary Noise and Improved Voice Activity Detection in an Automotive Hands-free Environment." Link to electronic thesis, 2005. http://www.wpi.edu/Pubs/ETD/Available/etd-051105-110646/.
Full textMinotto, Vicente Peruffo. "Audiovisual voice activity detection and localization of simultaneous speech sources." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2013. http://hdl.handle.net/10183/77231.
Full textGiven the tendency of creating interfaces between human and machines that increasingly allow simple ways of interaction, it is only natural that research effort is put into techniques that seek to simulate the most conventional mean of communication humans use: the speech. In the human auditory system, voice is automatically processed by the brain in an effortless and effective way, also commonly aided by visual cues, such as mouth movement and location of the speakers. This processing done by the brain includes two important components that speech-based communication require: Voice Activity Detection (VAD) and Sound Source Localization (SSL). Consequently, VAD and SSL also serve as mandatory preprocessing tools for high-end Human Computer Interface (HCI) applications in a computing environment, as the case of automatic speech recognition and speaker identification. However, VAD and SSL are still challenging problems when dealing with realistic acoustic scenarios, particularly in the presence of noise, reverberation and multiple simultaneous speakers. In this work we propose some approaches for tackling these problems using audiovisual information, both for the single source and the competing sources scenario, exploiting distinct ways of fusing the audio and video modalities. Our work also employs a microphone array for the audio processing, which allows the spatial information of the acoustic signals to be explored through the stateof- the art method Steered Response Power (SRP). As an additional consequence, a very fast GPU version of the SRP is developed, so that real-time processing is achieved. Our experiments show an average accuracy of 95% when performing VAD of up to three simultaneous speakers and an average error of 10cm when locating such speakers.
Ent, Petr. "Voice Activity Detection." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2009. http://www.nusl.cz/ntk/nusl-235483.
Full textCho, Yong Duk. "Speech detection, enhancement and compression for voice communications." Thesis, University of Surrey, 2001. http://epubs.surrey.ac.uk/842991/.
Full textDoukas, Nikolaos. "Voice activity detection using energy based measures and source separation." Thesis, Imperial College London, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.245220.
Full textSinclair, Mark. "Speech segmentation and speaker diarisation for transcription and translation." Thesis, University of Edinburgh, 2016. http://hdl.handle.net/1842/20970.
Full textThorell, Hampus. "Voice Activity Detection in the Tiger Platform." Thesis, Linköping University, Department of Electrical Engineering, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-6586.
Full textSectra Communications AB has developed a terminal for encrypted communication called the Tiger platform. During voice communication delays have sometimes been experienced resulting in conversational complications.
A solution to this problem, as was proposed by Sectra, would be to introduce voice activity detection, which means a separation of speech parts and non-speech parts of the input signal, to the Tiger platform. By only transferring the speech parts to the receiver, the bandwidth needed should be dramatically decreased. A lower bandwidth needed implies that the delays slowly should disappear. The problem is then to come up with a method that manages to distinguish the speech parts from the input signal. Fortunately a lot of theory on the subject has been done and numerous voice activity methods exist today.
In this thesis the theory of voice activity detection has been studied. A review of voice activity detectors that exist on the market today followed by an evaluation of some of these was performed in order to select a suitable candidate for the Tiger platform. This evaluation would later become the foundation for the selection of a voice activity detector for implementation.
Finally, the implementation of the chosen voice activity detector, including a comfort noise generator, was done on the platform. This implementation was based on the special requirements of the platform. Tests of the implementation in office environments show that possible delays are steadily being reduced during periods of speech inactivity, while the active speech quality is preserved.
Book chapters on the topic "Speech Activity Detection (SAD)"
Alam, Tanvirul, and Akib Khan. "Lightweight CNN for Robust Voice Activity Detection." In Speech and Computer, 1–12. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-60276-5_1.
Full textSolé-Casals, Jordi, Pere Martí-Puig, Ramon Reig-Bolaño, and Vladimir Zaiats. "Score Function for Voice Activity Detection." In Advances in Nonlinear Speech Processing, 76–83. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-11509-7_10.
Full textPertilä, Pasi, Alessio Brutti, Piergiorgio Svaizer, and Maurizio Omologo. "Multichannel Source Activity Detection, Localization, and Tracking." In Audio Source Separation and Speech Enhancement, 47–64. Chichester, UK: John Wiley & Sons Ltd, 2018. http://dx.doi.org/10.1002/9781119279860.ch4.
Full textMálek, Jiří, and Jindřich Žďánský. "Voice-Activity and Overlapped Speech Detection Using x-Vectors." In Text, Speech, and Dialogue, 366–76. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-58323-1_40.
Full textHuang, Zhongqiang, and Mary P. Harper. "Speech Activity Detection on Multichannels of Meeting Recordings." In Machine Learning for Multimodal Interaction, 415–27. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006. http://dx.doi.org/10.1007/11677482_35.
Full textZelinka, Jan. "Deep Learning and Online Speech Activity Detection for Czech Radio Broadcasting." In Text, Speech, and Dialogue, 428–35. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-00794-2_46.
Full textChu, Stephen M., Etienne Marcheret, and Gerasimos Potamianos. "Automatic Speech Recognition and Speech Activity Detection in the CHIL Smart Room." In Machine Learning for Multimodal Interaction, 332–43. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006. http://dx.doi.org/10.1007/11677482_29.
Full textGórriz, J. M., C. G. Puntonet, J. Ramírez, and J. C. Segura. "Bispectrum Estimators for Voice Activity Detection and Speech Recognition." In Nonlinear Analyses and Algorithms for Speech Processing, 174–85. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006. http://dx.doi.org/10.1007/11613107_15.
Full textMacho, Dušan, Climent Nadeu, and Andrey Temko. "Robust Speech Activity Detection in Interactive Smart-Room Environments." In Machine Learning for Multimodal Interaction, 236–47. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006. http://dx.doi.org/10.1007/11965152_21.
Full textHonarmandi Shandiz, Amin, and László Tóth. "Voice Activity Detection for Ultrasound-Based Silent Speech Interfaces Using Convolutional Neural Networks." In Text, Speech, and Dialogue, 499–510. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-83527-9_43.
Full textConference papers on the topic "Speech Activity Detection (SAD)"
Abdulla, Waleed H., Zhou Guan, and Hou Chi Sou. "Noise robust speech activity detection." In 2009 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT). IEEE, 2009. http://dx.doi.org/10.1109/isspit.2009.5407509.
Full textMatic, A., V. Osmani, and O. Mayora. "Speech activity detection using accelerometer." In 2012 34th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). IEEE, 2012. http://dx.doi.org/10.1109/embc.2012.6346377.
Full textTsai, TJ, and Nelson Morgan. "Speech activity detection: An economics approach." In ICASSP 2013 - 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2013. http://dx.doi.org/10.1109/icassp.2013.6638987.
Full textKhoury, Elie, and Matt Garland. "I-Vectors for speech activity detection." In Odyssey 2016. ISCA, 2016. http://dx.doi.org/10.21437/odyssey.2016-48.
Full textK, Punnoose A. "New Features for Speech Activity Detection." In SMM19, Workshop on Speech, Music and Mind 2019. ISCA: ISCA, 2019. http://dx.doi.org/10.21437/smm.2019-6.
Full textLaskowski, Kornel, Qin Jin, and Tanja Schultz. "Crosscorrelation-based multispeaker speech activity detection." In Interspeech 2004. ISCA: ISCA, 2004. http://dx.doi.org/10.21437/interspeech.2004-350.
Full textSarfjoo, Seyyed Saeed, Srikanth Madikeri, and Petr Motlicek. "Speech Activity Detection Based on Multilingual Speech Recognition System." In Interspeech 2021. ISCA: ISCA, 2021. http://dx.doi.org/10.21437/interspeech.2021-1058.
Full textHarsha, B. V. "A noise robust speech activity detection algorithm." In Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, 2004. IEEE, 2004. http://dx.doi.org/10.1109/isimp.2004.1434065.
Full textShahsavari, Sajad, Hossein Sameti, and Hossein Hadian. "Speech activity detection using deep neural networks." In 2017 Iranian Conference on Electrical Engineering (ICEE). IEEE, 2017. http://dx.doi.org/10.1109/iraniancee.2017.7985293.
Full textHeese, Florian, Markus Niermann, and Peter Vary. "Speech-codebook based soft Voice Activity Detection." In ICASSP 2015 - 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2015. http://dx.doi.org/10.1109/icassp.2015.7178789.
Full text