Academic literature on the topic 'Active speaker detection'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Active speaker detection.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Journal articles on the topic "Active speaker detection"
Assunção, Gustavo, Nuno Gonçalves, and Paulo Menezes. "Bio-Inspired Modality Fusion for Active Speaker Detection." Applied Sciences 11, no. 8 (April 10, 2021): 3397. http://dx.doi.org/10.3390/app11083397.
Full textPu, Jie, Yannis Panagakis, and Maja Pantic. "Active Speaker Detection and Localization in Videos Using Low-Rank and Kernelized Sparsity." IEEE Signal Processing Letters 27 (2020): 865–69. http://dx.doi.org/10.1109/lsp.2020.2996412.
Full textLindstrom, Fredric, Keni Ren, Kerstin Persson Waye, and Haibo Li. "A comparison of two active‐speaker‐detection methods suitable for usage in noise dosimeter measurements." Journal of the Acoustical Society of America 123, no. 5 (May 2008): 3527. http://dx.doi.org/10.1121/1.2934471.
Full textZhu, Ying-Xin, and Hao-Ran Jin. "Speaker Localization Based on Audio-Visual Bimodal Fusion." Journal of Advanced Computational Intelligence and Intelligent Informatics 25, no. 3 (May 20, 2021): 375–82. http://dx.doi.org/10.20965/jaciii.2021.p0375.
Full textStefanov, Kalin, Jonas Beskow, and Giampiero Salvi. "Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially Aware Language Acquisition." IEEE Transactions on Cognitive and Developmental Systems 12, no. 2 (June 2020): 250–59. http://dx.doi.org/10.1109/tcds.2019.2927941.
Full textDAI, Hai, Kean CHEN, Yang WANG, and Haoxin YU. "Fault detection method of secondary sound source in ANC system based on impedance characteristics." Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University 40, no. 6 (December 2022): 1242–49. http://dx.doi.org/10.1051/jnwpu/20224061242.
Full textAhmad, Zubair, Alquhayz, and Ditta. "Multimodal Speaker Diarization Using a Pre-Trained Audio-Visual Synchronization Model." Sensors 19, no. 23 (November 25, 2019): 5163. http://dx.doi.org/10.3390/s19235163.
Full textWang, Shaolei, Zhongyuan Wang, Wanxiang Che, Sendong Zhao, and Ting Liu. "Combining Self-supervised Learning and Active Learning for Disfluency Detection." ACM Transactions on Asian and Low-Resource Language Information Processing 21, no. 3 (May 31, 2022): 1–25. http://dx.doi.org/10.1145/3487290.
Full textMaltezou-Papastylianou, Constantina, Riccardo Russo, Denise Wallace, Chelsea Harmsworth, and Silke Paulmann. "Different stages of emotional prosody processing in healthy ageing–evidence from behavioural responses, ERPs, tDCS, and tRNS." PLOS ONE 17, no. 7 (July 21, 2022): e0270934. http://dx.doi.org/10.1371/journal.pone.0270934.
Full textLahemer, Elfituri S. F., and Ahmad Rad. "An Audio-Based SLAM for Indoor Environments: A Robotic Mixed Reality Presentation." Sensors 24, no. 9 (April 27, 2024): 2796. http://dx.doi.org/10.3390/s24092796.
Full textDissertations / Theses on the topic "Active speaker detection"
Li, Yi. "Speaker Diarization System for Call-center data." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-286677.
Full textFör att svara på frågan vem som talade när är högtalardarisering (SD) ett kritiskt steg för många talapplikationer i praktiken. Uppdraget med vårt projekt är att bygga ett MFCC-vektorbaserat högtalar-diariseringssystem ovanpå ett högtalarverifieringssystem (SV), som är ett befintligt Call-center-program för att kontrollera kundens identitet från ett telefonsamtal. Vårt högtalarsystem använder 13-dimensionella MFCC: er som funktioner, utför Voice Active Detection (VAD), segmentering, linjär gruppering och hierarkisk gruppering baserat på GMM och BIC-poäng. Genom att tillämpa den minskar vi EER (Equal Error Rate) från 18,1 % i baslinjeexperimentet till 3,26 % för de allmänna samtalscentret. För att bättre analysera och utvärdera systemet simulerade vi också en uppsättning callcenter-data baserat på de offentliga ljuddatabaserna ICSI corpus.
Pouthier, Baptiste. "Apprentissage profond et statistique sur données audiovisuelles dédié aux systèmes embarqués pour l'interface homme-machine." Electronic Thesis or Diss., Université Côte d'Azur, 2024. http://www.theses.fr/2024COAZ4019.
Full textIn the rapidly evolving landscape of human-machine interfaces, deep learning has been nothing short of revolutionary. It has ushered in a new era of audio-visual algorithms, which, in turn, have expanded the horizons of potential applications and strengthened the performance of traditional systems. However, these remarkable advancements come with a caveat - many of these algorithms are computationally demanding, rendering their integration onto embedded devices a formidable task. The primary focus of this thesis is to surmount this limitation through a comprehensive optimization effort, addressing the critical factors of latency and accuracy in audio-visual algorithms. Our approach entails a meticulous examination and enhancement of key components in the audio-visual human-machine interaction pipeline; we investigate and make contributions to fundamental aspects of audio-visual technology in Active Speaker Detection and Audio-visual Speech Recognition tasks. By tackling these critical building blocks, we aim to bridge the gap between the vast potential of audio-visual algorithms and their practical application in embedded systems. Our research introduces efficient models in Active Speaker Detection. On the one hand, our novel audio-visual fusion strategy yields significant improvements over other state-of-the-art systems, featuring a relatively simpler model. On the other hand, we explore neural architecture search, resulting in the development of a compact yet efficient architecture for the Active Speaker Detection problem. Furthermore, we present our work on audio-visual speech recognition, with a specific emphasis on keyword spotting. Our main contribution targets the visual aspect of speech recognition with a graph-based approach designed to streamline the visual processing pipeline, promising simpler audio-visual recognition systems
Books on the topic "Active speaker detection"
Kelly, Carla. Miss Milton speaks her mind. Seattle, WA: Camel Press, 2014.
Find full textFrancis, Dick. Banker. Oxford: Heinemann, 1992.
Find full textEscott, John. Girl on a motorcycle. Oxford: Oxford University Press, 2008.
Find full textConan, Doyle Arthur. The Sign of Four. Peterborough, Ont: Broadview Press, 2001.
Find full textConan, Doyle Arthur. The sign of four =: El signo de cuatro. London: Heinemann Educational, 1985.
Find full textChee, Traci. Speaker. Penguin Young Readers Group, 2018.
Find full textChee, Traci. Speaker. Penguin Young Readers Group, 2018.
Find full textChee, Traci. The speaker. 2017.
Find full textBankir: Detektivnyĭ roman. [Moskva]: EKSMO-Press, 1999.
Find full textConan, Doyle A. Sign of the Four (1890), Also Called the Sign of Four, Is the Second Novel Featuring Sherlock Holmes Written by Sir Arthur Conan Doyle. Doyle Wrote Four Novels and 56 Short Stories Featuring the Fictional Detective. Independently Published, 2021.
Find full textBook chapters on the topic "Active speaker detection"
Alcázar, Juan León, Moritz Cordes, Chen Zhao, and Bernard Ghanem. "End-to-End Active Speaker Detection." In Lecture Notes in Computer Science, 126–43. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-19836-6_8.
Full textChakravarty, Punarjay, and Tinne Tuytelaars. "Cross-Modal Supervision for Learning Active Speaker Detection in Video." In Computer Vision – ECCV 2016, 285–301. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-46454-1_18.
Full textMin, Kyle, Sourya Roy, Subarna Tripathi, Tanaya Guha, and Somdeb Majumdar. "Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection." In Lecture Notes in Computer Science, 371–87. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-19833-5_22.
Full textYang, Yatao, and Siyu Yan. "Cross-Modal Active Speaker Detection Algorithm in Video and End-To-End Landing Solution." In Lecture Notes in Electrical Engineering, 313–23. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2200-6_29.
Full textPallavi, C., Girija R, Vedhapriyavadhana R, Barnali Dey, and Rajiv Vincent. "A Relative Investigation of Various Algorithms for Online Financial Fraud Detection Techniques." In Recent Trends in Intensive Computing. IOS Press, 2021. http://dx.doi.org/10.3233/apc210174.
Full textDalton, Gene, and Ann Devitt. "Gaeilge Gaming." In Computer-Assisted Language Learning, 1093–110. IGI Global, 2019. http://dx.doi.org/10.4018/978-1-5225-7663-1.ch052.
Full textWilkins, Heidi. "Talking Back: Voice in Screwball Comedy." In Talkies, Road Movies and Chick Flicks. Edinburgh University Press, 2016. http://dx.doi.org/10.3366/edinburgh/9781474406895.003.0002.
Full textCripps, Yvonne. "The Public Interest Disclosure Act 1998." In Freedom Of Expression And Freedom Of Information, 275–87. Oxford University PressOxford, 2000. http://dx.doi.org/10.1093/oso/9780198268390.003.0018.
Full textSkulacheva, Tatyana, Natalia Slioussar, Alexander Kostyuk, Anna Lipina, Emil Latypov, and Varvara Koroleva. "The Influence of Verse on Cognitive Processes: A Psycholinguistic Experiment." In Tackling the Toolkit: Plotting Poetry through Computational Literary Studies, 155–66. Institute of Czech Literature of the Czech Academy of Sciences, 2022. http://dx.doi.org/10.51305/icl.cz.9788076580336.10.
Full textConference papers on the topic "Active speaker detection"
Roth, Joseph, Sourish Chaudhuri, Ondrej Klejch, Radhika Marvin, Andrew Gallagher, Liat Kaver, Sharadh Ramaswamy, et al. "Ava Active Speaker: An Audio-Visual Dataset for Active Speaker Detection." In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020. http://dx.doi.org/10.1109/icassp40776.2020.9053900.
Full textStefanov, Kalin, Jonas Beskow, and Giampiero Salvi. "Vision-based Active Speaker Detection in Multiparty Interaction." In GLU 2017 International Workshop on Grounding Language Understanding. ISCA: ISCA, 2017. http://dx.doi.org/10.21437/glu.2017-10.
Full textHuang, Chong, and Kazuhito Koishida. "Improved Active Speaker Detection based on Optical Flow." In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, 2020. http://dx.doi.org/10.1109/cvprw50498.2020.00483.
Full textChakravarty, Punarjay, Jeroen Zegers, Tinne Tuytelaars, and Hugo Van hamme. "Active speaker detection with audio-visual co-training." In ICMI '16: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2993148.2993172.
Full textKheradiya, Jatin, Sandeep Reddy C, and Rajesh Hegde. "Active Speaker Detection using audio-visual sensor array." In 2014 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT). IEEE, 2014. http://dx.doi.org/10.1109/isspit.2014.7300636.
Full textWuerkaixi, Abudukelimu, You Zhang, Zhiyao Duan, and Changshui Zhang. "Rethinking Audio-Visual Synchronization for Active Speaker Detection." In 2022 IEEE 32nd International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 2022. http://dx.doi.org/10.1109/mlsp55214.2022.9943352.
Full textJiang, Yidi, Ruijie Tao, Zexu Pan, and Haizhou Li. "Target Active Speaker Detection with Audio-visual Cues." In INTERSPEECH 2023. ISCA: ISCA, 2023. http://dx.doi.org/10.21437/interspeech.2023-574.
Full textLiao, Junhua, Haihan Duan, Kanghui Feng, Wanbing Zhao, Yanbing Yang, and Liangyin Chen. "A Light Weight Model for Active Speaker Detection." In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2023. http://dx.doi.org/10.1109/cvpr52729.2023.02196.
Full textAlcazar, Juan Leon, Fabian Caba Heilbron, Ali K. Thabet, and Bernard Ghanem. "MAAS: Multi-modal Assignation for Active Speaker Detection." In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2021. http://dx.doi.org/10.1109/iccv48922.2021.00033.
Full textMadrigal, Francisco, Frederic Lerasle, Lionel Pibre, and Isabelle Ferrane. "Audio-Video detection of the active speaker in meetings." In 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 2021. http://dx.doi.org/10.1109/icpr48806.2021.9412681.
Full textReports on the topic "Active speaker detection"
Mizrach, Amos, Michal Mazor, Amots Hetzroni, Joseph Grinshpun, Richard Mankin, Dennis Shuman, Nancy Epsky, and Robert Heath. Male Song as a Tool for Trapping Female Medflies. United States Department of Agriculture, December 2002. http://dx.doi.org/10.32747/2002.7586535.bard.
Full text