Добірка наукової літератури з теми "Active speaker detection"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Active speaker detection".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Статті в журналах з теми "Active speaker detection"
Assunção, Gustavo, Nuno Gonçalves, and Paulo Menezes. "Bio-Inspired Modality Fusion for Active Speaker Detection." Applied Sciences 11, no. 8 (April 10, 2021): 3397. http://dx.doi.org/10.3390/app11083397.
Повний текст джерелаPu, Jie, Yannis Panagakis, and Maja Pantic. "Active Speaker Detection and Localization in Videos Using Low-Rank and Kernelized Sparsity." IEEE Signal Processing Letters 27 (2020): 865–69. http://dx.doi.org/10.1109/lsp.2020.2996412.
Повний текст джерелаLindstrom, Fredric, Keni Ren, Kerstin Persson Waye, and Haibo Li. "A comparison of two active‐speaker‐detection methods suitable for usage in noise dosimeter measurements." Journal of the Acoustical Society of America 123, no. 5 (May 2008): 3527. http://dx.doi.org/10.1121/1.2934471.
Повний текст джерелаZhu, Ying-Xin, and Hao-Ran Jin. "Speaker Localization Based on Audio-Visual Bimodal Fusion." Journal of Advanced Computational Intelligence and Intelligent Informatics 25, no. 3 (May 20, 2021): 375–82. http://dx.doi.org/10.20965/jaciii.2021.p0375.
Повний текст джерелаStefanov, Kalin, Jonas Beskow, and Giampiero Salvi. "Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially Aware Language Acquisition." IEEE Transactions on Cognitive and Developmental Systems 12, no. 2 (June 2020): 250–59. http://dx.doi.org/10.1109/tcds.2019.2927941.
Повний текст джерелаDAI, Hai, Kean CHEN, Yang WANG, and Haoxin YU. "Fault detection method of secondary sound source in ANC system based on impedance characteristics." Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University 40, no. 6 (December 2022): 1242–49. http://dx.doi.org/10.1051/jnwpu/20224061242.
Повний текст джерелаAhmad, Zubair, Alquhayz, and Ditta. "Multimodal Speaker Diarization Using a Pre-Trained Audio-Visual Synchronization Model." Sensors 19, no. 23 (November 25, 2019): 5163. http://dx.doi.org/10.3390/s19235163.
Повний текст джерелаWang, Shaolei, Zhongyuan Wang, Wanxiang Che, Sendong Zhao, and Ting Liu. "Combining Self-supervised Learning and Active Learning for Disfluency Detection." ACM Transactions on Asian and Low-Resource Language Information Processing 21, no. 3 (May 31, 2022): 1–25. http://dx.doi.org/10.1145/3487290.
Повний текст джерелаMaltezou-Papastylianou, Constantina, Riccardo Russo, Denise Wallace, Chelsea Harmsworth, and Silke Paulmann. "Different stages of emotional prosody processing in healthy ageing–evidence from behavioural responses, ERPs, tDCS, and tRNS." PLOS ONE 17, no. 7 (July 21, 2022): e0270934. http://dx.doi.org/10.1371/journal.pone.0270934.
Повний текст джерелаLahemer, Elfituri S. F., and Ahmad Rad. "An Audio-Based SLAM for Indoor Environments: A Robotic Mixed Reality Presentation." Sensors 24, no. 9 (April 27, 2024): 2796. http://dx.doi.org/10.3390/s24092796.
Повний текст джерелаДисертації з теми "Active speaker detection"
Li, Yi. "Speaker Diarization System for Call-center data." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-286677.
Повний текст джерелаFör att svara på frågan vem som talade när är högtalardarisering (SD) ett kritiskt steg för många talapplikationer i praktiken. Uppdraget med vårt projekt är att bygga ett MFCC-vektorbaserat högtalar-diariseringssystem ovanpå ett högtalarverifieringssystem (SV), som är ett befintligt Call-center-program för att kontrollera kundens identitet från ett telefonsamtal. Vårt högtalarsystem använder 13-dimensionella MFCC: er som funktioner, utför Voice Active Detection (VAD), segmentering, linjär gruppering och hierarkisk gruppering baserat på GMM och BIC-poäng. Genom att tillämpa den minskar vi EER (Equal Error Rate) från 18,1 % i baslinjeexperimentet till 3,26 % för de allmänna samtalscentret. För att bättre analysera och utvärdera systemet simulerade vi också en uppsättning callcenter-data baserat på de offentliga ljuddatabaserna ICSI corpus.
Pouthier, Baptiste. "Apprentissage profond et statistique sur données audiovisuelles dédié aux systèmes embarqués pour l'interface homme-machine." Electronic Thesis or Diss., Université Côte d'Azur, 2024. http://www.theses.fr/2024COAZ4019.
Повний текст джерелаIn the rapidly evolving landscape of human-machine interfaces, deep learning has been nothing short of revolutionary. It has ushered in a new era of audio-visual algorithms, which, in turn, have expanded the horizons of potential applications and strengthened the performance of traditional systems. However, these remarkable advancements come with a caveat - many of these algorithms are computationally demanding, rendering their integration onto embedded devices a formidable task. The primary focus of this thesis is to surmount this limitation through a comprehensive optimization effort, addressing the critical factors of latency and accuracy in audio-visual algorithms. Our approach entails a meticulous examination and enhancement of key components in the audio-visual human-machine interaction pipeline; we investigate and make contributions to fundamental aspects of audio-visual technology in Active Speaker Detection and Audio-visual Speech Recognition tasks. By tackling these critical building blocks, we aim to bridge the gap between the vast potential of audio-visual algorithms and their practical application in embedded systems. Our research introduces efficient models in Active Speaker Detection. On the one hand, our novel audio-visual fusion strategy yields significant improvements over other state-of-the-art systems, featuring a relatively simpler model. On the other hand, we explore neural architecture search, resulting in the development of a compact yet efficient architecture for the Active Speaker Detection problem. Furthermore, we present our work on audio-visual speech recognition, with a specific emphasis on keyword spotting. Our main contribution targets the visual aspect of speech recognition with a graph-based approach designed to streamline the visual processing pipeline, promising simpler audio-visual recognition systems
Книги з теми "Active speaker detection"
Kelly, Carla. Miss Milton speaks her mind. Seattle, WA: Camel Press, 2014.
Знайти повний текст джерелаFrancis, Dick. Banker. Oxford: Heinemann, 1992.
Знайти повний текст джерелаEscott, John. Girl on a motorcycle. Oxford: Oxford University Press, 2008.
Знайти повний текст джерелаConan, Doyle Arthur. The Sign of Four. Peterborough, Ont: Broadview Press, 2001.
Знайти повний текст джерелаConan, Doyle Arthur. The sign of four =: El signo de cuatro. London: Heinemann Educational, 1985.
Знайти повний текст джерелаChee, Traci. Speaker. Penguin Young Readers Group, 2018.
Знайти повний текст джерелаChee, Traci. Speaker. Penguin Young Readers Group, 2018.
Знайти повний текст джерелаChee, Traci. The speaker. 2017.
Знайти повний текст джерелаBankir: Detektivnyĭ roman. [Moskva]: EKSMO-Press, 1999.
Знайти повний текст джерелаConan, Doyle A. Sign of the Four (1890), Also Called the Sign of Four, Is the Second Novel Featuring Sherlock Holmes Written by Sir Arthur Conan Doyle. Doyle Wrote Four Novels and 56 Short Stories Featuring the Fictional Detective. Independently Published, 2021.
Знайти повний текст джерелаЧастини книг з теми "Active speaker detection"
Alcázar, Juan León, Moritz Cordes, Chen Zhao, and Bernard Ghanem. "End-to-End Active Speaker Detection." In Lecture Notes in Computer Science, 126–43. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-19836-6_8.
Повний текст джерелаChakravarty, Punarjay, and Tinne Tuytelaars. "Cross-Modal Supervision for Learning Active Speaker Detection in Video." In Computer Vision – ECCV 2016, 285–301. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-46454-1_18.
Повний текст джерелаMin, Kyle, Sourya Roy, Subarna Tripathi, Tanaya Guha, and Somdeb Majumdar. "Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection." In Lecture Notes in Computer Science, 371–87. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-19833-5_22.
Повний текст джерелаYang, Yatao, and Siyu Yan. "Cross-Modal Active Speaker Detection Algorithm in Video and End-To-End Landing Solution." In Lecture Notes in Electrical Engineering, 313–23. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2200-6_29.
Повний текст джерелаPallavi, C., Girija R, Vedhapriyavadhana R, Barnali Dey, and Rajiv Vincent. "A Relative Investigation of Various Algorithms for Online Financial Fraud Detection Techniques." In Recent Trends in Intensive Computing. IOS Press, 2021. http://dx.doi.org/10.3233/apc210174.
Повний текст джерелаDalton, Gene, and Ann Devitt. "Gaeilge Gaming." In Computer-Assisted Language Learning, 1093–110. IGI Global, 2019. http://dx.doi.org/10.4018/978-1-5225-7663-1.ch052.
Повний текст джерелаWilkins, Heidi. "Talking Back: Voice in Screwball Comedy." In Talkies, Road Movies and Chick Flicks. Edinburgh University Press, 2016. http://dx.doi.org/10.3366/edinburgh/9781474406895.003.0002.
Повний текст джерелаCripps, Yvonne. "The Public Interest Disclosure Act 1998." In Freedom Of Expression And Freedom Of Information, 275–87. Oxford University PressOxford, 2000. http://dx.doi.org/10.1093/oso/9780198268390.003.0018.
Повний текст джерелаSkulacheva, Tatyana, Natalia Slioussar, Alexander Kostyuk, Anna Lipina, Emil Latypov, and Varvara Koroleva. "The Influence of Verse on Cognitive Processes: A Psycholinguistic Experiment." In Tackling the Toolkit: Plotting Poetry through Computational Literary Studies, 155–66. Institute of Czech Literature of the Czech Academy of Sciences, 2022. http://dx.doi.org/10.51305/icl.cz.9788076580336.10.
Повний текст джерелаТези доповідей конференцій з теми "Active speaker detection"
Roth, Joseph, Sourish Chaudhuri, Ondrej Klejch, Radhika Marvin, Andrew Gallagher, Liat Kaver, Sharadh Ramaswamy, et al. "Ava Active Speaker: An Audio-Visual Dataset for Active Speaker Detection." In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020. http://dx.doi.org/10.1109/icassp40776.2020.9053900.
Повний текст джерелаStefanov, Kalin, Jonas Beskow, and Giampiero Salvi. "Vision-based Active Speaker Detection in Multiparty Interaction." In GLU 2017 International Workshop on Grounding Language Understanding. ISCA: ISCA, 2017. http://dx.doi.org/10.21437/glu.2017-10.
Повний текст джерелаHuang, Chong, and Kazuhito Koishida. "Improved Active Speaker Detection based on Optical Flow." In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, 2020. http://dx.doi.org/10.1109/cvprw50498.2020.00483.
Повний текст джерелаChakravarty, Punarjay, Jeroen Zegers, Tinne Tuytelaars, and Hugo Van hamme. "Active speaker detection with audio-visual co-training." In ICMI '16: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2993148.2993172.
Повний текст джерелаKheradiya, Jatin, Sandeep Reddy C, and Rajesh Hegde. "Active Speaker Detection using audio-visual sensor array." In 2014 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT). IEEE, 2014. http://dx.doi.org/10.1109/isspit.2014.7300636.
Повний текст джерелаWuerkaixi, Abudukelimu, You Zhang, Zhiyao Duan, and Changshui Zhang. "Rethinking Audio-Visual Synchronization for Active Speaker Detection." In 2022 IEEE 32nd International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 2022. http://dx.doi.org/10.1109/mlsp55214.2022.9943352.
Повний текст джерелаJiang, Yidi, Ruijie Tao, Zexu Pan, and Haizhou Li. "Target Active Speaker Detection with Audio-visual Cues." In INTERSPEECH 2023. ISCA: ISCA, 2023. http://dx.doi.org/10.21437/interspeech.2023-574.
Повний текст джерелаLiao, Junhua, Haihan Duan, Kanghui Feng, Wanbing Zhao, Yanbing Yang, and Liangyin Chen. "A Light Weight Model for Active Speaker Detection." In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2023. http://dx.doi.org/10.1109/cvpr52729.2023.02196.
Повний текст джерелаAlcazar, Juan Leon, Fabian Caba Heilbron, Ali K. Thabet, and Bernard Ghanem. "MAAS: Multi-modal Assignation for Active Speaker Detection." In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2021. http://dx.doi.org/10.1109/iccv48922.2021.00033.
Повний текст джерелаMadrigal, Francisco, Frederic Lerasle, Lionel Pibre, and Isabelle Ferrane. "Audio-Video detection of the active speaker in meetings." In 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 2021. http://dx.doi.org/10.1109/icpr48806.2021.9412681.
Повний текст джерелаЗвіти організацій з теми "Active speaker detection"
Mizrach, Amos, Michal Mazor, Amots Hetzroni, Joseph Grinshpun, Richard Mankin, Dennis Shuman, Nancy Epsky, and Robert Heath. Male Song as a Tool for Trapping Female Medflies. United States Department of Agriculture, December 2002. http://dx.doi.org/10.32747/2002.7586535.bard.
Повний текст джерела