Auswahl der wissenschaftlichen Literatur zum Thema „Active speaker detection“
Geben Sie eine Quelle nach APA, MLA, Chicago, Harvard und anderen Zitierweisen an
Inhaltsverzeichnis
Machen Sie sich mit den Listen der aktuellen Artikel, Bücher, Dissertationen, Berichten und anderer wissenschaftlichen Quellen zum Thema "Active speaker detection" bekannt.
Neben jedem Werk im Literaturverzeichnis ist die Option "Zur Bibliographie hinzufügen" verfügbar. Nutzen Sie sie, wird Ihre bibliographische Angabe des gewählten Werkes nach der nötigen Zitierweise (APA, MLA, Harvard, Chicago, Vancouver usw.) automatisch gestaltet.
Sie können auch den vollen Text der wissenschaftlichen Publikation im PDF-Format herunterladen und eine Online-Annotation der Arbeit lesen, wenn die relevanten Parameter in den Metadaten verfügbar sind.
Zeitschriftenartikel zum Thema "Active speaker detection"
Assunção, Gustavo, Nuno Gonçalves und Paulo Menezes. „Bio-Inspired Modality Fusion for Active Speaker Detection“. Applied Sciences 11, Nr. 8 (10.04.2021): 3397. http://dx.doi.org/10.3390/app11083397.
Der volle Inhalt der QuellePu, Jie, Yannis Panagakis und Maja Pantic. „Active Speaker Detection and Localization in Videos Using Low-Rank and Kernelized Sparsity“. IEEE Signal Processing Letters 27 (2020): 865–69. http://dx.doi.org/10.1109/lsp.2020.2996412.
Der volle Inhalt der QuelleLindstrom, Fredric, Keni Ren, Kerstin Persson Waye und Haibo Li. „A comparison of two active‐speaker‐detection methods suitable for usage in noise dosimeter measurements“. Journal of the Acoustical Society of America 123, Nr. 5 (Mai 2008): 3527. http://dx.doi.org/10.1121/1.2934471.
Der volle Inhalt der QuelleZhu, Ying-Xin, und Hao-Ran Jin. „Speaker Localization Based on Audio-Visual Bimodal Fusion“. Journal of Advanced Computational Intelligence and Intelligent Informatics 25, Nr. 3 (20.05.2021): 375–82. http://dx.doi.org/10.20965/jaciii.2021.p0375.
Der volle Inhalt der QuelleStefanov, Kalin, Jonas Beskow und Giampiero Salvi. „Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially Aware Language Acquisition“. IEEE Transactions on Cognitive and Developmental Systems 12, Nr. 2 (Juni 2020): 250–59. http://dx.doi.org/10.1109/tcds.2019.2927941.
Der volle Inhalt der QuelleDAI, Hai, Kean CHEN, Yang WANG und Haoxin YU. „Fault detection method of secondary sound source in ANC system based on impedance characteristics“. Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University 40, Nr. 6 (Dezember 2022): 1242–49. http://dx.doi.org/10.1051/jnwpu/20224061242.
Der volle Inhalt der QuelleAhmad, Zubair, Alquhayz und Ditta. „Multimodal Speaker Diarization Using a Pre-Trained Audio-Visual Synchronization Model“. Sensors 19, Nr. 23 (25.11.2019): 5163. http://dx.doi.org/10.3390/s19235163.
Der volle Inhalt der QuelleWang, Shaolei, Zhongyuan Wang, Wanxiang Che, Sendong Zhao und Ting Liu. „Combining Self-supervised Learning and Active Learning for Disfluency Detection“. ACM Transactions on Asian and Low-Resource Language Information Processing 21, Nr. 3 (31.05.2022): 1–25. http://dx.doi.org/10.1145/3487290.
Der volle Inhalt der QuelleMaltezou-Papastylianou, Constantina, Riccardo Russo, Denise Wallace, Chelsea Harmsworth und Silke Paulmann. „Different stages of emotional prosody processing in healthy ageing–evidence from behavioural responses, ERPs, tDCS, and tRNS“. PLOS ONE 17, Nr. 7 (21.07.2022): e0270934. http://dx.doi.org/10.1371/journal.pone.0270934.
Der volle Inhalt der QuelleLahemer, Elfituri S. F., und Ahmad Rad. „An Audio-Based SLAM for Indoor Environments: A Robotic Mixed Reality Presentation“. Sensors 24, Nr. 9 (27.04.2024): 2796. http://dx.doi.org/10.3390/s24092796.
Der volle Inhalt der QuelleDissertationen zum Thema "Active speaker detection"
Li, Yi. „Speaker Diarization System for Call-center data“. Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-286677.
Der volle Inhalt der QuelleFör att svara på frågan vem som talade när är högtalardarisering (SD) ett kritiskt steg för många talapplikationer i praktiken. Uppdraget med vårt projekt är att bygga ett MFCC-vektorbaserat högtalar-diariseringssystem ovanpå ett högtalarverifieringssystem (SV), som är ett befintligt Call-center-program för att kontrollera kundens identitet från ett telefonsamtal. Vårt högtalarsystem använder 13-dimensionella MFCC: er som funktioner, utför Voice Active Detection (VAD), segmentering, linjär gruppering och hierarkisk gruppering baserat på GMM och BIC-poäng. Genom att tillämpa den minskar vi EER (Equal Error Rate) från 18,1 % i baslinjeexperimentet till 3,26 % för de allmänna samtalscentret. För att bättre analysera och utvärdera systemet simulerade vi också en uppsättning callcenter-data baserat på de offentliga ljuddatabaserna ICSI corpus.
Pouthier, Baptiste. „Apprentissage profond et statistique sur données audiovisuelles dédié aux systèmes embarqués pour l'interface homme-machine“. Electronic Thesis or Diss., Université Côte d'Azur, 2024. http://www.theses.fr/2024COAZ4019.
Der volle Inhalt der QuelleIn the rapidly evolving landscape of human-machine interfaces, deep learning has been nothing short of revolutionary. It has ushered in a new era of audio-visual algorithms, which, in turn, have expanded the horizons of potential applications and strengthened the performance of traditional systems. However, these remarkable advancements come with a caveat - many of these algorithms are computationally demanding, rendering their integration onto embedded devices a formidable task. The primary focus of this thesis is to surmount this limitation through a comprehensive optimization effort, addressing the critical factors of latency and accuracy in audio-visual algorithms. Our approach entails a meticulous examination and enhancement of key components in the audio-visual human-machine interaction pipeline; we investigate and make contributions to fundamental aspects of audio-visual technology in Active Speaker Detection and Audio-visual Speech Recognition tasks. By tackling these critical building blocks, we aim to bridge the gap between the vast potential of audio-visual algorithms and their practical application in embedded systems. Our research introduces efficient models in Active Speaker Detection. On the one hand, our novel audio-visual fusion strategy yields significant improvements over other state-of-the-art systems, featuring a relatively simpler model. On the other hand, we explore neural architecture search, resulting in the development of a compact yet efficient architecture for the Active Speaker Detection problem. Furthermore, we present our work on audio-visual speech recognition, with a specific emphasis on keyword spotting. Our main contribution targets the visual aspect of speech recognition with a graph-based approach designed to streamline the visual processing pipeline, promising simpler audio-visual recognition systems
Bücher zum Thema "Active speaker detection"
Kelly, Carla. Miss Milton speaks her mind. Seattle, WA: Camel Press, 2014.
Den vollen Inhalt der Quelle findenFrancis, Dick. Banker. Oxford: Heinemann, 1992.
Den vollen Inhalt der Quelle findenEscott, John. Girl on a motorcycle. Oxford: Oxford University Press, 2008.
Den vollen Inhalt der Quelle findenConan, Doyle Arthur. The Sign of Four. Peterborough, Ont: Broadview Press, 2001.
Den vollen Inhalt der Quelle findenConan, Doyle Arthur. The sign of four =: El signo de cuatro. London: Heinemann Educational, 1985.
Den vollen Inhalt der Quelle findenChee, Traci. Speaker. Penguin Young Readers Group, 2018.
Den vollen Inhalt der Quelle findenChee, Traci. Speaker. Penguin Young Readers Group, 2018.
Den vollen Inhalt der Quelle findenChee, Traci. The speaker. 2017.
Den vollen Inhalt der Quelle findenBankir: Detektivnyĭ roman. [Moskva]: EKSMO-Press, 1999.
Den vollen Inhalt der Quelle findenConan, Doyle A. Sign of the Four (1890), Also Called the Sign of Four, Is the Second Novel Featuring Sherlock Holmes Written by Sir Arthur Conan Doyle. Doyle Wrote Four Novels and 56 Short Stories Featuring the Fictional Detective. Independently Published, 2021.
Den vollen Inhalt der Quelle findenBuchteile zum Thema "Active speaker detection"
Alcázar, Juan León, Moritz Cordes, Chen Zhao und Bernard Ghanem. „End-to-End Active Speaker Detection“. In Lecture Notes in Computer Science, 126–43. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-19836-6_8.
Der volle Inhalt der QuelleChakravarty, Punarjay, und Tinne Tuytelaars. „Cross-Modal Supervision for Learning Active Speaker Detection in Video“. In Computer Vision – ECCV 2016, 285–301. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-46454-1_18.
Der volle Inhalt der QuelleMin, Kyle, Sourya Roy, Subarna Tripathi, Tanaya Guha und Somdeb Majumdar. „Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection“. In Lecture Notes in Computer Science, 371–87. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-19833-5_22.
Der volle Inhalt der QuelleYang, Yatao, und Siyu Yan. „Cross-Modal Active Speaker Detection Algorithm in Video and End-To-End Landing Solution“. In Lecture Notes in Electrical Engineering, 313–23. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2200-6_29.
Der volle Inhalt der QuellePallavi, C., Girija R, Vedhapriyavadhana R, Barnali Dey und Rajiv Vincent. „A Relative Investigation of Various Algorithms for Online Financial Fraud Detection Techniques“. In Recent Trends in Intensive Computing. IOS Press, 2021. http://dx.doi.org/10.3233/apc210174.
Der volle Inhalt der QuelleDalton, Gene, und Ann Devitt. „Gaeilge Gaming“. In Computer-Assisted Language Learning, 1093–110. IGI Global, 2019. http://dx.doi.org/10.4018/978-1-5225-7663-1.ch052.
Der volle Inhalt der QuelleWilkins, Heidi. „Talking Back: Voice in Screwball Comedy“. In Talkies, Road Movies and Chick Flicks. Edinburgh University Press, 2016. http://dx.doi.org/10.3366/edinburgh/9781474406895.003.0002.
Der volle Inhalt der QuelleCripps, Yvonne. „The Public Interest Disclosure Act 1998“. In Freedom Of Expression And Freedom Of Information, 275–87. Oxford University PressOxford, 2000. http://dx.doi.org/10.1093/oso/9780198268390.003.0018.
Der volle Inhalt der QuelleSkulacheva, Tatyana, Natalia Slioussar, Alexander Kostyuk, Anna Lipina, Emil Latypov und Varvara Koroleva. „The Influence of Verse on Cognitive Processes: A Psycholinguistic Experiment“. In Tackling the Toolkit: Plotting Poetry through Computational Literary Studies, 155–66. Institute of Czech Literature of the Czech Academy of Sciences, 2022. http://dx.doi.org/10.51305/icl.cz.9788076580336.10.
Der volle Inhalt der QuelleKonferenzberichte zum Thema "Active speaker detection"
Roth, Joseph, Sourish Chaudhuri, Ondrej Klejch, Radhika Marvin, Andrew Gallagher, Liat Kaver, Sharadh Ramaswamy et al. „Ava Active Speaker: An Audio-Visual Dataset for Active Speaker Detection“. In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020. http://dx.doi.org/10.1109/icassp40776.2020.9053900.
Der volle Inhalt der QuelleStefanov, Kalin, Jonas Beskow und Giampiero Salvi. „Vision-based Active Speaker Detection in Multiparty Interaction“. In GLU 2017 International Workshop on Grounding Language Understanding. ISCA: ISCA, 2017. http://dx.doi.org/10.21437/glu.2017-10.
Der volle Inhalt der QuelleHuang, Chong, und Kazuhito Koishida. „Improved Active Speaker Detection based on Optical Flow“. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, 2020. http://dx.doi.org/10.1109/cvprw50498.2020.00483.
Der volle Inhalt der QuelleChakravarty, Punarjay, Jeroen Zegers, Tinne Tuytelaars und Hugo Van hamme. „Active speaker detection with audio-visual co-training“. In ICMI '16: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2993148.2993172.
Der volle Inhalt der QuelleKheradiya, Jatin, Sandeep Reddy C und Rajesh Hegde. „Active Speaker Detection using audio-visual sensor array“. In 2014 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT). IEEE, 2014. http://dx.doi.org/10.1109/isspit.2014.7300636.
Der volle Inhalt der QuelleWuerkaixi, Abudukelimu, You Zhang, Zhiyao Duan und Changshui Zhang. „Rethinking Audio-Visual Synchronization for Active Speaker Detection“. In 2022 IEEE 32nd International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 2022. http://dx.doi.org/10.1109/mlsp55214.2022.9943352.
Der volle Inhalt der QuelleJiang, Yidi, Ruijie Tao, Zexu Pan und Haizhou Li. „Target Active Speaker Detection with Audio-visual Cues“. In INTERSPEECH 2023. ISCA: ISCA, 2023. http://dx.doi.org/10.21437/interspeech.2023-574.
Der volle Inhalt der QuelleLiao, Junhua, Haihan Duan, Kanghui Feng, Wanbing Zhao, Yanbing Yang und Liangyin Chen. „A Light Weight Model for Active Speaker Detection“. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2023. http://dx.doi.org/10.1109/cvpr52729.2023.02196.
Der volle Inhalt der QuelleAlcazar, Juan Leon, Fabian Caba Heilbron, Ali K. Thabet und Bernard Ghanem. „MAAS: Multi-modal Assignation for Active Speaker Detection“. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2021. http://dx.doi.org/10.1109/iccv48922.2021.00033.
Der volle Inhalt der QuelleMadrigal, Francisco, Frederic Lerasle, Lionel Pibre und Isabelle Ferrane. „Audio-Video detection of the active speaker in meetings“. In 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 2021. http://dx.doi.org/10.1109/icpr48806.2021.9412681.
Der volle Inhalt der QuelleBerichte der Organisationen zum Thema "Active speaker detection"
Mizrach, Amos, Michal Mazor, Amots Hetzroni, Joseph Grinshpun, Richard Mankin, Dennis Shuman, Nancy Epsky und Robert Heath. Male Song as a Tool for Trapping Female Medflies. United States Department of Agriculture, Dezember 2002. http://dx.doi.org/10.32747/2002.7586535.bard.
Der volle Inhalt der Quelle