Letteratura scientifica selezionata sul tema "Active speaker detection"
Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili
Consulta la lista di attuali articoli, libri, tesi, atti di convegni e altre fonti scientifiche attinenti al tema "Active speaker detection".
Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.
Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.
Articoli di riviste sul tema "Active speaker detection"
Assunção, Gustavo, Nuno Gonçalves e Paulo Menezes. "Bio-Inspired Modality Fusion for Active Speaker Detection". Applied Sciences 11, n. 8 (10 aprile 2021): 3397. http://dx.doi.org/10.3390/app11083397.
Testo completoPu, Jie, Yannis Panagakis e Maja Pantic. "Active Speaker Detection and Localization in Videos Using Low-Rank and Kernelized Sparsity". IEEE Signal Processing Letters 27 (2020): 865–69. http://dx.doi.org/10.1109/lsp.2020.2996412.
Testo completoLindstrom, Fredric, Keni Ren, Kerstin Persson Waye e Haibo Li. "A comparison of two active‐speaker‐detection methods suitable for usage in noise dosimeter measurements". Journal of the Acoustical Society of America 123, n. 5 (maggio 2008): 3527. http://dx.doi.org/10.1121/1.2934471.
Testo completoZhu, Ying-Xin, e Hao-Ran Jin. "Speaker Localization Based on Audio-Visual Bimodal Fusion". Journal of Advanced Computational Intelligence and Intelligent Informatics 25, n. 3 (20 maggio 2021): 375–82. http://dx.doi.org/10.20965/jaciii.2021.p0375.
Testo completoStefanov, Kalin, Jonas Beskow e Giampiero Salvi. "Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially Aware Language Acquisition". IEEE Transactions on Cognitive and Developmental Systems 12, n. 2 (giugno 2020): 250–59. http://dx.doi.org/10.1109/tcds.2019.2927941.
Testo completoDAI, Hai, Kean CHEN, Yang WANG e Haoxin YU. "Fault detection method of secondary sound source in ANC system based on impedance characteristics". Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University 40, n. 6 (dicembre 2022): 1242–49. http://dx.doi.org/10.1051/jnwpu/20224061242.
Testo completoAhmad, Zubair, Alquhayz e Ditta. "Multimodal Speaker Diarization Using a Pre-Trained Audio-Visual Synchronization Model". Sensors 19, n. 23 (25 novembre 2019): 5163. http://dx.doi.org/10.3390/s19235163.
Testo completoWang, Shaolei, Zhongyuan Wang, Wanxiang Che, Sendong Zhao e Ting Liu. "Combining Self-supervised Learning and Active Learning for Disfluency Detection". ACM Transactions on Asian and Low-Resource Language Information Processing 21, n. 3 (31 maggio 2022): 1–25. http://dx.doi.org/10.1145/3487290.
Testo completoMaltezou-Papastylianou, Constantina, Riccardo Russo, Denise Wallace, Chelsea Harmsworth e Silke Paulmann. "Different stages of emotional prosody processing in healthy ageing–evidence from behavioural responses, ERPs, tDCS, and tRNS". PLOS ONE 17, n. 7 (21 luglio 2022): e0270934. http://dx.doi.org/10.1371/journal.pone.0270934.
Testo completoLahemer, Elfituri S. F., e Ahmad Rad. "An Audio-Based SLAM for Indoor Environments: A Robotic Mixed Reality Presentation". Sensors 24, n. 9 (27 aprile 2024): 2796. http://dx.doi.org/10.3390/s24092796.
Testo completoTesi sul tema "Active speaker detection"
Li, Yi. "Speaker Diarization System for Call-center data". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-286677.
Testo completoFör att svara på frågan vem som talade när är högtalardarisering (SD) ett kritiskt steg för många talapplikationer i praktiken. Uppdraget med vårt projekt är att bygga ett MFCC-vektorbaserat högtalar-diariseringssystem ovanpå ett högtalarverifieringssystem (SV), som är ett befintligt Call-center-program för att kontrollera kundens identitet från ett telefonsamtal. Vårt högtalarsystem använder 13-dimensionella MFCC: er som funktioner, utför Voice Active Detection (VAD), segmentering, linjär gruppering och hierarkisk gruppering baserat på GMM och BIC-poäng. Genom att tillämpa den minskar vi EER (Equal Error Rate) från 18,1 % i baslinjeexperimentet till 3,26 % för de allmänna samtalscentret. För att bättre analysera och utvärdera systemet simulerade vi också en uppsättning callcenter-data baserat på de offentliga ljuddatabaserna ICSI corpus.
Pouthier, Baptiste. "Apprentissage profond et statistique sur données audiovisuelles dédié aux systèmes embarqués pour l'interface homme-machine". Electronic Thesis or Diss., Université Côte d'Azur, 2024. http://www.theses.fr/2024COAZ4019.
Testo completoIn the rapidly evolving landscape of human-machine interfaces, deep learning has been nothing short of revolutionary. It has ushered in a new era of audio-visual algorithms, which, in turn, have expanded the horizons of potential applications and strengthened the performance of traditional systems. However, these remarkable advancements come with a caveat - many of these algorithms are computationally demanding, rendering their integration onto embedded devices a formidable task. The primary focus of this thesis is to surmount this limitation through a comprehensive optimization effort, addressing the critical factors of latency and accuracy in audio-visual algorithms. Our approach entails a meticulous examination and enhancement of key components in the audio-visual human-machine interaction pipeline; we investigate and make contributions to fundamental aspects of audio-visual technology in Active Speaker Detection and Audio-visual Speech Recognition tasks. By tackling these critical building blocks, we aim to bridge the gap between the vast potential of audio-visual algorithms and their practical application in embedded systems. Our research introduces efficient models in Active Speaker Detection. On the one hand, our novel audio-visual fusion strategy yields significant improvements over other state-of-the-art systems, featuring a relatively simpler model. On the other hand, we explore neural architecture search, resulting in the development of a compact yet efficient architecture for the Active Speaker Detection problem. Furthermore, we present our work on audio-visual speech recognition, with a specific emphasis on keyword spotting. Our main contribution targets the visual aspect of speech recognition with a graph-based approach designed to streamline the visual processing pipeline, promising simpler audio-visual recognition systems
Libri sul tema "Active speaker detection"
Kelly, Carla. Miss Milton speaks her mind. Seattle, WA: Camel Press, 2014.
Cerca il testo completoFrancis, Dick. Banker. Oxford: Heinemann, 1992.
Cerca il testo completoEscott, John. Girl on a motorcycle. Oxford: Oxford University Press, 2008.
Cerca il testo completoConan, Doyle Arthur. The Sign of Four. Peterborough, Ont: Broadview Press, 2001.
Cerca il testo completoConan, Doyle Arthur. The sign of four =: El signo de cuatro. London: Heinemann Educational, 1985.
Cerca il testo completoChee, Traci. Speaker. Penguin Young Readers Group, 2018.
Cerca il testo completoChee, Traci. Speaker. Penguin Young Readers Group, 2018.
Cerca il testo completoChee, Traci. The speaker. 2017.
Cerca il testo completoBankir: Detektivnyĭ roman. [Moskva]: EKSMO-Press, 1999.
Cerca il testo completoConan, Doyle A. Sign of the Four (1890), Also Called the Sign of Four, Is the Second Novel Featuring Sherlock Holmes Written by Sir Arthur Conan Doyle. Doyle Wrote Four Novels and 56 Short Stories Featuring the Fictional Detective. Independently Published, 2021.
Cerca il testo completoCapitoli di libri sul tema "Active speaker detection"
Alcázar, Juan León, Moritz Cordes, Chen Zhao e Bernard Ghanem. "End-to-End Active Speaker Detection". In Lecture Notes in Computer Science, 126–43. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-19836-6_8.
Testo completoChakravarty, Punarjay, e Tinne Tuytelaars. "Cross-Modal Supervision for Learning Active Speaker Detection in Video". In Computer Vision – ECCV 2016, 285–301. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-46454-1_18.
Testo completoMin, Kyle, Sourya Roy, Subarna Tripathi, Tanaya Guha e Somdeb Majumdar. "Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection". In Lecture Notes in Computer Science, 371–87. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-19833-5_22.
Testo completoYang, Yatao, e Siyu Yan. "Cross-Modal Active Speaker Detection Algorithm in Video and End-To-End Landing Solution". In Lecture Notes in Electrical Engineering, 313–23. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2200-6_29.
Testo completoPallavi, C., Girija R, Vedhapriyavadhana R, Barnali Dey e Rajiv Vincent. "A Relative Investigation of Various Algorithms for Online Financial Fraud Detection Techniques". In Recent Trends in Intensive Computing. IOS Press, 2021. http://dx.doi.org/10.3233/apc210174.
Testo completoDalton, Gene, e Ann Devitt. "Gaeilge Gaming". In Computer-Assisted Language Learning, 1093–110. IGI Global, 2019. http://dx.doi.org/10.4018/978-1-5225-7663-1.ch052.
Testo completoWilkins, Heidi. "Talking Back: Voice in Screwball Comedy". In Talkies, Road Movies and Chick Flicks. Edinburgh University Press, 2016. http://dx.doi.org/10.3366/edinburgh/9781474406895.003.0002.
Testo completoCripps, Yvonne. "The Public Interest Disclosure Act 1998". In Freedom Of Expression And Freedom Of Information, 275–87. Oxford University PressOxford, 2000. http://dx.doi.org/10.1093/oso/9780198268390.003.0018.
Testo completoSkulacheva, Tatyana, Natalia Slioussar, Alexander Kostyuk, Anna Lipina, Emil Latypov e Varvara Koroleva. "The Influence of Verse on Cognitive Processes: A Psycholinguistic Experiment". In Tackling the Toolkit: Plotting Poetry through Computational Literary Studies, 155–66. Institute of Czech Literature of the Czech Academy of Sciences, 2022. http://dx.doi.org/10.51305/icl.cz.9788076580336.10.
Testo completoAtti di convegni sul tema "Active speaker detection"
Roth, Joseph, Sourish Chaudhuri, Ondrej Klejch, Radhika Marvin, Andrew Gallagher, Liat Kaver, Sharadh Ramaswamy et al. "Ava Active Speaker: An Audio-Visual Dataset for Active Speaker Detection". In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020. http://dx.doi.org/10.1109/icassp40776.2020.9053900.
Testo completoStefanov, Kalin, Jonas Beskow e Giampiero Salvi. "Vision-based Active Speaker Detection in Multiparty Interaction". In GLU 2017 International Workshop on Grounding Language Understanding. ISCA: ISCA, 2017. http://dx.doi.org/10.21437/glu.2017-10.
Testo completoHuang, Chong, e Kazuhito Koishida. "Improved Active Speaker Detection based on Optical Flow". In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, 2020. http://dx.doi.org/10.1109/cvprw50498.2020.00483.
Testo completoChakravarty, Punarjay, Jeroen Zegers, Tinne Tuytelaars e Hugo Van hamme. "Active speaker detection with audio-visual co-training". In ICMI '16: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2993148.2993172.
Testo completoKheradiya, Jatin, Sandeep Reddy C e Rajesh Hegde. "Active Speaker Detection using audio-visual sensor array". In 2014 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT). IEEE, 2014. http://dx.doi.org/10.1109/isspit.2014.7300636.
Testo completoWuerkaixi, Abudukelimu, You Zhang, Zhiyao Duan e Changshui Zhang. "Rethinking Audio-Visual Synchronization for Active Speaker Detection". In 2022 IEEE 32nd International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 2022. http://dx.doi.org/10.1109/mlsp55214.2022.9943352.
Testo completoJiang, Yidi, Ruijie Tao, Zexu Pan e Haizhou Li. "Target Active Speaker Detection with Audio-visual Cues". In INTERSPEECH 2023. ISCA: ISCA, 2023. http://dx.doi.org/10.21437/interspeech.2023-574.
Testo completoLiao, Junhua, Haihan Duan, Kanghui Feng, Wanbing Zhao, Yanbing Yang e Liangyin Chen. "A Light Weight Model for Active Speaker Detection". In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2023. http://dx.doi.org/10.1109/cvpr52729.2023.02196.
Testo completoAlcazar, Juan Leon, Fabian Caba Heilbron, Ali K. Thabet e Bernard Ghanem. "MAAS: Multi-modal Assignation for Active Speaker Detection". In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2021. http://dx.doi.org/10.1109/iccv48922.2021.00033.
Testo completoMadrigal, Francisco, Frederic Lerasle, Lionel Pibre e Isabelle Ferrane. "Audio-Video detection of the active speaker in meetings". In 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 2021. http://dx.doi.org/10.1109/icpr48806.2021.9412681.
Testo completoRapporti di organizzazioni sul tema "Active speaker detection"
Mizrach, Amos, Michal Mazor, Amots Hetzroni, Joseph Grinshpun, Richard Mankin, Dennis Shuman, Nancy Epsky e Robert Heath. Male Song as a Tool for Trapping Female Medflies. United States Department of Agriculture, dicembre 2002. http://dx.doi.org/10.32747/2002.7586535.bard.
Testo completo