Literatura científica selecionada sobre o tema "Active speaker detection"
Crie uma referência precisa em APA, MLA, Chicago, Harvard, e outros estilos
Consulte a lista de atuais artigos, livros, teses, anais de congressos e outras fontes científicas relevantes para o tema "Active speaker detection".
Ao lado de cada fonte na lista de referências, há um botão "Adicionar à bibliografia". Clique e geraremos automaticamente a citação bibliográfica do trabalho escolhido no estilo de citação de que você precisa: APA, MLA, Harvard, Chicago, Vancouver, etc.
Você também pode baixar o texto completo da publicação científica em formato .pdf e ler o resumo do trabalho online se estiver presente nos metadados.
Artigos de revistas sobre o assunto "Active speaker detection"
Assunção, Gustavo, Nuno Gonçalves e Paulo Menezes. "Bio-Inspired Modality Fusion for Active Speaker Detection". Applied Sciences 11, n.º 8 (10 de abril de 2021): 3397. http://dx.doi.org/10.3390/app11083397.
Texto completo da fontePu, Jie, Yannis Panagakis e Maja Pantic. "Active Speaker Detection and Localization in Videos Using Low-Rank and Kernelized Sparsity". IEEE Signal Processing Letters 27 (2020): 865–69. http://dx.doi.org/10.1109/lsp.2020.2996412.
Texto completo da fonteLindstrom, Fredric, Keni Ren, Kerstin Persson Waye e Haibo Li. "A comparison of two active‐speaker‐detection methods suitable for usage in noise dosimeter measurements". Journal of the Acoustical Society of America 123, n.º 5 (maio de 2008): 3527. http://dx.doi.org/10.1121/1.2934471.
Texto completo da fonteZhu, Ying-Xin, e Hao-Ran Jin. "Speaker Localization Based on Audio-Visual Bimodal Fusion". Journal of Advanced Computational Intelligence and Intelligent Informatics 25, n.º 3 (20 de maio de 2021): 375–82. http://dx.doi.org/10.20965/jaciii.2021.p0375.
Texto completo da fonteStefanov, Kalin, Jonas Beskow e Giampiero Salvi. "Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially Aware Language Acquisition". IEEE Transactions on Cognitive and Developmental Systems 12, n.º 2 (junho de 2020): 250–59. http://dx.doi.org/10.1109/tcds.2019.2927941.
Texto completo da fonteDAI, Hai, Kean CHEN, Yang WANG e Haoxin YU. "Fault detection method of secondary sound source in ANC system based on impedance characteristics". Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University 40, n.º 6 (dezembro de 2022): 1242–49. http://dx.doi.org/10.1051/jnwpu/20224061242.
Texto completo da fonteAhmad, Zubair, Alquhayz e Ditta. "Multimodal Speaker Diarization Using a Pre-Trained Audio-Visual Synchronization Model". Sensors 19, n.º 23 (25 de novembro de 2019): 5163. http://dx.doi.org/10.3390/s19235163.
Texto completo da fonteWang, Shaolei, Zhongyuan Wang, Wanxiang Che, Sendong Zhao e Ting Liu. "Combining Self-supervised Learning and Active Learning for Disfluency Detection". ACM Transactions on Asian and Low-Resource Language Information Processing 21, n.º 3 (31 de maio de 2022): 1–25. http://dx.doi.org/10.1145/3487290.
Texto completo da fonteMaltezou-Papastylianou, Constantina, Riccardo Russo, Denise Wallace, Chelsea Harmsworth e Silke Paulmann. "Different stages of emotional prosody processing in healthy ageing–evidence from behavioural responses, ERPs, tDCS, and tRNS". PLOS ONE 17, n.º 7 (21 de julho de 2022): e0270934. http://dx.doi.org/10.1371/journal.pone.0270934.
Texto completo da fonteLahemer, Elfituri S. F., e Ahmad Rad. "An Audio-Based SLAM for Indoor Environments: A Robotic Mixed Reality Presentation". Sensors 24, n.º 9 (27 de abril de 2024): 2796. http://dx.doi.org/10.3390/s24092796.
Texto completo da fonteTeses / dissertações sobre o assunto "Active speaker detection"
Li, Yi. "Speaker Diarization System for Call-center data". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-286677.
Texto completo da fonteFör att svara på frågan vem som talade när är högtalardarisering (SD) ett kritiskt steg för många talapplikationer i praktiken. Uppdraget med vårt projekt är att bygga ett MFCC-vektorbaserat högtalar-diariseringssystem ovanpå ett högtalarverifieringssystem (SV), som är ett befintligt Call-center-program för att kontrollera kundens identitet från ett telefonsamtal. Vårt högtalarsystem använder 13-dimensionella MFCC: er som funktioner, utför Voice Active Detection (VAD), segmentering, linjär gruppering och hierarkisk gruppering baserat på GMM och BIC-poäng. Genom att tillämpa den minskar vi EER (Equal Error Rate) från 18,1 % i baslinjeexperimentet till 3,26 % för de allmänna samtalscentret. För att bättre analysera och utvärdera systemet simulerade vi också en uppsättning callcenter-data baserat på de offentliga ljuddatabaserna ICSI corpus.
Pouthier, Baptiste. "Apprentissage profond et statistique sur données audiovisuelles dédié aux systèmes embarqués pour l'interface homme-machine". Electronic Thesis or Diss., Université Côte d'Azur, 2024. http://www.theses.fr/2024COAZ4019.
Texto completo da fonteIn the rapidly evolving landscape of human-machine interfaces, deep learning has been nothing short of revolutionary. It has ushered in a new era of audio-visual algorithms, which, in turn, have expanded the horizons of potential applications and strengthened the performance of traditional systems. However, these remarkable advancements come with a caveat - many of these algorithms are computationally demanding, rendering their integration onto embedded devices a formidable task. The primary focus of this thesis is to surmount this limitation through a comprehensive optimization effort, addressing the critical factors of latency and accuracy in audio-visual algorithms. Our approach entails a meticulous examination and enhancement of key components in the audio-visual human-machine interaction pipeline; we investigate and make contributions to fundamental aspects of audio-visual technology in Active Speaker Detection and Audio-visual Speech Recognition tasks. By tackling these critical building blocks, we aim to bridge the gap between the vast potential of audio-visual algorithms and their practical application in embedded systems. Our research introduces efficient models in Active Speaker Detection. On the one hand, our novel audio-visual fusion strategy yields significant improvements over other state-of-the-art systems, featuring a relatively simpler model. On the other hand, we explore neural architecture search, resulting in the development of a compact yet efficient architecture for the Active Speaker Detection problem. Furthermore, we present our work on audio-visual speech recognition, with a specific emphasis on keyword spotting. Our main contribution targets the visual aspect of speech recognition with a graph-based approach designed to streamline the visual processing pipeline, promising simpler audio-visual recognition systems
Livros sobre o assunto "Active speaker detection"
Kelly, Carla. Miss Milton speaks her mind. Seattle, WA: Camel Press, 2014.
Encontre o texto completo da fonteFrancis, Dick. Banker. Oxford: Heinemann, 1992.
Encontre o texto completo da fonteEscott, John. Girl on a motorcycle. Oxford: Oxford University Press, 2008.
Encontre o texto completo da fonteConan, Doyle Arthur. The Sign of Four. Peterborough, Ont: Broadview Press, 2001.
Encontre o texto completo da fonteConan, Doyle Arthur. The sign of four =: El signo de cuatro. London: Heinemann Educational, 1985.
Encontre o texto completo da fonteChee, Traci. Speaker. Penguin Young Readers Group, 2018.
Encontre o texto completo da fonteChee, Traci. Speaker. Penguin Young Readers Group, 2018.
Encontre o texto completo da fonteChee, Traci. The speaker. 2017.
Encontre o texto completo da fonteBankir: Detektivnyĭ roman. [Moskva]: EKSMO-Press, 1999.
Encontre o texto completo da fonteConan, Doyle A. Sign of the Four (1890), Also Called the Sign of Four, Is the Second Novel Featuring Sherlock Holmes Written by Sir Arthur Conan Doyle. Doyle Wrote Four Novels and 56 Short Stories Featuring the Fictional Detective. Independently Published, 2021.
Encontre o texto completo da fonteCapítulos de livros sobre o assunto "Active speaker detection"
Alcázar, Juan León, Moritz Cordes, Chen Zhao e Bernard Ghanem. "End-to-End Active Speaker Detection". In Lecture Notes in Computer Science, 126–43. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-19836-6_8.
Texto completo da fonteChakravarty, Punarjay, e Tinne Tuytelaars. "Cross-Modal Supervision for Learning Active Speaker Detection in Video". In Computer Vision – ECCV 2016, 285–301. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-46454-1_18.
Texto completo da fonteMin, Kyle, Sourya Roy, Subarna Tripathi, Tanaya Guha e Somdeb Majumdar. "Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection". In Lecture Notes in Computer Science, 371–87. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-19833-5_22.
Texto completo da fonteYang, Yatao, e Siyu Yan. "Cross-Modal Active Speaker Detection Algorithm in Video and End-To-End Landing Solution". In Lecture Notes in Electrical Engineering, 313–23. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2200-6_29.
Texto completo da fontePallavi, C., Girija R, Vedhapriyavadhana R, Barnali Dey e Rajiv Vincent. "A Relative Investigation of Various Algorithms for Online Financial Fraud Detection Techniques". In Recent Trends in Intensive Computing. IOS Press, 2021. http://dx.doi.org/10.3233/apc210174.
Texto completo da fonteDalton, Gene, e Ann Devitt. "Gaeilge Gaming". In Computer-Assisted Language Learning, 1093–110. IGI Global, 2019. http://dx.doi.org/10.4018/978-1-5225-7663-1.ch052.
Texto completo da fonteWilkins, Heidi. "Talking Back: Voice in Screwball Comedy". In Talkies, Road Movies and Chick Flicks. Edinburgh University Press, 2016. http://dx.doi.org/10.3366/edinburgh/9781474406895.003.0002.
Texto completo da fonteCripps, Yvonne. "The Public Interest Disclosure Act 1998". In Freedom Of Expression And Freedom Of Information, 275–87. Oxford University PressOxford, 2000. http://dx.doi.org/10.1093/oso/9780198268390.003.0018.
Texto completo da fonteSkulacheva, Tatyana, Natalia Slioussar, Alexander Kostyuk, Anna Lipina, Emil Latypov e Varvara Koroleva. "The Influence of Verse on Cognitive Processes: A Psycholinguistic Experiment". In Tackling the Toolkit: Plotting Poetry through Computational Literary Studies, 155–66. Institute of Czech Literature of the Czech Academy of Sciences, 2022. http://dx.doi.org/10.51305/icl.cz.9788076580336.10.
Texto completo da fonteTrabalhos de conferências sobre o assunto "Active speaker detection"
Roth, Joseph, Sourish Chaudhuri, Ondrej Klejch, Radhika Marvin, Andrew Gallagher, Liat Kaver, Sharadh Ramaswamy et al. "Ava Active Speaker: An Audio-Visual Dataset for Active Speaker Detection". In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020. http://dx.doi.org/10.1109/icassp40776.2020.9053900.
Texto completo da fonteStefanov, Kalin, Jonas Beskow e Giampiero Salvi. "Vision-based Active Speaker Detection in Multiparty Interaction". In GLU 2017 International Workshop on Grounding Language Understanding. ISCA: ISCA, 2017. http://dx.doi.org/10.21437/glu.2017-10.
Texto completo da fonteHuang, Chong, e Kazuhito Koishida. "Improved Active Speaker Detection based on Optical Flow". In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, 2020. http://dx.doi.org/10.1109/cvprw50498.2020.00483.
Texto completo da fonteChakravarty, Punarjay, Jeroen Zegers, Tinne Tuytelaars e Hugo Van hamme. "Active speaker detection with audio-visual co-training". In ICMI '16: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2993148.2993172.
Texto completo da fonteKheradiya, Jatin, Sandeep Reddy C e Rajesh Hegde. "Active Speaker Detection using audio-visual sensor array". In 2014 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT). IEEE, 2014. http://dx.doi.org/10.1109/isspit.2014.7300636.
Texto completo da fonteWuerkaixi, Abudukelimu, You Zhang, Zhiyao Duan e Changshui Zhang. "Rethinking Audio-Visual Synchronization for Active Speaker Detection". In 2022 IEEE 32nd International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 2022. http://dx.doi.org/10.1109/mlsp55214.2022.9943352.
Texto completo da fonteJiang, Yidi, Ruijie Tao, Zexu Pan e Haizhou Li. "Target Active Speaker Detection with Audio-visual Cues". In INTERSPEECH 2023. ISCA: ISCA, 2023. http://dx.doi.org/10.21437/interspeech.2023-574.
Texto completo da fonteLiao, Junhua, Haihan Duan, Kanghui Feng, Wanbing Zhao, Yanbing Yang e Liangyin Chen. "A Light Weight Model for Active Speaker Detection". In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2023. http://dx.doi.org/10.1109/cvpr52729.2023.02196.
Texto completo da fonteAlcazar, Juan Leon, Fabian Caba Heilbron, Ali K. Thabet e Bernard Ghanem. "MAAS: Multi-modal Assignation for Active Speaker Detection". In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2021. http://dx.doi.org/10.1109/iccv48922.2021.00033.
Texto completo da fonteMadrigal, Francisco, Frederic Lerasle, Lionel Pibre e Isabelle Ferrane. "Audio-Video detection of the active speaker in meetings". In 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 2021. http://dx.doi.org/10.1109/icpr48806.2021.9412681.
Texto completo da fonteRelatórios de organizações sobre o assunto "Active speaker detection"
Mizrach, Amos, Michal Mazor, Amots Hetzroni, Joseph Grinshpun, Richard Mankin, Dennis Shuman, Nancy Epsky e Robert Heath. Male Song as a Tool for Trapping Female Medflies. United States Department of Agriculture, dezembro de 2002. http://dx.doi.org/10.32747/2002.7586535.bard.
Texto completo da fonte