Gotowa bibliografia na temat „Active speaker detection”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Zobacz listy aktualnych artykułów, książek, rozpraw, streszczeń i innych źródeł naukowych na temat „Active speaker detection”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Artykuły w czasopismach na temat "Active speaker detection"
Assunção, Gustavo, Nuno Gonçalves i Paulo Menezes. "Bio-Inspired Modality Fusion for Active Speaker Detection". Applied Sciences 11, nr 8 (10.04.2021): 3397. http://dx.doi.org/10.3390/app11083397.
Pełny tekst źródłaPu, Jie, Yannis Panagakis i Maja Pantic. "Active Speaker Detection and Localization in Videos Using Low-Rank and Kernelized Sparsity". IEEE Signal Processing Letters 27 (2020): 865–69. http://dx.doi.org/10.1109/lsp.2020.2996412.
Pełny tekst źródłaLindstrom, Fredric, Keni Ren, Kerstin Persson Waye i Haibo Li. "A comparison of two active‐speaker‐detection methods suitable for usage in noise dosimeter measurements". Journal of the Acoustical Society of America 123, nr 5 (maj 2008): 3527. http://dx.doi.org/10.1121/1.2934471.
Pełny tekst źródłaZhu, Ying-Xin, i Hao-Ran Jin. "Speaker Localization Based on Audio-Visual Bimodal Fusion". Journal of Advanced Computational Intelligence and Intelligent Informatics 25, nr 3 (20.05.2021): 375–82. http://dx.doi.org/10.20965/jaciii.2021.p0375.
Pełny tekst źródłaStefanov, Kalin, Jonas Beskow i Giampiero Salvi. "Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially Aware Language Acquisition". IEEE Transactions on Cognitive and Developmental Systems 12, nr 2 (czerwiec 2020): 250–59. http://dx.doi.org/10.1109/tcds.2019.2927941.
Pełny tekst źródłaDAI, Hai, Kean CHEN, Yang WANG i Haoxin YU. "Fault detection method of secondary sound source in ANC system based on impedance characteristics". Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University 40, nr 6 (grudzień 2022): 1242–49. http://dx.doi.org/10.1051/jnwpu/20224061242.
Pełny tekst źródłaAhmad, Zubair, Alquhayz i Ditta. "Multimodal Speaker Diarization Using a Pre-Trained Audio-Visual Synchronization Model". Sensors 19, nr 23 (25.11.2019): 5163. http://dx.doi.org/10.3390/s19235163.
Pełny tekst źródłaWang, Shaolei, Zhongyuan Wang, Wanxiang Che, Sendong Zhao i Ting Liu. "Combining Self-supervised Learning and Active Learning for Disfluency Detection". ACM Transactions on Asian and Low-Resource Language Information Processing 21, nr 3 (31.05.2022): 1–25. http://dx.doi.org/10.1145/3487290.
Pełny tekst źródłaMaltezou-Papastylianou, Constantina, Riccardo Russo, Denise Wallace, Chelsea Harmsworth i Silke Paulmann. "Different stages of emotional prosody processing in healthy ageing–evidence from behavioural responses, ERPs, tDCS, and tRNS". PLOS ONE 17, nr 7 (21.07.2022): e0270934. http://dx.doi.org/10.1371/journal.pone.0270934.
Pełny tekst źródłaLahemer, Elfituri S. F., i Ahmad Rad. "An Audio-Based SLAM for Indoor Environments: A Robotic Mixed Reality Presentation". Sensors 24, nr 9 (27.04.2024): 2796. http://dx.doi.org/10.3390/s24092796.
Pełny tekst źródłaRozprawy doktorskie na temat "Active speaker detection"
Li, Yi. "Speaker Diarization System for Call-center data". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-286677.
Pełny tekst źródłaFör att svara på frågan vem som talade när är högtalardarisering (SD) ett kritiskt steg för många talapplikationer i praktiken. Uppdraget med vårt projekt är att bygga ett MFCC-vektorbaserat högtalar-diariseringssystem ovanpå ett högtalarverifieringssystem (SV), som är ett befintligt Call-center-program för att kontrollera kundens identitet från ett telefonsamtal. Vårt högtalarsystem använder 13-dimensionella MFCC: er som funktioner, utför Voice Active Detection (VAD), segmentering, linjär gruppering och hierarkisk gruppering baserat på GMM och BIC-poäng. Genom att tillämpa den minskar vi EER (Equal Error Rate) från 18,1 % i baslinjeexperimentet till 3,26 % för de allmänna samtalscentret. För att bättre analysera och utvärdera systemet simulerade vi också en uppsättning callcenter-data baserat på de offentliga ljuddatabaserna ICSI corpus.
Pouthier, Baptiste. "Apprentissage profond et statistique sur données audiovisuelles dédié aux systèmes embarqués pour l'interface homme-machine". Electronic Thesis or Diss., Université Côte d'Azur, 2024. http://www.theses.fr/2024COAZ4019.
Pełny tekst źródłaIn the rapidly evolving landscape of human-machine interfaces, deep learning has been nothing short of revolutionary. It has ushered in a new era of audio-visual algorithms, which, in turn, have expanded the horizons of potential applications and strengthened the performance of traditional systems. However, these remarkable advancements come with a caveat - many of these algorithms are computationally demanding, rendering their integration onto embedded devices a formidable task. The primary focus of this thesis is to surmount this limitation through a comprehensive optimization effort, addressing the critical factors of latency and accuracy in audio-visual algorithms. Our approach entails a meticulous examination and enhancement of key components in the audio-visual human-machine interaction pipeline; we investigate and make contributions to fundamental aspects of audio-visual technology in Active Speaker Detection and Audio-visual Speech Recognition tasks. By tackling these critical building blocks, we aim to bridge the gap between the vast potential of audio-visual algorithms and their practical application in embedded systems. Our research introduces efficient models in Active Speaker Detection. On the one hand, our novel audio-visual fusion strategy yields significant improvements over other state-of-the-art systems, featuring a relatively simpler model. On the other hand, we explore neural architecture search, resulting in the development of a compact yet efficient architecture for the Active Speaker Detection problem. Furthermore, we present our work on audio-visual speech recognition, with a specific emphasis on keyword spotting. Our main contribution targets the visual aspect of speech recognition with a graph-based approach designed to streamline the visual processing pipeline, promising simpler audio-visual recognition systems
Książki na temat "Active speaker detection"
Kelly, Carla. Miss Milton speaks her mind. Seattle, WA: Camel Press, 2014.
Znajdź pełny tekst źródłaFrancis, Dick. Banker. Oxford: Heinemann, 1992.
Znajdź pełny tekst źródłaEscott, John. Girl on a motorcycle. Oxford: Oxford University Press, 2008.
Znajdź pełny tekst źródłaConan, Doyle Arthur. The Sign of Four. Peterborough, Ont: Broadview Press, 2001.
Znajdź pełny tekst źródłaConan, Doyle Arthur. The sign of four =: El signo de cuatro. London: Heinemann Educational, 1985.
Znajdź pełny tekst źródłaChee, Traci. Speaker. Penguin Young Readers Group, 2018.
Znajdź pełny tekst źródłaChee, Traci. Speaker. Penguin Young Readers Group, 2018.
Znajdź pełny tekst źródłaChee, Traci. The speaker. 2017.
Znajdź pełny tekst źródłaBankir: Detektivnyĭ roman. [Moskva]: EKSMO-Press, 1999.
Znajdź pełny tekst źródłaConan, Doyle A. Sign of the Four (1890), Also Called the Sign of Four, Is the Second Novel Featuring Sherlock Holmes Written by Sir Arthur Conan Doyle. Doyle Wrote Four Novels and 56 Short Stories Featuring the Fictional Detective. Independently Published, 2021.
Znajdź pełny tekst źródłaCzęści książek na temat "Active speaker detection"
Alcázar, Juan León, Moritz Cordes, Chen Zhao i Bernard Ghanem. "End-to-End Active Speaker Detection". W Lecture Notes in Computer Science, 126–43. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-19836-6_8.
Pełny tekst źródłaChakravarty, Punarjay, i Tinne Tuytelaars. "Cross-Modal Supervision for Learning Active Speaker Detection in Video". W Computer Vision – ECCV 2016, 285–301. Cham: Springer International Publishing, 2016. http://dx.doi.org/10.1007/978-3-319-46454-1_18.
Pełny tekst źródłaMin, Kyle, Sourya Roy, Subarna Tripathi, Tanaya Guha i Somdeb Majumdar. "Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection". W Lecture Notes in Computer Science, 371–87. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-19833-5_22.
Pełny tekst źródłaYang, Yatao, i Siyu Yan. "Cross-Modal Active Speaker Detection Algorithm in Video and End-To-End Landing Solution". W Lecture Notes in Electrical Engineering, 313–23. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2200-6_29.
Pełny tekst źródłaPallavi, C., Girija R, Vedhapriyavadhana R, Barnali Dey i Rajiv Vincent. "A Relative Investigation of Various Algorithms for Online Financial Fraud Detection Techniques". W Recent Trends in Intensive Computing. IOS Press, 2021. http://dx.doi.org/10.3233/apc210174.
Pełny tekst źródłaDalton, Gene, i Ann Devitt. "Gaeilge Gaming". W Computer-Assisted Language Learning, 1093–110. IGI Global, 2019. http://dx.doi.org/10.4018/978-1-5225-7663-1.ch052.
Pełny tekst źródłaWilkins, Heidi. "Talking Back: Voice in Screwball Comedy". W Talkies, Road Movies and Chick Flicks. Edinburgh University Press, 2016. http://dx.doi.org/10.3366/edinburgh/9781474406895.003.0002.
Pełny tekst źródłaCripps, Yvonne. "The Public Interest Disclosure Act 1998". W Freedom Of Expression And Freedom Of Information, 275–87. Oxford University PressOxford, 2000. http://dx.doi.org/10.1093/oso/9780198268390.003.0018.
Pełny tekst źródłaSkulacheva, Tatyana, Natalia Slioussar, Alexander Kostyuk, Anna Lipina, Emil Latypov i Varvara Koroleva. "The Influence of Verse on Cognitive Processes: A Psycholinguistic Experiment". W Tackling the Toolkit: Plotting Poetry through Computational Literary Studies, 155–66. Institute of Czech Literature of the Czech Academy of Sciences, 2022. http://dx.doi.org/10.51305/icl.cz.9788076580336.10.
Pełny tekst źródłaStreszczenia konferencji na temat "Active speaker detection"
Roth, Joseph, Sourish Chaudhuri, Ondrej Klejch, Radhika Marvin, Andrew Gallagher, Liat Kaver, Sharadh Ramaswamy i in. "Ava Active Speaker: An Audio-Visual Dataset for Active Speaker Detection". W ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2020. http://dx.doi.org/10.1109/icassp40776.2020.9053900.
Pełny tekst źródłaStefanov, Kalin, Jonas Beskow i Giampiero Salvi. "Vision-based Active Speaker Detection in Multiparty Interaction". W GLU 2017 International Workshop on Grounding Language Understanding. ISCA: ISCA, 2017. http://dx.doi.org/10.21437/glu.2017-10.
Pełny tekst źródłaHuang, Chong, i Kazuhito Koishida. "Improved Active Speaker Detection based on Optical Flow". W 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, 2020. http://dx.doi.org/10.1109/cvprw50498.2020.00483.
Pełny tekst źródłaChakravarty, Punarjay, Jeroen Zegers, Tinne Tuytelaars i Hugo Van hamme. "Active speaker detection with audio-visual co-training". W ICMI '16: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2993148.2993172.
Pełny tekst źródłaKheradiya, Jatin, Sandeep Reddy C i Rajesh Hegde. "Active Speaker Detection using audio-visual sensor array". W 2014 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT). IEEE, 2014. http://dx.doi.org/10.1109/isspit.2014.7300636.
Pełny tekst źródłaWuerkaixi, Abudukelimu, You Zhang, Zhiyao Duan i Changshui Zhang. "Rethinking Audio-Visual Synchronization for Active Speaker Detection". W 2022 IEEE 32nd International Workshop on Machine Learning for Signal Processing (MLSP). IEEE, 2022. http://dx.doi.org/10.1109/mlsp55214.2022.9943352.
Pełny tekst źródłaJiang, Yidi, Ruijie Tao, Zexu Pan i Haizhou Li. "Target Active Speaker Detection with Audio-visual Cues". W INTERSPEECH 2023. ISCA: ISCA, 2023. http://dx.doi.org/10.21437/interspeech.2023-574.
Pełny tekst źródłaLiao, Junhua, Haihan Duan, Kanghui Feng, Wanbing Zhao, Yanbing Yang i Liangyin Chen. "A Light Weight Model for Active Speaker Detection". W 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2023. http://dx.doi.org/10.1109/cvpr52729.2023.02196.
Pełny tekst źródłaAlcazar, Juan Leon, Fabian Caba Heilbron, Ali K. Thabet i Bernard Ghanem. "MAAS: Multi-modal Assignation for Active Speaker Detection". W 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2021. http://dx.doi.org/10.1109/iccv48922.2021.00033.
Pełny tekst źródłaMadrigal, Francisco, Frederic Lerasle, Lionel Pibre i Isabelle Ferrane. "Audio-Video detection of the active speaker in meetings". W 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, 2021. http://dx.doi.org/10.1109/icpr48806.2021.9412681.
Pełny tekst źródłaRaporty organizacyjne na temat "Active speaker detection"
Mizrach, Amos, Michal Mazor, Amots Hetzroni, Joseph Grinshpun, Richard Mankin, Dennis Shuman, Nancy Epsky i Robert Heath. Male Song as a Tool for Trapping Female Medflies. United States Department of Agriculture, grudzień 2002. http://dx.doi.org/10.32747/2002.7586535.bard.
Pełny tekst źródła