Literatura académica sobre el tema "Automated audio captioning"
Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros
Consulte las listas temáticas de artículos, libros, tesis, actas de conferencias y otras fuentes académicas sobre el tema "Automated audio captioning".
Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.
También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.
Artículos de revistas sobre el tema "Automated audio captioning"
Bokhove, Christian y Christopher Downey. "Automated generation of ‘good enough’ transcripts as a first step to transcription of audio-recorded data". Methodological Innovations 11, n.º 2 (mayo de 2018): 205979911879074. http://dx.doi.org/10.1177/2059799118790743.
Texto completoKoenecke, Allison, Andrew Nam, Emily Lake, Joe Nudell, Minnie Quartey, Zion Mengesha, Connor Toups, John R. Rickford, Dan Jurafsky y Sharad Goel. "Racial disparities in automated speech recognition". Proceedings of the National Academy of Sciences 117, n.º 14 (23 de marzo de 2020): 7684–89. http://dx.doi.org/10.1073/pnas.1915768117.
Texto completoMirzaei, Maryam Sadat, Kourosh Meshgi, Yuya Akita y Tatsuya Kawahara. "Partial and synchronized captioning: A new tool to assist learners in developing second language listening skill". ReCALL 29, n.º 2 (2 de marzo de 2017): 178–99. http://dx.doi.org/10.1017/s0958344017000039.
Texto completoGuo, Rundong. "Advancing real-time close captioning: blind source separation and transcription for hearing impairments". Applied and Computational Engineering 30, n.º 1 (22 de enero de 2024): 125–30. http://dx.doi.org/10.54254/2755-2721/30/20230084.
Texto completoPrabhala, Jagat Chaitanya, Venkatnareshbabu K y Ragoju Ravi. "OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIARIZATION SYSTEMS: A MATHEMATICAL FORMULATION". Applied Mathematics and Sciences An International Journal (MathSJ) 10, n.º 1/2 (26 de junio de 2023): 1–10. http://dx.doi.org/10.5121/mathsj.2023.10201.
Texto completoNam, Somang y Deborah Fels. "Simulation of Subjective Closed Captioning Quality Assessment Using Prediction Models". International Journal of Semantic Computing 13, n.º 01 (marzo de 2019): 45–65. http://dx.doi.org/10.1142/s1793351x19400038.
Texto completoGotmare, Abhay, Gandharva Thite y Laxmi Bewoor. "A multimodal machine learning approach to generate news articles from geo-tagged images". International Journal of Electrical and Computer Engineering (IJECE) 14, n.º 3 (1 de junio de 2024): 3434. http://dx.doi.org/10.11591/ijece.v14i3.pp3434-3442.
Texto completoVerma, Dr Neeta. "Assistive Vision Technology using Deep Learning Techniques". International Journal for Research in Applied Science and Engineering Technology 9, n.º VII (31 de julio de 2021): 2695–704. http://dx.doi.org/10.22214/ijraset.2021.36815.
Texto completoEren, Aysegul Ozkaya y Mustafa Sert. "Automated Audio Captioning with Topic Modeling". IEEE Access, 2023, 1. http://dx.doi.org/10.1109/access.2023.3235733.
Texto completoXiao, Feiyang, Jian Guan, Qiaoxi Zhu y Wenwu Wang. "Graph Attention for Automated Audio Captioning". IEEE Signal Processing Letters, 2023, 1–5. http://dx.doi.org/10.1109/lsp.2023.3266114.
Texto completoTesis sobre el tema "Automated audio captioning"
Labbé, Etienne. "Description automatique des événements sonores par des méthodes d'apprentissage profond". Electronic Thesis or Diss., Université de Toulouse (2023-....), 2024. http://www.theses.fr/2024TLSES054.
Texto completoIn the audio research field, the majority of machine learning systems focus on recognizing a limited number of sound events. However, when a machine interacts with real data, it must be able to handle much more varied and complex situations. To tackle this problem, annotators use natural language, which allows any sound information to be summarized. Automated Audio Captioning (AAC) was introduced recently to develop systems capable of automatically producing a description of any type of sound in text form. This task concerns all kinds of sound events such as environmental, urban, domestic sounds, sound effects, music or speech. This type of system could be used by people who are deaf or hard of hearing, and could improve the indexing of large audio databases. In the first part of this thesis, we present the state of the art of the AAC task through a global description of public datasets, learning methods, architectures and evaluation metrics. Using this knowledge, we then present the architecture of our first AAC system, which obtains encouraging scores on the main AAC metric named SPIDEr: 24.7% on the Clotho corpus and 40.1% on the AudioCaps corpus. Then, subsequently, we explore many aspects of AAC systems in the second part. We first focus on evaluation methods through the study of SPIDEr. For this, we propose a variant called SPIDEr-max, which considers several candidates for each audio file, and which shows that the SPIDEr metric is very sensitive to the predicted words. Then, we improve our reference system by exploring different architectures and numerous hyper-parameters to exceed the state of the art on AudioCaps (SPIDEr of 49.5%). Next, we explore a multi-task learning method aimed at improving the semantics of sentences generated by our system. Finally, we build a general and unbiased AAC system called CONETTE, which can generate different types of descriptions that approximate those of the target datasets. In the third and last part, we propose to study the capabilities of a AAC system to automatically search for audio content in a database. Our approach obtains competitive scores to systems dedicated to this task, while using fewer parameters. We also introduce semi-supervised methods to improve our system using new unlabeled audio data, and we show how pseudo-label generation can impact a AAC model. Finally, we studied the AAC systems in languages other than English: French, Spanish and German. In addition, we propose a system capable of producing all four languages at the same time, and we compare it with systems specialized in each language
Capítulos de libros sobre el tema "Automated audio captioning"
M., Nivedita, AsnathVictyPhamila Y., Umashankar Kumaravelan y Karthikeyan N. "Voice-Based Image Captioning System for Assisting Visually Impaired People Using Neural Networks". En Principles and Applications of Socio-Cognitive and Affective Computing, 177–99. IGI Global, 2022. http://dx.doi.org/10.4018/978-1-6684-3843-5.ch011.
Texto completoVenturini, Shamira, Michaela Mae Vann, Martina Pucci y Giulia M. L. Bencini. "Towards a More Inclusive Learning Environment: The Importance of Providing Captions That Are Suited to Learners’ Language Proficiency in the UDL Classroom". En Studies in Health Technology and Informatics. IOS Press, 2022. http://dx.doi.org/10.3233/shti220884.
Texto completoActas de conferencias sobre el tema "Automated audio captioning"
Kim, Minkyu, Kim Sung-Bin y Tae-Hyun Oh. "Prefix Tuning for Automated Audio Captioning". En ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023. http://dx.doi.org/10.1109/icassp49357.2023.10096877.
Texto completoDrossos, Konstantinos, Sharath Adavanne y Tuomas Virtanen. "Automated audio captioning with recurrent neural networks". En 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). IEEE, 2017. http://dx.doi.org/10.1109/waspaa.2017.8170058.
Texto completoChen, Chen, Nana Hou, Yuchen Hu, Heqing Zou, Xiaofeng Qi y Eng Siong Chng. "Interactive Auido-text Representation for Automated Audio Captioning with Contrastive Learning". En Interspeech 2022. ISCA: ISCA, 2022. http://dx.doi.org/10.21437/interspeech.2022-10510.
Texto completoKim, Jaeyeon, Jaeyoon Jung, Jinjoo Lee y Sang Hoon Woo. "EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning". En ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2024. http://dx.doi.org/10.1109/icassp48485.2024.10446672.
Texto completoYe, Zhongjie, Yuqing Wang, Helin Wang, Dongchao Yang y Yuexian Zou. "FeatureCut: An Adaptive Data Augmentation for Automated Audio Captioning". En 2022 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 2022. http://dx.doi.org/10.23919/apsipaasc55919.2022.9980325.
Texto completoKoh, Andrew, Soham Tiwari y Chng Eng Siong. "Automated Audio Captioning with Epochal Difficult Captions for curriculum learning". En 2022 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 2022. http://dx.doi.org/10.23919/apsipaasc55919.2022.9980242.
Texto completoWijngaard, Gijs, Elia Formisano, Bruno L. Giordano y Michel Dumontier. "ACES: Evaluating Automated Audio Captioning Models on the Semantics of Sounds". En 2023 31st European Signal Processing Conference (EUSIPCO). IEEE, 2023. http://dx.doi.org/10.23919/eusipco58844.2023.10289793.
Texto completoJain, Arushi, Navaneeth B. R, Shelly Mohanty, R. Sujatha, Sujatha R, Sourabh Tiwari y Rashmi T. Shankarappa. "Web Framework for Enhancing Automated Audio Captioning Performance for Domestic Environment". En 2022 13th International Conference on Computing Communication and Networking Technologies (ICCCNT). IEEE, 2022. http://dx.doi.org/10.1109/icccnt54827.2022.9984255.
Texto completoSun, Jianyuan, Xubo Liu, Xinhao Mei, Volkan Kılıç, Mark D. Plumbley y Wenwu Wang. "Dual Transformer Decoder based Features Fusion Network for Automated Audio Captioning". En INTERSPEECH 2023. ISCA: ISCA, 2023. http://dx.doi.org/10.21437/interspeech.2023-943.
Texto completoXu, Xuenan, Heinrich Dinkel, Mengyue Wu, Zeyu Xie y Kai Yu. "Investigating Local and Global Information for Automated Audio Captioning with Transfer Learning". En ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021. http://dx.doi.org/10.1109/icassp39728.2021.9413982.
Texto completo