Gotowa bibliografia na temat „Automated audio captioning”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Zobacz listy aktualnych artykułów, książek, rozpraw, streszczeń i innych źródeł naukowych na temat „Automated audio captioning”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Artykuły w czasopismach na temat "Automated audio captioning"
Bokhove, Christian, i Christopher Downey. "Automated generation of ‘good enough’ transcripts as a first step to transcription of audio-recorded data". Methodological Innovations 11, nr 2 (maj 2018): 205979911879074. http://dx.doi.org/10.1177/2059799118790743.
Pełny tekst źródłaKoenecke, Allison, Andrew Nam, Emily Lake, Joe Nudell, Minnie Quartey, Zion Mengesha, Connor Toups, John R. Rickford, Dan Jurafsky i Sharad Goel. "Racial disparities in automated speech recognition". Proceedings of the National Academy of Sciences 117, nr 14 (23.03.2020): 7684–89. http://dx.doi.org/10.1073/pnas.1915768117.
Pełny tekst źródłaMirzaei, Maryam Sadat, Kourosh Meshgi, Yuya Akita i Tatsuya Kawahara. "Partial and synchronized captioning: A new tool to assist learners in developing second language listening skill". ReCALL 29, nr 2 (2.03.2017): 178–99. http://dx.doi.org/10.1017/s0958344017000039.
Pełny tekst źródłaGuo, Rundong. "Advancing real-time close captioning: blind source separation and transcription for hearing impairments". Applied and Computational Engineering 30, nr 1 (22.01.2024): 125–30. http://dx.doi.org/10.54254/2755-2721/30/20230084.
Pełny tekst źródłaPrabhala, Jagat Chaitanya, Venkatnareshbabu K i Ragoju Ravi. "OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIARIZATION SYSTEMS: A MATHEMATICAL FORMULATION". Applied Mathematics and Sciences An International Journal (MathSJ) 10, nr 1/2 (26.06.2023): 1–10. http://dx.doi.org/10.5121/mathsj.2023.10201.
Pełny tekst źródłaNam, Somang, i Deborah Fels. "Simulation of Subjective Closed Captioning Quality Assessment Using Prediction Models". International Journal of Semantic Computing 13, nr 01 (marzec 2019): 45–65. http://dx.doi.org/10.1142/s1793351x19400038.
Pełny tekst źródłaGotmare, Abhay, Gandharva Thite i Laxmi Bewoor. "A multimodal machine learning approach to generate news articles from geo-tagged images". International Journal of Electrical and Computer Engineering (IJECE) 14, nr 3 (1.06.2024): 3434. http://dx.doi.org/10.11591/ijece.v14i3.pp3434-3442.
Pełny tekst źródłaVerma, Dr Neeta. "Assistive Vision Technology using Deep Learning Techniques". International Journal for Research in Applied Science and Engineering Technology 9, nr VII (31.07.2021): 2695–704. http://dx.doi.org/10.22214/ijraset.2021.36815.
Pełny tekst źródłaEren, Aysegul Ozkaya, i Mustafa Sert. "Automated Audio Captioning with Topic Modeling". IEEE Access, 2023, 1. http://dx.doi.org/10.1109/access.2023.3235733.
Pełny tekst źródłaXiao, Feiyang, Jian Guan, Qiaoxi Zhu i Wenwu Wang. "Graph Attention for Automated Audio Captioning". IEEE Signal Processing Letters, 2023, 1–5. http://dx.doi.org/10.1109/lsp.2023.3266114.
Pełny tekst źródłaRozprawy doktorskie na temat "Automated audio captioning"
Labbé, Etienne. "Description automatique des événements sonores par des méthodes d'apprentissage profond". Electronic Thesis or Diss., Université de Toulouse (2023-....), 2024. http://www.theses.fr/2024TLSES054.
Pełny tekst źródłaIn the audio research field, the majority of machine learning systems focus on recognizing a limited number of sound events. However, when a machine interacts with real data, it must be able to handle much more varied and complex situations. To tackle this problem, annotators use natural language, which allows any sound information to be summarized. Automated Audio Captioning (AAC) was introduced recently to develop systems capable of automatically producing a description of any type of sound in text form. This task concerns all kinds of sound events such as environmental, urban, domestic sounds, sound effects, music or speech. This type of system could be used by people who are deaf or hard of hearing, and could improve the indexing of large audio databases. In the first part of this thesis, we present the state of the art of the AAC task through a global description of public datasets, learning methods, architectures and evaluation metrics. Using this knowledge, we then present the architecture of our first AAC system, which obtains encouraging scores on the main AAC metric named SPIDEr: 24.7% on the Clotho corpus and 40.1% on the AudioCaps corpus. Then, subsequently, we explore many aspects of AAC systems in the second part. We first focus on evaluation methods through the study of SPIDEr. For this, we propose a variant called SPIDEr-max, which considers several candidates for each audio file, and which shows that the SPIDEr metric is very sensitive to the predicted words. Then, we improve our reference system by exploring different architectures and numerous hyper-parameters to exceed the state of the art on AudioCaps (SPIDEr of 49.5%). Next, we explore a multi-task learning method aimed at improving the semantics of sentences generated by our system. Finally, we build a general and unbiased AAC system called CONETTE, which can generate different types of descriptions that approximate those of the target datasets. In the third and last part, we propose to study the capabilities of a AAC system to automatically search for audio content in a database. Our approach obtains competitive scores to systems dedicated to this task, while using fewer parameters. We also introduce semi-supervised methods to improve our system using new unlabeled audio data, and we show how pseudo-label generation can impact a AAC model. Finally, we studied the AAC systems in languages other than English: French, Spanish and German. In addition, we propose a system capable of producing all four languages at the same time, and we compare it with systems specialized in each language
Części książek na temat "Automated audio captioning"
M., Nivedita, AsnathVictyPhamila Y., Umashankar Kumaravelan i Karthikeyan N. "Voice-Based Image Captioning System for Assisting Visually Impaired People Using Neural Networks". W Principles and Applications of Socio-Cognitive and Affective Computing, 177–99. IGI Global, 2022. http://dx.doi.org/10.4018/978-1-6684-3843-5.ch011.
Pełny tekst źródłaVenturini, Shamira, Michaela Mae Vann, Martina Pucci i Giulia M. L. Bencini. "Towards a More Inclusive Learning Environment: The Importance of Providing Captions That Are Suited to Learners’ Language Proficiency in the UDL Classroom". W Studies in Health Technology and Informatics. IOS Press, 2022. http://dx.doi.org/10.3233/shti220884.
Pełny tekst źródłaStreszczenia konferencji na temat "Automated audio captioning"
Kim, Minkyu, Kim Sung-Bin i Tae-Hyun Oh. "Prefix Tuning for Automated Audio Captioning". W ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023. http://dx.doi.org/10.1109/icassp49357.2023.10096877.
Pełny tekst źródłaDrossos, Konstantinos, Sharath Adavanne i Tuomas Virtanen. "Automated audio captioning with recurrent neural networks". W 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). IEEE, 2017. http://dx.doi.org/10.1109/waspaa.2017.8170058.
Pełny tekst źródłaChen, Chen, Nana Hou, Yuchen Hu, Heqing Zou, Xiaofeng Qi i Eng Siong Chng. "Interactive Auido-text Representation for Automated Audio Captioning with Contrastive Learning". W Interspeech 2022. ISCA: ISCA, 2022. http://dx.doi.org/10.21437/interspeech.2022-10510.
Pełny tekst źródłaKim, Jaeyeon, Jaeyoon Jung, Jinjoo Lee i Sang Hoon Woo. "EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning". W ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2024. http://dx.doi.org/10.1109/icassp48485.2024.10446672.
Pełny tekst źródłaYe, Zhongjie, Yuqing Wang, Helin Wang, Dongchao Yang i Yuexian Zou. "FeatureCut: An Adaptive Data Augmentation for Automated Audio Captioning". W 2022 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 2022. http://dx.doi.org/10.23919/apsipaasc55919.2022.9980325.
Pełny tekst źródłaKoh, Andrew, Soham Tiwari i Chng Eng Siong. "Automated Audio Captioning with Epochal Difficult Captions for curriculum learning". W 2022 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 2022. http://dx.doi.org/10.23919/apsipaasc55919.2022.9980242.
Pełny tekst źródłaWijngaard, Gijs, Elia Formisano, Bruno L. Giordano i Michel Dumontier. "ACES: Evaluating Automated Audio Captioning Models on the Semantics of Sounds". W 2023 31st European Signal Processing Conference (EUSIPCO). IEEE, 2023. http://dx.doi.org/10.23919/eusipco58844.2023.10289793.
Pełny tekst źródłaJain, Arushi, Navaneeth B. R, Shelly Mohanty, R. Sujatha, Sujatha R, Sourabh Tiwari i Rashmi T. Shankarappa. "Web Framework for Enhancing Automated Audio Captioning Performance for Domestic Environment". W 2022 13th International Conference on Computing Communication and Networking Technologies (ICCCNT). IEEE, 2022. http://dx.doi.org/10.1109/icccnt54827.2022.9984255.
Pełny tekst źródłaSun, Jianyuan, Xubo Liu, Xinhao Mei, Volkan Kılıç, Mark D. Plumbley i Wenwu Wang. "Dual Transformer Decoder based Features Fusion Network for Automated Audio Captioning". W INTERSPEECH 2023. ISCA: ISCA, 2023. http://dx.doi.org/10.21437/interspeech.2023-943.
Pełny tekst źródłaXu, Xuenan, Heinrich Dinkel, Mengyue Wu, Zeyu Xie i Kai Yu. "Investigating Local and Global Information for Automated Audio Captioning with Transfer Learning". W ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021. http://dx.doi.org/10.1109/icassp39728.2021.9413982.
Pełny tekst źródła