Littérature scientifique sur le sujet « Automated audio captioning »
Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres
Consultez les listes thématiques d’articles de revues, de livres, de thèses, de rapports de conférences et d’autres sources académiques sur le sujet « Automated audio captioning ».
À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.
Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.
Articles de revues sur le sujet "Automated audio captioning"
Bokhove, Christian, et Christopher Downey. « Automated generation of ‘good enough’ transcripts as a first step to transcription of audio-recorded data ». Methodological Innovations 11, no 2 (mai 2018) : 205979911879074. http://dx.doi.org/10.1177/2059799118790743.
Texte intégralKoenecke, Allison, Andrew Nam, Emily Lake, Joe Nudell, Minnie Quartey, Zion Mengesha, Connor Toups, John R. Rickford, Dan Jurafsky et Sharad Goel. « Racial disparities in automated speech recognition ». Proceedings of the National Academy of Sciences 117, no 14 (23 mars 2020) : 7684–89. http://dx.doi.org/10.1073/pnas.1915768117.
Texte intégralMirzaei, Maryam Sadat, Kourosh Meshgi, Yuya Akita et Tatsuya Kawahara. « Partial and synchronized captioning : A new tool to assist learners in developing second language listening skill ». ReCALL 29, no 2 (2 mars 2017) : 178–99. http://dx.doi.org/10.1017/s0958344017000039.
Texte intégralGuo, Rundong. « Advancing real-time close captioning : blind source separation and transcription for hearing impairments ». Applied and Computational Engineering 30, no 1 (22 janvier 2024) : 125–30. http://dx.doi.org/10.54254/2755-2721/30/20230084.
Texte intégralPrabhala, Jagat Chaitanya, Venkatnareshbabu K et Ragoju Ravi. « OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIARIZATION SYSTEMS : A MATHEMATICAL FORMULATION ». Applied Mathematics and Sciences An International Journal (MathSJ) 10, no 1/2 (26 juin 2023) : 1–10. http://dx.doi.org/10.5121/mathsj.2023.10201.
Texte intégralNam, Somang, et Deborah Fels. « Simulation of Subjective Closed Captioning Quality Assessment Using Prediction Models ». International Journal of Semantic Computing 13, no 01 (mars 2019) : 45–65. http://dx.doi.org/10.1142/s1793351x19400038.
Texte intégralGotmare, Abhay, Gandharva Thite et Laxmi Bewoor. « A multimodal machine learning approach to generate news articles from geo-tagged images ». International Journal of Electrical and Computer Engineering (IJECE) 14, no 3 (1 juin 2024) : 3434. http://dx.doi.org/10.11591/ijece.v14i3.pp3434-3442.
Texte intégralVerma, Dr Neeta. « Assistive Vision Technology using Deep Learning Techniques ». International Journal for Research in Applied Science and Engineering Technology 9, no VII (31 juillet 2021) : 2695–704. http://dx.doi.org/10.22214/ijraset.2021.36815.
Texte intégralEren, Aysegul Ozkaya, et Mustafa Sert. « Automated Audio Captioning with Topic Modeling ». IEEE Access, 2023, 1. http://dx.doi.org/10.1109/access.2023.3235733.
Texte intégralXiao, Feiyang, Jian Guan, Qiaoxi Zhu et Wenwu Wang. « Graph Attention for Automated Audio Captioning ». IEEE Signal Processing Letters, 2023, 1–5. http://dx.doi.org/10.1109/lsp.2023.3266114.
Texte intégralThèses sur le sujet "Automated audio captioning"
Labbé, Etienne. « Description automatique des événements sonores par des méthodes d'apprentissage profond ». Electronic Thesis or Diss., Université de Toulouse (2023-....), 2024. http://www.theses.fr/2024TLSES054.
Texte intégralIn the audio research field, the majority of machine learning systems focus on recognizing a limited number of sound events. However, when a machine interacts with real data, it must be able to handle much more varied and complex situations. To tackle this problem, annotators use natural language, which allows any sound information to be summarized. Automated Audio Captioning (AAC) was introduced recently to develop systems capable of automatically producing a description of any type of sound in text form. This task concerns all kinds of sound events such as environmental, urban, domestic sounds, sound effects, music or speech. This type of system could be used by people who are deaf or hard of hearing, and could improve the indexing of large audio databases. In the first part of this thesis, we present the state of the art of the AAC task through a global description of public datasets, learning methods, architectures and evaluation metrics. Using this knowledge, we then present the architecture of our first AAC system, which obtains encouraging scores on the main AAC metric named SPIDEr: 24.7% on the Clotho corpus and 40.1% on the AudioCaps corpus. Then, subsequently, we explore many aspects of AAC systems in the second part. We first focus on evaluation methods through the study of SPIDEr. For this, we propose a variant called SPIDEr-max, which considers several candidates for each audio file, and which shows that the SPIDEr metric is very sensitive to the predicted words. Then, we improve our reference system by exploring different architectures and numerous hyper-parameters to exceed the state of the art on AudioCaps (SPIDEr of 49.5%). Next, we explore a multi-task learning method aimed at improving the semantics of sentences generated by our system. Finally, we build a general and unbiased AAC system called CONETTE, which can generate different types of descriptions that approximate those of the target datasets. In the third and last part, we propose to study the capabilities of a AAC system to automatically search for audio content in a database. Our approach obtains competitive scores to systems dedicated to this task, while using fewer parameters. We also introduce semi-supervised methods to improve our system using new unlabeled audio data, and we show how pseudo-label generation can impact a AAC model. Finally, we studied the AAC systems in languages other than English: French, Spanish and German. In addition, we propose a system capable of producing all four languages at the same time, and we compare it with systems specialized in each language
Chapitres de livres sur le sujet "Automated audio captioning"
M., Nivedita, AsnathVictyPhamila Y., Umashankar Kumaravelan et Karthikeyan N. « Voice-Based Image Captioning System for Assisting Visually Impaired People Using Neural Networks ». Dans Principles and Applications of Socio-Cognitive and Affective Computing, 177–99. IGI Global, 2022. http://dx.doi.org/10.4018/978-1-6684-3843-5.ch011.
Texte intégralVenturini, Shamira, Michaela Mae Vann, Martina Pucci et Giulia M. L. Bencini. « Towards a More Inclusive Learning Environment : The Importance of Providing Captions That Are Suited to Learners’ Language Proficiency in the UDL Classroom ». Dans Studies in Health Technology and Informatics. IOS Press, 2022. http://dx.doi.org/10.3233/shti220884.
Texte intégralActes de conférences sur le sujet "Automated audio captioning"
Kim, Minkyu, Kim Sung-Bin et Tae-Hyun Oh. « Prefix Tuning for Automated Audio Captioning ». Dans ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023. http://dx.doi.org/10.1109/icassp49357.2023.10096877.
Texte intégralDrossos, Konstantinos, Sharath Adavanne et Tuomas Virtanen. « Automated audio captioning with recurrent neural networks ». Dans 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). IEEE, 2017. http://dx.doi.org/10.1109/waspaa.2017.8170058.
Texte intégralChen, Chen, Nana Hou, Yuchen Hu, Heqing Zou, Xiaofeng Qi et Eng Siong Chng. « Interactive Auido-text Representation for Automated Audio Captioning with Contrastive Learning ». Dans Interspeech 2022. ISCA : ISCA, 2022. http://dx.doi.org/10.21437/interspeech.2022-10510.
Texte intégralKim, Jaeyeon, Jaeyoon Jung, Jinjoo Lee et Sang Hoon Woo. « EnCLAP : Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning ». Dans ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2024. http://dx.doi.org/10.1109/icassp48485.2024.10446672.
Texte intégralYe, Zhongjie, Yuqing Wang, Helin Wang, Dongchao Yang et Yuexian Zou. « FeatureCut : An Adaptive Data Augmentation for Automated Audio Captioning ». Dans 2022 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 2022. http://dx.doi.org/10.23919/apsipaasc55919.2022.9980325.
Texte intégralKoh, Andrew, Soham Tiwari et Chng Eng Siong. « Automated Audio Captioning with Epochal Difficult Captions for curriculum learning ». Dans 2022 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 2022. http://dx.doi.org/10.23919/apsipaasc55919.2022.9980242.
Texte intégralWijngaard, Gijs, Elia Formisano, Bruno L. Giordano et Michel Dumontier. « ACES : Evaluating Automated Audio Captioning Models on the Semantics of Sounds ». Dans 2023 31st European Signal Processing Conference (EUSIPCO). IEEE, 2023. http://dx.doi.org/10.23919/eusipco58844.2023.10289793.
Texte intégralJain, Arushi, Navaneeth B. R, Shelly Mohanty, R. Sujatha, Sujatha R, Sourabh Tiwari et Rashmi T. Shankarappa. « Web Framework for Enhancing Automated Audio Captioning Performance for Domestic Environment ». Dans 2022 13th International Conference on Computing Communication and Networking Technologies (ICCCNT). IEEE, 2022. http://dx.doi.org/10.1109/icccnt54827.2022.9984255.
Texte intégralSun, Jianyuan, Xubo Liu, Xinhao Mei, Volkan Kılıç, Mark D. Plumbley et Wenwu Wang. « Dual Transformer Decoder based Features Fusion Network for Automated Audio Captioning ». Dans INTERSPEECH 2023. ISCA : ISCA, 2023. http://dx.doi.org/10.21437/interspeech.2023-943.
Texte intégralXu, Xuenan, Heinrich Dinkel, Mengyue Wu, Zeyu Xie et Kai Yu. « Investigating Local and Global Information for Automated Audio Captioning with Transfer Learning ». Dans ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021. http://dx.doi.org/10.1109/icassp39728.2021.9413982.
Texte intégral