Literatura científica selecionada sobre o tema "Automated audio captioning"
Crie uma referência precisa em APA, MLA, Chicago, Harvard, e outros estilos
Consulte a lista de atuais artigos, livros, teses, anais de congressos e outras fontes científicas relevantes para o tema "Automated audio captioning".
Ao lado de cada fonte na lista de referências, há um botão "Adicionar à bibliografia". Clique e geraremos automaticamente a citação bibliográfica do trabalho escolhido no estilo de citação de que você precisa: APA, MLA, Harvard, Chicago, Vancouver, etc.
Você também pode baixar o texto completo da publicação científica em formato .pdf e ler o resumo do trabalho online se estiver presente nos metadados.
Artigos de revistas sobre o assunto "Automated audio captioning"
Bokhove, Christian, e Christopher Downey. "Automated generation of ‘good enough’ transcripts as a first step to transcription of audio-recorded data". Methodological Innovations 11, n.º 2 (maio de 2018): 205979911879074. http://dx.doi.org/10.1177/2059799118790743.
Texto completo da fonteKoenecke, Allison, Andrew Nam, Emily Lake, Joe Nudell, Minnie Quartey, Zion Mengesha, Connor Toups, John R. Rickford, Dan Jurafsky e Sharad Goel. "Racial disparities in automated speech recognition". Proceedings of the National Academy of Sciences 117, n.º 14 (23 de março de 2020): 7684–89. http://dx.doi.org/10.1073/pnas.1915768117.
Texto completo da fonteMirzaei, Maryam Sadat, Kourosh Meshgi, Yuya Akita e Tatsuya Kawahara. "Partial and synchronized captioning: A new tool to assist learners in developing second language listening skill". ReCALL 29, n.º 2 (2 de março de 2017): 178–99. http://dx.doi.org/10.1017/s0958344017000039.
Texto completo da fonteGuo, Rundong. "Advancing real-time close captioning: blind source separation and transcription for hearing impairments". Applied and Computational Engineering 30, n.º 1 (22 de janeiro de 2024): 125–30. http://dx.doi.org/10.54254/2755-2721/30/20230084.
Texto completo da fontePrabhala, Jagat Chaitanya, Venkatnareshbabu K e Ragoju Ravi. "OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIARIZATION SYSTEMS: A MATHEMATICAL FORMULATION". Applied Mathematics and Sciences An International Journal (MathSJ) 10, n.º 1/2 (26 de junho de 2023): 1–10. http://dx.doi.org/10.5121/mathsj.2023.10201.
Texto completo da fonteNam, Somang, e Deborah Fels. "Simulation of Subjective Closed Captioning Quality Assessment Using Prediction Models". International Journal of Semantic Computing 13, n.º 01 (março de 2019): 45–65. http://dx.doi.org/10.1142/s1793351x19400038.
Texto completo da fonteGotmare, Abhay, Gandharva Thite e Laxmi Bewoor. "A multimodal machine learning approach to generate news articles from geo-tagged images". International Journal of Electrical and Computer Engineering (IJECE) 14, n.º 3 (1 de junho de 2024): 3434. http://dx.doi.org/10.11591/ijece.v14i3.pp3434-3442.
Texto completo da fonteVerma, Dr Neeta. "Assistive Vision Technology using Deep Learning Techniques". International Journal for Research in Applied Science and Engineering Technology 9, n.º VII (31 de julho de 2021): 2695–704. http://dx.doi.org/10.22214/ijraset.2021.36815.
Texto completo da fonteEren, Aysegul Ozkaya, e Mustafa Sert. "Automated Audio Captioning with Topic Modeling". IEEE Access, 2023, 1. http://dx.doi.org/10.1109/access.2023.3235733.
Texto completo da fonteXiao, Feiyang, Jian Guan, Qiaoxi Zhu e Wenwu Wang. "Graph Attention for Automated Audio Captioning". IEEE Signal Processing Letters, 2023, 1–5. http://dx.doi.org/10.1109/lsp.2023.3266114.
Texto completo da fonteTeses / dissertações sobre o assunto "Automated audio captioning"
Labbé, Etienne. "Description automatique des événements sonores par des méthodes d'apprentissage profond". Electronic Thesis or Diss., Université de Toulouse (2023-....), 2024. http://www.theses.fr/2024TLSES054.
Texto completo da fonteIn the audio research field, the majority of machine learning systems focus on recognizing a limited number of sound events. However, when a machine interacts with real data, it must be able to handle much more varied and complex situations. To tackle this problem, annotators use natural language, which allows any sound information to be summarized. Automated Audio Captioning (AAC) was introduced recently to develop systems capable of automatically producing a description of any type of sound in text form. This task concerns all kinds of sound events such as environmental, urban, domestic sounds, sound effects, music or speech. This type of system could be used by people who are deaf or hard of hearing, and could improve the indexing of large audio databases. In the first part of this thesis, we present the state of the art of the AAC task through a global description of public datasets, learning methods, architectures and evaluation metrics. Using this knowledge, we then present the architecture of our first AAC system, which obtains encouraging scores on the main AAC metric named SPIDEr: 24.7% on the Clotho corpus and 40.1% on the AudioCaps corpus. Then, subsequently, we explore many aspects of AAC systems in the second part. We first focus on evaluation methods through the study of SPIDEr. For this, we propose a variant called SPIDEr-max, which considers several candidates for each audio file, and which shows that the SPIDEr metric is very sensitive to the predicted words. Then, we improve our reference system by exploring different architectures and numerous hyper-parameters to exceed the state of the art on AudioCaps (SPIDEr of 49.5%). Next, we explore a multi-task learning method aimed at improving the semantics of sentences generated by our system. Finally, we build a general and unbiased AAC system called CONETTE, which can generate different types of descriptions that approximate those of the target datasets. In the third and last part, we propose to study the capabilities of a AAC system to automatically search for audio content in a database. Our approach obtains competitive scores to systems dedicated to this task, while using fewer parameters. We also introduce semi-supervised methods to improve our system using new unlabeled audio data, and we show how pseudo-label generation can impact a AAC model. Finally, we studied the AAC systems in languages other than English: French, Spanish and German. In addition, we propose a system capable of producing all four languages at the same time, and we compare it with systems specialized in each language
Capítulos de livros sobre o assunto "Automated audio captioning"
M., Nivedita, AsnathVictyPhamila Y., Umashankar Kumaravelan e Karthikeyan N. "Voice-Based Image Captioning System for Assisting Visually Impaired People Using Neural Networks". In Principles and Applications of Socio-Cognitive and Affective Computing, 177–99. IGI Global, 2022. http://dx.doi.org/10.4018/978-1-6684-3843-5.ch011.
Texto completo da fonteVenturini, Shamira, Michaela Mae Vann, Martina Pucci e Giulia M. L. Bencini. "Towards a More Inclusive Learning Environment: The Importance of Providing Captions That Are Suited to Learners’ Language Proficiency in the UDL Classroom". In Studies in Health Technology and Informatics. IOS Press, 2022. http://dx.doi.org/10.3233/shti220884.
Texto completo da fonteTrabalhos de conferências sobre o assunto "Automated audio captioning"
Kim, Minkyu, Kim Sung-Bin e Tae-Hyun Oh. "Prefix Tuning for Automated Audio Captioning". In ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023. http://dx.doi.org/10.1109/icassp49357.2023.10096877.
Texto completo da fonteDrossos, Konstantinos, Sharath Adavanne e Tuomas Virtanen. "Automated audio captioning with recurrent neural networks". In 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). IEEE, 2017. http://dx.doi.org/10.1109/waspaa.2017.8170058.
Texto completo da fonteChen, Chen, Nana Hou, Yuchen Hu, Heqing Zou, Xiaofeng Qi e Eng Siong Chng. "Interactive Auido-text Representation for Automated Audio Captioning with Contrastive Learning". In Interspeech 2022. ISCA: ISCA, 2022. http://dx.doi.org/10.21437/interspeech.2022-10510.
Texto completo da fonteKim, Jaeyeon, Jaeyoon Jung, Jinjoo Lee e Sang Hoon Woo. "EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning". In ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2024. http://dx.doi.org/10.1109/icassp48485.2024.10446672.
Texto completo da fonteYe, Zhongjie, Yuqing Wang, Helin Wang, Dongchao Yang e Yuexian Zou. "FeatureCut: An Adaptive Data Augmentation for Automated Audio Captioning". In 2022 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 2022. http://dx.doi.org/10.23919/apsipaasc55919.2022.9980325.
Texto completo da fonteKoh, Andrew, Soham Tiwari e Chng Eng Siong. "Automated Audio Captioning with Epochal Difficult Captions for curriculum learning". In 2022 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 2022. http://dx.doi.org/10.23919/apsipaasc55919.2022.9980242.
Texto completo da fonteWijngaard, Gijs, Elia Formisano, Bruno L. Giordano e Michel Dumontier. "ACES: Evaluating Automated Audio Captioning Models on the Semantics of Sounds". In 2023 31st European Signal Processing Conference (EUSIPCO). IEEE, 2023. http://dx.doi.org/10.23919/eusipco58844.2023.10289793.
Texto completo da fonteJain, Arushi, Navaneeth B. R, Shelly Mohanty, R. Sujatha, Sujatha R, Sourabh Tiwari e Rashmi T. Shankarappa. "Web Framework for Enhancing Automated Audio Captioning Performance for Domestic Environment". In 2022 13th International Conference on Computing Communication and Networking Technologies (ICCCNT). IEEE, 2022. http://dx.doi.org/10.1109/icccnt54827.2022.9984255.
Texto completo da fonteSun, Jianyuan, Xubo Liu, Xinhao Mei, Volkan Kılıç, Mark D. Plumbley e Wenwu Wang. "Dual Transformer Decoder based Features Fusion Network for Automated Audio Captioning". In INTERSPEECH 2023. ISCA: ISCA, 2023. http://dx.doi.org/10.21437/interspeech.2023-943.
Texto completo da fonteXu, Xuenan, Heinrich Dinkel, Mengyue Wu, Zeyu Xie e Kai Yu. "Investigating Local and Global Information for Automated Audio Captioning with Transfer Learning". In ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021. http://dx.doi.org/10.1109/icassp39728.2021.9413982.
Texto completo da fonte