Littérature scientifique sur le sujet « Multimodal Transformers »
Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres
Sommaire
Consultez les listes thématiques d’articles de revues, de livres, de thèses, de rapports de conférences et d’autres sources académiques sur le sujet « Multimodal Transformers ».
À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.
Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.
Articles de revues sur le sujet "Multimodal Transformers"
Jaiswal, Sushma, Harikumar Pallthadka, Rajesh P. Chinchewadi et Tarun Jaiswal. « Optimized Image Captioning : Hybrid Transformers Vision Transformers and Convolutional Neural Networks : Enhanced with Beam Search ». International Journal of Intelligent Systems and Applications 16, no 2 (8 avril 2024) : 53–61. http://dx.doi.org/10.5815/ijisa.2024.02.05.
Texte intégralBayat, Nasrin, Jong-Hwan Kim, Renoa Choudhury, Ibrahim F. Kadhim, Zubaidah Al-Mashhadani, Mark Aldritz Dela Virgen, Reuben Latorre, Ricardo De La Paz et Joon-Hyuk Park. « Vision Transformer Customized for Environment Detection and Collision Prediction to Assist the Visually Impaired ». Journal of Imaging 9, no 8 (15 août 2023) : 161. http://dx.doi.org/10.3390/jimaging9080161.
Texte intégralHendricks, Lisa Anne, John Mellor, Rosalia Schneider, Jean-Baptiste Alayrac et Aida Nematzadeh. « Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers ». Transactions of the Association for Computational Linguistics 9 (2021) : 570–85. http://dx.doi.org/10.1162/tacl_a_00385.
Texte intégralShao, Zilei. « A literature review on multimodal deep learning models for detecting mental disorders in conversational data : Pre-transformer and transformer-based approaches ». Applied and Computational Engineering 18, no 1 (23 octobre 2023) : 215–24. http://dx.doi.org/10.54254/2755-2721/18/20230993.
Texte intégralWang, LeiChen, Simon Giebenhain, Carsten Anklam et Bastian Goldluecke. « Radar Ghost Target Detection via Multimodal Transformers ». IEEE Robotics and Automation Letters 6, no 4 (octobre 2021) : 7758–65. http://dx.doi.org/10.1109/lra.2021.3100176.
Texte intégralSalin, Emmanuelle, Badreddine Farah, Stéphane Ayache et Benoit Favre. « Are Vision-Language Transformers Learning Multimodal Representations ? A Probing Perspective ». Proceedings of the AAAI Conference on Artificial Intelligence 36, no 10 (28 juin 2022) : 11248–57. http://dx.doi.org/10.1609/aaai.v36i10.21375.
Texte intégralSun, Qixuan, Nianhua Fang, Zhuo Liu, Liang Zhao, Youpeng Wen et Hongxiang Lin. « HybridCTrm : Bridging CNN and Transformer for Multimodal Brain Image Segmentation ». Journal of Healthcare Engineering 2021 (1 octobre 2021) : 1–10. http://dx.doi.org/10.1155/2021/7467261.
Texte intégralYu Tian, Qiyang Zhao, Zine el abidine Kherroubi, Fouzi Boukhalfa, Kebin Wu et Faouzi Bader. « Multimodal transformers for wireless communications : A case study in beam prediction ». ITU Journal on Future and Evolving Technologies 4, no 3 (5 septembre 2023) : 461–71. http://dx.doi.org/10.52953/jwra8095.
Texte intégralChen, Yu, Ming Yin, Yu Li et Qian Cai. « CSU-Net : A CNN-Transformer Parallel Network for Multimodal Brain Tumour Segmentation ». Electronics 11, no 14 (16 juillet 2022) : 2226. http://dx.doi.org/10.3390/electronics11142226.
Texte intégralWang, Zhaokai, Renda Bao, Qi Wu et Si Liu. « Confidence-aware Non-repetitive Multimodal Transformers for TextCaps ». Proceedings of the AAAI Conference on Artificial Intelligence 35, no 4 (18 mai 2021) : 2835–43. http://dx.doi.org/10.1609/aaai.v35i4.16389.
Texte intégralThèses sur le sujet "Multimodal Transformers"
Greco, Claudio. « Transfer Learning and Attention Mechanisms in a Multimodal Setting ». Doctoral thesis, Università degli studi di Trento, 2022. http://hdl.handle.net/11572/341874.
Texte intégralVazquez, Rodriguez Juan Fernando. « Transformateurs multimodaux pour la reconnaissance des émotions ». Electronic Thesis or Diss., Université Grenoble Alpes, 2023. http://www.theses.fr/2023GRALM057.
Texte intégralMental health and emotional well-being have significant influence on physical health, and are especially important for healthy aging. Continued progress on sensors and microelectronics has provided a number of new technologies that can be deployed in homes and used to monitor health and well-being. These can be combined with recent advances in machine learning to provide services that enhance the physical and emotional well-being of individuals to promote healthy aging. In this context, an automatic emotion recognition system can provide a tool to help assure the emotional well-being of frail people. Therefore, it is desirable to develop a technology that can draw information about human emotions from multiple sensor modalities and can be trained without the need for large labeled training datasets.This thesis addresses the problem of emotion recognition using the different types of signals that a smart environment may provide, such as visual, audio, and physiological signals. To do this, we develop different models based on the Transformer architecture, which has useful characteristics such as their capacity to model long-range dependencies, as well as their capability to discern the relevant parts of the input. We first propose a model to recognize emotions from individual physiological signals. We propose a self-supervised pre-training technique that uses unlabeled physiological signals, showing that that pre-training technique helps the model to perform better. This approach is then extended to take advantage of the complementarity of information that may exist in different physiological signals. For this, we develop a model that combines different physiological signals and also uses self-supervised pre-training to improve its performance. We propose a method for pre-training that does not require a dataset with the complete set of target signals, but can rather, be trained on individual datasets from each target signal.To further take advantage of the different modalities that a smart environment may provide, we also propose a model that uses as inputs multimodal signals such as video, audio, and physiological signals. Since these signals are of a different nature, they cover different ways in which emotions are expressed, thus they should provide complementary information concerning emotions, and therefore it is appealing to use them together. However, in real-world scenarios, there might be cases where a modality is missing. Our model is flexible enough to continue working when a modality is missing, albeit with a reduction in its performance. To address this problem, we propose a training strategy that reduces the drop in performance when a modality is missing.The methods developed in this thesis are evaluated using several datasets, obtaining results that demonstrate the effectiveness of our approach to pre-train Transformers to recognize emotions from physiological signals. The results also show the efficacy of our Transformer-based solution to aggregate multimodal information, and to accommodate missing modalities. These results demonstrate the feasibility of the proposed approaches to recognizing emotions from multiple environmental sensors. This opens new avenues for deeper exploration of using Transformer-based approaches to process information from environmental sensors and allows the development of emotion recognition technologies robust to missing modalities. The results of this work can contribute to better care for the mental health of frail people
Mills, Kathy Ann. « Multiliteracies : a critical ethnography : pedagogy, power, discourse and access to multiliteracies ». Thesis, Queensland University of Technology, 2006. https://eprints.qut.edu.au/16244/1/Kathy_Mills_Thesis.pdf.
Texte intégralMills, Kathy Ann. « Multiliteracies : a critical ethnography : pedagogy, power, discourse and access to multiliteracies ». Queensland University of Technology, 2006. http://eprints.qut.edu.au/16244/.
Texte intégralChapitres de livres sur le sujet "Multimodal Transformers"
Revanur, Ambareesh, Ananyananda Dasari, Conrad S. Tucker et László A. Jeni. « Instantaneous Physiological Estimation Using Video Transformers ». Dans Multimodal AI in Healthcare, 307–19. Cham : Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-14771-5_22.
Texte intégralKant, Yash, Dhruv Batra, Peter Anderson, Alexander Schwing, Devi Parikh, Jiasen Lu et Harsh Agrawal. « Spatially Aware Multimodal Transformers for TextVQA ». Dans Computer Vision – ECCV 2020, 715–32. Cham : Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-58545-7_41.
Texte intégralMojtahedi, Ramtin, Mohammad Hamghalam, Richard K. G. Do et Amber L. Simpson. « Towards Optimal Patch Size in Vision Transformers for Tumor Segmentation ». Dans Multiscale Multimodal Medical Imaging, 110–20. Cham : Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-18814-5_11.
Texte intégralRamesh, Krithik, et Yun Sing Koh. « Investigation of Explainability Techniques for Multimodal Transformers ». Dans Communications in Computer and Information Science, 90–98. Singapore : Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-8746-5_7.
Texte intégralBucur, Ana-Maria, Adrian Cosma, Paolo Rosso et Liviu P. Dinu. « It’s Just a Matter of Time : Detecting Depression with Time-Enriched Multimodal Transformers ». Dans Lecture Notes in Computer Science, 200–215. Cham : Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-28244-7_13.
Texte intégralSun, Zhengxiao, Feiyu Chen et Jie Shao. « Synesthesia Transformer with Contrastive Multimodal Learning ». Dans Neural Information Processing, 431–42. Cham : Springer International Publishing, 2023. http://dx.doi.org/10.1007/978-3-031-30105-6_36.
Texte intégralXie, Long-Fei, et Xu-Yao Zhang. « Gate-Fusion Transformer for Multimodal Sentiment Analysis ». Dans Pattern Recognition and Artificial Intelligence, 28–40. Cham : Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-59830-3_3.
Texte intégralWang, Wenxuan, Chen Chen, Meng Ding, Hong Yu, Sen Zha et Jiangyun Li. « TransBTS : Multimodal Brain Tumor Segmentation Using Transformer ». Dans Medical Image Computing and Computer Assisted Intervention – MICCAI 2021, 109–19. Cham : Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-87193-2_11.
Texte intégralLiu, Dan, Wei Song et Xiaobing Zhao. « Pedestrian Attribute Recognition Based on Multimodal Transformer ». Dans Pattern Recognition and Computer Vision, 422–33. Singapore : Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-99-8429-9_34.
Texte intégralReyes, Abel A., Sidike Paheding, Makarand Deo et Michel Audette. « Gabor Filter-Embedded U-Net with Transformer-Based Encoding for Biomedical Image Segmentation ». Dans Multiscale Multimodal Medical Imaging, 76–88. Cham : Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-18814-5_8.
Texte intégralActes de conférences sur le sujet "Multimodal Transformers"
Parthasarathy, Srinivas, et Shiva Sundaram. « Detecting Expressions with Multimodal Transformers ». Dans 2021 IEEE Spoken Language Technology Workshop (SLT). IEEE, 2021. http://dx.doi.org/10.1109/slt48900.2021.9383573.
Texte intégralChua, Watson W. K., Lu Li et Alvina Goh. « Classifying Multimodal Data Using Transformers ». Dans KDD '22 : The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York, NY, USA : ACM, 2022. http://dx.doi.org/10.1145/3534678.3542634.
Texte intégralWang, Yikai, Xinghao Chen, Lele Cao, Wenbing Huang, Fuchun Sun et Yunhe Wang. « Multimodal Token Fusion for Vision Transformers ». Dans 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2022. http://dx.doi.org/10.1109/cvpr52688.2022.01187.
Texte intégralTang, Wenzhuo, Hongzhi Wen, Renming Liu, Jiayuan Ding, Wei Jin, Yuying Xie, Hui Liu et Jiliang Tang. « Single-Cell Multimodal Prediction via Transformers ». Dans CIKM '23 : The 32nd ACM International Conference on Information and Knowledge Management. New York, NY, USA : ACM, 2023. http://dx.doi.org/10.1145/3583780.3615061.
Texte intégralLiu, Yicheng, Jinghuai Zhang, Liangji Fang, Qinhong Jiang et Bolei Zhou. « Multimodal Motion Prediction with Stacked Transformers ». Dans 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2021. http://dx.doi.org/10.1109/cvpr46437.2021.00749.
Texte intégralBhargava, Prajjwal. « Adaptive Transformers for Learning Multimodal Representations ». Dans Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics : Student Research Workshop. Stroudsburg, PA, USA : Association for Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.acl-srw.1.
Texte intégralVazquez-Rodriguez, Juan. « Using Multimodal Transformers in Affective Computing ». Dans 2021 9th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW). IEEE, 2021. http://dx.doi.org/10.1109/aciiw52867.2021.9666396.
Texte intégralShang, Xindi, Zehuan Yuan, Anran Wang et Changhu Wang. « Multimodal Video Summarization via Time-Aware Transformers ». Dans MM '21 : ACM Multimedia Conference. New York, NY, USA : ACM, 2021. http://dx.doi.org/10.1145/3474085.3475321.
Texte intégralWu, Zhengtao, Lingbo Liu, Yang Zhang, Mingzhi Mao, Liang Lin et Guanbin Li. « Multimodal Crowd Counting with Mutual Attention Transformers ». Dans 2022 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2022. http://dx.doi.org/10.1109/icme52920.2022.9859777.
Texte intégralMa, Mengmeng, Jian Ren, Long Zhao, Davide Testuggine et Xi Peng. « Are Multimodal Transformers Robust to Missing Modality ? » Dans 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2022. http://dx.doi.org/10.1109/cvpr52688.2022.01764.
Texte intégralRapports d'organisations sur le sujet "Multimodal Transformers"
Glushko, E. Ya, et A. N. Stepanyuk. The multimode island kind photonic crystal resonator : states classification. SME Burlaka, 2017. http://dx.doi.org/10.31812/0564/1561.
Texte intégral