Добірка наукової літератури з теми "Transformers Multimodaux"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Transformers Multimodaux".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Статті в журналах з теми "Transformers Multimodaux"
Jaiswal, Sushma, Harikumar Pallthadka, Rajesh P. Chinchewadi, and Tarun Jaiswal. "Optimized Image Captioning: Hybrid Transformers Vision Transformers and Convolutional Neural Networks: Enhanced with Beam Search." International Journal of Intelligent Systems and Applications 16, no. 2 (April 8, 2024): 53–61. http://dx.doi.org/10.5815/ijisa.2024.02.05.
Повний текст джерелаBayat, Nasrin, Jong-Hwan Kim, Renoa Choudhury, Ibrahim F. Kadhim, Zubaidah Al-Mashhadani, Mark Aldritz Dela Virgen, Reuben Latorre, Ricardo De La Paz, and Joon-Hyuk Park. "Vision Transformer Customized for Environment Detection and Collision Prediction to Assist the Visually Impaired." Journal of Imaging 9, no. 8 (August 15, 2023): 161. http://dx.doi.org/10.3390/jimaging9080161.
Повний текст джерелаShao, Zilei. "A literature review on multimodal deep learning models for detecting mental disorders in conversational data: Pre-transformer and transformer-based approaches." Applied and Computational Engineering 18, no. 1 (October 23, 2023): 215–24. http://dx.doi.org/10.54254/2755-2721/18/20230993.
Повний текст джерелаHendricks, Lisa Anne, John Mellor, Rosalia Schneider, Jean-Baptiste Alayrac, and Aida Nematzadeh. "Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers." Transactions of the Association for Computational Linguistics 9 (2021): 570–85. http://dx.doi.org/10.1162/tacl_a_00385.
Повний текст джерелаChen, Yu, Ming Yin, Yu Li, and Qian Cai. "CSU-Net: A CNN-Transformer Parallel Network for Multimodal Brain Tumour Segmentation." Electronics 11, no. 14 (July 16, 2022): 2226. http://dx.doi.org/10.3390/electronics11142226.
Повний текст джерелаSun, Qixuan, Nianhua Fang, Zhuo Liu, Liang Zhao, Youpeng Wen, and Hongxiang Lin. "HybridCTrm: Bridging CNN and Transformer for Multimodal Brain Image Segmentation." Journal of Healthcare Engineering 2021 (October 1, 2021): 1–10. http://dx.doi.org/10.1155/2021/7467261.
Повний текст джерелаYu Tian, Qiyang Zhao, Zine el abidine Kherroubi, Fouzi Boukhalfa, Kebin Wu, and Faouzi Bader. "Multimodal transformers for wireless communications: A case study in beam prediction." ITU Journal on Future and Evolving Technologies 4, no. 3 (September 5, 2023): 461–71. http://dx.doi.org/10.52953/jwra8095.
Повний текст джерелаXu, Yifan, Huapeng Wei, Minxuan Lin, Yingying Deng, Kekai Sheng, Mengdan Zhang, Fan Tang, Weiming Dong, Feiyue Huang, and Changsheng Xu. "Transformers in computational visual media: A survey." Computational Visual Media 8, no. 1 (October 27, 2021): 33–62. http://dx.doi.org/10.1007/s41095-021-0247-3.
Повний текст джерелаZhong, Enmin, Carlos R. del-Blanco, Daniel Berjón, Fernando Jaureguizar, and Narciso García. "Real-Time Monocular Skeleton-Based Hand Gesture Recognition Using 3D-Jointsformer." Sensors 23, no. 16 (August 10, 2023): 7066. http://dx.doi.org/10.3390/s23167066.
Повний текст джерелаNia, Zahra Movahedi, Ali Ahmadi, Bruce Mellado, Jianhong Wu, James Orbinski, Ali Asgary, and Jude D. Kong. "Twitter-based gender recognition using transformers." Mathematical Biosciences and Engineering 20, no. 9 (2023): 15957–77. http://dx.doi.org/10.3934/mbe.2023711.
Повний текст джерелаДисертації з теми "Transformers Multimodaux"
Vazquez, Rodriguez Juan Fernando. "Transformateurs multimodaux pour la reconnaissance des émotions." Electronic Thesis or Diss., Université Grenoble Alpes, 2023. http://www.theses.fr/2023GRALM057.
Повний текст джерелаMental health and emotional well-being have significant influence on physical health, and are especially important for healthy aging. Continued progress on sensors and microelectronics has provided a number of new technologies that can be deployed in homes and used to monitor health and well-being. These can be combined with recent advances in machine learning to provide services that enhance the physical and emotional well-being of individuals to promote healthy aging. In this context, an automatic emotion recognition system can provide a tool to help assure the emotional well-being of frail people. Therefore, it is desirable to develop a technology that can draw information about human emotions from multiple sensor modalities and can be trained without the need for large labeled training datasets.This thesis addresses the problem of emotion recognition using the different types of signals that a smart environment may provide, such as visual, audio, and physiological signals. To do this, we develop different models based on the Transformer architecture, which has useful characteristics such as their capacity to model long-range dependencies, as well as their capability to discern the relevant parts of the input. We first propose a model to recognize emotions from individual physiological signals. We propose a self-supervised pre-training technique that uses unlabeled physiological signals, showing that that pre-training technique helps the model to perform better. This approach is then extended to take advantage of the complementarity of information that may exist in different physiological signals. For this, we develop a model that combines different physiological signals and also uses self-supervised pre-training to improve its performance. We propose a method for pre-training that does not require a dataset with the complete set of target signals, but can rather, be trained on individual datasets from each target signal.To further take advantage of the different modalities that a smart environment may provide, we also propose a model that uses as inputs multimodal signals such as video, audio, and physiological signals. Since these signals are of a different nature, they cover different ways in which emotions are expressed, thus they should provide complementary information concerning emotions, and therefore it is appealing to use them together. However, in real-world scenarios, there might be cases where a modality is missing. Our model is flexible enough to continue working when a modality is missing, albeit with a reduction in its performance. To address this problem, we propose a training strategy that reduces the drop in performance when a modality is missing.The methods developed in this thesis are evaluated using several datasets, obtaining results that demonstrate the effectiveness of our approach to pre-train Transformers to recognize emotions from physiological signals. The results also show the efficacy of our Transformer-based solution to aggregate multimodal information, and to accommodate missing modalities. These results demonstrate the feasibility of the proposed approaches to recognizing emotions from multiple environmental sensors. This opens new avenues for deeper exploration of using Transformer-based approaches to process information from environmental sensors and allows the development of emotion recognition technologies robust to missing modalities. The results of this work can contribute to better care for the mental health of frail people
Greco, Claudio. "Transfer Learning and Attention Mechanisms in a Multimodal Setting." Doctoral thesis, Università degli studi di Trento, 2022. http://hdl.handle.net/11572/341874.
Повний текст джерелаMills, Kathy Ann. "Multiliteracies : a critical ethnography : pedagogy, power, discourse and access to multiliteracies." Thesis, Queensland University of Technology, 2006. https://eprints.qut.edu.au/16244/1/Kathy_Mills_Thesis.pdf.
Повний текст джерелаMills, Kathy Ann. "Multiliteracies : a critical ethnography : pedagogy, power, discourse and access to multiliteracies." Queensland University of Technology, 2006. http://eprints.qut.edu.au/16244/.
Повний текст джерелаЧастини книг з теми "Transformers Multimodaux"
Revanur, Ambareesh, Ananyananda Dasari, Conrad S. Tucker, and László A. Jeni. "Instantaneous Physiological Estimation Using Video Transformers." In Multimodal AI in Healthcare, 307–19. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-14771-5_22.
Повний текст джерелаKant, Yash, Dhruv Batra, Peter Anderson, Alexander Schwing, Devi Parikh, Jiasen Lu, and Harsh Agrawal. "Spatially Aware Multimodal Transformers for TextVQA." In Computer Vision – ECCV 2020, 715–32. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-58545-7_41.
Повний текст джерелаMojtahedi, Ramtin, Mohammad Hamghalam, Richard K. G. Do, and Amber L. Simpson. "Towards Optimal Patch Size in Vision Transformers for Tumor Segmentation." In Multiscale Multimodal Medical Imaging, 110–20. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-18814-5_11.
Повний текст джерелаSun, Zhengxiao, Feiyu Chen, and Jie Shao. "Synesthesia Transformer with Contrastive Multimodal Learning." In Neural Information Processing, 431–42. Cham: Springer International Publishing, 2023. http://dx.doi.org/10.1007/978-3-031-30105-6_36.
Повний текст джерелаRamesh, Krithik, and Yun Sing Koh. "Investigation of Explainability Techniques for Multimodal Transformers." In Communications in Computer and Information Science, 90–98. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-8746-5_7.
Повний текст джерелаXie, Long-Fei, and Xu-Yao Zhang. "Gate-Fusion Transformer for Multimodal Sentiment Analysis." In Pattern Recognition and Artificial Intelligence, 28–40. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-59830-3_3.
Повний текст джерелаWang, Wenxuan, Chen Chen, Meng Ding, Hong Yu, Sen Zha, and Jiangyun Li. "TransBTS: Multimodal Brain Tumor Segmentation Using Transformer." In Medical Image Computing and Computer Assisted Intervention – MICCAI 2021, 109–19. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-87193-2_11.
Повний текст джерелаLiu, Dan, Wei Song, and Xiaobing Zhao. "Pedestrian Attribute Recognition Based on Multimodal Transformer." In Pattern Recognition and Computer Vision, 422–33. Singapore: Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-99-8429-9_34.
Повний текст джерелаReyes, Abel A., Sidike Paheding, Makarand Deo, and Michel Audette. "Gabor Filter-Embedded U-Net with Transformer-Based Encoding for Biomedical Image Segmentation." In Multiscale Multimodal Medical Imaging, 76–88. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-18814-5_8.
Повний текст джерелаSanthirasekaram, Ainkaran, Karen Pinto, Mathias Winkler, Eric Aboagye, Ben Glocker, and Andrea Rockall. "Multi-scale Hybrid Transformer Networks: Application to Prostate Disease Classification." In Multimodal Learning for Clinical Decision Support, 12–21. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-89847-2_2.
Повний текст джерелаТези доповідей конференцій з теми "Transformers Multimodaux"
Yao, Shaowei, and Xiaojun Wan. "Multimodal Transformer for Multimodal Machine Translation." In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.acl-main.400.
Повний текст джерелаTang, Jiajia, Kang Li, Ming Hou, Xuanyu Jin, Wanzeng Kong, Yu Ding, and Qibin Zhao. "MMT: Multi-way Multi-modal Transformer for Multimodal Learning." In Thirty-First International Joint Conference on Artificial Intelligence {IJCAI-22}. California: International Joint Conferences on Artificial Intelligence Organization, 2022. http://dx.doi.org/10.24963/ijcai.2022/480.
Повний текст джерелаParthasarathy, Srinivas, and Shiva Sundaram. "Detecting Expressions with Multimodal Transformers." In 2021 IEEE Spoken Language Technology Workshop (SLT). IEEE, 2021. http://dx.doi.org/10.1109/slt48900.2021.9383573.
Повний текст джерелаChua, Watson W. K., Lu Li, and Alvina Goh. "Classifying Multimodal Data Using Transformers." In KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM, 2022. http://dx.doi.org/10.1145/3534678.3542634.
Повний текст джерелаTsai, Yao-Hung Hubert, Shaojie Bai, Paul Pu Liang, J. Zico Kolter, Louis-Philippe Morency, and Ruslan Salakhutdinov. "Multimodal Transformer for Unaligned Multimodal Language Sequences." In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2019. http://dx.doi.org/10.18653/v1/p19-1656.
Повний текст джерелаHe, Xuehai, and Xin Wang. "Multimodal Graph Transformer for Multimodal Question Answering." In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2023. http://dx.doi.org/10.18653/v1/2023.eacl-main.15.
Повний текст джерелаJin, Tao, Siyu Huang, Ming Chen, Yingming Li, and Zhongfei Zhang. "SBAT: Video Captioning with Sparse Boundary-Aware Transformer." In Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence {IJCAI-PRICAI-20}. California: International Joint Conferences on Artificial Intelligence Organization, 2020. http://dx.doi.org/10.24963/ijcai.2020/88.
Повний текст джерелаWang, Yikai, Xinghao Chen, Lele Cao, Wenbing Huang, Fuchun Sun, and Yunhe Wang. "Multimodal Token Fusion for Vision Transformers." In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2022. http://dx.doi.org/10.1109/cvpr52688.2022.01187.
Повний текст джерелаTang, Wenzhuo, Hongzhi Wen, Renming Liu, Jiayuan Ding, Wei Jin, Yuying Xie, Hui Liu, and Jiliang Tang. "Single-Cell Multimodal Prediction via Transformers." In CIKM '23: The 32nd ACM International Conference on Information and Knowledge Management. New York, NY, USA: ACM, 2023. http://dx.doi.org/10.1145/3583780.3615061.
Повний текст джерелаLiu, Yicheng, Jinghuai Zhang, Liangji Fang, Qinhong Jiang, and Bolei Zhou. "Multimodal Motion Prediction with Stacked Transformers." In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2021. http://dx.doi.org/10.1109/cvpr46437.2021.00749.
Повний текст джерела