Journal articles on the topic 'Video Vision Transformer'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 journal articles for your research on the topic 'Video Vision Transformer.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.
Naikwadi, Sanket Shashikant. "Video Summarization Using Vision and Language Transformer Models." International Journal of Research Publication and Reviews 6, no. 6 (January 2025): 5217–21. https://doi.org/10.55248/gengpi.6.0125.0654.
Full textMoutik, Oumaima, Hiba Sekkat, Smail Tigani, Abdellah Chehri, Rachid Saadane, Taha Ait Tchakoucht, and Anand Paul. "Convolutional Neural Networks or Vision Transformers: Who Will Win the Race for Action Recognitions in Visual Data?" Sensors 23, no. 2 (January 9, 2023): 734. http://dx.doi.org/10.3390/s23020734.
Full textYuan, Hongchun, Zhenyu Cai, Hui Zhou, Yue Wang, and Xiangzhi Chen. "TransAnomaly: Video Anomaly Detection Using Video Vision Transformer." IEEE Access 9 (2021): 123977–86. http://dx.doi.org/10.1109/access.2021.3109102.
Full textSarraf, Saman, and Milton Kabia. "Optimal Topology of Vision Transformer for Real-Time Video Action Recognition in an End-To-End Cloud Solution." Machine Learning and Knowledge Extraction 5, no. 4 (September 29, 2023): 1320–39. http://dx.doi.org/10.3390/make5040067.
Full textZhao, Hong, Zhiwen Chen, Lan Guo, and Zeyu Han. "Video captioning based on vision transformer and reinforcement learning." PeerJ Computer Science 8 (March 16, 2022): e916. http://dx.doi.org/10.7717/peerj-cs.916.
Full textIm, Heeju, and Yong Suk Choi. "A Full Transformer Video Captioning Model via Vision Transformer." KIISE Transactions on Computing Practices 29, no. 8 (August 31, 2023): 378–83. http://dx.doi.org/10.5626/ktcp.2023.29.8.378.
Full textUgile, Tukaram, and Dr Nilesh Uke. "TRANSFORMER ARCHITECTURES FOR COMPUTER VISION: A COMPREHENSIVE REVIEW AND FUTURE RESEARCH DIRECTIONS." Journal of Dynamics and Control 9, no. 3 (March 15, 2025): 70–79. https://doi.org/10.71058/jodac.v9i3005.
Full textWu, Pengfei, Le Wang, Sanping Zhou, Gang Hua, and Changyin Sun. "Temporal Correlation Vision Transformer for Video Person Re-Identification." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 6 (March 24, 2024): 6083–91. http://dx.doi.org/10.1609/aaai.v38i6.28424.
Full textJin, Yanxiu, and Rulin Ma. "Applications of transformers in computer vision." Applied and Computational Engineering 16, no. 1 (October 23, 2023): 234–41. http://dx.doi.org/10.54254/2755-2721/16/20230898.
Full textPei, Pengfei, Xianfeng Zhao, Jinchuan Li, Yun Cao, and Xuyuan Lai. "Vision Transformer-Based Video Hashing Retrieval for Tracing the Source of Fake Videos." Security and Communication Networks 2023 (June 28, 2023): 1–16. http://dx.doi.org/10.1155/2023/5349392.
Full textWang, Hao, Wenjia Zhang, and Guohua Liu. "TSNet: Token Sparsification for Efficient Video Transformer." Applied Sciences 13, no. 19 (September 24, 2023): 10633. http://dx.doi.org/10.3390/app131910633.
Full textKim, Dahyun, and Myung Hwan Na. "Rice yield prediction and self-attention visualization using Video Vision Transformer." Korean Data Analysis Society 25, no. 4 (August 31, 2023): 1249–59. http://dx.doi.org/10.37727/jkdas.2023.25.4.1249.
Full textLee, Jaewoo, Sungjun Lee, Wonki Cho, Zahid Ali Siddiqui, and Unsang Park. "Vision Transformer-Based Tailing Detection in Videos." Applied Sciences 11, no. 24 (December 7, 2021): 11591. http://dx.doi.org/10.3390/app112411591.
Full textAbdlrazg, Bassma A. Awad, Sumaia Masoud, and Mnal M. Ali. "Human Action Detection Using A hybrid Architecture of CNN and Transformer." International Science and Technology Journal 34, no. 1 (January 25, 2024): 1–15. http://dx.doi.org/10.62341/bsmh2119.
Full textLi, Xue, Huibo Zhou, and Ming Zhao. "Transformer-based cascade networks with spatial and channel reconstruction convolution for deepfake detection." Mathematical Biosciences and Engineering 21, no. 3 (2024): 4142–64. http://dx.doi.org/10.3934/mbe.2024183.
Full textZhou, Siyuan, Chunru Zhan, Biao Wang, Tiezheng Ge, Yuning Jiang, and Li Niu. "Video Object of Interest Segmentation." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 3 (June 26, 2023): 3805–13. http://dx.doi.org/10.1609/aaai.v37i3.25493.
Full textHuo, Hua, and Bingjie Li. "MgMViT: Multi-Granularity and Multi-Scale Vision Transformer for Efficient Action Recognition." Electronics 13, no. 5 (February 29, 2024): 948. http://dx.doi.org/10.3390/electronics13050948.
Full textKumar, Pavan. "Revolutionizing Deepfake Detection and Realtime Video Vision with CNN-based Deep Learning Model." International Journal of Innovative Research in Information Security 10, no. 04 (May 8, 2024): 173–77. http://dx.doi.org/10.26562/ijiris.2024.v1004.10.
Full textReddy, Sai Krishna. "Advancements in Video Deblurring: A Comprehensive Review." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 05 (May 7, 2024): 1–5. http://dx.doi.org/10.55041/ijsrem32759.
Full textIm, Heeju, and Yong-Suk Choi. "UAT: Universal Attention Transformer for Video Captioning." Sensors 22, no. 13 (June 25, 2022): 4817. http://dx.doi.org/10.3390/s22134817.
Full textYamazaki, Kashu, Khoa Vo, Quang Sang Truong, Bhiksha Raj, and Ngan Le. "VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 3 (June 26, 2023): 3081–90. http://dx.doi.org/10.1609/aaai.v37i3.25412.
Full textChoksi, Sarah, Sanjeev Narasimhan, Mattia Ballo, Mehmet Turkcan, Yiran Hu, Chengbo Zang, Alex Farrell, et al. "Automatic assessment of robotic suturing utilizing computer vision in a dry-lab simulation." Artificial Intelligence Surgery 5, no. 2 (April 1, 2025): 160–9. https://doi.org/10.20517/ais.2024.84.
Full textNarsina, Deekshith, Nicholas Richardson, Arjun Kamisetty, Jaya Chandra Srikanth Gummadi, and Krishna Devarapu. "Neural Network Architectures for Real-Time Image and Video Processing Applications." Engineering International 10, no. 2 (2022): 131–44. https://doi.org/10.18034/ei.v10i2.735.
Full textHan, Xiao, Yongbin Wang, Shouxun Liu, and Cong Jin. "Online Multiplayer Tracking by Extracting Temporal Contexts with Transformer." Wireless Communications and Mobile Computing 2022 (October 11, 2022): 1–10. http://dx.doi.org/10.1155/2022/6177973.
Full textZhang, Fan, Jiawei Tian, Jianhao Wang, Guanyou Liu, and Ying Liu. "ECViST: Mine Intelligent Monitoring Based on Edge Computing and Vision Swin Transformer-YOLOv5." Energies 15, no. 23 (November 29, 2022): 9015. http://dx.doi.org/10.3390/en15239015.
Full textMardani, Konstantina, Nicholas Vretos, and Petros Daras. "Transformer-Based Fire Detection in Videos." Sensors 23, no. 6 (March 11, 2023): 3035. http://dx.doi.org/10.3390/s23063035.
Full textPeng, Pengfei, Guoqing Liang, and Tao Luan. "Multi-View Inconsistency Analysis for Video Object-Level Splicing Localization." International Journal of Emerging Technologies and Advanced Applications 1, no. 3 (April 24, 2024): 1–5. http://dx.doi.org/10.62677/ijetaa.2403111.
Full textWang, Jing, and ZongJu Yang. "Transformer-Guided Video Inpainting Algorithm Based on Local Spatial-Temporal joint." EAI Endorsed Transactions on e-Learning 8, no. 4 (August 15, 2023): e2. http://dx.doi.org/10.4108/eetel.3156.
Full textLe, Viet-Tuan, Kiet Tran-Trung, and Vinh Truong Hoang. "A Comprehensive Review of Recent Deep Learning Techniques for Human Activity Recognition." Computational Intelligence and Neuroscience 2022 (April 20, 2022): 1–17. http://dx.doi.org/10.1155/2022/8323962.
Full textHong, Jiuk, Chaehyeon Lee, and Heechul Jung. "Late Fusion-Based Video Transformer for Facial Micro-Expression Recognition." Applied Sciences 12, no. 3 (January 23, 2022): 1169. http://dx.doi.org/10.3390/app12031169.
Full textD, Mrs Srivalli, and Divya Sri V. "Video Inpainting with Local and Global Refinement." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 03 (March 17, 2024): 1–5. http://dx.doi.org/10.55041/ijsrem29385.
Full textHabeb, Mohamed H., May Salama, and Lamiaa A. Elrefaei. "Enhancing Video Anomaly Detection Using a Transformer Spatiotemporal Attention Unsupervised Framework for Large Datasets." Algorithms 17, no. 7 (July 1, 2024): 286. http://dx.doi.org/10.3390/a17070286.
Full textUsmani, Shaheen, Sunil Kumar, and Debanjan Sadhya. "Spatio-temporal knowledge distilled video vision transformer (STKD-VViT) for multimodal deepfake detection." Neurocomputing 620 (March 2025): 129256. https://doi.org/10.1016/j.neucom.2024.129256.
Full textKumar, Yulia, Kuan Huang, Chin-Chien Lin, Annaliese Watson, J. Jenny Li, Patricia Morreale, and Justin Delgado. "Applying Swin Architecture to Diverse Sign Language Datasets." Electronics 13, no. 8 (April 16, 2024): 1509. http://dx.doi.org/10.3390/electronics13081509.
Full textLi, Yixiao, Lixiang Li, Zirui Zhuang, Yuan Fang, Haipeng Peng, and Nam Ling. "Transformer-Based Data-Driven Video Coding Acceleration for Industrial Applications." Mathematical Problems in Engineering 2022 (September 27, 2022): 1–11. http://dx.doi.org/10.1155/2022/1440323.
Full textNikulina, Olena, Valerii Severyn, Oleksii Kondratov, and Oleksii Olhovoy. "MODELS OF REMOTE IDENTIFICATION OF PARAMETERS OF DYNAMIC OBJECTS USING DETECTION TRANSFORMERS AND OPTICAL FLOW." Bulletin of National Technical University "KhPI". Series: System Analysis, Control and Information Technologies, no. 1 (11) (July 30, 2024): 52–57. http://dx.doi.org/10.20998/2079-0023.2024.01.08.
Full textEl Moaqet, Hisham, Rami Janini, Tamer Abdulbaki Alshirbaji, Nour Aldeen Jalal, and Knut Möller. "Using Vision Transformers for Classifying Surgical Tools in Computer Aided Surgeries." Current Directions in Biomedical Engineering 10, no. 4 (December 1, 2024): 232–35. https://doi.org/10.1515/cdbme-2024-2056.
Full textJang, Hee-Deok, Seokjoon Kwon, Hyunwoo Nam, and Dong Eui Chang. "Chemical Gas Source Localization with Synthetic Time Series Diffusion Data Using Video Vision Transformer." Applied Sciences 14, no. 11 (May 23, 2024): 4451. http://dx.doi.org/10.3390/app14114451.
Full textMozaffari, M. Hamed, Yuchuan Li, Niloofar Hooshyaripour, and Yoon Ko. "Vision-Based Prediction of Flashover Using Transformers and Convolutional Long Short-Term Memory Model." Electronics 13, no. 23 (December 3, 2024): 4776. https://doi.org/10.3390/electronics13234776.
Full textGeng, Xiaozhong, Cheng Chen, Ping Yu, Baijin Liu, Weixin Hu, Qipeng Liang, and Xintong Zhang. "OM-VST: A video action recognition model based on optimized downsampling module combined with multi-scale feature fusion." PLOS ONE 20, no. 3 (March 6, 2025): e0318884. https://doi.org/10.1371/journal.pone.0318884.
Full textKim, Nayeon, Sukhee Cho, and Byungjun Bae. "SMaTE: A Segment-Level Feature Mixing and Temporal Encoding Framework for Facial Expression Recognition." Sensors 22, no. 15 (August 1, 2022): 5753. http://dx.doi.org/10.3390/s22155753.
Full textLai, Derek Ka-Hei, Ethan Shiu-Wang Cheng, Bryan Pak-Hei So, Ye-Jiao Mao, Sophia Ming-Yan Cheung, Daphne Sze Ki Cheung, Duo Wai-Chi Wong, and James Chung-Wai Cheung. "Transformer Models and Convolutional Networks with Different Activation Functions for Swallow Classification Using Depth Video Data." Mathematics 11, no. 14 (July 12, 2023): 3081. http://dx.doi.org/10.3390/math11143081.
Full textLiu, Yuqi, Luhui Xu, Pengfei Xiong, and Qin Jin. "Token Mixing: Parameter-Efficient Transfer Learning from Image-Language to Video-Language." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 2 (June 26, 2023): 1781–89. http://dx.doi.org/10.1609/aaai.v37i2.25267.
Full textLorenzo, Javier, Ignacio Parra Alonso, Rubén Izquierdo, Augusto Luis Ballardini, Álvaro Hernández Saz, David Fernández Llorca, and Miguel Ángel Sotelo. "CAPformer: Pedestrian Crossing Action Prediction Using Transformer." Sensors 21, no. 17 (August 24, 2021): 5694. http://dx.doi.org/10.3390/s21175694.
Full textGuo, Zizhao, and Sancong Ying. "Whole-Body Keypoint and Skeleton Augmented RGB Networks for Video Action Recognition." Applied Sciences 12, no. 12 (June 18, 2022): 6215. http://dx.doi.org/10.3390/app12126215.
Full textZhang, Renhong, Tianheng Cheng, Shusheng Yang, Haoyi Jiang, Shuai Zhang, Jiancheng Lyu, Xin Li, et al. "MobileInst: Video Instance Segmentation on the Mobile." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 7 (March 24, 2024): 7260–68. http://dx.doi.org/10.1609/aaai.v38i7.28555.
Full textZang, Chengbo, Mehmet Kerem Turkcan, Sanjeev Narasimhan, Yuqing Cao, Kaan Yarali, Zixuan Xiang, Skyler Szot, et al. "Surgical Phase Recognition in Inguinal Hernia Repair—AI-Based Confirmatory Baseline and Exploration of Competitive Models." Bioengineering 10, no. 6 (May 27, 2023): 654. http://dx.doi.org/10.3390/bioengineering10060654.
Full textLiu, Hao, Jiwen Lu, Jianjiang Feng, and Jie Zhou. "Two-Stream Transformer Networks for Video-Based Face Alignment." IEEE Transactions on Pattern Analysis and Machine Intelligence 40, no. 11 (November 1, 2018): 2546–54. http://dx.doi.org/10.1109/tpami.2017.2734779.
Full textKhan, Salman, Muzammal Naseer, Munawar Hayat, Syed Waqas Zamir, Fahad Shahbaz Khan, and Mubarak Shah. "Transformers in Vision: A Survey." ACM Computing Surveys, January 6, 2022. http://dx.doi.org/10.1145/3505244.
Full textHsu, Tzu-Chun, Yi-Sheng Liao, and Chun-Rong Huang. "Video Summarization With Spatiotemporal Vision Transformer." IEEE Transactions on Image Processing, 2023, 1. http://dx.doi.org/10.1109/tip.2023.3275069.
Full text