Artigos de revistas sobre o tema "Video Vision Transformer"
Crie uma referência precisa em APA, MLA, Chicago, Harvard, e outros estilos
Veja os 50 melhores artigos de revistas para estudos sobre o assunto "Video Vision Transformer".
Ao lado de cada fonte na lista de referências, há um botão "Adicionar à bibliografia". Clique e geraremos automaticamente a citação bibliográfica do trabalho escolhido no estilo de citação de que você precisa: APA, MLA, Harvard, Chicago, Vancouver, etc.
Você também pode baixar o texto completo da publicação científica em formato .pdf e ler o resumo do trabalho online se estiver presente nos metadados.
Veja os artigos de revistas das mais diversas áreas científicas e compile uma bibliografia correta.
Naikwadi, Sanket Shashikant. "Video Summarization Using Vision and Language Transformer Models". International Journal of Research Publication and Reviews 6, n.º 6 (janeiro de 2025): 5217–21. https://doi.org/10.55248/gengpi.6.0125.0654.
Texto completo da fonteMoutik, Oumaima, Hiba Sekkat, Smail Tigani, Abdellah Chehri, Rachid Saadane, Taha Ait Tchakoucht e Anand Paul. "Convolutional Neural Networks or Vision Transformers: Who Will Win the Race for Action Recognitions in Visual Data?" Sensors 23, n.º 2 (9 de janeiro de 2023): 734. http://dx.doi.org/10.3390/s23020734.
Texto completo da fonteYuan, Hongchun, Zhenyu Cai, Hui Zhou, Yue Wang e Xiangzhi Chen. "TransAnomaly: Video Anomaly Detection Using Video Vision Transformer". IEEE Access 9 (2021): 123977–86. http://dx.doi.org/10.1109/access.2021.3109102.
Texto completo da fonteSarraf, Saman, e Milton Kabia. "Optimal Topology of Vision Transformer for Real-Time Video Action Recognition in an End-To-End Cloud Solution". Machine Learning and Knowledge Extraction 5, n.º 4 (29 de setembro de 2023): 1320–39. http://dx.doi.org/10.3390/make5040067.
Texto completo da fonteZhao, Hong, Zhiwen Chen, Lan Guo e Zeyu Han. "Video captioning based on vision transformer and reinforcement learning". PeerJ Computer Science 8 (16 de março de 2022): e916. http://dx.doi.org/10.7717/peerj-cs.916.
Texto completo da fonteIm, Heeju, e Yong Suk Choi. "A Full Transformer Video Captioning Model via Vision Transformer". KIISE Transactions on Computing Practices 29, n.º 8 (31 de agosto de 2023): 378–83. http://dx.doi.org/10.5626/ktcp.2023.29.8.378.
Texto completo da fonteUgile, Tukaram, e Dr Nilesh Uke. "TRANSFORMER ARCHITECTURES FOR COMPUTER VISION: A COMPREHENSIVE REVIEW AND FUTURE RESEARCH DIRECTIONS". Journal of Dynamics and Control 9, n.º 3 (15 de março de 2025): 70–79. https://doi.org/10.71058/jodac.v9i3005.
Texto completo da fonteWu, Pengfei, Le Wang, Sanping Zhou, Gang Hua e Changyin Sun. "Temporal Correlation Vision Transformer for Video Person Re-Identification". Proceedings of the AAAI Conference on Artificial Intelligence 38, n.º 6 (24 de março de 2024): 6083–91. http://dx.doi.org/10.1609/aaai.v38i6.28424.
Texto completo da fonteJin, Yanxiu, e Rulin Ma. "Applications of transformers in computer vision". Applied and Computational Engineering 16, n.º 1 (23 de outubro de 2023): 234–41. http://dx.doi.org/10.54254/2755-2721/16/20230898.
Texto completo da fontePei, Pengfei, Xianfeng Zhao, Jinchuan Li, Yun Cao e Xuyuan Lai. "Vision Transformer-Based Video Hashing Retrieval for Tracing the Source of Fake Videos". Security and Communication Networks 2023 (28 de junho de 2023): 1–16. http://dx.doi.org/10.1155/2023/5349392.
Texto completo da fonteWang, Hao, Wenjia Zhang e Guohua Liu. "TSNet: Token Sparsification for Efficient Video Transformer". Applied Sciences 13, n.º 19 (24 de setembro de 2023): 10633. http://dx.doi.org/10.3390/app131910633.
Texto completo da fonteKim, Dahyun, e Myung Hwan Na. "Rice yield prediction and self-attention visualization using Video Vision Transformer". Korean Data Analysis Society 25, n.º 4 (31 de agosto de 2023): 1249–59. http://dx.doi.org/10.37727/jkdas.2023.25.4.1249.
Texto completo da fonteLee, Jaewoo, Sungjun Lee, Wonki Cho, Zahid Ali Siddiqui e Unsang Park. "Vision Transformer-Based Tailing Detection in Videos". Applied Sciences 11, n.º 24 (7 de dezembro de 2021): 11591. http://dx.doi.org/10.3390/app112411591.
Texto completo da fonteAbdlrazg, Bassma A. Awad, Sumaia Masoud e Mnal M. Ali. "Human Action Detection Using A hybrid Architecture of CNN and Transformer". International Science and Technology Journal 34, n.º 1 (25 de janeiro de 2024): 1–15. http://dx.doi.org/10.62341/bsmh2119.
Texto completo da fonteLi, Xue, Huibo Zhou e Ming Zhao. "Transformer-based cascade networks with spatial and channel reconstruction convolution for deepfake detection". Mathematical Biosciences and Engineering 21, n.º 3 (2024): 4142–64. http://dx.doi.org/10.3934/mbe.2024183.
Texto completo da fonteZhou, Siyuan, Chunru Zhan, Biao Wang, Tiezheng Ge, Yuning Jiang e Li Niu. "Video Object of Interest Segmentation". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 3 (26 de junho de 2023): 3805–13. http://dx.doi.org/10.1609/aaai.v37i3.25493.
Texto completo da fonteHuo, Hua, e Bingjie Li. "MgMViT: Multi-Granularity and Multi-Scale Vision Transformer for Efficient Action Recognition". Electronics 13, n.º 5 (29 de fevereiro de 2024): 948. http://dx.doi.org/10.3390/electronics13050948.
Texto completo da fonteKumar, Pavan. "Revolutionizing Deepfake Detection and Realtime Video Vision with CNN-based Deep Learning Model". International Journal of Innovative Research in Information Security 10, n.º 04 (8 de maio de 2024): 173–77. http://dx.doi.org/10.26562/ijiris.2024.v1004.10.
Texto completo da fonteReddy, Sai Krishna. "Advancements in Video Deblurring: A Comprehensive Review". INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, n.º 05 (7 de maio de 2024): 1–5. http://dx.doi.org/10.55041/ijsrem32759.
Texto completo da fonteIm, Heeju, e Yong-Suk Choi. "UAT: Universal Attention Transformer for Video Captioning". Sensors 22, n.º 13 (25 de junho de 2022): 4817. http://dx.doi.org/10.3390/s22134817.
Texto completo da fonteYamazaki, Kashu, Khoa Vo, Quang Sang Truong, Bhiksha Raj e Ngan Le. "VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 3 (26 de junho de 2023): 3081–90. http://dx.doi.org/10.1609/aaai.v37i3.25412.
Texto completo da fonteChoksi, Sarah, Sanjeev Narasimhan, Mattia Ballo, Mehmet Turkcan, Yiran Hu, Chengbo Zang, Alex Farrell et al. "Automatic assessment of robotic suturing utilizing computer vision in a dry-lab simulation". Artificial Intelligence Surgery 5, n.º 2 (1 de abril de 2025): 160–9. https://doi.org/10.20517/ais.2024.84.
Texto completo da fonteNarsina, Deekshith, Nicholas Richardson, Arjun Kamisetty, Jaya Chandra Srikanth Gummadi e Krishna Devarapu. "Neural Network Architectures for Real-Time Image and Video Processing Applications". Engineering International 10, n.º 2 (2022): 131–44. https://doi.org/10.18034/ei.v10i2.735.
Texto completo da fonteHan, Xiao, Yongbin Wang, Shouxun Liu e Cong Jin. "Online Multiplayer Tracking by Extracting Temporal Contexts with Transformer". Wireless Communications and Mobile Computing 2022 (11 de outubro de 2022): 1–10. http://dx.doi.org/10.1155/2022/6177973.
Texto completo da fonteZhang, Fan, Jiawei Tian, Jianhao Wang, Guanyou Liu e Ying Liu. "ECViST: Mine Intelligent Monitoring Based on Edge Computing and Vision Swin Transformer-YOLOv5". Energies 15, n.º 23 (29 de novembro de 2022): 9015. http://dx.doi.org/10.3390/en15239015.
Texto completo da fonteMardani, Konstantina, Nicholas Vretos e Petros Daras. "Transformer-Based Fire Detection in Videos". Sensors 23, n.º 6 (11 de março de 2023): 3035. http://dx.doi.org/10.3390/s23063035.
Texto completo da fontePeng, Pengfei, Guoqing Liang e Tao Luan. "Multi-View Inconsistency Analysis for Video Object-Level Splicing Localization". International Journal of Emerging Technologies and Advanced Applications 1, n.º 3 (24 de abril de 2024): 1–5. http://dx.doi.org/10.62677/ijetaa.2403111.
Texto completo da fonteWang, Jing, e ZongJu Yang. "Transformer-Guided Video Inpainting Algorithm Based on Local Spatial-Temporal joint". EAI Endorsed Transactions on e-Learning 8, n.º 4 (15 de agosto de 2023): e2. http://dx.doi.org/10.4108/eetel.3156.
Texto completo da fonteLe, Viet-Tuan, Kiet Tran-Trung e Vinh Truong Hoang. "A Comprehensive Review of Recent Deep Learning Techniques for Human Activity Recognition". Computational Intelligence and Neuroscience 2022 (20 de abril de 2022): 1–17. http://dx.doi.org/10.1155/2022/8323962.
Texto completo da fonteHong, Jiuk, Chaehyeon Lee e Heechul Jung. "Late Fusion-Based Video Transformer for Facial Micro-Expression Recognition". Applied Sciences 12, n.º 3 (23 de janeiro de 2022): 1169. http://dx.doi.org/10.3390/app12031169.
Texto completo da fonteD, Mrs Srivalli, e Divya Sri V. "Video Inpainting with Local and Global Refinement". INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, n.º 03 (17 de março de 2024): 1–5. http://dx.doi.org/10.55041/ijsrem29385.
Texto completo da fonteHabeb, Mohamed H., May Salama e Lamiaa A. Elrefaei. "Enhancing Video Anomaly Detection Using a Transformer Spatiotemporal Attention Unsupervised Framework for Large Datasets". Algorithms 17, n.º 7 (1 de julho de 2024): 286. http://dx.doi.org/10.3390/a17070286.
Texto completo da fonteUsmani, Shaheen, Sunil Kumar e Debanjan Sadhya. "Spatio-temporal knowledge distilled video vision transformer (STKD-VViT) for multimodal deepfake detection". Neurocomputing 620 (março de 2025): 129256. https://doi.org/10.1016/j.neucom.2024.129256.
Texto completo da fonteKumar, Yulia, Kuan Huang, Chin-Chien Lin, Annaliese Watson, J. Jenny Li, Patricia Morreale e Justin Delgado. "Applying Swin Architecture to Diverse Sign Language Datasets". Electronics 13, n.º 8 (16 de abril de 2024): 1509. http://dx.doi.org/10.3390/electronics13081509.
Texto completo da fonteLi, Yixiao, Lixiang Li, Zirui Zhuang, Yuan Fang, Haipeng Peng e Nam Ling. "Transformer-Based Data-Driven Video Coding Acceleration for Industrial Applications". Mathematical Problems in Engineering 2022 (27 de setembro de 2022): 1–11. http://dx.doi.org/10.1155/2022/1440323.
Texto completo da fonteNikulina, Olena, Valerii Severyn, Oleksii Kondratov e Oleksii Olhovoy. "MODELS OF REMOTE IDENTIFICATION OF PARAMETERS OF DYNAMIC OBJECTS USING DETECTION TRANSFORMERS AND OPTICAL FLOW". Bulletin of National Technical University "KhPI". Series: System Analysis, Control and Information Technologies, n.º 1 (11) (30 de julho de 2024): 52–57. http://dx.doi.org/10.20998/2079-0023.2024.01.08.
Texto completo da fonteEl Moaqet, Hisham, Rami Janini, Tamer Abdulbaki Alshirbaji, Nour Aldeen Jalal e Knut Möller. "Using Vision Transformers for Classifying Surgical Tools in Computer Aided Surgeries". Current Directions in Biomedical Engineering 10, n.º 4 (1 de dezembro de 2024): 232–35. https://doi.org/10.1515/cdbme-2024-2056.
Texto completo da fonteJang, Hee-Deok, Seokjoon Kwon, Hyunwoo Nam e Dong Eui Chang. "Chemical Gas Source Localization with Synthetic Time Series Diffusion Data Using Video Vision Transformer". Applied Sciences 14, n.º 11 (23 de maio de 2024): 4451. http://dx.doi.org/10.3390/app14114451.
Texto completo da fonteMozaffari, M. Hamed, Yuchuan Li, Niloofar Hooshyaripour e Yoon Ko. "Vision-Based Prediction of Flashover Using Transformers and Convolutional Long Short-Term Memory Model". Electronics 13, n.º 23 (3 de dezembro de 2024): 4776. https://doi.org/10.3390/electronics13234776.
Texto completo da fonteGeng, Xiaozhong, Cheng Chen, Ping Yu, Baijin Liu, Weixin Hu, Qipeng Liang e Xintong Zhang. "OM-VST: A video action recognition model based on optimized downsampling module combined with multi-scale feature fusion". PLOS ONE 20, n.º 3 (6 de março de 2025): e0318884. https://doi.org/10.1371/journal.pone.0318884.
Texto completo da fonteKim, Nayeon, Sukhee Cho e Byungjun Bae. "SMaTE: A Segment-Level Feature Mixing and Temporal Encoding Framework for Facial Expression Recognition". Sensors 22, n.º 15 (1 de agosto de 2022): 5753. http://dx.doi.org/10.3390/s22155753.
Texto completo da fonteLai, Derek Ka-Hei, Ethan Shiu-Wang Cheng, Bryan Pak-Hei So, Ye-Jiao Mao, Sophia Ming-Yan Cheung, Daphne Sze Ki Cheung, Duo Wai-Chi Wong e James Chung-Wai Cheung. "Transformer Models and Convolutional Networks with Different Activation Functions for Swallow Classification Using Depth Video Data". Mathematics 11, n.º 14 (12 de julho de 2023): 3081. http://dx.doi.org/10.3390/math11143081.
Texto completo da fonteLiu, Yuqi, Luhui Xu, Pengfei Xiong e Qin Jin. "Token Mixing: Parameter-Efficient Transfer Learning from Image-Language to Video-Language". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 2 (26 de junho de 2023): 1781–89. http://dx.doi.org/10.1609/aaai.v37i2.25267.
Texto completo da fonteLorenzo, Javier, Ignacio Parra Alonso, Rubén Izquierdo, Augusto Luis Ballardini, Álvaro Hernández Saz, David Fernández Llorca e Miguel Ángel Sotelo. "CAPformer: Pedestrian Crossing Action Prediction Using Transformer". Sensors 21, n.º 17 (24 de agosto de 2021): 5694. http://dx.doi.org/10.3390/s21175694.
Texto completo da fonteGuo, Zizhao, e Sancong Ying. "Whole-Body Keypoint and Skeleton Augmented RGB Networks for Video Action Recognition". Applied Sciences 12, n.º 12 (18 de junho de 2022): 6215. http://dx.doi.org/10.3390/app12126215.
Texto completo da fonteZhang, Renhong, Tianheng Cheng, Shusheng Yang, Haoyi Jiang, Shuai Zhang, Jiancheng Lyu, Xin Li et al. "MobileInst: Video Instance Segmentation on the Mobile". Proceedings of the AAAI Conference on Artificial Intelligence 38, n.º 7 (24 de março de 2024): 7260–68. http://dx.doi.org/10.1609/aaai.v38i7.28555.
Texto completo da fonteZang, Chengbo, Mehmet Kerem Turkcan, Sanjeev Narasimhan, Yuqing Cao, Kaan Yarali, Zixuan Xiang, Skyler Szot et al. "Surgical Phase Recognition in Inguinal Hernia Repair—AI-Based Confirmatory Baseline and Exploration of Competitive Models". Bioengineering 10, n.º 6 (27 de maio de 2023): 654. http://dx.doi.org/10.3390/bioengineering10060654.
Texto completo da fonteLiu, Hao, Jiwen Lu, Jianjiang Feng e Jie Zhou. "Two-Stream Transformer Networks for Video-Based Face Alignment". IEEE Transactions on Pattern Analysis and Machine Intelligence 40, n.º 11 (1 de novembro de 2018): 2546–54. http://dx.doi.org/10.1109/tpami.2017.2734779.
Texto completo da fonteKhan, Salman, Muzammal Naseer, Munawar Hayat, Syed Waqas Zamir, Fahad Shahbaz Khan e Mubarak Shah. "Transformers in Vision: A Survey". ACM Computing Surveys, 6 de janeiro de 2022. http://dx.doi.org/10.1145/3505244.
Texto completo da fonteHsu, Tzu-Chun, Yi-Sheng Liao e Chun-Rong Huang. "Video Summarization With Spatiotemporal Vision Transformer". IEEE Transactions on Image Processing, 2023, 1. http://dx.doi.org/10.1109/tip.2023.3275069.
Texto completo da fonte