Artykuły w czasopismach na temat „Video Vision Transformer”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Sprawdź 50 najlepszych artykułów w czasopismach naukowych na temat „Video Vision Transformer”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Przeglądaj artykuły w czasopismach z różnych dziedzin i twórz odpowiednie bibliografie.
Naikwadi, Sanket Shashikant. "Video Summarization Using Vision and Language Transformer Models". International Journal of Research Publication and Reviews 6, nr 6 (styczeń 2025): 5217–21. https://doi.org/10.55248/gengpi.6.0125.0654.
Pełny tekst źródłaMoutik, Oumaima, Hiba Sekkat, Smail Tigani, Abdellah Chehri, Rachid Saadane, Taha Ait Tchakoucht i Anand Paul. "Convolutional Neural Networks or Vision Transformers: Who Will Win the Race for Action Recognitions in Visual Data?" Sensors 23, nr 2 (9.01.2023): 734. http://dx.doi.org/10.3390/s23020734.
Pełny tekst źródłaYuan, Hongchun, Zhenyu Cai, Hui Zhou, Yue Wang i Xiangzhi Chen. "TransAnomaly: Video Anomaly Detection Using Video Vision Transformer". IEEE Access 9 (2021): 123977–86. http://dx.doi.org/10.1109/access.2021.3109102.
Pełny tekst źródłaSarraf, Saman, i Milton Kabia. "Optimal Topology of Vision Transformer for Real-Time Video Action Recognition in an End-To-End Cloud Solution". Machine Learning and Knowledge Extraction 5, nr 4 (29.09.2023): 1320–39. http://dx.doi.org/10.3390/make5040067.
Pełny tekst źródłaZhao, Hong, Zhiwen Chen, Lan Guo i Zeyu Han. "Video captioning based on vision transformer and reinforcement learning". PeerJ Computer Science 8 (16.03.2022): e916. http://dx.doi.org/10.7717/peerj-cs.916.
Pełny tekst źródłaIm, Heeju, i Yong Suk Choi. "A Full Transformer Video Captioning Model via Vision Transformer". KIISE Transactions on Computing Practices 29, nr 8 (31.08.2023): 378–83. http://dx.doi.org/10.5626/ktcp.2023.29.8.378.
Pełny tekst źródłaUgile, Tukaram, i Dr Nilesh Uke. "TRANSFORMER ARCHITECTURES FOR COMPUTER VISION: A COMPREHENSIVE REVIEW AND FUTURE RESEARCH DIRECTIONS". Journal of Dynamics and Control 9, nr 3 (15.03.2025): 70–79. https://doi.org/10.71058/jodac.v9i3005.
Pełny tekst źródłaWu, Pengfei, Le Wang, Sanping Zhou, Gang Hua i Changyin Sun. "Temporal Correlation Vision Transformer for Video Person Re-Identification". Proceedings of the AAAI Conference on Artificial Intelligence 38, nr 6 (24.03.2024): 6083–91. http://dx.doi.org/10.1609/aaai.v38i6.28424.
Pełny tekst źródłaJin, Yanxiu, i Rulin Ma. "Applications of transformers in computer vision". Applied and Computational Engineering 16, nr 1 (23.10.2023): 234–41. http://dx.doi.org/10.54254/2755-2721/16/20230898.
Pełny tekst źródłaPei, Pengfei, Xianfeng Zhao, Jinchuan Li, Yun Cao i Xuyuan Lai. "Vision Transformer-Based Video Hashing Retrieval for Tracing the Source of Fake Videos". Security and Communication Networks 2023 (28.06.2023): 1–16. http://dx.doi.org/10.1155/2023/5349392.
Pełny tekst źródłaWang, Hao, Wenjia Zhang i Guohua Liu. "TSNet: Token Sparsification for Efficient Video Transformer". Applied Sciences 13, nr 19 (24.09.2023): 10633. http://dx.doi.org/10.3390/app131910633.
Pełny tekst źródłaKim, Dahyun, i Myung Hwan Na. "Rice yield prediction and self-attention visualization using Video Vision Transformer". Korean Data Analysis Society 25, nr 4 (31.08.2023): 1249–59. http://dx.doi.org/10.37727/jkdas.2023.25.4.1249.
Pełny tekst źródłaLee, Jaewoo, Sungjun Lee, Wonki Cho, Zahid Ali Siddiqui i Unsang Park. "Vision Transformer-Based Tailing Detection in Videos". Applied Sciences 11, nr 24 (7.12.2021): 11591. http://dx.doi.org/10.3390/app112411591.
Pełny tekst źródłaAbdlrazg, Bassma A. Awad, Sumaia Masoud i Mnal M. Ali. "Human Action Detection Using A hybrid Architecture of CNN and Transformer". International Science and Technology Journal 34, nr 1 (25.01.2024): 1–15. http://dx.doi.org/10.62341/bsmh2119.
Pełny tekst źródłaLi, Xue, Huibo Zhou i Ming Zhao. "Transformer-based cascade networks with spatial and channel reconstruction convolution for deepfake detection". Mathematical Biosciences and Engineering 21, nr 3 (2024): 4142–64. http://dx.doi.org/10.3934/mbe.2024183.
Pełny tekst źródłaZhou, Siyuan, Chunru Zhan, Biao Wang, Tiezheng Ge, Yuning Jiang i Li Niu. "Video Object of Interest Segmentation". Proceedings of the AAAI Conference on Artificial Intelligence 37, nr 3 (26.06.2023): 3805–13. http://dx.doi.org/10.1609/aaai.v37i3.25493.
Pełny tekst źródłaHuo, Hua, i Bingjie Li. "MgMViT: Multi-Granularity and Multi-Scale Vision Transformer for Efficient Action Recognition". Electronics 13, nr 5 (29.02.2024): 948. http://dx.doi.org/10.3390/electronics13050948.
Pełny tekst źródłaKumar, Pavan. "Revolutionizing Deepfake Detection and Realtime Video Vision with CNN-based Deep Learning Model". International Journal of Innovative Research in Information Security 10, nr 04 (8.05.2024): 173–77. http://dx.doi.org/10.26562/ijiris.2024.v1004.10.
Pełny tekst źródłaReddy, Sai Krishna. "Advancements in Video Deblurring: A Comprehensive Review". INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, nr 05 (7.05.2024): 1–5. http://dx.doi.org/10.55041/ijsrem32759.
Pełny tekst źródłaIm, Heeju, i Yong-Suk Choi. "UAT: Universal Attention Transformer for Video Captioning". Sensors 22, nr 13 (25.06.2022): 4817. http://dx.doi.org/10.3390/s22134817.
Pełny tekst źródłaYamazaki, Kashu, Khoa Vo, Quang Sang Truong, Bhiksha Raj i Ngan Le. "VLTinT: Visual-Linguistic Transformer-in-Transformer for Coherent Video Paragraph Captioning". Proceedings of the AAAI Conference on Artificial Intelligence 37, nr 3 (26.06.2023): 3081–90. http://dx.doi.org/10.1609/aaai.v37i3.25412.
Pełny tekst źródłaChoksi, Sarah, Sanjeev Narasimhan, Mattia Ballo, Mehmet Turkcan, Yiran Hu, Chengbo Zang, Alex Farrell i in. "Automatic assessment of robotic suturing utilizing computer vision in a dry-lab simulation". Artificial Intelligence Surgery 5, nr 2 (1.04.2025): 160–9. https://doi.org/10.20517/ais.2024.84.
Pełny tekst źródłaNarsina, Deekshith, Nicholas Richardson, Arjun Kamisetty, Jaya Chandra Srikanth Gummadi i Krishna Devarapu. "Neural Network Architectures for Real-Time Image and Video Processing Applications". Engineering International 10, nr 2 (2022): 131–44. https://doi.org/10.18034/ei.v10i2.735.
Pełny tekst źródłaHan, Xiao, Yongbin Wang, Shouxun Liu i Cong Jin. "Online Multiplayer Tracking by Extracting Temporal Contexts with Transformer". Wireless Communications and Mobile Computing 2022 (11.10.2022): 1–10. http://dx.doi.org/10.1155/2022/6177973.
Pełny tekst źródłaZhang, Fan, Jiawei Tian, Jianhao Wang, Guanyou Liu i Ying Liu. "ECViST: Mine Intelligent Monitoring Based on Edge Computing and Vision Swin Transformer-YOLOv5". Energies 15, nr 23 (29.11.2022): 9015. http://dx.doi.org/10.3390/en15239015.
Pełny tekst źródłaMardani, Konstantina, Nicholas Vretos i Petros Daras. "Transformer-Based Fire Detection in Videos". Sensors 23, nr 6 (11.03.2023): 3035. http://dx.doi.org/10.3390/s23063035.
Pełny tekst źródłaPeng, Pengfei, Guoqing Liang i Tao Luan. "Multi-View Inconsistency Analysis for Video Object-Level Splicing Localization". International Journal of Emerging Technologies and Advanced Applications 1, nr 3 (24.04.2024): 1–5. http://dx.doi.org/10.62677/ijetaa.2403111.
Pełny tekst źródłaWang, Jing, i ZongJu Yang. "Transformer-Guided Video Inpainting Algorithm Based on Local Spatial-Temporal joint". EAI Endorsed Transactions on e-Learning 8, nr 4 (15.08.2023): e2. http://dx.doi.org/10.4108/eetel.3156.
Pełny tekst źródłaLe, Viet-Tuan, Kiet Tran-Trung i Vinh Truong Hoang. "A Comprehensive Review of Recent Deep Learning Techniques for Human Activity Recognition". Computational Intelligence and Neuroscience 2022 (20.04.2022): 1–17. http://dx.doi.org/10.1155/2022/8323962.
Pełny tekst źródłaHong, Jiuk, Chaehyeon Lee i Heechul Jung. "Late Fusion-Based Video Transformer for Facial Micro-Expression Recognition". Applied Sciences 12, nr 3 (23.01.2022): 1169. http://dx.doi.org/10.3390/app12031169.
Pełny tekst źródłaD, Mrs Srivalli, i Divya Sri V. "Video Inpainting with Local and Global Refinement". INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, nr 03 (17.03.2024): 1–5. http://dx.doi.org/10.55041/ijsrem29385.
Pełny tekst źródłaHabeb, Mohamed H., May Salama i Lamiaa A. Elrefaei. "Enhancing Video Anomaly Detection Using a Transformer Spatiotemporal Attention Unsupervised Framework for Large Datasets". Algorithms 17, nr 7 (1.07.2024): 286. http://dx.doi.org/10.3390/a17070286.
Pełny tekst źródłaUsmani, Shaheen, Sunil Kumar i Debanjan Sadhya. "Spatio-temporal knowledge distilled video vision transformer (STKD-VViT) for multimodal deepfake detection". Neurocomputing 620 (marzec 2025): 129256. https://doi.org/10.1016/j.neucom.2024.129256.
Pełny tekst źródłaKumar, Yulia, Kuan Huang, Chin-Chien Lin, Annaliese Watson, J. Jenny Li, Patricia Morreale i Justin Delgado. "Applying Swin Architecture to Diverse Sign Language Datasets". Electronics 13, nr 8 (16.04.2024): 1509. http://dx.doi.org/10.3390/electronics13081509.
Pełny tekst źródłaLi, Yixiao, Lixiang Li, Zirui Zhuang, Yuan Fang, Haipeng Peng i Nam Ling. "Transformer-Based Data-Driven Video Coding Acceleration for Industrial Applications". Mathematical Problems in Engineering 2022 (27.09.2022): 1–11. http://dx.doi.org/10.1155/2022/1440323.
Pełny tekst źródłaNikulina, Olena, Valerii Severyn, Oleksii Kondratov i Oleksii Olhovoy. "MODELS OF REMOTE IDENTIFICATION OF PARAMETERS OF DYNAMIC OBJECTS USING DETECTION TRANSFORMERS AND OPTICAL FLOW". Bulletin of National Technical University "KhPI". Series: System Analysis, Control and Information Technologies, nr 1 (11) (30.07.2024): 52–57. http://dx.doi.org/10.20998/2079-0023.2024.01.08.
Pełny tekst źródłaEl Moaqet, Hisham, Rami Janini, Tamer Abdulbaki Alshirbaji, Nour Aldeen Jalal i Knut Möller. "Using Vision Transformers for Classifying Surgical Tools in Computer Aided Surgeries". Current Directions in Biomedical Engineering 10, nr 4 (1.12.2024): 232–35. https://doi.org/10.1515/cdbme-2024-2056.
Pełny tekst źródłaJang, Hee-Deok, Seokjoon Kwon, Hyunwoo Nam i Dong Eui Chang. "Chemical Gas Source Localization with Synthetic Time Series Diffusion Data Using Video Vision Transformer". Applied Sciences 14, nr 11 (23.05.2024): 4451. http://dx.doi.org/10.3390/app14114451.
Pełny tekst źródłaMozaffari, M. Hamed, Yuchuan Li, Niloofar Hooshyaripour i Yoon Ko. "Vision-Based Prediction of Flashover Using Transformers and Convolutional Long Short-Term Memory Model". Electronics 13, nr 23 (3.12.2024): 4776. https://doi.org/10.3390/electronics13234776.
Pełny tekst źródłaGeng, Xiaozhong, Cheng Chen, Ping Yu, Baijin Liu, Weixin Hu, Qipeng Liang i Xintong Zhang. "OM-VST: A video action recognition model based on optimized downsampling module combined with multi-scale feature fusion". PLOS ONE 20, nr 3 (6.03.2025): e0318884. https://doi.org/10.1371/journal.pone.0318884.
Pełny tekst źródłaKim, Nayeon, Sukhee Cho i Byungjun Bae. "SMaTE: A Segment-Level Feature Mixing and Temporal Encoding Framework for Facial Expression Recognition". Sensors 22, nr 15 (1.08.2022): 5753. http://dx.doi.org/10.3390/s22155753.
Pełny tekst źródłaLai, Derek Ka-Hei, Ethan Shiu-Wang Cheng, Bryan Pak-Hei So, Ye-Jiao Mao, Sophia Ming-Yan Cheung, Daphne Sze Ki Cheung, Duo Wai-Chi Wong i James Chung-Wai Cheung. "Transformer Models and Convolutional Networks with Different Activation Functions for Swallow Classification Using Depth Video Data". Mathematics 11, nr 14 (12.07.2023): 3081. http://dx.doi.org/10.3390/math11143081.
Pełny tekst źródłaLiu, Yuqi, Luhui Xu, Pengfei Xiong i Qin Jin. "Token Mixing: Parameter-Efficient Transfer Learning from Image-Language to Video-Language". Proceedings of the AAAI Conference on Artificial Intelligence 37, nr 2 (26.06.2023): 1781–89. http://dx.doi.org/10.1609/aaai.v37i2.25267.
Pełny tekst źródłaLorenzo, Javier, Ignacio Parra Alonso, Rubén Izquierdo, Augusto Luis Ballardini, Álvaro Hernández Saz, David Fernández Llorca i Miguel Ángel Sotelo. "CAPformer: Pedestrian Crossing Action Prediction Using Transformer". Sensors 21, nr 17 (24.08.2021): 5694. http://dx.doi.org/10.3390/s21175694.
Pełny tekst źródłaGuo, Zizhao, i Sancong Ying. "Whole-Body Keypoint and Skeleton Augmented RGB Networks for Video Action Recognition". Applied Sciences 12, nr 12 (18.06.2022): 6215. http://dx.doi.org/10.3390/app12126215.
Pełny tekst źródłaZhang, Renhong, Tianheng Cheng, Shusheng Yang, Haoyi Jiang, Shuai Zhang, Jiancheng Lyu, Xin Li i in. "MobileInst: Video Instance Segmentation on the Mobile". Proceedings of the AAAI Conference on Artificial Intelligence 38, nr 7 (24.03.2024): 7260–68. http://dx.doi.org/10.1609/aaai.v38i7.28555.
Pełny tekst źródłaZang, Chengbo, Mehmet Kerem Turkcan, Sanjeev Narasimhan, Yuqing Cao, Kaan Yarali, Zixuan Xiang, Skyler Szot i in. "Surgical Phase Recognition in Inguinal Hernia Repair—AI-Based Confirmatory Baseline and Exploration of Competitive Models". Bioengineering 10, nr 6 (27.05.2023): 654. http://dx.doi.org/10.3390/bioengineering10060654.
Pełny tekst źródłaLiu, Hao, Jiwen Lu, Jianjiang Feng i Jie Zhou. "Two-Stream Transformer Networks for Video-Based Face Alignment". IEEE Transactions on Pattern Analysis and Machine Intelligence 40, nr 11 (1.11.2018): 2546–54. http://dx.doi.org/10.1109/tpami.2017.2734779.
Pełny tekst źródłaKhan, Salman, Muzammal Naseer, Munawar Hayat, Syed Waqas Zamir, Fahad Shahbaz Khan i Mubarak Shah. "Transformers in Vision: A Survey". ACM Computing Surveys, 6.01.2022. http://dx.doi.org/10.1145/3505244.
Pełny tekst źródłaHsu, Tzu-Chun, Yi-Sheng Liao i Chun-Rong Huang. "Video Summarization With Spatiotemporal Vision Transformer". IEEE Transactions on Image Processing, 2023, 1. http://dx.doi.org/10.1109/tip.2023.3275069.
Pełny tekst źródła