Добірка наукової літератури з теми "Video question answering"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Video question answering".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Статті в журналах з теми "Video question answering"
Lei, Chenyi, Lei Wu, Dong Liu, Zhao Li, Guoxin Wang, Haihong Tang, and Houqiang Li. "Multi-Question Learning for Visual Question Answering." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (April 3, 2020): 11328–35. http://dx.doi.org/10.1609/aaai.v34i07.6794.
Повний текст джерелаRuwa, Nelson, Qirong Mao, Liangjun Wang, and Jianping Gou. "Affective question answering on video." Neurocomputing 363 (October 2019): 125–39. http://dx.doi.org/10.1016/j.neucom.2019.06.046.
Повний текст джерелаWang, Yueqian, Yuxuan Wang, Kai Chen, and Dongyan Zhao. "STAIR: Spatial-Temporal Reasoning with Auditable Intermediate Results for Video Question Answering." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 17 (March 24, 2024): 19215–23. http://dx.doi.org/10.1609/aaai.v38i17.29890.
Повний текст джерелаZong, Linlin, Jiahui Wan, Xianchao Zhang, Xinyue Liu, Wenxin Liang, and Bo Xu. "Video-Context Aligned Transformer for Video Question Answering." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 17 (March 24, 2024): 19795–803. http://dx.doi.org/10.1609/aaai.v38i17.29954.
Повний текст джерелаHuang, Deng, Peihao Chen, Runhao Zeng, Qing Du, Mingkui Tan, and Chuang Gan. "Location-Aware Graph Convolutional Networks for Video Question Answering." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (April 3, 2020): 11021–28. http://dx.doi.org/10.1609/aaai.v34i07.6737.
Повний текст джерелаGao, Lianli, Pengpeng Zeng, Jingkuan Song, Yuan-Fang Li, Wu Liu, Tao Mei, and Heng Tao Shen. "Structured Two-Stream Attention Network for Video Question Answering." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 6391–98. http://dx.doi.org/10.1609/aaai.v33i01.33016391.
Повний текст джерелаKumar, Krishnamoorthi Magesh, and P. Valarmathie. "Domain and Intelligence Based Multimedia Question Answering System." International Journal of Evaluation and Research in Education (IJERE) 5, no. 3 (September 1, 2016): 227. http://dx.doi.org/10.11591/ijere.v5i3.4544.
Повний текст джерелаXue, Hongyang, Zhou Zhao, and Deng Cai. "Unifying the Video and Question Attentions for Open-Ended Video Question Answering." IEEE Transactions on Image Processing 26, no. 12 (December 2017): 5656–66. http://dx.doi.org/10.1109/tip.2017.2746267.
Повний текст джерелаJang, Yunseok, Yale Song, Chris Dongjoo Kim, Youngjae Yu, Youngjin Kim, and Gunhee Kim. "Video Question Answering with Spatio-Temporal Reasoning." International Journal of Computer Vision 127, no. 10 (June 18, 2019): 1385–412. http://dx.doi.org/10.1007/s11263-019-01189-x.
Повний текст джерелаZhuang, Yueting, Dejing Xu, Xin Yan, Wenzhuo Cheng, Zhou Zhao, Shiliang Pu, and Jun Xiao. "Multichannel Attention Refinement for Video Question Answering." ACM Transactions on Multimedia Computing, Communications, and Applications 16, no. 1s (April 28, 2020): 1–23. http://dx.doi.org/10.1145/3366710.
Повний текст джерелаДисертації з теми "Video question answering"
Engin, Deniz. "Video question answering with limited supervision." Electronic Thesis or Diss., Université de Rennes (2023-....), 2024. http://www.theses.fr/2024URENS016.
Повний текст джерелаVideo content has significantly increased in volume and diversity in the digital era, and this expansion has highlighted the necessity for advanced video understanding technologies. Driven by this necessity, this thesis explores semantically understanding videos, leveraging multiple perceptual modes similar to human cognitive processes and efficient learning with limited supervision similar to human learning capabilities. This thesis specifically focuses on video question answering as one of the main video understanding tasks. Our first contribution addresses long-range video question answering, requiring an understanding of extended video content. While recent approaches rely on human-generated external sources, we process raw data to generate video summaries. Our following contribution explores zero-shot and few-shot video question answering, aiming to enhance efficient learning from limited data. We leverage the knowledge of existing large-scale models by eliminating challenges in adapting pre-trained models to limited data. We demonstrate that these contributions significantly enhance the capabilities of multimodal video question-answering systems, where specifically human-annotated labeled data is limited or unavailable
Chowdhury, Muhammad Iqbal Hasan. "Question-answering on image/video content." Thesis, Queensland University of Technology, 2020. https://eprints.qut.edu.au/205096/1/Muhammad%20Iqbal%20Hasan_Chowdhury_Thesis.pdf.
Повний текст джерелаZeng, Kuo-Hao, and 曾國豪. "Video titling and Question-Answering." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/a3a6sw.
Повний текст джерела國立清華大學
電機工程學系所
105
Video titling and question answering are two important tasks toward high-level visual data understanding. To address those two tasks, we propose a large-scale dataset and demonstrate several models on such dataset in this work. A great video title describes the most salient event compactly and captures the viewer's attention. In contrast, video captioning tends to generate sentences that describe the video as a whole. Although generating a video title automatically is a very useful task, it is much less addressed than video captioning. We address video title generation for the first time by proposing two methods that extend state-of-the-art video captioners to this new task. First, we make video captioners highlight sensitive by priming them with a highlight detector. Our framework allows for jointly training a model for title generation and video highlight localization. Second, we induce high sentence diversity in video captioners, so that the generated titles are also diverse and catchy. This means that a large number of sentences might be required to learn the sentence structure of titles. Hence, we propose a novel sentence augmentation method to train a captioner with additional sentence-only examples that come without corresponding videos. On the other hand, for video question-answering task: we propose to learn a deep model to answer a free-form natural language question about the contents of a video. We make a program automatically harvests a large number of videos and descriptions freely available online. Then, a large number of candidate QA pairs are automatically generated from descriptions rather than manually annotated. Next, we use these candidate QA pairs to train a number of video-based QA methods extended from MN, VQA, SA, and SS. In order to handle non-perfect candidate QA pairs, we propose a self-paced learning procedure to iteratively identify them and mitigate their effects in training. To demonstrate our idea, we collected a large-scale Video Titles in the Wild (VTW) dataset of $18100$ automatically crawled user-generated videos and titles. We then utilize an automatic QA generator to generate a large number of QA pairs for training and collect the manually generated QA pairs from Amazon Mechanical Turk. On VTW, our methods consistently improve title prediction accuracy, and achieve the best performance in both automatic and human evaluation. Next, our sentence augmentation method also outperforms the baselines on the M-VAD dataset. Finally, the results of video question answering show that our self-paced learning procedure is effective, and the extended SS model outperforms various baselines.
Xu, Huijuan. "Vision and language understanding with localized evidence." Thesis, 2018. https://hdl.handle.net/2144/34790.
Повний текст джерелаКниги з теми "Video question answering"
McCallum, Richard. Evangelical Christian Responses to Islam. Bloomsbury Publishing Plc, 2024. http://dx.doi.org/10.5040/9781350418240.
Повний текст джерелаWalker, Stephen. Digital Mediation. Bloomsbury Publishing Plc, 2024. http://dx.doi.org/10.5040/9781526525772.
Повний текст джерелаЧастини книг з теми "Video question answering"
Wu, Qi, Peng Wang, Xin Wang, Xiaodong He, and Wenwu Zhu. "Video Question Answering." In Visual Question Answering, 119–33. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-0964-1_8.
Повний текст джерелаWu, Qi, Peng Wang, Xin Wang, Xiaodong He, and Wenwu Zhu. "Video Representation Learning." In Visual Question Answering, 111–17. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-0964-1_7.
Повний текст джерелаWu, Qi, Peng Wang, Xin Wang, Xiaodong He, and Wenwu Zhu. "Advanced Models for Video Question Answering." In Visual Question Answering, 135–43. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-0964-1_9.
Повний текст джерелаGao, Lei, Guangda Li, Yan-Tao Zheng, Richang Hong, and Tat-Seng Chua. "Video Reference: A Video Question Answering Engine." In Lecture Notes in Computer Science, 799–801. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-11301-7_92.
Повний текст джерелаXiao, Junbin, Pan Zhou, Tat-Seng Chua, and Shuicheng Yan. "Video Graph Transformer for Video Question Answering." In Lecture Notes in Computer Science, 39–58. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-20059-5_3.
Повний текст джерелаPiergiovanni, AJ, Kairo Morton, Weicheng Kuo, Michael S. Ryoo, and Anelia Angelova. "Video Question Answering with Iterative Video-Text Co-tokenization." In Lecture Notes in Computer Science, 76–94. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-20059-5_5.
Повний текст джерелаChen, Xuanwei, Rui Liu, Xiaomeng Song, and Yahong Han. "Locating Visual Explanations for Video Question Answering." In MultiMedia Modeling, 290–302. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-67832-6_24.
Повний текст джерелаGupta, Pranay, and Manish Gupta. "NewsKVQA: Knowledge-Aware News Video Question Answering." In Advances in Knowledge Discovery and Data Mining, 3–15. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-05981-0_1.
Повний текст джерелаGe, Yuanyuan, Youjiang Xu, and Yahong Han. "Video Question Answering Using a Forget Memory Network." In Communications in Computer and Information Science, 404–15. Singapore: Springer Singapore, 2017. http://dx.doi.org/10.1007/978-981-10-7299-4_33.
Повний текст джерелаGao, Kun, Xianglei Zhu, and Yahong Han. "Initialized Frame Attention Networks for Video Question Answering." In Communications in Computer and Information Science, 349–59. Singapore: Springer Singapore, 2018. http://dx.doi.org/10.1007/978-981-10-8530-7_34.
Повний текст джерелаТези доповідей конференцій з теми "Video question answering"
Zhao, Wentian, Seokhwan Kim, Ning Xu, and Hailin Jin. "Video Question Answering on Screencast Tutorials." In Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence {IJCAI-PRICAI-20}. California: International Joint Conferences on Artificial Intelligence Organization, 2020. http://dx.doi.org/10.24963/ijcai.2020/148.
Повний текст джерелаJenni, Kommineni, M. Srinivas, Roshni Sannapu, and Murukessan Perumal. "CSA-BERT: Video Question Answering." In 2023 IEEE Statistical Signal Processing Workshop (SSP). IEEE, 2023. http://dx.doi.org/10.1109/ssp53291.2023.10207954.
Повний текст джерелаLi, Hao, Peng Jin, Zesen Cheng, Songyang Zhang, Kai Chen, Zhennan Wang, Chang Liu, and Jie Chen. "TG-VQA: Ternary Game of Video Question Answering." In Thirty-Second International Joint Conference on Artificial Intelligence {IJCAI-23}. California: International Joint Conferences on Artificial Intelligence Organization, 2023. http://dx.doi.org/10.24963/ijcai.2023/116.
Повний текст джерелаZhao, Zhou, Qifan Yang, Deng Cai, Xiaofei He, and Yueting Zhuang. "Video Question Answering via Hierarchical Spatio-Temporal Attention Networks." In Twenty-Sixth International Joint Conference on Artificial Intelligence. California: International Joint Conferences on Artificial Intelligence Organization, 2017. http://dx.doi.org/10.24963/ijcai.2017/492.
Повний текст джерелаChao, Guan-Lin, Abhinav Rastogi, Semih Yavuz, Dilek Hakkani-Tur, Jindong Chen, and Ian Lane. "Learning Question-Guided Video Representation for Multi-Turn Video Question Answering." In Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue. Stroudsburg, PA, USA: Association for Computational Linguistics, 2019. http://dx.doi.org/10.18653/v1/w19-5926.
Повний текст джерелаBhalerao, Mandar, Shlok Gujar, Aditya Bhave, and Anant V. Nimkar. "Visual Question Answering Using Video Clips." In 2019 IEEE Bombay Section Signature Conference (IBSSC). IEEE, 2019. http://dx.doi.org/10.1109/ibssc47189.2019.8973090.
Повний текст джерелаYang, Zekun, Noa Garcia, Chenhui Chu, Mayu Otani, Yuta Nakashima, and Haruo Takemura. "BERT Representations for Video Question Answering." In 2020 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2020. http://dx.doi.org/10.1109/wacv45572.2020.9093596.
Повний текст джерелаLi, Yicong, Xiang Wang, Junbin Xiao, Wei Ji, and Tat-Seng Chua. "Invariant Grounding for Video Question Answering." In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2022. http://dx.doi.org/10.1109/cvpr52688.2022.00294.
Повний текст джерелаFang, Jiannan, Lingling Sun, and Yaqi Wang. "Video question answering by frame attention." In Eleventh International Conference on Digital Image Processing, edited by Xudong Jiang and Jenq-Neng Hwang. SPIE, 2019. http://dx.doi.org/10.1117/12.2539615.
Повний текст джерелаLei, Jie, Licheng Yu, Mohit Bansal, and Tamara Berg. "TVQA: Localized, Compositional Video Question Answering." In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018. http://dx.doi.org/10.18653/v1/d18-1167.
Повний текст джерела