Letteratura scientifica selezionata sul tema "Video question answering"
Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili
Consulta la lista di attuali articoli, libri, tesi, atti di convegni e altre fonti scientifiche attinenti al tema "Video question answering".
Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.
Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.
Articoli di riviste sul tema "Video question answering"
Lei, Chenyi, Lei Wu, Dong Liu, Zhao Li, Guoxin Wang, Haihong Tang e Houqiang Li. "Multi-Question Learning for Visual Question Answering". Proceedings of the AAAI Conference on Artificial Intelligence 34, n. 07 (3 aprile 2020): 11328–35. http://dx.doi.org/10.1609/aaai.v34i07.6794.
Testo completoRuwa, Nelson, Qirong Mao, Liangjun Wang e Jianping Gou. "Affective question answering on video". Neurocomputing 363 (ottobre 2019): 125–39. http://dx.doi.org/10.1016/j.neucom.2019.06.046.
Testo completoWang, Yueqian, Yuxuan Wang, Kai Chen e Dongyan Zhao. "STAIR: Spatial-Temporal Reasoning with Auditable Intermediate Results for Video Question Answering". Proceedings of the AAAI Conference on Artificial Intelligence 38, n. 17 (24 marzo 2024): 19215–23. http://dx.doi.org/10.1609/aaai.v38i17.29890.
Testo completoZong, Linlin, Jiahui Wan, Xianchao Zhang, Xinyue Liu, Wenxin Liang e Bo Xu. "Video-Context Aligned Transformer for Video Question Answering". Proceedings of the AAAI Conference on Artificial Intelligence 38, n. 17 (24 marzo 2024): 19795–803. http://dx.doi.org/10.1609/aaai.v38i17.29954.
Testo completoHuang, Deng, Peihao Chen, Runhao Zeng, Qing Du, Mingkui Tan e Chuang Gan. "Location-Aware Graph Convolutional Networks for Video Question Answering". Proceedings of the AAAI Conference on Artificial Intelligence 34, n. 07 (3 aprile 2020): 11021–28. http://dx.doi.org/10.1609/aaai.v34i07.6737.
Testo completoGao, Lianli, Pengpeng Zeng, Jingkuan Song, Yuan-Fang Li, Wu Liu, Tao Mei e Heng Tao Shen. "Structured Two-Stream Attention Network for Video Question Answering". Proceedings of the AAAI Conference on Artificial Intelligence 33 (17 luglio 2019): 6391–98. http://dx.doi.org/10.1609/aaai.v33i01.33016391.
Testo completoKumar, Krishnamoorthi Magesh, e P. Valarmathie. "Domain and Intelligence Based Multimedia Question Answering System". International Journal of Evaluation and Research in Education (IJERE) 5, n. 3 (1 settembre 2016): 227. http://dx.doi.org/10.11591/ijere.v5i3.4544.
Testo completoXue, Hongyang, Zhou Zhao e Deng Cai. "Unifying the Video and Question Attentions for Open-Ended Video Question Answering". IEEE Transactions on Image Processing 26, n. 12 (dicembre 2017): 5656–66. http://dx.doi.org/10.1109/tip.2017.2746267.
Testo completoJang, Yunseok, Yale Song, Chris Dongjoo Kim, Youngjae Yu, Youngjin Kim e Gunhee Kim. "Video Question Answering with Spatio-Temporal Reasoning". International Journal of Computer Vision 127, n. 10 (18 giugno 2019): 1385–412. http://dx.doi.org/10.1007/s11263-019-01189-x.
Testo completoZhuang, Yueting, Dejing Xu, Xin Yan, Wenzhuo Cheng, Zhou Zhao, Shiliang Pu e Jun Xiao. "Multichannel Attention Refinement for Video Question Answering". ACM Transactions on Multimedia Computing, Communications, and Applications 16, n. 1s (28 aprile 2020): 1–23. http://dx.doi.org/10.1145/3366710.
Testo completoTesi sul tema "Video question answering"
Engin, Deniz. "Video question answering with limited supervision". Electronic Thesis or Diss., Université de Rennes (2023-....), 2024. http://www.theses.fr/2024URENS016.
Testo completoVideo content has significantly increased in volume and diversity in the digital era, and this expansion has highlighted the necessity for advanced video understanding technologies. Driven by this necessity, this thesis explores semantically understanding videos, leveraging multiple perceptual modes similar to human cognitive processes and efficient learning with limited supervision similar to human learning capabilities. This thesis specifically focuses on video question answering as one of the main video understanding tasks. Our first contribution addresses long-range video question answering, requiring an understanding of extended video content. While recent approaches rely on human-generated external sources, we process raw data to generate video summaries. Our following contribution explores zero-shot and few-shot video question answering, aiming to enhance efficient learning from limited data. We leverage the knowledge of existing large-scale models by eliminating challenges in adapting pre-trained models to limited data. We demonstrate that these contributions significantly enhance the capabilities of multimodal video question-answering systems, where specifically human-annotated labeled data is limited or unavailable
Chowdhury, Muhammad Iqbal Hasan. "Question-answering on image/video content". Thesis, Queensland University of Technology, 2020. https://eprints.qut.edu.au/205096/1/Muhammad%20Iqbal%20Hasan_Chowdhury_Thesis.pdf.
Testo completoZeng, Kuo-Hao, e 曾國豪. "Video titling and Question-Answering". Thesis, 2017. http://ndltd.ncl.edu.tw/handle/a3a6sw.
Testo completo國立清華大學
電機工程學系所
105
Video titling and question answering are two important tasks toward high-level visual data understanding. To address those two tasks, we propose a large-scale dataset and demonstrate several models on such dataset in this work. A great video title describes the most salient event compactly and captures the viewer's attention. In contrast, video captioning tends to generate sentences that describe the video as a whole. Although generating a video title automatically is a very useful task, it is much less addressed than video captioning. We address video title generation for the first time by proposing two methods that extend state-of-the-art video captioners to this new task. First, we make video captioners highlight sensitive by priming them with a highlight detector. Our framework allows for jointly training a model for title generation and video highlight localization. Second, we induce high sentence diversity in video captioners, so that the generated titles are also diverse and catchy. This means that a large number of sentences might be required to learn the sentence structure of titles. Hence, we propose a novel sentence augmentation method to train a captioner with additional sentence-only examples that come without corresponding videos. On the other hand, for video question-answering task: we propose to learn a deep model to answer a free-form natural language question about the contents of a video. We make a program automatically harvests a large number of videos and descriptions freely available online. Then, a large number of candidate QA pairs are automatically generated from descriptions rather than manually annotated. Next, we use these candidate QA pairs to train a number of video-based QA methods extended from MN, VQA, SA, and SS. In order to handle non-perfect candidate QA pairs, we propose a self-paced learning procedure to iteratively identify them and mitigate their effects in training. To demonstrate our idea, we collected a large-scale Video Titles in the Wild (VTW) dataset of $18100$ automatically crawled user-generated videos and titles. We then utilize an automatic QA generator to generate a large number of QA pairs for training and collect the manually generated QA pairs from Amazon Mechanical Turk. On VTW, our methods consistently improve title prediction accuracy, and achieve the best performance in both automatic and human evaluation. Next, our sentence augmentation method also outperforms the baselines on the M-VAD dataset. Finally, the results of video question answering show that our self-paced learning procedure is effective, and the extended SS model outperforms various baselines.
Xu, Huijuan. "Vision and language understanding with localized evidence". Thesis, 2018. https://hdl.handle.net/2144/34790.
Testo completoLibri sul tema "Video question answering"
McCallum, Richard. Evangelical Christian Responses to Islam. Bloomsbury Publishing Plc, 2024. http://dx.doi.org/10.5040/9781350418240.
Testo completoWalker, Stephen. Digital Mediation. Bloomsbury Publishing Plc, 2024. http://dx.doi.org/10.5040/9781526525772.
Testo completoCapitoli di libri sul tema "Video question answering"
Wu, Qi, Peng Wang, Xin Wang, Xiaodong He e Wenwu Zhu. "Video Question Answering". In Visual Question Answering, 119–33. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-0964-1_8.
Testo completoWu, Qi, Peng Wang, Xin Wang, Xiaodong He e Wenwu Zhu. "Video Representation Learning". In Visual Question Answering, 111–17. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-0964-1_7.
Testo completoWu, Qi, Peng Wang, Xin Wang, Xiaodong He e Wenwu Zhu. "Advanced Models for Video Question Answering". In Visual Question Answering, 135–43. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-0964-1_9.
Testo completoGao, Lei, Guangda Li, Yan-Tao Zheng, Richang Hong e Tat-Seng Chua. "Video Reference: A Video Question Answering Engine". In Lecture Notes in Computer Science, 799–801. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-11301-7_92.
Testo completoXiao, Junbin, Pan Zhou, Tat-Seng Chua e Shuicheng Yan. "Video Graph Transformer for Video Question Answering". In Lecture Notes in Computer Science, 39–58. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-20059-5_3.
Testo completoPiergiovanni, AJ, Kairo Morton, Weicheng Kuo, Michael S. Ryoo e Anelia Angelova. "Video Question Answering with Iterative Video-Text Co-tokenization". In Lecture Notes in Computer Science, 76–94. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-20059-5_5.
Testo completoChen, Xuanwei, Rui Liu, Xiaomeng Song e Yahong Han. "Locating Visual Explanations for Video Question Answering". In MultiMedia Modeling, 290–302. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-67832-6_24.
Testo completoGupta, Pranay, e Manish Gupta. "NewsKVQA: Knowledge-Aware News Video Question Answering". In Advances in Knowledge Discovery and Data Mining, 3–15. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-05981-0_1.
Testo completoGe, Yuanyuan, Youjiang Xu e Yahong Han. "Video Question Answering Using a Forget Memory Network". In Communications in Computer and Information Science, 404–15. Singapore: Springer Singapore, 2017. http://dx.doi.org/10.1007/978-981-10-7299-4_33.
Testo completoGao, Kun, Xianglei Zhu e Yahong Han. "Initialized Frame Attention Networks for Video Question Answering". In Communications in Computer and Information Science, 349–59. Singapore: Springer Singapore, 2018. http://dx.doi.org/10.1007/978-981-10-8530-7_34.
Testo completoAtti di convegni sul tema "Video question answering"
Zhao, Wentian, Seokhwan Kim, Ning Xu e Hailin Jin. "Video Question Answering on Screencast Tutorials". In Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence {IJCAI-PRICAI-20}. California: International Joint Conferences on Artificial Intelligence Organization, 2020. http://dx.doi.org/10.24963/ijcai.2020/148.
Testo completoJenni, Kommineni, M. Srinivas, Roshni Sannapu e Murukessan Perumal. "CSA-BERT: Video Question Answering". In 2023 IEEE Statistical Signal Processing Workshop (SSP). IEEE, 2023. http://dx.doi.org/10.1109/ssp53291.2023.10207954.
Testo completoLi, Hao, Peng Jin, Zesen Cheng, Songyang Zhang, Kai Chen, Zhennan Wang, Chang Liu e Jie Chen. "TG-VQA: Ternary Game of Video Question Answering". In Thirty-Second International Joint Conference on Artificial Intelligence {IJCAI-23}. California: International Joint Conferences on Artificial Intelligence Organization, 2023. http://dx.doi.org/10.24963/ijcai.2023/116.
Testo completoZhao, Zhou, Qifan Yang, Deng Cai, Xiaofei He e Yueting Zhuang. "Video Question Answering via Hierarchical Spatio-Temporal Attention Networks". In Twenty-Sixth International Joint Conference on Artificial Intelligence. California: International Joint Conferences on Artificial Intelligence Organization, 2017. http://dx.doi.org/10.24963/ijcai.2017/492.
Testo completoChao, Guan-Lin, Abhinav Rastogi, Semih Yavuz, Dilek Hakkani-Tur, Jindong Chen e Ian Lane. "Learning Question-Guided Video Representation for Multi-Turn Video Question Answering". In Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue. Stroudsburg, PA, USA: Association for Computational Linguistics, 2019. http://dx.doi.org/10.18653/v1/w19-5926.
Testo completoBhalerao, Mandar, Shlok Gujar, Aditya Bhave e Anant V. Nimkar. "Visual Question Answering Using Video Clips". In 2019 IEEE Bombay Section Signature Conference (IBSSC). IEEE, 2019. http://dx.doi.org/10.1109/ibssc47189.2019.8973090.
Testo completoYang, Zekun, Noa Garcia, Chenhui Chu, Mayu Otani, Yuta Nakashima e Haruo Takemura. "BERT Representations for Video Question Answering". In 2020 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2020. http://dx.doi.org/10.1109/wacv45572.2020.9093596.
Testo completoLi, Yicong, Xiang Wang, Junbin Xiao, Wei Ji e Tat-Seng Chua. "Invariant Grounding for Video Question Answering". In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2022. http://dx.doi.org/10.1109/cvpr52688.2022.00294.
Testo completoFang, Jiannan, Lingling Sun e Yaqi Wang. "Video question answering by frame attention". In Eleventh International Conference on Digital Image Processing, a cura di Xudong Jiang e Jenq-Neng Hwang. SPIE, 2019. http://dx.doi.org/10.1117/12.2539615.
Testo completoLei, Jie, Licheng Yu, Mohit Bansal e Tamara Berg. "TVQA: Localized, Compositional Video Question Answering". In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018. http://dx.doi.org/10.18653/v1/d18-1167.
Testo completo