Auswahl der wissenschaftlichen Literatur zum Thema „Video question answering“
Geben Sie eine Quelle nach APA, MLA, Chicago, Harvard und anderen Zitierweisen an
Inhaltsverzeichnis
Machen Sie sich mit den Listen der aktuellen Artikel, Bücher, Dissertationen, Berichten und anderer wissenschaftlichen Quellen zum Thema "Video question answering" bekannt.
Neben jedem Werk im Literaturverzeichnis ist die Option "Zur Bibliographie hinzufügen" verfügbar. Nutzen Sie sie, wird Ihre bibliographische Angabe des gewählten Werkes nach der nötigen Zitierweise (APA, MLA, Harvard, Chicago, Vancouver usw.) automatisch gestaltet.
Sie können auch den vollen Text der wissenschaftlichen Publikation im PDF-Format herunterladen und eine Online-Annotation der Arbeit lesen, wenn die relevanten Parameter in den Metadaten verfügbar sind.
Zeitschriftenartikel zum Thema "Video question answering"
Lei, Chenyi, Lei Wu, Dong Liu, Zhao Li, Guoxin Wang, Haihong Tang und Houqiang Li. „Multi-Question Learning for Visual Question Answering“. Proceedings of the AAAI Conference on Artificial Intelligence 34, Nr. 07 (03.04.2020): 11328–35. http://dx.doi.org/10.1609/aaai.v34i07.6794.
Der volle Inhalt der QuelleRuwa, Nelson, Qirong Mao, Liangjun Wang und Jianping Gou. „Affective question answering on video“. Neurocomputing 363 (Oktober 2019): 125–39. http://dx.doi.org/10.1016/j.neucom.2019.06.046.
Der volle Inhalt der QuelleWang, Yueqian, Yuxuan Wang, Kai Chen und Dongyan Zhao. „STAIR: Spatial-Temporal Reasoning with Auditable Intermediate Results for Video Question Answering“. Proceedings of the AAAI Conference on Artificial Intelligence 38, Nr. 17 (24.03.2024): 19215–23. http://dx.doi.org/10.1609/aaai.v38i17.29890.
Der volle Inhalt der QuelleZong, Linlin, Jiahui Wan, Xianchao Zhang, Xinyue Liu, Wenxin Liang und Bo Xu. „Video-Context Aligned Transformer for Video Question Answering“. Proceedings of the AAAI Conference on Artificial Intelligence 38, Nr. 17 (24.03.2024): 19795–803. http://dx.doi.org/10.1609/aaai.v38i17.29954.
Der volle Inhalt der QuelleHuang, Deng, Peihao Chen, Runhao Zeng, Qing Du, Mingkui Tan und Chuang Gan. „Location-Aware Graph Convolutional Networks for Video Question Answering“. Proceedings of the AAAI Conference on Artificial Intelligence 34, Nr. 07 (03.04.2020): 11021–28. http://dx.doi.org/10.1609/aaai.v34i07.6737.
Der volle Inhalt der QuelleGao, Lianli, Pengpeng Zeng, Jingkuan Song, Yuan-Fang Li, Wu Liu, Tao Mei und Heng Tao Shen. „Structured Two-Stream Attention Network for Video Question Answering“. Proceedings of the AAAI Conference on Artificial Intelligence 33 (17.07.2019): 6391–98. http://dx.doi.org/10.1609/aaai.v33i01.33016391.
Der volle Inhalt der QuelleKumar, Krishnamoorthi Magesh, und P. Valarmathie. „Domain and Intelligence Based Multimedia Question Answering System“. International Journal of Evaluation and Research in Education (IJERE) 5, Nr. 3 (01.09.2016): 227. http://dx.doi.org/10.11591/ijere.v5i3.4544.
Der volle Inhalt der QuelleXue, Hongyang, Zhou Zhao und Deng Cai. „Unifying the Video and Question Attentions for Open-Ended Video Question Answering“. IEEE Transactions on Image Processing 26, Nr. 12 (Dezember 2017): 5656–66. http://dx.doi.org/10.1109/tip.2017.2746267.
Der volle Inhalt der QuelleJang, Yunseok, Yale Song, Chris Dongjoo Kim, Youngjae Yu, Youngjin Kim und Gunhee Kim. „Video Question Answering with Spatio-Temporal Reasoning“. International Journal of Computer Vision 127, Nr. 10 (18.06.2019): 1385–412. http://dx.doi.org/10.1007/s11263-019-01189-x.
Der volle Inhalt der QuelleZhuang, Yueting, Dejing Xu, Xin Yan, Wenzhuo Cheng, Zhou Zhao, Shiliang Pu und Jun Xiao. „Multichannel Attention Refinement for Video Question Answering“. ACM Transactions on Multimedia Computing, Communications, and Applications 16, Nr. 1s (28.04.2020): 1–23. http://dx.doi.org/10.1145/3366710.
Der volle Inhalt der QuelleDissertationen zum Thema "Video question answering"
Engin, Deniz. „Video question answering with limited supervision“. Electronic Thesis or Diss., Université de Rennes (2023-....), 2024. http://www.theses.fr/2024URENS016.
Der volle Inhalt der QuelleVideo content has significantly increased in volume and diversity in the digital era, and this expansion has highlighted the necessity for advanced video understanding technologies. Driven by this necessity, this thesis explores semantically understanding videos, leveraging multiple perceptual modes similar to human cognitive processes and efficient learning with limited supervision similar to human learning capabilities. This thesis specifically focuses on video question answering as one of the main video understanding tasks. Our first contribution addresses long-range video question answering, requiring an understanding of extended video content. While recent approaches rely on human-generated external sources, we process raw data to generate video summaries. Our following contribution explores zero-shot and few-shot video question answering, aiming to enhance efficient learning from limited data. We leverage the knowledge of existing large-scale models by eliminating challenges in adapting pre-trained models to limited data. We demonstrate that these contributions significantly enhance the capabilities of multimodal video question-answering systems, where specifically human-annotated labeled data is limited or unavailable
Chowdhury, Muhammad Iqbal Hasan. „Question-answering on image/video content“. Thesis, Queensland University of Technology, 2020. https://eprints.qut.edu.au/205096/1/Muhammad%20Iqbal%20Hasan_Chowdhury_Thesis.pdf.
Der volle Inhalt der QuelleZeng, Kuo-Hao, und 曾國豪. „Video titling and Question-Answering“. Thesis, 2017. http://ndltd.ncl.edu.tw/handle/a3a6sw.
Der volle Inhalt der Quelle國立清華大學
電機工程學系所
105
Video titling and question answering are two important tasks toward high-level visual data understanding. To address those two tasks, we propose a large-scale dataset and demonstrate several models on such dataset in this work. A great video title describes the most salient event compactly and captures the viewer's attention. In contrast, video captioning tends to generate sentences that describe the video as a whole. Although generating a video title automatically is a very useful task, it is much less addressed than video captioning. We address video title generation for the first time by proposing two methods that extend state-of-the-art video captioners to this new task. First, we make video captioners highlight sensitive by priming them with a highlight detector. Our framework allows for jointly training a model for title generation and video highlight localization. Second, we induce high sentence diversity in video captioners, so that the generated titles are also diverse and catchy. This means that a large number of sentences might be required to learn the sentence structure of titles. Hence, we propose a novel sentence augmentation method to train a captioner with additional sentence-only examples that come without corresponding videos. On the other hand, for video question-answering task: we propose to learn a deep model to answer a free-form natural language question about the contents of a video. We make a program automatically harvests a large number of videos and descriptions freely available online. Then, a large number of candidate QA pairs are automatically generated from descriptions rather than manually annotated. Next, we use these candidate QA pairs to train a number of video-based QA methods extended from MN, VQA, SA, and SS. In order to handle non-perfect candidate QA pairs, we propose a self-paced learning procedure to iteratively identify them and mitigate their effects in training. To demonstrate our idea, we collected a large-scale Video Titles in the Wild (VTW) dataset of $18100$ automatically crawled user-generated videos and titles. We then utilize an automatic QA generator to generate a large number of QA pairs for training and collect the manually generated QA pairs from Amazon Mechanical Turk. On VTW, our methods consistently improve title prediction accuracy, and achieve the best performance in both automatic and human evaluation. Next, our sentence augmentation method also outperforms the baselines on the M-VAD dataset. Finally, the results of video question answering show that our self-paced learning procedure is effective, and the extended SS model outperforms various baselines.
Xu, Huijuan. „Vision and language understanding with localized evidence“. Thesis, 2018. https://hdl.handle.net/2144/34790.
Der volle Inhalt der QuelleBücher zum Thema "Video question answering"
McCallum, Richard. Evangelical Christian Responses to Islam. Bloomsbury Publishing Plc, 2024. http://dx.doi.org/10.5040/9781350418240.
Der volle Inhalt der QuelleWalker, Stephen. Digital Mediation. Bloomsbury Publishing Plc, 2024. http://dx.doi.org/10.5040/9781526525772.
Der volle Inhalt der QuelleBuchteile zum Thema "Video question answering"
Wu, Qi, Peng Wang, Xin Wang, Xiaodong He und Wenwu Zhu. „Video Question Answering“. In Visual Question Answering, 119–33. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-0964-1_8.
Der volle Inhalt der QuelleWu, Qi, Peng Wang, Xin Wang, Xiaodong He und Wenwu Zhu. „Video Representation Learning“. In Visual Question Answering, 111–17. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-0964-1_7.
Der volle Inhalt der QuelleWu, Qi, Peng Wang, Xin Wang, Xiaodong He und Wenwu Zhu. „Advanced Models for Video Question Answering“. In Visual Question Answering, 135–43. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-0964-1_9.
Der volle Inhalt der QuelleGao, Lei, Guangda Li, Yan-Tao Zheng, Richang Hong und Tat-Seng Chua. „Video Reference: A Video Question Answering Engine“. In Lecture Notes in Computer Science, 799–801. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-11301-7_92.
Der volle Inhalt der QuelleXiao, Junbin, Pan Zhou, Tat-Seng Chua und Shuicheng Yan. „Video Graph Transformer for Video Question Answering“. In Lecture Notes in Computer Science, 39–58. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-20059-5_3.
Der volle Inhalt der QuellePiergiovanni, AJ, Kairo Morton, Weicheng Kuo, Michael S. Ryoo und Anelia Angelova. „Video Question Answering with Iterative Video-Text Co-tokenization“. In Lecture Notes in Computer Science, 76–94. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-20059-5_5.
Der volle Inhalt der QuelleChen, Xuanwei, Rui Liu, Xiaomeng Song und Yahong Han. „Locating Visual Explanations for Video Question Answering“. In MultiMedia Modeling, 290–302. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-67832-6_24.
Der volle Inhalt der QuelleGupta, Pranay, und Manish Gupta. „NewsKVQA: Knowledge-Aware News Video Question Answering“. In Advances in Knowledge Discovery and Data Mining, 3–15. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-05981-0_1.
Der volle Inhalt der QuelleGe, Yuanyuan, Youjiang Xu und Yahong Han. „Video Question Answering Using a Forget Memory Network“. In Communications in Computer and Information Science, 404–15. Singapore: Springer Singapore, 2017. http://dx.doi.org/10.1007/978-981-10-7299-4_33.
Der volle Inhalt der QuelleGao, Kun, Xianglei Zhu und Yahong Han. „Initialized Frame Attention Networks for Video Question Answering“. In Communications in Computer and Information Science, 349–59. Singapore: Springer Singapore, 2018. http://dx.doi.org/10.1007/978-981-10-8530-7_34.
Der volle Inhalt der QuelleKonferenzberichte zum Thema "Video question answering"
Zhao, Wentian, Seokhwan Kim, Ning Xu und Hailin Jin. „Video Question Answering on Screencast Tutorials“. In Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence {IJCAI-PRICAI-20}. California: International Joint Conferences on Artificial Intelligence Organization, 2020. http://dx.doi.org/10.24963/ijcai.2020/148.
Der volle Inhalt der QuelleJenni, Kommineni, M. Srinivas, Roshni Sannapu und Murukessan Perumal. „CSA-BERT: Video Question Answering“. In 2023 IEEE Statistical Signal Processing Workshop (SSP). IEEE, 2023. http://dx.doi.org/10.1109/ssp53291.2023.10207954.
Der volle Inhalt der QuelleLi, Hao, Peng Jin, Zesen Cheng, Songyang Zhang, Kai Chen, Zhennan Wang, Chang Liu und Jie Chen. „TG-VQA: Ternary Game of Video Question Answering“. In Thirty-Second International Joint Conference on Artificial Intelligence {IJCAI-23}. California: International Joint Conferences on Artificial Intelligence Organization, 2023. http://dx.doi.org/10.24963/ijcai.2023/116.
Der volle Inhalt der QuelleZhao, Zhou, Qifan Yang, Deng Cai, Xiaofei He und Yueting Zhuang. „Video Question Answering via Hierarchical Spatio-Temporal Attention Networks“. In Twenty-Sixth International Joint Conference on Artificial Intelligence. California: International Joint Conferences on Artificial Intelligence Organization, 2017. http://dx.doi.org/10.24963/ijcai.2017/492.
Der volle Inhalt der QuelleChao, Guan-Lin, Abhinav Rastogi, Semih Yavuz, Dilek Hakkani-Tur, Jindong Chen und Ian Lane. „Learning Question-Guided Video Representation for Multi-Turn Video Question Answering“. In Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue. Stroudsburg, PA, USA: Association for Computational Linguistics, 2019. http://dx.doi.org/10.18653/v1/w19-5926.
Der volle Inhalt der QuelleBhalerao, Mandar, Shlok Gujar, Aditya Bhave und Anant V. Nimkar. „Visual Question Answering Using Video Clips“. In 2019 IEEE Bombay Section Signature Conference (IBSSC). IEEE, 2019. http://dx.doi.org/10.1109/ibssc47189.2019.8973090.
Der volle Inhalt der QuelleYang, Zekun, Noa Garcia, Chenhui Chu, Mayu Otani, Yuta Nakashima und Haruo Takemura. „BERT Representations for Video Question Answering“. In 2020 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2020. http://dx.doi.org/10.1109/wacv45572.2020.9093596.
Der volle Inhalt der QuelleLi, Yicong, Xiang Wang, Junbin Xiao, Wei Ji und Tat-Seng Chua. „Invariant Grounding for Video Question Answering“. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2022. http://dx.doi.org/10.1109/cvpr52688.2022.00294.
Der volle Inhalt der QuelleFang, Jiannan, Lingling Sun und Yaqi Wang. „Video question answering by frame attention“. In Eleventh International Conference on Digital Image Processing, herausgegeben von Xudong Jiang und Jenq-Neng Hwang. SPIE, 2019. http://dx.doi.org/10.1117/12.2539615.
Der volle Inhalt der QuelleLei, Jie, Licheng Yu, Mohit Bansal und Tamara Berg. „TVQA: Localized, Compositional Video Question Answering“. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018. http://dx.doi.org/10.18653/v1/d18-1167.
Der volle Inhalt der Quelle