Literatura científica selecionada sobre o tema "Video question answering"
Crie uma referência precisa em APA, MLA, Chicago, Harvard, e outros estilos
Consulte a lista de atuais artigos, livros, teses, anais de congressos e outras fontes científicas relevantes para o tema "Video question answering".
Ao lado de cada fonte na lista de referências, há um botão "Adicionar à bibliografia". Clique e geraremos automaticamente a citação bibliográfica do trabalho escolhido no estilo de citação de que você precisa: APA, MLA, Harvard, Chicago, Vancouver, etc.
Você também pode baixar o texto completo da publicação científica em formato .pdf e ler o resumo do trabalho online se estiver presente nos metadados.
Artigos de revistas sobre o assunto "Video question answering"
Lei, Chenyi, Lei Wu, Dong Liu, Zhao Li, Guoxin Wang, Haihong Tang e Houqiang Li. "Multi-Question Learning for Visual Question Answering". Proceedings of the AAAI Conference on Artificial Intelligence 34, n.º 07 (3 de abril de 2020): 11328–35. http://dx.doi.org/10.1609/aaai.v34i07.6794.
Texto completo da fonteRuwa, Nelson, Qirong Mao, Liangjun Wang e Jianping Gou. "Affective question answering on video". Neurocomputing 363 (outubro de 2019): 125–39. http://dx.doi.org/10.1016/j.neucom.2019.06.046.
Texto completo da fonteWang, Yueqian, Yuxuan Wang, Kai Chen e Dongyan Zhao. "STAIR: Spatial-Temporal Reasoning with Auditable Intermediate Results for Video Question Answering". Proceedings of the AAAI Conference on Artificial Intelligence 38, n.º 17 (24 de março de 2024): 19215–23. http://dx.doi.org/10.1609/aaai.v38i17.29890.
Texto completo da fonteZong, Linlin, Jiahui Wan, Xianchao Zhang, Xinyue Liu, Wenxin Liang e Bo Xu. "Video-Context Aligned Transformer for Video Question Answering". Proceedings of the AAAI Conference on Artificial Intelligence 38, n.º 17 (24 de março de 2024): 19795–803. http://dx.doi.org/10.1609/aaai.v38i17.29954.
Texto completo da fonteHuang, Deng, Peihao Chen, Runhao Zeng, Qing Du, Mingkui Tan e Chuang Gan. "Location-Aware Graph Convolutional Networks for Video Question Answering". Proceedings of the AAAI Conference on Artificial Intelligence 34, n.º 07 (3 de abril de 2020): 11021–28. http://dx.doi.org/10.1609/aaai.v34i07.6737.
Texto completo da fonteGao, Lianli, Pengpeng Zeng, Jingkuan Song, Yuan-Fang Li, Wu Liu, Tao Mei e Heng Tao Shen. "Structured Two-Stream Attention Network for Video Question Answering". Proceedings of the AAAI Conference on Artificial Intelligence 33 (17 de julho de 2019): 6391–98. http://dx.doi.org/10.1609/aaai.v33i01.33016391.
Texto completo da fonteKumar, Krishnamoorthi Magesh, e P. Valarmathie. "Domain and Intelligence Based Multimedia Question Answering System". International Journal of Evaluation and Research in Education (IJERE) 5, n.º 3 (1 de setembro de 2016): 227. http://dx.doi.org/10.11591/ijere.v5i3.4544.
Texto completo da fonteXue, Hongyang, Zhou Zhao e Deng Cai. "Unifying the Video and Question Attentions for Open-Ended Video Question Answering". IEEE Transactions on Image Processing 26, n.º 12 (dezembro de 2017): 5656–66. http://dx.doi.org/10.1109/tip.2017.2746267.
Texto completo da fonteJang, Yunseok, Yale Song, Chris Dongjoo Kim, Youngjae Yu, Youngjin Kim e Gunhee Kim. "Video Question Answering with Spatio-Temporal Reasoning". International Journal of Computer Vision 127, n.º 10 (18 de junho de 2019): 1385–412. http://dx.doi.org/10.1007/s11263-019-01189-x.
Texto completo da fonteZhuang, Yueting, Dejing Xu, Xin Yan, Wenzhuo Cheng, Zhou Zhao, Shiliang Pu e Jun Xiao. "Multichannel Attention Refinement for Video Question Answering". ACM Transactions on Multimedia Computing, Communications, and Applications 16, n.º 1s (28 de abril de 2020): 1–23. http://dx.doi.org/10.1145/3366710.
Texto completo da fonteTeses / dissertações sobre o assunto "Video question answering"
Engin, Deniz. "Video question answering with limited supervision". Electronic Thesis or Diss., Université de Rennes (2023-....), 2024. http://www.theses.fr/2024URENS016.
Texto completo da fonteVideo content has significantly increased in volume and diversity in the digital era, and this expansion has highlighted the necessity for advanced video understanding technologies. Driven by this necessity, this thesis explores semantically understanding videos, leveraging multiple perceptual modes similar to human cognitive processes and efficient learning with limited supervision similar to human learning capabilities. This thesis specifically focuses on video question answering as one of the main video understanding tasks. Our first contribution addresses long-range video question answering, requiring an understanding of extended video content. While recent approaches rely on human-generated external sources, we process raw data to generate video summaries. Our following contribution explores zero-shot and few-shot video question answering, aiming to enhance efficient learning from limited data. We leverage the knowledge of existing large-scale models by eliminating challenges in adapting pre-trained models to limited data. We demonstrate that these contributions significantly enhance the capabilities of multimodal video question-answering systems, where specifically human-annotated labeled data is limited or unavailable
Chowdhury, Muhammad Iqbal Hasan. "Question-answering on image/video content". Thesis, Queensland University of Technology, 2020. https://eprints.qut.edu.au/205096/1/Muhammad%20Iqbal%20Hasan_Chowdhury_Thesis.pdf.
Texto completo da fonteZeng, Kuo-Hao, e 曾國豪. "Video titling and Question-Answering". Thesis, 2017. http://ndltd.ncl.edu.tw/handle/a3a6sw.
Texto completo da fonte國立清華大學
電機工程學系所
105
Video titling and question answering are two important tasks toward high-level visual data understanding. To address those two tasks, we propose a large-scale dataset and demonstrate several models on such dataset in this work. A great video title describes the most salient event compactly and captures the viewer's attention. In contrast, video captioning tends to generate sentences that describe the video as a whole. Although generating a video title automatically is a very useful task, it is much less addressed than video captioning. We address video title generation for the first time by proposing two methods that extend state-of-the-art video captioners to this new task. First, we make video captioners highlight sensitive by priming them with a highlight detector. Our framework allows for jointly training a model for title generation and video highlight localization. Second, we induce high sentence diversity in video captioners, so that the generated titles are also diverse and catchy. This means that a large number of sentences might be required to learn the sentence structure of titles. Hence, we propose a novel sentence augmentation method to train a captioner with additional sentence-only examples that come without corresponding videos. On the other hand, for video question-answering task: we propose to learn a deep model to answer a free-form natural language question about the contents of a video. We make a program automatically harvests a large number of videos and descriptions freely available online. Then, a large number of candidate QA pairs are automatically generated from descriptions rather than manually annotated. Next, we use these candidate QA pairs to train a number of video-based QA methods extended from MN, VQA, SA, and SS. In order to handle non-perfect candidate QA pairs, we propose a self-paced learning procedure to iteratively identify them and mitigate their effects in training. To demonstrate our idea, we collected a large-scale Video Titles in the Wild (VTW) dataset of $18100$ automatically crawled user-generated videos and titles. We then utilize an automatic QA generator to generate a large number of QA pairs for training and collect the manually generated QA pairs from Amazon Mechanical Turk. On VTW, our methods consistently improve title prediction accuracy, and achieve the best performance in both automatic and human evaluation. Next, our sentence augmentation method also outperforms the baselines on the M-VAD dataset. Finally, the results of video question answering show that our self-paced learning procedure is effective, and the extended SS model outperforms various baselines.
Xu, Huijuan. "Vision and language understanding with localized evidence". Thesis, 2018. https://hdl.handle.net/2144/34790.
Texto completo da fonteLivros sobre o assunto "Video question answering"
McCallum, Richard. Evangelical Christian Responses to Islam. Bloomsbury Publishing Plc, 2024. http://dx.doi.org/10.5040/9781350418240.
Texto completo da fonteWalker, Stephen. Digital Mediation. Bloomsbury Publishing Plc, 2024. http://dx.doi.org/10.5040/9781526525772.
Texto completo da fonteCapítulos de livros sobre o assunto "Video question answering"
Wu, Qi, Peng Wang, Xin Wang, Xiaodong He e Wenwu Zhu. "Video Question Answering". In Visual Question Answering, 119–33. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-0964-1_8.
Texto completo da fonteWu, Qi, Peng Wang, Xin Wang, Xiaodong He e Wenwu Zhu. "Video Representation Learning". In Visual Question Answering, 111–17. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-0964-1_7.
Texto completo da fonteWu, Qi, Peng Wang, Xin Wang, Xiaodong He e Wenwu Zhu. "Advanced Models for Video Question Answering". In Visual Question Answering, 135–43. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-0964-1_9.
Texto completo da fonteGao, Lei, Guangda Li, Yan-Tao Zheng, Richang Hong e Tat-Seng Chua. "Video Reference: A Video Question Answering Engine". In Lecture Notes in Computer Science, 799–801. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-11301-7_92.
Texto completo da fonteXiao, Junbin, Pan Zhou, Tat-Seng Chua e Shuicheng Yan. "Video Graph Transformer for Video Question Answering". In Lecture Notes in Computer Science, 39–58. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-20059-5_3.
Texto completo da fontePiergiovanni, AJ, Kairo Morton, Weicheng Kuo, Michael S. Ryoo e Anelia Angelova. "Video Question Answering with Iterative Video-Text Co-tokenization". In Lecture Notes in Computer Science, 76–94. Cham: Springer Nature Switzerland, 2022. http://dx.doi.org/10.1007/978-3-031-20059-5_5.
Texto completo da fonteChen, Xuanwei, Rui Liu, Xiaomeng Song e Yahong Han. "Locating Visual Explanations for Video Question Answering". In MultiMedia Modeling, 290–302. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-67832-6_24.
Texto completo da fonteGupta, Pranay, e Manish Gupta. "NewsKVQA: Knowledge-Aware News Video Question Answering". In Advances in Knowledge Discovery and Data Mining, 3–15. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-05981-0_1.
Texto completo da fonteGe, Yuanyuan, Youjiang Xu e Yahong Han. "Video Question Answering Using a Forget Memory Network". In Communications in Computer and Information Science, 404–15. Singapore: Springer Singapore, 2017. http://dx.doi.org/10.1007/978-981-10-7299-4_33.
Texto completo da fonteGao, Kun, Xianglei Zhu e Yahong Han. "Initialized Frame Attention Networks for Video Question Answering". In Communications in Computer and Information Science, 349–59. Singapore: Springer Singapore, 2018. http://dx.doi.org/10.1007/978-981-10-8530-7_34.
Texto completo da fonteTrabalhos de conferências sobre o assunto "Video question answering"
Zhao, Wentian, Seokhwan Kim, Ning Xu e Hailin Jin. "Video Question Answering on Screencast Tutorials". In Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence {IJCAI-PRICAI-20}. California: International Joint Conferences on Artificial Intelligence Organization, 2020. http://dx.doi.org/10.24963/ijcai.2020/148.
Texto completo da fonteJenni, Kommineni, M. Srinivas, Roshni Sannapu e Murukessan Perumal. "CSA-BERT: Video Question Answering". In 2023 IEEE Statistical Signal Processing Workshop (SSP). IEEE, 2023. http://dx.doi.org/10.1109/ssp53291.2023.10207954.
Texto completo da fonteLi, Hao, Peng Jin, Zesen Cheng, Songyang Zhang, Kai Chen, Zhennan Wang, Chang Liu e Jie Chen. "TG-VQA: Ternary Game of Video Question Answering". In Thirty-Second International Joint Conference on Artificial Intelligence {IJCAI-23}. California: International Joint Conferences on Artificial Intelligence Organization, 2023. http://dx.doi.org/10.24963/ijcai.2023/116.
Texto completo da fonteZhao, Zhou, Qifan Yang, Deng Cai, Xiaofei He e Yueting Zhuang. "Video Question Answering via Hierarchical Spatio-Temporal Attention Networks". In Twenty-Sixth International Joint Conference on Artificial Intelligence. California: International Joint Conferences on Artificial Intelligence Organization, 2017. http://dx.doi.org/10.24963/ijcai.2017/492.
Texto completo da fonteChao, Guan-Lin, Abhinav Rastogi, Semih Yavuz, Dilek Hakkani-Tur, Jindong Chen e Ian Lane. "Learning Question-Guided Video Representation for Multi-Turn Video Question Answering". In Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue. Stroudsburg, PA, USA: Association for Computational Linguistics, 2019. http://dx.doi.org/10.18653/v1/w19-5926.
Texto completo da fonteBhalerao, Mandar, Shlok Gujar, Aditya Bhave e Anant V. Nimkar. "Visual Question Answering Using Video Clips". In 2019 IEEE Bombay Section Signature Conference (IBSSC). IEEE, 2019. http://dx.doi.org/10.1109/ibssc47189.2019.8973090.
Texto completo da fonteYang, Zekun, Noa Garcia, Chenhui Chu, Mayu Otani, Yuta Nakashima e Haruo Takemura. "BERT Representations for Video Question Answering". In 2020 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2020. http://dx.doi.org/10.1109/wacv45572.2020.9093596.
Texto completo da fonteLi, Yicong, Xiang Wang, Junbin Xiao, Wei Ji e Tat-Seng Chua. "Invariant Grounding for Video Question Answering". In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2022. http://dx.doi.org/10.1109/cvpr52688.2022.00294.
Texto completo da fonteFang, Jiannan, Lingling Sun e Yaqi Wang. "Video question answering by frame attention". In Eleventh International Conference on Digital Image Processing, editado por Xudong Jiang e Jenq-Neng Hwang. SPIE, 2019. http://dx.doi.org/10.1117/12.2539615.
Texto completo da fonteLei, Jie, Licheng Yu, Mohit Bansal e Tamara Berg. "TVQA: Localized, Compositional Video Question Answering". In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018. http://dx.doi.org/10.18653/v1/d18-1167.
Texto completo da fonte