Academic literature on the topic 'Change Captioning'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Change Captioning.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Change Captioning"

1

Qiu, Yue, Yutaka Satoh, Ryota Suzuki, Kenji Iwata, and Hirokatsu Kataoka. "Indoor Scene Change Captioning Based on Multimodality Data." Sensors 20, no. 17 (August 23, 2020): 4761. http://dx.doi.org/10.3390/s20174761.

Full text
Abstract:
This study proposes a framework for describing a scene change using natural language text based on indoor scene observations conducted before and after a scene change. The recognition of scene changes plays an essential role in a variety of real-world applications, such as scene anomaly detection. Most scene understanding research has focused on static scenes. Most existing scene change captioning methods detect scene changes from single-view RGB images, neglecting the underlying three-dimensional structures. Previous three-dimensional scene change captioning methods use simulated scenes consisting of geometry primitives, making it unsuitable for real-world applications. To solve these problems, we automatically generated large-scale indoor scene change caption datasets. We propose an end-to-end framework for describing scene changes from various input modalities, namely, RGB images, depth images, and point cloud data, which are available in most robot applications. We conducted experiments with various input modalities and models and evaluated model performance using datasets with various levels of complexity. Experimental results show that the models that combine RGB images and point cloud data as input achieve high performance in sentence generation and caption correctness and are robust for change type understanding for datasets with high complexity. The developed datasets and models contribute to the study of indoor scene change understanding.
APA, Harvard, Vancouver, ISO, and other styles
2

Qiu, Yue, Yutaka Satoh, Ryota Suzuki, Kenji Iwata, and Hirokatsu Kataoka. "3D-Aware Scene Change Captioning From Multiview Images." IEEE Robotics and Automation Letters 5, no. 3 (July 2020): 4743–50. http://dx.doi.org/10.1109/lra.2020.3003290.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Sun, Yaoqi, Liang Li, Tingting Yao, Tongyv Lu, Bolun Zheng, Chenggang Yan, Hua Zhang, Yongjun Bao, Guiguang Ding, and Gregory Slabaugh. "Bidirectional difference locating and semantic consistency reasoning for change captioning." International Journal of Intelligent Systems 37, no. 5 (January 19, 2022): 2969–87. http://dx.doi.org/10.1002/int.22821.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Secară, Alina. "Technology has its role to play in our work, and i believe technology is providing support in the fields that we operate in." CLINA Revista Interdisciplinaria de Traducción Interpretación y Comunicación Intercultural 7, no. 1 (January 18, 2022): 17–24. http://dx.doi.org/10.14201/clina2021711724.

Full text
Abstract:
Technologies such as speech recognition and machine translation are starting to change the audiovisual translation landscape, both in terms of the skills linguists need to successfully compete in the language services market and how training needs to adapt to this change. This interview highlights how such technologies have impacted audiovisual translation training, focusing specifically on subtitling and captioning.
APA, Harvard, Vancouver, ISO, and other styles
5

Cho, Suhyun, and Hayoung Oh. "Generalized Image Captioning for Multilingual Support." Applied Sciences 13, no. 4 (February 14, 2023): 2446. http://dx.doi.org/10.3390/app13042446.

Full text
Abstract:
Image captioning is a problem of viewing images and describing images in language. This is an important problem that can be solved by understanding the image, and combining two fields of image processing and natural language processing into one. The purpose of image captioning research so far has been to create general explanatory captions in the learning data. However, various environments in reality must be considered for practical use, as well as image descriptions that suit the purpose of use. Image caption research requires processing new learning data to generate descriptive captions for specific purposes, but it takes a lot of time and effort to create learnable data. In this study, we propose a method to solve this problem. Popular image captioning can help visually impaired people understand their surroundings by automatically recognizing and describing images into text and then into voice and is an important issue that can be applied to many places such as image search, art therapy, sports commentary, and real-time traffic information commentary. Through the domain object dictionary method proposed in this study, we propose a method to generate image captions without the need to process new learning data by adjusting the object dictionary for each domain application. The method proposed in the study is to change the dictionary of the object to focus on the domain object dictionary rather than processing the learning data, leading to the creation of various image captions by intensively explaining the objects required for each domain. In this work, we propose a filter captioning model that induces generation of image captions from various domains while maintaining the performance of existing models.
APA, Harvard, Vancouver, ISO, and other styles
6

Reddy, Kota Akshith, Satish C J, Jahnavi Polsani, Teja Naveen Chintapalli, and Gangapatnam Sai Ananya. "Analysis of the Fuzziness of Image Caption Generation Models due to Data Augmentation Techniques." International Journal of Recent Technology and Engineering (IJRTE) 10, no. 3 (September 30, 2021): 131–39. http://dx.doi.org/10.35940/ijrte.c6439.0910321.

Full text
Abstract:
Automatic Image Caption Generation is one of the core problems in the field of Deep Learning. Data Augmentation is a technique which helps in increasing the amount of data at hand and this is done by augmenting the training data using various techniques like flipping, rotating, Zooming, Brightening, etc. In this work, we create an Image Captioning model and check its robustness on all the major types of Image Augmentation techniques. The results show the fuzziness of the model while working with the same image but a different augmentation technique and because of this, a different caption is produced every time a different data augmentation technique is employed. We also show the change in the performance of the model after applying these augmentation techniques. Flickr8k dataset is used for this study along with BLEU score as the evaluation metric for the image captioning model.
APA, Harvard, Vancouver, ISO, and other styles
7

Yao, Linli, Weiying Wang, and Qin Jin. "Image Difference Captioning with Pre-training and Contrastive Learning." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 3 (June 28, 2022): 3108–16. http://dx.doi.org/10.1609/aaai.v36i3.20218.

Full text
Abstract:
The Image Difference Captioning (IDC) task aims to describe the visual differences between two similar images with natural language. The major challenges of this task lie in two aspects: 1) fine-grained visual differences that require learning stronger vision and language association and 2) high-cost of manual annotations that leads to limited supervised data. To address these challenges, we propose a new modeling framework following the pre-training-finetuning paradigm. Specifically, we design three self-supervised tasks and contrastive learning strategies to align visual differences and text descriptions at a fine-grained level. Moreover, we propose a data expansion strategy to utilize extra cross-task supervision information, such as data for fine-grained image classification, to alleviate the limitation of available supervised IDC data. Extensive experiments on two IDC benchmark datasets, CLEVR-Change and Birds-to-Words, demonstrate the effectiveness of the proposed modeling framework. The codes and models will be released at https://github.com/yaolinli/IDC.
APA, Harvard, Vancouver, ISO, and other styles
8

Qu, Shiru, Yuling Xi, and Songtao Ding. "Image Caption Description of Traffic Scene Based on Deep Learning." Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University 36, no. 3 (June 2018): 522–27. http://dx.doi.org/10.1051/jnwpu/20183630522.

Full text
Abstract:
It is a hard issue to describe the complex traffic scene accurately in computer vision. The traffic scene is changeable, which causes image captioning easily interfered by light changes and object occlusion. To solve this problem, we propose an image caption generation model based on attention mechanism. Combining convolutional neural network (CNN) and recurrent neural network (RNN) to generate an end-to-end description for traffic images. To generate a semantic description with distinct degree of discrimination, the attention mechanism is applied to language model. Using Flickr8K、Flickr30K and MS COCO benchmark datasets to validate the effectiveness of our method. The accuracy is promoted maximally by 8.6%, 12.4%, 19.3% and 21.5% in different evaluation metrics. Experiments show that our algorithm has good robustness in four different complex traffic scenarios, such as light change, abnormal weather environment, road marked target and various kinds of transportation tools.
APA, Harvard, Vancouver, ISO, and other styles
9

Vanner, Catherine, and Anuradha Dugal. "Personal, Powerful, Political." Girlhood Studies 13, no. 2 (June 1, 2020): vii—xv. http://dx.doi.org/10.3167/ghs.2020.130202.

Full text
Abstract:
“Today I met my role model,” tweeted climate change activist Greta Thunberg on 25 February 2020, captioning a picture of herself with girls’ education activist Malala Yousafzai, who also tweeted the picture, proclaiming that Greta was “the only friend I would skip school for.” The proclamations of mutual admiration illustrate a form of solidarity between the two most famous girl activists, who are often pointed to as examples of the power of the individual girl activist in spite of their intentionally collective approaches that connect young activists and civil society organizations around the world. These girl activists have garnered worldwide attention for their causes but have also been subject to problematic media representations that elevate voices of privilege and/or focus on girl activists as exceptional individuals (Gordon and Taft 2010; Hesford 2014), often obscuring the movements behind them. For this reason, this special issue explores activism networks by, for, and with girls and young women, examining and emphasizing girls’ activism in collective and collaborative spaces.
APA, Harvard, Vancouver, ISO, and other styles
10

Atliha, Viktar, and Dmitrij Šešok. "Image-Captioning Model Compression." Applied Sciences 12, no. 3 (February 4, 2022): 1638. http://dx.doi.org/10.3390/app12031638.

Full text
Abstract:
Image captioning is a very important task, which is on the edge between natural language processing (NLP) and computer vision (CV). The current quality of the captioning models allows them to be used for practical tasks, but they require both large computational power and considerable storage space. Despite the practical importance of the image-captioning problem, only a few papers have investigated model size compression in order to prepare them for use on mobile devices. Furthermore, these works usually only investigate decoder compression in a typical encoder–decoder architecture, while the encoder traditionally occupies most of the space. We applied the most efficient model-compression techniques such as architectural changes, pruning and quantization to several state-of-the-art image-captioning architectures. As a result, all of these models were compressed by no less than 91% in terms of memory (including encoder), but lost no more than 2% and 4.5% in metrics such as CIDEr and SPICE, respectively. At the same time, the best model showed results of 127.4 CIDEr and 21.4 SPICE, with a size equal to only 34.8 MB, which sets a strong baseline for compression problems for image-captioning models, and could be used for practical applications.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Change Captioning"

1

Hoxha, Genc. "IMAGE CAPTIONING FOR REMOTE SENSING IMAGE ANALYSIS." Doctoral thesis, Università degli studi di Trento, 2022. http://hdl.handle.net/11572/351752.

Full text
Abstract:
Image Captioning (IC) aims to generate a coherent and comprehensive textual description that summarizes the complex content of an image. It is a combination of computer vision and natural language processing techniques to encode the visual features of an image and translate them into a sentence. In the context of remote sensing (RS) analysis, IC has been emerging as a new research area of high interest since it not only recognizes the objects within an image but also describes their attributes and relationships. In this thesis, we propose several IC methods for RS image analysis. We focus on the design of different approaches that take into consideration the peculiarity of RS images (e.g. spectral, temporal and spatial properties) and study the benefits of IC in challenging RS applications. In particular, we focus our attention on developing a new decoder which is based on support vector machines. Compared to the traditional decoders that are based on deep learning, the proposed decoder is particularly interesting for those situations in which only a few training samples are available to alleviate the problem of overfitting. The peculiarity of the proposed decoder is its simplicity and efficiency. It is composed of only one hyperparameter, does not require expensive power units and is very fast in terms of training and testing time making it suitable for real life applications. Despite the efforts made in developing reliable and accurate IC systems, the task is far for being solved. The generated descriptions are affected by several errors related to the attributes and the objects present in an RS scene. Once an error occurs, it is propagated through the recurrent layers of the decoders leading to inaccurate descriptions. To cope with this issue, we propose two post-processing techniques with the aim of improving the generated sentences by detecting and correcting the potential errors. They are based on Hidden Markov Model and Viterbi algorithm. The former aims to generate a set of possible states while the latter aims at finding the optimal sequence of states. The proposed post-processing techniques can be injected to any IC system at test time to improve the quality of the generated sentences. While all the captioning systems developed in the RS community are devoted to single and RGB images, we propose two captioning systems that can be applied to multitemporal and multispectral RS images. The proposed captioning systems are able at describing the changes occurred in a given geographical through time. We refer to this new paradigm of analysing multitemporal and multispectral images as change captioning (CC). To test the proposed CC systems, we construct two novel datasets composed of bitemporal RS images. The first one is composed of very high-resolution RGB images while the second one of medium resolution multispectral satellite images. To advance the task of CC, the constructed datasets are publically available in the following link: https://disi.unitn.it/~melgani/datasets.html. Finally, we analyse the potential of IC for content based image retrieval (CBIR) and show its applicability and advantages compared to the traditional techniques. Specifically, we focus our attention on developing a CBIR systems that represents an image with generated descriptions and uses sentence similarity to search and retrieve relevant RS images. Compare to traditional CBIR systems, the proposed system is able to search and retrieve images using either an image or a sentence as a query making it more comfortable for the end-users. The achieved results show the promising potentialities of our proposed methods compared to the baselines and state-of-the art methods.
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Change Captioning"

1

Qiu, Yue, Kodai Nakashima, Yutaka Satoh, Ryota Suzuki, Kenji Iwata, and Hirokatsu Kataoka. "Scene Change Captioning in Real Scenarios." In Artificial Intelligence in HCI, 405–19. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-05643-7_26.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Suzuki, Ippei, Kenta Yamamoto, Akihisa Shitara, Ryosuke Hyakuta, Ryo Iijima, and Yoichi Ochiai. "See-Through Captions in a Museum Guided Tour: Exploring Museum Guided Tour for Deaf and Hard-of-Hearing People with Real-Time Captioning on Transparent Display." In Lecture Notes in Computer Science, 542–52. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-08648-9_64.

Full text
Abstract:
AbstractAccess to audible information for deaf and hard-of-hearing (DHH) people is an essential component as we move towards a diverse society. Real-time captioning is a technology with great potential to help the lives of DHH people, and various applications utilizing mobile devices have been developed. These technologies can improve the daily lives of DHH people and can considerably change the value of audio content provided in public facilities such as museums. We developed a real-time captioning system called See-Through Captions that displays subtitles on a transparent display and conducted a demonstration experiment to apply this system to a guided tour in a museum. Eleven DHH people participated in this demonstration experiment, and through questionnaires and interviews, we explored the possibility of utilizing the transparent subtitle system in a guided tour at the museum.
APA, Harvard, Vancouver, ISO, and other styles
3

Shi, Xiangxi, Xu Yang, Jiuxiang Gu, Shafiq Joty, and Jianfei Cai. "Finding It at Another Side: A Viewpoint-Adapted Matching Encoder for Change Captioning." In Computer Vision – ECCV 2020, 574–90. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-58568-6_34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Olkin, Rhoda. "Immersion and Activism." In Teaching Disability, 150–89. Oxford University Press, 2021. http://dx.doi.org/10.1093/oso/9780190850661.003.0010.

Full text
Abstract:
The seven activities in this chapter are a deeper dive into the disability experience. The effects of disability-related fatigue are explored in an activity that takes place over 1 week and in a second activity that requires making difficult choices about ways to reduce fatigue. The third activity has students immerse themselves in a disability venue. The fourth activity, done in small groups, focuses on microaggressions, collecting data from 10 disabled people about microaggressions experienced. The fifth activity has students experience television using captioning to explore what is missed in information given to Deaf people. The sixth activity has students pick one problem for people with disabilities and to write the appropriate official requesting a specific change, using data, an example story, and a proposed solution. The last activity assumes some level of clinical skills and involves making clinical responses to eight vignettes involving individuals, couples, or families with disabilities.
APA, Harvard, Vancouver, ISO, and other styles
5

Vajoczki, Susan, and Susan Watt. "Lecture Capture as a Tool to Enhance Student Accessibility." In Cases on Online Learning Communities and Beyond, 200–213. IGI Global, 2013. http://dx.doi.org/10.4018/978-1-4666-1936-4.ch011.

Full text
Abstract:
This case examines the incremental introduction of lecture-capture as a learning technology at a research-intensive university with the goal of addressing issues created by increases in both undergraduate enrolments and disability accommodation needs. This process began with podcasting lectures, leading ultimately to a lecture capture system with closed captioning. At each step, the changes were evaluated in terms of their impact on student learning, acceptability to students and faculty, and application to different disciplines. This evidence-based approach is in keeping with the research culture of the academy and has been helpful in advocating for budgetary support and encouraging faculty participation. As a result of this project, the authors unexpectedly gained substantial knowledge about the complexity of students’ lives, the impact of that complexity on their approach to learning, instructor misperceptions about the impact of this form of learning, the presence of many unreported disabilities, and the many different ways in which students used the system.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Change Captioning"

1

Park, Dong Huk, Trevor Darrell, and Anna Rohrbach. "Robust Change Captioning." In 2019 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2019. http://dx.doi.org/10.1109/iccv.2019.00472.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Kim, Hoeseong, Jongseok Kim, Hyungseok Lee, Hyunsung Park, and Gunhee Kim. "Viewpoint-Agnostic Change Captioning with Cycle Consistency." In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2021. http://dx.doi.org/10.1109/iccv48922.2021.00210.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Liao, Zeming, Qingbao Huang, Yu Liang, Mingyi Fu, Yi Cai, and Qing Li. "Scene Graph with 3D Information for Change Captioning." In MM '21: ACM Multimedia Conference. New York, NY, USA: ACM, 2021. http://dx.doi.org/10.1145/3474085.3475712.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Tu, Yunbin, Liang Li, Chenggang Yan, Shengxiang Gao, and Zhengtao Yu. "Rˆ3Net:Relation-embedded Representation Reconstruction Network for Change Captioning." In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, 2021. http://dx.doi.org/10.18653/v1/2021.emnlp-main.735.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Tu, Yunbin, Tingting Yao, Liang Li, Jiedong Lou, Shengxiang Gao, Zhengtao Yu, and Chenggang Yan. "Semantic Relation-aware Difference Representation Learning for Change Captioning." In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Stroudsburg, PA, USA: Association for Computational Linguistics, 2021. http://dx.doi.org/10.18653/v1/2021.findings-acl.6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Hosseinzadeh, Mehrdad, and Yang Wang. "Image Change Captioning by Learning from an Auxiliary Task." In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2021. http://dx.doi.org/10.1109/cvpr46437.2021.00275.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Qiu, Yue, Shintaro Yamamoto, Ryosuke Yamada, Ryota Suzuki, Hirokatsu Kataoka, Kenji Iwata, and Yutaka Satoh. "3D Change Localization and Captioning from Dynamic Scans of Indoor Scenes." In 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). IEEE, 2023. http://dx.doi.org/10.1109/wacv56688.2023.00123.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Chouaf, Seloua, Genc Hoxha, Youcef Smara, and Farid Melgani. "Captioning Changes in Bi-Temporal Remote Sensing Images." In IGARSS 2021 - 2021 IEEE International Geoscience and Remote Sensing Symposium. IEEE, 2021. http://dx.doi.org/10.1109/igarss47720.2021.9554419.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography