Academic literature on the topic 'Visual representation learning'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Visual representation learning.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Journal articles on the topic "Visual representation learning"
Lee, Jungmin, and Wongyoung Lee. "Aspects of A Study on the Multi Presentational Metaphor Education Using Online Telestration." Korean Society of Culture and Convergence 44, no. 9 (September 30, 2022): 163–73. http://dx.doi.org/10.33645/cnc.2022.9.44.9.163.
Full textYang, Chuanguang, Zhulin An, Linhang Cai, and Yongjun Xu. "Mutual Contrastive Learning for Visual Representation Learning." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 3 (June 28, 2022): 3045–53. http://dx.doi.org/10.1609/aaai.v36i3.20211.
Full textKhaerun Nisa, Rachmawati, and Reza Muhamad Zaenal. "Analysis Of Students' Mathematical Representation Ability in View of Learning Styles." Indo-MathEdu Intellectuals Journal 4, no. 2 (August 15, 2023): 99–109. http://dx.doi.org/10.54373/imeij.v4i2.119.
Full textKholilatun, Fiki, Nizaruddin Nizaruddin, and F. X. Didik Purwosetiyono. "Kemampuan Representasi Siswa SMP Kelas VIII dalam Menyelesaikan Soal Cerita Materi Peluang Ditinjau dari Gaya Belajar Visual." Jurnal Kualita Pendidikan 4, no. 1 (April 30, 2023): 54–59. http://dx.doi.org/10.51651/jkp.v4i1.339.
Full textRif'at, Mohamad, Sudiansyah Sudiansyah, and Khoirunnisa Imama. "Role of visual abilities in mathematics learning: An analysis of conceptual representation." Al-Jabar : Jurnal Pendidikan Matematika 15, no. 1 (June 10, 2024): 87. http://dx.doi.org/10.24042/ajpm.v15i1.22406.
Full textRuliani, Iva Desi, Nizaruddin Nizaruddin, and Yanuar Hery Murtianto. "Profile Analysis of Mathematical Problem Solving Abilities with Krulik & Rudnick Stages Judging from Medium Visual Representation." JIPM (Jurnal Ilmiah Pendidikan Matematika) 7, no. 1 (September 7, 2018): 22. http://dx.doi.org/10.25273/jipm.v7i1.2123.
Full textZha, B., and A. Yilmaz. "LEARNING MAPS FOR OBJECT LOCALIZATION USING VISUAL-INERTIAL ODOMETRY." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences V-1-2020 (August 3, 2020): 343–50. http://dx.doi.org/10.5194/isprs-annals-v-1-2020-343-2020.
Full textMoghaddam, B., and A. Pentland. "Probabilistic visual learning for object representation." IEEE Transactions on Pattern Analysis and Machine Intelligence 19, no. 7 (July 1997): 696–710. http://dx.doi.org/10.1109/34.598227.
Full textHe, Xiangteng, and Yuxin Peng. "Fine-Grained Visual-Textual Representation Learning." IEEE Transactions on Circuits and Systems for Video Technology 30, no. 2 (February 2020): 520–31. http://dx.doi.org/10.1109/tcsvt.2019.2892802.
Full textLiu, Qiyuan, Qi Zhou, Rui Yang, and Jie Wang. "Robust Representation Learning by Clustering with Bisimulation Metrics for Visual Reinforcement Learning with Distractions." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 7 (June 26, 2023): 8843–51. http://dx.doi.org/10.1609/aaai.v37i7.26063.
Full textDissertations / Theses on the topic "Visual representation learning"
Wang, Zhaoqing. "Self-supervised Visual Representation Learning." Thesis, The University of Sydney, 2022. https://hdl.handle.net/2123/29595.
Full textZhou, Bolei. "Interpretable representation learning for visual intelligence." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/117837.
Full textThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 131-140).
Recent progress of deep neural networks in computer vision and machine learning has enabled transformative applications across robotics, healthcare, and security. However, despite the superior performance of the deep neural networks, it remains challenging to understand their inner workings and explain their output predictions. This thesis investigates several novel approaches for opening up the "black box" of neural networks used in visual recognition tasks and understanding their inner working mechanism. I first show that objects and other meaningful concepts emerge as a consequence of recognizing scenes. A network dissection approach is further introduced to automatically identify the internal units as the emergent concept detectors and quantify their interpretability. Then I describe an approach that can efficiently explain the output prediction for any given image. It sheds light on the decision-making process of the networks and why the predictions succeed or fail. Finally, I show some ongoing efforts toward learning efficient and interpretable deep representations for video event understanding and some future directions.
by Bolei Zhou.
Ph. D.
Ben-Younes, Hedi. "Multi-modal representation learning towards visual reasoning." Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS173.
Full textThe quantity of images that populate the Internet is dramatically increasing. It becomes of critical importance to develop the technology for a precise and automatic understanding of visual contents. As image recognition systems are becoming more and more relevant, researchers in artificial intelligence now seek for the next generation vision systems that can perform high-level scene understanding. In this thesis, we are interested in Visual Question Answering (VQA), which consists in building models that answer any natural language question about any image. Because of its nature and complexity, VQA is often considered as a proxy for visual reasoning. Classically, VQA architectures are designed as trainable systems that are provided with images, questions about them and their answers. To tackle this problem, typical approaches involve modern Deep Learning (DL) techniques. In the first part, we focus on developping multi-modal fusion strategies to model the interactions between image and question representations. More specifically, we explore bilinear fusion models and exploit concepts from tensor analysis to provide tractable and expressive factorizations of parameters. These fusion mechanisms are studied under the widely used visual attention framework: the answer to the question is provided by focusing only on the relevant image regions. In the last part, we move away from the attention mechanism and build a more advanced scene understanding architecture where we consider objects and their spatial and semantic relations. All models are thoroughly experimentally evaluated on standard datasets and the results are competitive with the literature
Sharif, Razavian Ali. "Convolutional Network Representation for Visual Recognition." Doctoral thesis, KTH, Robotik, perception och lärande, RPL, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-197919.
Full textQC 20161209
Yu, Mengyang. "Feature reduction and representation learning for visual applications." Thesis, Northumbria University, 2016. http://nrl.northumbria.ac.uk/30222/.
Full textVenkataramanan, Shashanka. "Metric learning for instance and category-level visual representation." Electronic Thesis or Diss., Université de Rennes (2023-....), 2024. http://www.theses.fr/2024URENS022.
Full textThe primary goal in computer vision is to enable machines to extract meaningful information from visual data, such as images and videos, and leverage this information to perform a wide range of tasks. To this end, substantial research has focused on developing deep learning models capable of encoding comprehensive and robust visual representations. A prominent strategy in this context involves pretraining models on large-scale datasets, such as ImageNet, to learn representations that can exhibit cross-task applicability and facilitate the successful handling of diverse downstream tasks with minimal effort. To facilitate learning on these large-scale datasets and encode good representations, com- plex data augmentation strategies have been used. However, these augmentations can be limited in their scope, either being hand-crafted and lacking diversity, or generating images that appear unnatural. Moreover, the focus of these augmentation techniques has primarily been on the ImageNet dataset and its downstream tasks, limiting their applicability to a broader range of computer vision problems. In this thesis, we aim to tackle these limitations by exploring different approaches to en- hance the efficiency and effectiveness in representation learning. The common thread across the works presented is the use of interpolation-based techniques, such as mixup, to generate diverse and informative training examples beyond the original dataset. In the first work, we are motivated by the idea of deformation as a natural way of interpolating images rather than using a convex combination. We show that geometrically aligning the two images in the fea- ture space, allows for more natural interpolation that retains the geometry of one image and the texture of the other, connecting it to style transfer. Drawing from these observations, we explore the combination of mixup and deep metric learning. We develop a generalized formu- lation that accommodates mixup in metric learning, leading to improved representations that explore areas of the embedding space beyond the training classes. Building on these insights, we revisit the original motivation of mixup and generate a larger number of interpolated examples beyond the mini-batch size by interpolating in the embedding space. This approach allows us to sample on the entire convex hull of the mini-batch, rather than just along lin- ear segments between pairs of examples. Finally, we investigate the potential of using natural augmentations of objects from videos. We introduce a "Walking Tours" dataset of first-person egocentric videos, which capture a diverse range of objects and actions in natural scene transi- tions. We then propose a novel self-supervised pretraining method called DoRA, which detects and tracks objects in video frames, deriving multiple views from the tracks and using them in a self-supervised manner
Li, Nuo Ph D. Massachusetts Institute of Technology. "Unsupervised learning of invariant object representation in primate visual cortex." Thesis, Massachusetts Institute of Technology, 2011. http://hdl.handle.net/1721.1/65288.
Full textCataloged from PDF version of thesis.
Includes bibliographical references.
Visual object recognition (categorization and identification) is one of the most fundamental cognitive functions for our survival. Our visual system has the remarkable ability to convey to us visual object and category information in a manner that is largely tolerant ("invariant") to the exact position, size, pose of the object, illumination, and clutter. The ventral visual stream in non-human primate has solved this problem. At the highest stage of the visual hierarchy, the inferior temporal cortex (IT), neurons have selectivity for objects and maintain that selectivity across variations in the images. A reasonably sized population of these tolerant neurons can support object recognition. However, we do not yet understand how IT neurons construct this neuronal tolerance. The aim of this thesis is to tackle this question and to examine the hypothesis that the ventral visual stream may leverage experience to build its neuronal tolerance. One potentially powerful idea is that time can act as an implicit teacher, in that each object's identity tends to remain temporally stable, thus different retinal images of the same object are temporally contiguous. In theory, the ventral stream could take advantage of this natural tendency and learn to associate together the neuronal representations of temporally contiguous retinal images to yield tolerant object selectivity in IT cortex. In this thesis, I report neuronal support for this hypothesis in IT of non-human primates. First, targeted alteration of temporally contiguous experience with object images at different retinal positions rapidly reshaped IT neurons' position tolerance. Second, similar temporal contiguity manipulation of experience with object images at different sizes similarly reshaped IT size tolerance. These instances of experience-induced effect were similar in magnitude, grew gradually stronger with increasing visual experience, and the size of the effect was large. Taken together, these studies show that unsupervised, temporally contiguous experience can reshape and build at least two types of IT tolerance, and that they can do so under a wide range of spatiotemporal regimes encountered during natural visual exploration. These results suggest that the ventral visual stream uses temporal contiguity visual experience with a general unsupervised tolerance learning (UTL) mechanism to build its invariant object representation.
by Nuo Li.
Ph.D.
Dalens, Théophile. "Learnable factored image representation for visual discovery." Thesis, Paris Sciences et Lettres (ComUE), 2019. http://www.theses.fr/2019PSLEE036.
Full textThis thesis proposes an approach for analyzing unpaired visual data annotated with time stamps by generating how images would have looked like if they were from different times. To isolate and transfer time dependent appearance variations, we introduce a new trainable bilinear factor separation module. We analyze its relation to classical factored representations and concatenation-based auto-encoders. We demonstrate this new module has clear advantages compared to standard concatenation when used in a bottleneck encoder-decoder convolutional neural network architecture. We also show that it can be inserted in a recent adversarial image translation architecture, enabling the image transformation to multiple different target time periods using a single network
Jonaityte, Inga <1981>. "Visual representation and financial decision making." Doctoral thesis, Università Ca' Foscari Venezia, 2014. http://hdl.handle.net/10579/4593.
Full textQuesta tesi affronta sperimentalmente gli effetti delle rappresentazioni visive sulle decisioni finanziarie. Ipotizziamo che le rappresentazioni visive dell'informazione finanziaria possano influenzare le decisioni. Per testare tali ipotesi, abbiamo condotto esperimenti online e mostrato che la scelta della rappresentazione visiva conduce a cambiamenti nell'attenzione, comprensione, e valutazione dell'informazione. Il secondo studio riguarda l'abilità dei consulenti finanziari di offrire giudizio esperto per aiutare consumatori inesperti nelle decisioni finanziarie. Abbiamo trovato che il contenuto della pubblicità influenza significativamente tanto l'esperto quanto l'inesperto, il che offre una nuova prospettiva sulle decisioni dei consulenti finanziari. Il terzo tema riguarda l'apprendimento da informazioni multidimensionali, l'adattamento al cambiamento e lo sviluppo di nuove strategie. Abbiamo investigato gli effetti dell'importanza delle "cues" e di cambiamenti dell'ambiente decisionale sull'apprendimento. Trasformazioni improvvise nell'ambiente decisionale sono più dannose di trasformazioni graduali.
Büchler, Uta [Verfasser], and Björn [Akademischer Betreuer] Ommer. "Visual Representation Learning with Minimal Supervision / Uta Büchler ; Betreuer: Björn Ommer." Heidelberg : Universitätsbibliothek Heidelberg, 2021. http://d-nb.info/1225868505/34.
Full textBooks on the topic "Visual representation learning"
Virk, Satyugjit Singh. Learning STEM Through Integrative Visual Representation. [New York, N.Y.?]: [publisher not identified], 2013.
Find full textZhang, Zheng. Binary Representation Learning on Visual Images. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2.
Full textCheng, Hong. Sparse Representation, Modeling and Learning in Visual Recognition. London: Springer London, 2015. http://dx.doi.org/10.1007/978-1-4471-6714-3.
Full text1966-, McBride Kecia Driver, ed. Visual media and the humanities: A pedagogy of representation. Knoxville: University of Tennessee Press, 2004.
Find full textZareian, Alireza. Learning Structured Representations for Understanding Visual and Multimedia Data. [New York, N.Y.?]: [publisher not identified], 2021.
Find full textI, Rumiati Raffaella, and Caramazza Alfonso, eds. The Multiple functions of sensory-motor representations. Hove: Psychology Press, 2005.
Find full textSpiliotopoulou-Papantoniou, Vasiliki. The changing role of visual representations as a tool for research and learning. Hauppauge, N.Y: Nova Science Publishers, 2011.
Find full textLearning-Based Local Visual Representation and Indexing. Elsevier, 2015. http://dx.doi.org/10.1016/c2014-0-01997-1.
Full textJi, Rongrong, Yue Gao, Ling-Yu Duan, Qionghai Dai, and Yao Hongxun. Learning-Based Local Visual Representation and Indexing. Elsevier Science & Technology Books, 2015.
Find full textLearning-Based Local Visual Representation and Indexing. Elsevier Science & Technology Books, 2015.
Find full textBook chapters on the topic "Visual representation learning"
Wu, Qi, Peng Wang, Xin Wang, Xiaodong He, and Wenwu Zhu. "Video Representation Learning." In Visual Question Answering, 111–17. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-0964-1_7.
Full textZhang, Zheng. "Correction to: Binary Representation Learning on Visual Images: Learning to Hash for Similarity Search." In Binary Representation Learning on Visual Images, C1—C2. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_8.
Full textZhang, Zheng. "Deep Collaborative Graph Hashing." In Binary Representation Learning on Visual Images, 143–67. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_6.
Full textZhang, Zheng. "Scalable Supervised Asymmetric Hashing." In Binary Representation Learning on Visual Images, 17–50. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_2.
Full textZhang, Zheng. "Probability Ordinal-Preserving Semantic Hashing." In Binary Representation Learning on Visual Images, 81–109. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_4.
Full textZhang, Zheng. "Introduction." In Binary Representation Learning on Visual Images, 1–16. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_1.
Full textZhang, Zheng. "Semantic-Aware Adversarial Training." In Binary Representation Learning on Visual Images, 169–97. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_7.
Full textZhang, Zheng. "Inductive Structure Consistent Hashing." In Binary Representation Learning on Visual Images, 51–80. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_3.
Full textZhang, Zheng. "Ordinal-Preserving Latent Graph Hashing." In Binary Representation Learning on Visual Images, 111–41. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_5.
Full textGuo, Tan, Lei Zhang, and Xiaoheng Tan. "Extreme Latent Representation Learning for Visual Classification." In Proceedings in Adaptation, Learning and Optimization, 65–75. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-23307-5_8.
Full textConference papers on the topic "Visual representation learning"
Chen, Guikun, Xia Li, Yi Yang, and Wenguan Wang. "Neural Clustering Based Visual Representation Learning." In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 5714–25. IEEE, 2024. http://dx.doi.org/10.1109/cvpr52733.2024.00546.
Full textBrack, Viktor, and Dominik Koßmann. "Local Representation Learning Using Visual Priors for Remote Sensing." In IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium, 8263–67. IEEE, 2024. http://dx.doi.org/10.1109/igarss53475.2024.10641131.
Full textXie, Ruobing, Zhiyuan Liu, Huanbo Luan, and Maosong Sun. "Image-embodied Knowledge Representation Learning." In Twenty-Sixth International Joint Conference on Artificial Intelligence. California: International Joint Conferences on Artificial Intelligence Organization, 2017. http://dx.doi.org/10.24963/ijcai.2017/438.
Full textLi, Zechao. "Understanding-oriented visual representation learning." In the 7th International Conference. New York, New York, USA: ACM Press, 2015. http://dx.doi.org/10.1145/2808492.2808572.
Full textLee, Donghun, Seonghyun Kim, Samyeul Noh, Heechul Bae, and Ingook Jang. "High-level Visual Representation via Perceptual Representation Learning." In 2023 14th International Conference on Information and Communication Technology Convergence (ICTC). IEEE, 2023. http://dx.doi.org/10.1109/ictc58733.2023.10393558.
Full textKolesnikov, Alexander, Xiaohua Zhai, and Lucas Beyer. "Revisiting Self-Supervised Visual Representation Learning." In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019. http://dx.doi.org/10.1109/cvpr.2019.00202.
Full textSariyildiz, Mert Bulent, Yannis Kalantidis, Diane Larlus, and Karteek Alahari. "Concept Generalization in Visual Representation Learning." In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2021. http://dx.doi.org/10.1109/iccv48922.2021.00949.
Full textÖzçelİk, Timoteos Onur, Berk Gökberk, and Lale Akarun. "Self-Supervised Dense Visual Representation Learning." In 2024 32nd Signal Processing and Communications Applications Conference (SIU). IEEE, 2024. http://dx.doi.org/10.1109/siu61531.2024.10600771.
Full textBroscheit, Samuel. "Learning Distributional Token Representations from Visual Features." In Proceedings of The Third Workshop on Representation Learning for NLP. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018. http://dx.doi.org/10.18653/v1/w18-3025.
Full textHong, Xudong, Vera Demberg, Asad Sayeed, Qiankun Zheng, and Bernt Schiele. "Visual Coherence Loss for Coherent and Visually Grounded Story Generation." In Proceedings of the 8th Workshop on Representation Learning for NLP (RepL4NLP 2023). Stroudsburg, PA, USA: Association for Computational Linguistics, 2023. http://dx.doi.org/10.18653/v1/2023.repl4nlp-1.27.
Full textReports on the topic "Visual representation learning"
Tarasenko, Rostyslav O., Svitlana M. Amelina, Yuliya M. Kazhan, and Olga V. Bondarenko. The use of AR elements in the study of foreign languages at the university. CEUR Workshop Proceedings, November 2020. http://dx.doi.org/10.31812/123456789/4421.
Full textTarasenko, Rostyslav O., Svitlana M. Amelina, Yuliya M. Kazhan, and Olga V. Bondarenko. The use of AR elements in the study of foreign languages at the university. CEUR Workshop Proceedings, November 2020. http://dx.doi.org/10.31812/123456789/4421.
Full textShukla, Indu, Rajeev Agrawal, Kelly Ervin, and Jonathan Boone. AI on digital twin of facility captured by reality scans. Engineer Research and Development Center (U.S.), November 2023. http://dx.doi.org/10.21079/11681/47850.
Full textIatsyshyn, Anna V., Valeriia O. Kovach, Yevhen O. Romanenko, Iryna I. Deinega, Andrii V. Iatsyshyn, Oleksandr O. Popov, Yulii G. Kutsan, Volodymyr O. Artemchuk, Oleksandr Yu Burov, and Svitlana H. Lytvynova. Application of augmented reality technologies for preparation of specialists of new technological era. [б. в.], February 2020. http://dx.doi.org/10.31812/123456789/3749.
Full textZerla, Pauline. Trauma, Violence Prevention, and Reintegration: Learning from Youth Conflict Narratives in the Central African Republic. RESOLVE Network, February 2024. http://dx.doi.org/10.37805/lpbi2024.1.
Full text