Добірка наукової літератури з теми "Visual representation learning"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Visual representation learning".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Статті в журналах з теми "Visual representation learning"
Lee, Jungmin, and Wongyoung Lee. "Aspects of A Study on the Multi Presentational Metaphor Education Using Online Telestration." Korean Society of Culture and Convergence 44, no. 9 (September 30, 2022): 163–73. http://dx.doi.org/10.33645/cnc.2022.9.44.9.163.
Повний текст джерелаYang, Chuanguang, Zhulin An, Linhang Cai, and Yongjun Xu. "Mutual Contrastive Learning for Visual Representation Learning." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 3 (June 28, 2022): 3045–53. http://dx.doi.org/10.1609/aaai.v36i3.20211.
Повний текст джерелаKhaerun Nisa, Rachmawati, and Reza Muhamad Zaenal. "Analysis Of Students' Mathematical Representation Ability in View of Learning Styles." Indo-MathEdu Intellectuals Journal 4, no. 2 (August 15, 2023): 99–109. http://dx.doi.org/10.54373/imeij.v4i2.119.
Повний текст джерелаKholilatun, Fiki, Nizaruddin Nizaruddin, and F. X. Didik Purwosetiyono. "Kemampuan Representasi Siswa SMP Kelas VIII dalam Menyelesaikan Soal Cerita Materi Peluang Ditinjau dari Gaya Belajar Visual." Jurnal Kualita Pendidikan 4, no. 1 (April 30, 2023): 54–59. http://dx.doi.org/10.51651/jkp.v4i1.339.
Повний текст джерелаRif'at, Mohamad, Sudiansyah Sudiansyah, and Khoirunnisa Imama. "Role of visual abilities in mathematics learning: An analysis of conceptual representation." Al-Jabar : Jurnal Pendidikan Matematika 15, no. 1 (June 10, 2024): 87. http://dx.doi.org/10.24042/ajpm.v15i1.22406.
Повний текст джерелаRuliani, Iva Desi, Nizaruddin Nizaruddin, and Yanuar Hery Murtianto. "Profile Analysis of Mathematical Problem Solving Abilities with Krulik & Rudnick Stages Judging from Medium Visual Representation." JIPM (Jurnal Ilmiah Pendidikan Matematika) 7, no. 1 (September 7, 2018): 22. http://dx.doi.org/10.25273/jipm.v7i1.2123.
Повний текст джерелаZha, B., and A. Yilmaz. "LEARNING MAPS FOR OBJECT LOCALIZATION USING VISUAL-INERTIAL ODOMETRY." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences V-1-2020 (August 3, 2020): 343–50. http://dx.doi.org/10.5194/isprs-annals-v-1-2020-343-2020.
Повний текст джерелаMoghaddam, B., and A. Pentland. "Probabilistic visual learning for object representation." IEEE Transactions on Pattern Analysis and Machine Intelligence 19, no. 7 (July 1997): 696–710. http://dx.doi.org/10.1109/34.598227.
Повний текст джерелаHe, Xiangteng, and Yuxin Peng. "Fine-Grained Visual-Textual Representation Learning." IEEE Transactions on Circuits and Systems for Video Technology 30, no. 2 (February 2020): 520–31. http://dx.doi.org/10.1109/tcsvt.2019.2892802.
Повний текст джерелаLiu, Qiyuan, Qi Zhou, Rui Yang, and Jie Wang. "Robust Representation Learning by Clustering with Bisimulation Metrics for Visual Reinforcement Learning with Distractions." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 7 (June 26, 2023): 8843–51. http://dx.doi.org/10.1609/aaai.v37i7.26063.
Повний текст джерелаДисертації з теми "Visual representation learning"
Wang, Zhaoqing. "Self-supervised Visual Representation Learning." Thesis, The University of Sydney, 2022. https://hdl.handle.net/2123/29595.
Повний текст джерелаZhou, Bolei. "Interpretable representation learning for visual intelligence." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/117837.
Повний текст джерелаThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 131-140).
Recent progress of deep neural networks in computer vision and machine learning has enabled transformative applications across robotics, healthcare, and security. However, despite the superior performance of the deep neural networks, it remains challenging to understand their inner workings and explain their output predictions. This thesis investigates several novel approaches for opening up the "black box" of neural networks used in visual recognition tasks and understanding their inner working mechanism. I first show that objects and other meaningful concepts emerge as a consequence of recognizing scenes. A network dissection approach is further introduced to automatically identify the internal units as the emergent concept detectors and quantify their interpretability. Then I describe an approach that can efficiently explain the output prediction for any given image. It sheds light on the decision-making process of the networks and why the predictions succeed or fail. Finally, I show some ongoing efforts toward learning efficient and interpretable deep representations for video event understanding and some future directions.
by Bolei Zhou.
Ph. D.
Ben-Younes, Hedi. "Multi-modal representation learning towards visual reasoning." Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS173.
Повний текст джерелаThe quantity of images that populate the Internet is dramatically increasing. It becomes of critical importance to develop the technology for a precise and automatic understanding of visual contents. As image recognition systems are becoming more and more relevant, researchers in artificial intelligence now seek for the next generation vision systems that can perform high-level scene understanding. In this thesis, we are interested in Visual Question Answering (VQA), which consists in building models that answer any natural language question about any image. Because of its nature and complexity, VQA is often considered as a proxy for visual reasoning. Classically, VQA architectures are designed as trainable systems that are provided with images, questions about them and their answers. To tackle this problem, typical approaches involve modern Deep Learning (DL) techniques. In the first part, we focus on developping multi-modal fusion strategies to model the interactions between image and question representations. More specifically, we explore bilinear fusion models and exploit concepts from tensor analysis to provide tractable and expressive factorizations of parameters. These fusion mechanisms are studied under the widely used visual attention framework: the answer to the question is provided by focusing only on the relevant image regions. In the last part, we move away from the attention mechanism and build a more advanced scene understanding architecture where we consider objects and their spatial and semantic relations. All models are thoroughly experimentally evaluated on standard datasets and the results are competitive with the literature
Sharif, Razavian Ali. "Convolutional Network Representation for Visual Recognition." Doctoral thesis, KTH, Robotik, perception och lärande, RPL, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-197919.
Повний текст джерелаQC 20161209
Yu, Mengyang. "Feature reduction and representation learning for visual applications." Thesis, Northumbria University, 2016. http://nrl.northumbria.ac.uk/30222/.
Повний текст джерелаVenkataramanan, Shashanka. "Metric learning for instance and category-level visual representation." Electronic Thesis or Diss., Université de Rennes (2023-....), 2024. http://www.theses.fr/2024URENS022.
Повний текст джерелаThe primary goal in computer vision is to enable machines to extract meaningful information from visual data, such as images and videos, and leverage this information to perform a wide range of tasks. To this end, substantial research has focused on developing deep learning models capable of encoding comprehensive and robust visual representations. A prominent strategy in this context involves pretraining models on large-scale datasets, such as ImageNet, to learn representations that can exhibit cross-task applicability and facilitate the successful handling of diverse downstream tasks with minimal effort. To facilitate learning on these large-scale datasets and encode good representations, com- plex data augmentation strategies have been used. However, these augmentations can be limited in their scope, either being hand-crafted and lacking diversity, or generating images that appear unnatural. Moreover, the focus of these augmentation techniques has primarily been on the ImageNet dataset and its downstream tasks, limiting their applicability to a broader range of computer vision problems. In this thesis, we aim to tackle these limitations by exploring different approaches to en- hance the efficiency and effectiveness in representation learning. The common thread across the works presented is the use of interpolation-based techniques, such as mixup, to generate diverse and informative training examples beyond the original dataset. In the first work, we are motivated by the idea of deformation as a natural way of interpolating images rather than using a convex combination. We show that geometrically aligning the two images in the fea- ture space, allows for more natural interpolation that retains the geometry of one image and the texture of the other, connecting it to style transfer. Drawing from these observations, we explore the combination of mixup and deep metric learning. We develop a generalized formu- lation that accommodates mixup in metric learning, leading to improved representations that explore areas of the embedding space beyond the training classes. Building on these insights, we revisit the original motivation of mixup and generate a larger number of interpolated examples beyond the mini-batch size by interpolating in the embedding space. This approach allows us to sample on the entire convex hull of the mini-batch, rather than just along lin- ear segments between pairs of examples. Finally, we investigate the potential of using natural augmentations of objects from videos. We introduce a "Walking Tours" dataset of first-person egocentric videos, which capture a diverse range of objects and actions in natural scene transi- tions. We then propose a novel self-supervised pretraining method called DoRA, which detects and tracks objects in video frames, deriving multiple views from the tracks and using them in a self-supervised manner
Li, Nuo Ph D. Massachusetts Institute of Technology. "Unsupervised learning of invariant object representation in primate visual cortex." Thesis, Massachusetts Institute of Technology, 2011. http://hdl.handle.net/1721.1/65288.
Повний текст джерелаCataloged from PDF version of thesis.
Includes bibliographical references.
Visual object recognition (categorization and identification) is one of the most fundamental cognitive functions for our survival. Our visual system has the remarkable ability to convey to us visual object and category information in a manner that is largely tolerant ("invariant") to the exact position, size, pose of the object, illumination, and clutter. The ventral visual stream in non-human primate has solved this problem. At the highest stage of the visual hierarchy, the inferior temporal cortex (IT), neurons have selectivity for objects and maintain that selectivity across variations in the images. A reasonably sized population of these tolerant neurons can support object recognition. However, we do not yet understand how IT neurons construct this neuronal tolerance. The aim of this thesis is to tackle this question and to examine the hypothesis that the ventral visual stream may leverage experience to build its neuronal tolerance. One potentially powerful idea is that time can act as an implicit teacher, in that each object's identity tends to remain temporally stable, thus different retinal images of the same object are temporally contiguous. In theory, the ventral stream could take advantage of this natural tendency and learn to associate together the neuronal representations of temporally contiguous retinal images to yield tolerant object selectivity in IT cortex. In this thesis, I report neuronal support for this hypothesis in IT of non-human primates. First, targeted alteration of temporally contiguous experience with object images at different retinal positions rapidly reshaped IT neurons' position tolerance. Second, similar temporal contiguity manipulation of experience with object images at different sizes similarly reshaped IT size tolerance. These instances of experience-induced effect were similar in magnitude, grew gradually stronger with increasing visual experience, and the size of the effect was large. Taken together, these studies show that unsupervised, temporally contiguous experience can reshape and build at least two types of IT tolerance, and that they can do so under a wide range of spatiotemporal regimes encountered during natural visual exploration. These results suggest that the ventral visual stream uses temporal contiguity visual experience with a general unsupervised tolerance learning (UTL) mechanism to build its invariant object representation.
by Nuo Li.
Ph.D.
Dalens, Théophile. "Learnable factored image representation for visual discovery." Thesis, Paris Sciences et Lettres (ComUE), 2019. http://www.theses.fr/2019PSLEE036.
Повний текст джерелаThis thesis proposes an approach for analyzing unpaired visual data annotated with time stamps by generating how images would have looked like if they were from different times. To isolate and transfer time dependent appearance variations, we introduce a new trainable bilinear factor separation module. We analyze its relation to classical factored representations and concatenation-based auto-encoders. We demonstrate this new module has clear advantages compared to standard concatenation when used in a bottleneck encoder-decoder convolutional neural network architecture. We also show that it can be inserted in a recent adversarial image translation architecture, enabling the image transformation to multiple different target time periods using a single network
Jonaityte, Inga <1981>. "Visual representation and financial decision making." Doctoral thesis, Università Ca' Foscari Venezia, 2014. http://hdl.handle.net/10579/4593.
Повний текст джерелаQuesta tesi affronta sperimentalmente gli effetti delle rappresentazioni visive sulle decisioni finanziarie. Ipotizziamo che le rappresentazioni visive dell'informazione finanziaria possano influenzare le decisioni. Per testare tali ipotesi, abbiamo condotto esperimenti online e mostrato che la scelta della rappresentazione visiva conduce a cambiamenti nell'attenzione, comprensione, e valutazione dell'informazione. Il secondo studio riguarda l'abilità dei consulenti finanziari di offrire giudizio esperto per aiutare consumatori inesperti nelle decisioni finanziarie. Abbiamo trovato che il contenuto della pubblicità influenza significativamente tanto l'esperto quanto l'inesperto, il che offre una nuova prospettiva sulle decisioni dei consulenti finanziari. Il terzo tema riguarda l'apprendimento da informazioni multidimensionali, l'adattamento al cambiamento e lo sviluppo di nuove strategie. Abbiamo investigato gli effetti dell'importanza delle "cues" e di cambiamenti dell'ambiente decisionale sull'apprendimento. Trasformazioni improvvise nell'ambiente decisionale sono più dannose di trasformazioni graduali.
Büchler, Uta [Verfasser], and Björn [Akademischer Betreuer] Ommer. "Visual Representation Learning with Minimal Supervision / Uta Büchler ; Betreuer: Björn Ommer." Heidelberg : Universitätsbibliothek Heidelberg, 2021. http://d-nb.info/1225868505/34.
Повний текст джерелаКниги з теми "Visual representation learning"
Virk, Satyugjit Singh. Learning STEM Through Integrative Visual Representation. [New York, N.Y.?]: [publisher not identified], 2013.
Знайти повний текст джерелаZhang, Zheng. Binary Representation Learning on Visual Images. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2.
Повний текст джерелаCheng, Hong. Sparse Representation, Modeling and Learning in Visual Recognition. London: Springer London, 2015. http://dx.doi.org/10.1007/978-1-4471-6714-3.
Повний текст джерела1966-, McBride Kecia Driver, ed. Visual media and the humanities: A pedagogy of representation. Knoxville: University of Tennessee Press, 2004.
Знайти повний текст джерелаZareian, Alireza. Learning Structured Representations for Understanding Visual and Multimedia Data. [New York, N.Y.?]: [publisher not identified], 2021.
Знайти повний текст джерелаI, Rumiati Raffaella, and Caramazza Alfonso, eds. The Multiple functions of sensory-motor representations. Hove: Psychology Press, 2005.
Знайти повний текст джерелаSpiliotopoulou-Papantoniou, Vasiliki. The changing role of visual representations as a tool for research and learning. Hauppauge, N.Y: Nova Science Publishers, 2011.
Знайти повний текст джерелаLearning-Based Local Visual Representation and Indexing. Elsevier, 2015. http://dx.doi.org/10.1016/c2014-0-01997-1.
Повний текст джерелаJi, Rongrong, Yue Gao, Ling-Yu Duan, Qionghai Dai, and Yao Hongxun. Learning-Based Local Visual Representation and Indexing. Elsevier Science & Technology Books, 2015.
Знайти повний текст джерелаLearning-Based Local Visual Representation and Indexing. Elsevier Science & Technology Books, 2015.
Знайти повний текст джерелаЧастини книг з теми "Visual representation learning"
Wu, Qi, Peng Wang, Xin Wang, Xiaodong He, and Wenwu Zhu. "Video Representation Learning." In Visual Question Answering, 111–17. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-0964-1_7.
Повний текст джерелаZhang, Zheng. "Correction to: Binary Representation Learning on Visual Images: Learning to Hash for Similarity Search." In Binary Representation Learning on Visual Images, C1—C2. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_8.
Повний текст джерелаZhang, Zheng. "Deep Collaborative Graph Hashing." In Binary Representation Learning on Visual Images, 143–67. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_6.
Повний текст джерелаZhang, Zheng. "Scalable Supervised Asymmetric Hashing." In Binary Representation Learning on Visual Images, 17–50. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_2.
Повний текст джерелаZhang, Zheng. "Probability Ordinal-Preserving Semantic Hashing." In Binary Representation Learning on Visual Images, 81–109. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_4.
Повний текст джерелаZhang, Zheng. "Introduction." In Binary Representation Learning on Visual Images, 1–16. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_1.
Повний текст джерелаZhang, Zheng. "Semantic-Aware Adversarial Training." In Binary Representation Learning on Visual Images, 169–97. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_7.
Повний текст джерелаZhang, Zheng. "Inductive Structure Consistent Hashing." In Binary Representation Learning on Visual Images, 51–80. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_3.
Повний текст джерелаZhang, Zheng. "Ordinal-Preserving Latent Graph Hashing." In Binary Representation Learning on Visual Images, 111–41. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_5.
Повний текст джерелаGuo, Tan, Lei Zhang, and Xiaoheng Tan. "Extreme Latent Representation Learning for Visual Classification." In Proceedings in Adaptation, Learning and Optimization, 65–75. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-23307-5_8.
Повний текст джерелаТези доповідей конференцій з теми "Visual representation learning"
Chen, Guikun, Xia Li, Yi Yang, and Wenguan Wang. "Neural Clustering Based Visual Representation Learning." In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 5714–25. IEEE, 2024. http://dx.doi.org/10.1109/cvpr52733.2024.00546.
Повний текст джерелаBrack, Viktor, and Dominik Koßmann. "Local Representation Learning Using Visual Priors for Remote Sensing." In IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium, 8263–67. IEEE, 2024. http://dx.doi.org/10.1109/igarss53475.2024.10641131.
Повний текст джерелаXie, Ruobing, Zhiyuan Liu, Huanbo Luan, and Maosong Sun. "Image-embodied Knowledge Representation Learning." In Twenty-Sixth International Joint Conference on Artificial Intelligence. California: International Joint Conferences on Artificial Intelligence Organization, 2017. http://dx.doi.org/10.24963/ijcai.2017/438.
Повний текст джерелаLi, Zechao. "Understanding-oriented visual representation learning." In the 7th International Conference. New York, New York, USA: ACM Press, 2015. http://dx.doi.org/10.1145/2808492.2808572.
Повний текст джерелаLee, Donghun, Seonghyun Kim, Samyeul Noh, Heechul Bae, and Ingook Jang. "High-level Visual Representation via Perceptual Representation Learning." In 2023 14th International Conference on Information and Communication Technology Convergence (ICTC). IEEE, 2023. http://dx.doi.org/10.1109/ictc58733.2023.10393558.
Повний текст джерелаKolesnikov, Alexander, Xiaohua Zhai, and Lucas Beyer. "Revisiting Self-Supervised Visual Representation Learning." In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019. http://dx.doi.org/10.1109/cvpr.2019.00202.
Повний текст джерелаSariyildiz, Mert Bulent, Yannis Kalantidis, Diane Larlus, and Karteek Alahari. "Concept Generalization in Visual Representation Learning." In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2021. http://dx.doi.org/10.1109/iccv48922.2021.00949.
Повний текст джерелаÖzçelİk, Timoteos Onur, Berk Gökberk, and Lale Akarun. "Self-Supervised Dense Visual Representation Learning." In 2024 32nd Signal Processing and Communications Applications Conference (SIU). IEEE, 2024. http://dx.doi.org/10.1109/siu61531.2024.10600771.
Повний текст джерелаBroscheit, Samuel. "Learning Distributional Token Representations from Visual Features." In Proceedings of The Third Workshop on Representation Learning for NLP. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018. http://dx.doi.org/10.18653/v1/w18-3025.
Повний текст джерелаHong, Xudong, Vera Demberg, Asad Sayeed, Qiankun Zheng, and Bernt Schiele. "Visual Coherence Loss for Coherent and Visually Grounded Story Generation." In Proceedings of the 8th Workshop on Representation Learning for NLP (RepL4NLP 2023). Stroudsburg, PA, USA: Association for Computational Linguistics, 2023. http://dx.doi.org/10.18653/v1/2023.repl4nlp-1.27.
Повний текст джерелаЗвіти організацій з теми "Visual representation learning"
Tarasenko, Rostyslav O., Svitlana M. Amelina, Yuliya M. Kazhan, and Olga V. Bondarenko. The use of AR elements in the study of foreign languages at the university. CEUR Workshop Proceedings, November 2020. http://dx.doi.org/10.31812/123456789/4421.
Повний текст джерелаTarasenko, Rostyslav O., Svitlana M. Amelina, Yuliya M. Kazhan, and Olga V. Bondarenko. The use of AR elements in the study of foreign languages at the university. CEUR Workshop Proceedings, November 2020. http://dx.doi.org/10.31812/123456789/4421.
Повний текст джерелаShukla, Indu, Rajeev Agrawal, Kelly Ervin, and Jonathan Boone. AI on digital twin of facility captured by reality scans. Engineer Research and Development Center (U.S.), November 2023. http://dx.doi.org/10.21079/11681/47850.
Повний текст джерелаIatsyshyn, Anna V., Valeriia O. Kovach, Yevhen O. Romanenko, Iryna I. Deinega, Andrii V. Iatsyshyn, Oleksandr O. Popov, Yulii G. Kutsan, Volodymyr O. Artemchuk, Oleksandr Yu Burov, and Svitlana H. Lytvynova. Application of augmented reality technologies for preparation of specialists of new technological era. [б. в.], February 2020. http://dx.doi.org/10.31812/123456789/3749.
Повний текст джерелаZerla, Pauline. Trauma, Violence Prevention, and Reintegration: Learning from Youth Conflict Narratives in the Central African Republic. RESOLVE Network, February 2024. http://dx.doi.org/10.37805/lpbi2024.1.
Повний текст джерела