Letteratura scientifica selezionata sul tema "Visual representation learning"
Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili
Consulta la lista di attuali articoli, libri, tesi, atti di convegni e altre fonti scientifiche attinenti al tema "Visual representation learning".
Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.
Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.
Articoli di riviste sul tema "Visual representation learning"
Lee, Jungmin, e Wongyoung Lee. "Aspects of A Study on the Multi Presentational Metaphor Education Using Online Telestration". Korean Society of Culture and Convergence 44, n. 9 (30 settembre 2022): 163–73. http://dx.doi.org/10.33645/cnc.2022.9.44.9.163.
Testo completoYang, Chuanguang, Zhulin An, Linhang Cai e Yongjun Xu. "Mutual Contrastive Learning for Visual Representation Learning". Proceedings of the AAAI Conference on Artificial Intelligence 36, n. 3 (28 giugno 2022): 3045–53. http://dx.doi.org/10.1609/aaai.v36i3.20211.
Testo completoKhaerun Nisa, Rachmawati, e Reza Muhamad Zaenal. "Analysis Of Students' Mathematical Representation Ability in View of Learning Styles". Indo-MathEdu Intellectuals Journal 4, n. 2 (15 agosto 2023): 99–109. http://dx.doi.org/10.54373/imeij.v4i2.119.
Testo completoKholilatun, Fiki, Nizaruddin Nizaruddin e F. X. Didik Purwosetiyono. "Kemampuan Representasi Siswa SMP Kelas VIII dalam Menyelesaikan Soal Cerita Materi Peluang Ditinjau dari Gaya Belajar Visual". Jurnal Kualita Pendidikan 4, n. 1 (30 aprile 2023): 54–59. http://dx.doi.org/10.51651/jkp.v4i1.339.
Testo completoRif'at, Mohamad, Sudiansyah Sudiansyah e Khoirunnisa Imama. "Role of visual abilities in mathematics learning: An analysis of conceptual representation". Al-Jabar : Jurnal Pendidikan Matematika 15, n. 1 (10 giugno 2024): 87. http://dx.doi.org/10.24042/ajpm.v15i1.22406.
Testo completoRuliani, Iva Desi, Nizaruddin Nizaruddin e Yanuar Hery Murtianto. "Profile Analysis of Mathematical Problem Solving Abilities with Krulik & Rudnick Stages Judging from Medium Visual Representation". JIPM (Jurnal Ilmiah Pendidikan Matematika) 7, n. 1 (7 settembre 2018): 22. http://dx.doi.org/10.25273/jipm.v7i1.2123.
Testo completoZha, B., e A. Yilmaz. "LEARNING MAPS FOR OBJECT LOCALIZATION USING VISUAL-INERTIAL ODOMETRY". ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences V-1-2020 (3 agosto 2020): 343–50. http://dx.doi.org/10.5194/isprs-annals-v-1-2020-343-2020.
Testo completoMoghaddam, B., e A. Pentland. "Probabilistic visual learning for object representation". IEEE Transactions on Pattern Analysis and Machine Intelligence 19, n. 7 (luglio 1997): 696–710. http://dx.doi.org/10.1109/34.598227.
Testo completoHe, Xiangteng, e Yuxin Peng. "Fine-Grained Visual-Textual Representation Learning". IEEE Transactions on Circuits and Systems for Video Technology 30, n. 2 (febbraio 2020): 520–31. http://dx.doi.org/10.1109/tcsvt.2019.2892802.
Testo completoLiu, Qiyuan, Qi Zhou, Rui Yang e Jie Wang. "Robust Representation Learning by Clustering with Bisimulation Metrics for Visual Reinforcement Learning with Distractions". Proceedings of the AAAI Conference on Artificial Intelligence 37, n. 7 (26 giugno 2023): 8843–51. http://dx.doi.org/10.1609/aaai.v37i7.26063.
Testo completoTesi sul tema "Visual representation learning"
Wang, Zhaoqing. "Self-supervised Visual Representation Learning". Thesis, The University of Sydney, 2022. https://hdl.handle.net/2123/29595.
Testo completoZhou, Bolei. "Interpretable representation learning for visual intelligence". Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/117837.
Testo completoThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 131-140).
Recent progress of deep neural networks in computer vision and machine learning has enabled transformative applications across robotics, healthcare, and security. However, despite the superior performance of the deep neural networks, it remains challenging to understand their inner workings and explain their output predictions. This thesis investigates several novel approaches for opening up the "black box" of neural networks used in visual recognition tasks and understanding their inner working mechanism. I first show that objects and other meaningful concepts emerge as a consequence of recognizing scenes. A network dissection approach is further introduced to automatically identify the internal units as the emergent concept detectors and quantify their interpretability. Then I describe an approach that can efficiently explain the output prediction for any given image. It sheds light on the decision-making process of the networks and why the predictions succeed or fail. Finally, I show some ongoing efforts toward learning efficient and interpretable deep representations for video event understanding and some future directions.
by Bolei Zhou.
Ph. D.
Ben-Younes, Hedi. "Multi-modal representation learning towards visual reasoning". Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS173.
Testo completoThe quantity of images that populate the Internet is dramatically increasing. It becomes of critical importance to develop the technology for a precise and automatic understanding of visual contents. As image recognition systems are becoming more and more relevant, researchers in artificial intelligence now seek for the next generation vision systems that can perform high-level scene understanding. In this thesis, we are interested in Visual Question Answering (VQA), which consists in building models that answer any natural language question about any image. Because of its nature and complexity, VQA is often considered as a proxy for visual reasoning. Classically, VQA architectures are designed as trainable systems that are provided with images, questions about them and their answers. To tackle this problem, typical approaches involve modern Deep Learning (DL) techniques. In the first part, we focus on developping multi-modal fusion strategies to model the interactions between image and question representations. More specifically, we explore bilinear fusion models and exploit concepts from tensor analysis to provide tractable and expressive factorizations of parameters. These fusion mechanisms are studied under the widely used visual attention framework: the answer to the question is provided by focusing only on the relevant image regions. In the last part, we move away from the attention mechanism and build a more advanced scene understanding architecture where we consider objects and their spatial and semantic relations. All models are thoroughly experimentally evaluated on standard datasets and the results are competitive with the literature
Sharif, Razavian Ali. "Convolutional Network Representation for Visual Recognition". Doctoral thesis, KTH, Robotik, perception och lärande, RPL, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-197919.
Testo completoQC 20161209
Yu, Mengyang. "Feature reduction and representation learning for visual applications". Thesis, Northumbria University, 2016. http://nrl.northumbria.ac.uk/30222/.
Testo completoVenkataramanan, Shashanka. "Metric learning for instance and category-level visual representation". Electronic Thesis or Diss., Université de Rennes (2023-....), 2024. http://www.theses.fr/2024URENS022.
Testo completoThe primary goal in computer vision is to enable machines to extract meaningful information from visual data, such as images and videos, and leverage this information to perform a wide range of tasks. To this end, substantial research has focused on developing deep learning models capable of encoding comprehensive and robust visual representations. A prominent strategy in this context involves pretraining models on large-scale datasets, such as ImageNet, to learn representations that can exhibit cross-task applicability and facilitate the successful handling of diverse downstream tasks with minimal effort. To facilitate learning on these large-scale datasets and encode good representations, com- plex data augmentation strategies have been used. However, these augmentations can be limited in their scope, either being hand-crafted and lacking diversity, or generating images that appear unnatural. Moreover, the focus of these augmentation techniques has primarily been on the ImageNet dataset and its downstream tasks, limiting their applicability to a broader range of computer vision problems. In this thesis, we aim to tackle these limitations by exploring different approaches to en- hance the efficiency and effectiveness in representation learning. The common thread across the works presented is the use of interpolation-based techniques, such as mixup, to generate diverse and informative training examples beyond the original dataset. In the first work, we are motivated by the idea of deformation as a natural way of interpolating images rather than using a convex combination. We show that geometrically aligning the two images in the fea- ture space, allows for more natural interpolation that retains the geometry of one image and the texture of the other, connecting it to style transfer. Drawing from these observations, we explore the combination of mixup and deep metric learning. We develop a generalized formu- lation that accommodates mixup in metric learning, leading to improved representations that explore areas of the embedding space beyond the training classes. Building on these insights, we revisit the original motivation of mixup and generate a larger number of interpolated examples beyond the mini-batch size by interpolating in the embedding space. This approach allows us to sample on the entire convex hull of the mini-batch, rather than just along lin- ear segments between pairs of examples. Finally, we investigate the potential of using natural augmentations of objects from videos. We introduce a "Walking Tours" dataset of first-person egocentric videos, which capture a diverse range of objects and actions in natural scene transi- tions. We then propose a novel self-supervised pretraining method called DoRA, which detects and tracks objects in video frames, deriving multiple views from the tracks and using them in a self-supervised manner
Li, Nuo Ph D. Massachusetts Institute of Technology. "Unsupervised learning of invariant object representation in primate visual cortex". Thesis, Massachusetts Institute of Technology, 2011. http://hdl.handle.net/1721.1/65288.
Testo completoCataloged from PDF version of thesis.
Includes bibliographical references.
Visual object recognition (categorization and identification) is one of the most fundamental cognitive functions for our survival. Our visual system has the remarkable ability to convey to us visual object and category information in a manner that is largely tolerant ("invariant") to the exact position, size, pose of the object, illumination, and clutter. The ventral visual stream in non-human primate has solved this problem. At the highest stage of the visual hierarchy, the inferior temporal cortex (IT), neurons have selectivity for objects and maintain that selectivity across variations in the images. A reasonably sized population of these tolerant neurons can support object recognition. However, we do not yet understand how IT neurons construct this neuronal tolerance. The aim of this thesis is to tackle this question and to examine the hypothesis that the ventral visual stream may leverage experience to build its neuronal tolerance. One potentially powerful idea is that time can act as an implicit teacher, in that each object's identity tends to remain temporally stable, thus different retinal images of the same object are temporally contiguous. In theory, the ventral stream could take advantage of this natural tendency and learn to associate together the neuronal representations of temporally contiguous retinal images to yield tolerant object selectivity in IT cortex. In this thesis, I report neuronal support for this hypothesis in IT of non-human primates. First, targeted alteration of temporally contiguous experience with object images at different retinal positions rapidly reshaped IT neurons' position tolerance. Second, similar temporal contiguity manipulation of experience with object images at different sizes similarly reshaped IT size tolerance. These instances of experience-induced effect were similar in magnitude, grew gradually stronger with increasing visual experience, and the size of the effect was large. Taken together, these studies show that unsupervised, temporally contiguous experience can reshape and build at least two types of IT tolerance, and that they can do so under a wide range of spatiotemporal regimes encountered during natural visual exploration. These results suggest that the ventral visual stream uses temporal contiguity visual experience with a general unsupervised tolerance learning (UTL) mechanism to build its invariant object representation.
by Nuo Li.
Ph.D.
Dalens, Théophile. "Learnable factored image representation for visual discovery". Thesis, Paris Sciences et Lettres (ComUE), 2019. http://www.theses.fr/2019PSLEE036.
Testo completoThis thesis proposes an approach for analyzing unpaired visual data annotated with time stamps by generating how images would have looked like if they were from different times. To isolate and transfer time dependent appearance variations, we introduce a new trainable bilinear factor separation module. We analyze its relation to classical factored representations and concatenation-based auto-encoders. We demonstrate this new module has clear advantages compared to standard concatenation when used in a bottleneck encoder-decoder convolutional neural network architecture. We also show that it can be inserted in a recent adversarial image translation architecture, enabling the image transformation to multiple different target time periods using a single network
Jonaityte, Inga <1981>. "Visual representation and financial decision making". Doctoral thesis, Università Ca' Foscari Venezia, 2014. http://hdl.handle.net/10579/4593.
Testo completoQuesta tesi affronta sperimentalmente gli effetti delle rappresentazioni visive sulle decisioni finanziarie. Ipotizziamo che le rappresentazioni visive dell'informazione finanziaria possano influenzare le decisioni. Per testare tali ipotesi, abbiamo condotto esperimenti online e mostrato che la scelta della rappresentazione visiva conduce a cambiamenti nell'attenzione, comprensione, e valutazione dell'informazione. Il secondo studio riguarda l'abilità dei consulenti finanziari di offrire giudizio esperto per aiutare consumatori inesperti nelle decisioni finanziarie. Abbiamo trovato che il contenuto della pubblicità influenza significativamente tanto l'esperto quanto l'inesperto, il che offre una nuova prospettiva sulle decisioni dei consulenti finanziari. Il terzo tema riguarda l'apprendimento da informazioni multidimensionali, l'adattamento al cambiamento e lo sviluppo di nuove strategie. Abbiamo investigato gli effetti dell'importanza delle "cues" e di cambiamenti dell'ambiente decisionale sull'apprendimento. Trasformazioni improvvise nell'ambiente decisionale sono più dannose di trasformazioni graduali.
Büchler, Uta [Verfasser], e Björn [Akademischer Betreuer] Ommer. "Visual Representation Learning with Minimal Supervision / Uta Büchler ; Betreuer: Björn Ommer". Heidelberg : Universitätsbibliothek Heidelberg, 2021. http://d-nb.info/1225868505/34.
Testo completoLibri sul tema "Visual representation learning"
Virk, Satyugjit Singh. Learning STEM Through Integrative Visual Representation. [New York, N.Y.?]: [publisher not identified], 2013.
Cerca il testo completoZhang, Zheng. Binary Representation Learning on Visual Images. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2.
Testo completoCheng, Hong. Sparse Representation, Modeling and Learning in Visual Recognition. London: Springer London, 2015. http://dx.doi.org/10.1007/978-1-4471-6714-3.
Testo completo1966-, McBride Kecia Driver, a cura di. Visual media and the humanities: A pedagogy of representation. Knoxville: University of Tennessee Press, 2004.
Cerca il testo completoZareian, Alireza. Learning Structured Representations for Understanding Visual and Multimedia Data. [New York, N.Y.?]: [publisher not identified], 2021.
Cerca il testo completoI, Rumiati Raffaella, e Caramazza Alfonso, a cura di. The Multiple functions of sensory-motor representations. Hove: Psychology Press, 2005.
Cerca il testo completoSpiliotopoulou-Papantoniou, Vasiliki. The changing role of visual representations as a tool for research and learning. Hauppauge, N.Y: Nova Science Publishers, 2011.
Cerca il testo completoLearning-Based Local Visual Representation and Indexing. Elsevier, 2015. http://dx.doi.org/10.1016/c2014-0-01997-1.
Testo completoJi, Rongrong, Yue Gao, Ling-Yu Duan, Qionghai Dai e Yao Hongxun. Learning-Based Local Visual Representation and Indexing. Elsevier Science & Technology Books, 2015.
Cerca il testo completoLearning-Based Local Visual Representation and Indexing. Elsevier Science & Technology Books, 2015.
Cerca il testo completoCapitoli di libri sul tema "Visual representation learning"
Wu, Qi, Peng Wang, Xin Wang, Xiaodong He e Wenwu Zhu. "Video Representation Learning". In Visual Question Answering, 111–17. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-0964-1_7.
Testo completoZhang, Zheng. "Correction to: Binary Representation Learning on Visual Images: Learning to Hash for Similarity Search". In Binary Representation Learning on Visual Images, C1—C2. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_8.
Testo completoZhang, Zheng. "Deep Collaborative Graph Hashing". In Binary Representation Learning on Visual Images, 143–67. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_6.
Testo completoZhang, Zheng. "Scalable Supervised Asymmetric Hashing". In Binary Representation Learning on Visual Images, 17–50. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_2.
Testo completoZhang, Zheng. "Probability Ordinal-Preserving Semantic Hashing". In Binary Representation Learning on Visual Images, 81–109. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_4.
Testo completoZhang, Zheng. "Introduction". In Binary Representation Learning on Visual Images, 1–16. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_1.
Testo completoZhang, Zheng. "Semantic-Aware Adversarial Training". In Binary Representation Learning on Visual Images, 169–97. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_7.
Testo completoZhang, Zheng. "Inductive Structure Consistent Hashing". In Binary Representation Learning on Visual Images, 51–80. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_3.
Testo completoZhang, Zheng. "Ordinal-Preserving Latent Graph Hashing". In Binary Representation Learning on Visual Images, 111–41. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_5.
Testo completoGuo, Tan, Lei Zhang e Xiaoheng Tan. "Extreme Latent Representation Learning for Visual Classification". In Proceedings in Adaptation, Learning and Optimization, 65–75. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-23307-5_8.
Testo completoAtti di convegni sul tema "Visual representation learning"
Chen, Guikun, Xia Li, Yi Yang e Wenguan Wang. "Neural Clustering Based Visual Representation Learning". In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 5714–25. IEEE, 2024. http://dx.doi.org/10.1109/cvpr52733.2024.00546.
Testo completoBrack, Viktor, e Dominik Koßmann. "Local Representation Learning Using Visual Priors for Remote Sensing". In IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium, 8263–67. IEEE, 2024. http://dx.doi.org/10.1109/igarss53475.2024.10641131.
Testo completoXie, Ruobing, Zhiyuan Liu, Huanbo Luan e Maosong Sun. "Image-embodied Knowledge Representation Learning". In Twenty-Sixth International Joint Conference on Artificial Intelligence. California: International Joint Conferences on Artificial Intelligence Organization, 2017. http://dx.doi.org/10.24963/ijcai.2017/438.
Testo completoLi, Zechao. "Understanding-oriented visual representation learning". In the 7th International Conference. New York, New York, USA: ACM Press, 2015. http://dx.doi.org/10.1145/2808492.2808572.
Testo completoLee, Donghun, Seonghyun Kim, Samyeul Noh, Heechul Bae e Ingook Jang. "High-level Visual Representation via Perceptual Representation Learning". In 2023 14th International Conference on Information and Communication Technology Convergence (ICTC). IEEE, 2023. http://dx.doi.org/10.1109/ictc58733.2023.10393558.
Testo completoKolesnikov, Alexander, Xiaohua Zhai e Lucas Beyer. "Revisiting Self-Supervised Visual Representation Learning". In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019. http://dx.doi.org/10.1109/cvpr.2019.00202.
Testo completoSariyildiz, Mert Bulent, Yannis Kalantidis, Diane Larlus e Karteek Alahari. "Concept Generalization in Visual Representation Learning". In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2021. http://dx.doi.org/10.1109/iccv48922.2021.00949.
Testo completoÖzçelİk, Timoteos Onur, Berk Gökberk e Lale Akarun. "Self-Supervised Dense Visual Representation Learning". In 2024 32nd Signal Processing and Communications Applications Conference (SIU). IEEE, 2024. http://dx.doi.org/10.1109/siu61531.2024.10600771.
Testo completoBroscheit, Samuel. "Learning Distributional Token Representations from Visual Features". In Proceedings of The Third Workshop on Representation Learning for NLP. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018. http://dx.doi.org/10.18653/v1/w18-3025.
Testo completoHong, Xudong, Vera Demberg, Asad Sayeed, Qiankun Zheng e Bernt Schiele. "Visual Coherence Loss for Coherent and Visually Grounded Story Generation". In Proceedings of the 8th Workshop on Representation Learning for NLP (RepL4NLP 2023). Stroudsburg, PA, USA: Association for Computational Linguistics, 2023. http://dx.doi.org/10.18653/v1/2023.repl4nlp-1.27.
Testo completoRapporti di organizzazioni sul tema "Visual representation learning"
Tarasenko, Rostyslav O., Svitlana M. Amelina, Yuliya M. Kazhan e Olga V. Bondarenko. The use of AR elements in the study of foreign languages at the university. CEUR Workshop Proceedings, novembre 2020. http://dx.doi.org/10.31812/123456789/4421.
Testo completoTarasenko, Rostyslav O., Svitlana M. Amelina, Yuliya M. Kazhan e Olga V. Bondarenko. The use of AR elements in the study of foreign languages at the university. CEUR Workshop Proceedings, novembre 2020. http://dx.doi.org/10.31812/123456789/4421.
Testo completoShukla, Indu, Rajeev Agrawal, Kelly Ervin e Jonathan Boone. AI on digital twin of facility captured by reality scans. Engineer Research and Development Center (U.S.), novembre 2023. http://dx.doi.org/10.21079/11681/47850.
Testo completoIatsyshyn, Anna V., Valeriia O. Kovach, Yevhen O. Romanenko, Iryna I. Deinega, Andrii V. Iatsyshyn, Oleksandr O. Popov, Yulii G. Kutsan, Volodymyr O. Artemchuk, Oleksandr Yu Burov e Svitlana H. Lytvynova. Application of augmented reality technologies for preparation of specialists of new technological era. [б. в.], febbraio 2020. http://dx.doi.org/10.31812/123456789/3749.
Testo completoZerla, Pauline. Trauma, Violence Prevention, and Reintegration: Learning from Youth Conflict Narratives in the Central African Republic. RESOLVE Network, febbraio 2024. http://dx.doi.org/10.37805/lpbi2024.1.
Testo completo