Literatura científica selecionada sobre o tema "Visual representation learning"
Crie uma referência precisa em APA, MLA, Chicago, Harvard, e outros estilos
Consulte a lista de atuais artigos, livros, teses, anais de congressos e outras fontes científicas relevantes para o tema "Visual representation learning".
Ao lado de cada fonte na lista de referências, há um botão "Adicionar à bibliografia". Clique e geraremos automaticamente a citação bibliográfica do trabalho escolhido no estilo de citação de que você precisa: APA, MLA, Harvard, Chicago, Vancouver, etc.
Você também pode baixar o texto completo da publicação científica em formato .pdf e ler o resumo do trabalho online se estiver presente nos metadados.
Artigos de revistas sobre o assunto "Visual representation learning"
Lee, Jungmin, e Wongyoung Lee. "Aspects of A Study on the Multi Presentational Metaphor Education Using Online Telestration". Korean Society of Culture and Convergence 44, n.º 9 (30 de setembro de 2022): 163–73. http://dx.doi.org/10.33645/cnc.2022.9.44.9.163.
Texto completo da fonteYang, Chuanguang, Zhulin An, Linhang Cai e Yongjun Xu. "Mutual Contrastive Learning for Visual Representation Learning". Proceedings of the AAAI Conference on Artificial Intelligence 36, n.º 3 (28 de junho de 2022): 3045–53. http://dx.doi.org/10.1609/aaai.v36i3.20211.
Texto completo da fonteKhaerun Nisa, Rachmawati, e Reza Muhamad Zaenal. "Analysis Of Students' Mathematical Representation Ability in View of Learning Styles". Indo-MathEdu Intellectuals Journal 4, n.º 2 (15 de agosto de 2023): 99–109. http://dx.doi.org/10.54373/imeij.v4i2.119.
Texto completo da fonteKholilatun, Fiki, Nizaruddin Nizaruddin e F. X. Didik Purwosetiyono. "Kemampuan Representasi Siswa SMP Kelas VIII dalam Menyelesaikan Soal Cerita Materi Peluang Ditinjau dari Gaya Belajar Visual". Jurnal Kualita Pendidikan 4, n.º 1 (30 de abril de 2023): 54–59. http://dx.doi.org/10.51651/jkp.v4i1.339.
Texto completo da fonteRif'at, Mohamad, Sudiansyah Sudiansyah e Khoirunnisa Imama. "Role of visual abilities in mathematics learning: An analysis of conceptual representation". Al-Jabar : Jurnal Pendidikan Matematika 15, n.º 1 (10 de junho de 2024): 87. http://dx.doi.org/10.24042/ajpm.v15i1.22406.
Texto completo da fonteRuliani, Iva Desi, Nizaruddin Nizaruddin e Yanuar Hery Murtianto. "Profile Analysis of Mathematical Problem Solving Abilities with Krulik & Rudnick Stages Judging from Medium Visual Representation". JIPM (Jurnal Ilmiah Pendidikan Matematika) 7, n.º 1 (7 de setembro de 2018): 22. http://dx.doi.org/10.25273/jipm.v7i1.2123.
Texto completo da fonteZha, B., e A. Yilmaz. "LEARNING MAPS FOR OBJECT LOCALIZATION USING VISUAL-INERTIAL ODOMETRY". ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences V-1-2020 (3 de agosto de 2020): 343–50. http://dx.doi.org/10.5194/isprs-annals-v-1-2020-343-2020.
Texto completo da fonteMoghaddam, B., e A. Pentland. "Probabilistic visual learning for object representation". IEEE Transactions on Pattern Analysis and Machine Intelligence 19, n.º 7 (julho de 1997): 696–710. http://dx.doi.org/10.1109/34.598227.
Texto completo da fonteHe, Xiangteng, e Yuxin Peng. "Fine-Grained Visual-Textual Representation Learning". IEEE Transactions on Circuits and Systems for Video Technology 30, n.º 2 (fevereiro de 2020): 520–31. http://dx.doi.org/10.1109/tcsvt.2019.2892802.
Texto completo da fonteLiu, Qiyuan, Qi Zhou, Rui Yang e Jie Wang. "Robust Representation Learning by Clustering with Bisimulation Metrics for Visual Reinforcement Learning with Distractions". Proceedings of the AAAI Conference on Artificial Intelligence 37, n.º 7 (26 de junho de 2023): 8843–51. http://dx.doi.org/10.1609/aaai.v37i7.26063.
Texto completo da fonteTeses / dissertações sobre o assunto "Visual representation learning"
Wang, Zhaoqing. "Self-supervised Visual Representation Learning". Thesis, The University of Sydney, 2022. https://hdl.handle.net/2123/29595.
Texto completo da fonteZhou, Bolei. "Interpretable representation learning for visual intelligence". Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/117837.
Texto completo da fonteThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 131-140).
Recent progress of deep neural networks in computer vision and machine learning has enabled transformative applications across robotics, healthcare, and security. However, despite the superior performance of the deep neural networks, it remains challenging to understand their inner workings and explain their output predictions. This thesis investigates several novel approaches for opening up the "black box" of neural networks used in visual recognition tasks and understanding their inner working mechanism. I first show that objects and other meaningful concepts emerge as a consequence of recognizing scenes. A network dissection approach is further introduced to automatically identify the internal units as the emergent concept detectors and quantify their interpretability. Then I describe an approach that can efficiently explain the output prediction for any given image. It sheds light on the decision-making process of the networks and why the predictions succeed or fail. Finally, I show some ongoing efforts toward learning efficient and interpretable deep representations for video event understanding and some future directions.
by Bolei Zhou.
Ph. D.
Ben-Younes, Hedi. "Multi-modal representation learning towards visual reasoning". Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS173.
Texto completo da fonteThe quantity of images that populate the Internet is dramatically increasing. It becomes of critical importance to develop the technology for a precise and automatic understanding of visual contents. As image recognition systems are becoming more and more relevant, researchers in artificial intelligence now seek for the next generation vision systems that can perform high-level scene understanding. In this thesis, we are interested in Visual Question Answering (VQA), which consists in building models that answer any natural language question about any image. Because of its nature and complexity, VQA is often considered as a proxy for visual reasoning. Classically, VQA architectures are designed as trainable systems that are provided with images, questions about them and their answers. To tackle this problem, typical approaches involve modern Deep Learning (DL) techniques. In the first part, we focus on developping multi-modal fusion strategies to model the interactions between image and question representations. More specifically, we explore bilinear fusion models and exploit concepts from tensor analysis to provide tractable and expressive factorizations of parameters. These fusion mechanisms are studied under the widely used visual attention framework: the answer to the question is provided by focusing only on the relevant image regions. In the last part, we move away from the attention mechanism and build a more advanced scene understanding architecture where we consider objects and their spatial and semantic relations. All models are thoroughly experimentally evaluated on standard datasets and the results are competitive with the literature
Sharif, Razavian Ali. "Convolutional Network Representation for Visual Recognition". Doctoral thesis, KTH, Robotik, perception och lärande, RPL, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-197919.
Texto completo da fonteQC 20161209
Yu, Mengyang. "Feature reduction and representation learning for visual applications". Thesis, Northumbria University, 2016. http://nrl.northumbria.ac.uk/30222/.
Texto completo da fonteVenkataramanan, Shashanka. "Metric learning for instance and category-level visual representation". Electronic Thesis or Diss., Université de Rennes (2023-....), 2024. http://www.theses.fr/2024URENS022.
Texto completo da fonteThe primary goal in computer vision is to enable machines to extract meaningful information from visual data, such as images and videos, and leverage this information to perform a wide range of tasks. To this end, substantial research has focused on developing deep learning models capable of encoding comprehensive and robust visual representations. A prominent strategy in this context involves pretraining models on large-scale datasets, such as ImageNet, to learn representations that can exhibit cross-task applicability and facilitate the successful handling of diverse downstream tasks with minimal effort. To facilitate learning on these large-scale datasets and encode good representations, com- plex data augmentation strategies have been used. However, these augmentations can be limited in their scope, either being hand-crafted and lacking diversity, or generating images that appear unnatural. Moreover, the focus of these augmentation techniques has primarily been on the ImageNet dataset and its downstream tasks, limiting their applicability to a broader range of computer vision problems. In this thesis, we aim to tackle these limitations by exploring different approaches to en- hance the efficiency and effectiveness in representation learning. The common thread across the works presented is the use of interpolation-based techniques, such as mixup, to generate diverse and informative training examples beyond the original dataset. In the first work, we are motivated by the idea of deformation as a natural way of interpolating images rather than using a convex combination. We show that geometrically aligning the two images in the fea- ture space, allows for more natural interpolation that retains the geometry of one image and the texture of the other, connecting it to style transfer. Drawing from these observations, we explore the combination of mixup and deep metric learning. We develop a generalized formu- lation that accommodates mixup in metric learning, leading to improved representations that explore areas of the embedding space beyond the training classes. Building on these insights, we revisit the original motivation of mixup and generate a larger number of interpolated examples beyond the mini-batch size by interpolating in the embedding space. This approach allows us to sample on the entire convex hull of the mini-batch, rather than just along lin- ear segments between pairs of examples. Finally, we investigate the potential of using natural augmentations of objects from videos. We introduce a "Walking Tours" dataset of first-person egocentric videos, which capture a diverse range of objects and actions in natural scene transi- tions. We then propose a novel self-supervised pretraining method called DoRA, which detects and tracks objects in video frames, deriving multiple views from the tracks and using them in a self-supervised manner
Li, Nuo Ph D. Massachusetts Institute of Technology. "Unsupervised learning of invariant object representation in primate visual cortex". Thesis, Massachusetts Institute of Technology, 2011. http://hdl.handle.net/1721.1/65288.
Texto completo da fonteCataloged from PDF version of thesis.
Includes bibliographical references.
Visual object recognition (categorization and identification) is one of the most fundamental cognitive functions for our survival. Our visual system has the remarkable ability to convey to us visual object and category information in a manner that is largely tolerant ("invariant") to the exact position, size, pose of the object, illumination, and clutter. The ventral visual stream in non-human primate has solved this problem. At the highest stage of the visual hierarchy, the inferior temporal cortex (IT), neurons have selectivity for objects and maintain that selectivity across variations in the images. A reasonably sized population of these tolerant neurons can support object recognition. However, we do not yet understand how IT neurons construct this neuronal tolerance. The aim of this thesis is to tackle this question and to examine the hypothesis that the ventral visual stream may leverage experience to build its neuronal tolerance. One potentially powerful idea is that time can act as an implicit teacher, in that each object's identity tends to remain temporally stable, thus different retinal images of the same object are temporally contiguous. In theory, the ventral stream could take advantage of this natural tendency and learn to associate together the neuronal representations of temporally contiguous retinal images to yield tolerant object selectivity in IT cortex. In this thesis, I report neuronal support for this hypothesis in IT of non-human primates. First, targeted alteration of temporally contiguous experience with object images at different retinal positions rapidly reshaped IT neurons' position tolerance. Second, similar temporal contiguity manipulation of experience with object images at different sizes similarly reshaped IT size tolerance. These instances of experience-induced effect were similar in magnitude, grew gradually stronger with increasing visual experience, and the size of the effect was large. Taken together, these studies show that unsupervised, temporally contiguous experience can reshape and build at least two types of IT tolerance, and that they can do so under a wide range of spatiotemporal regimes encountered during natural visual exploration. These results suggest that the ventral visual stream uses temporal contiguity visual experience with a general unsupervised tolerance learning (UTL) mechanism to build its invariant object representation.
by Nuo Li.
Ph.D.
Dalens, Théophile. "Learnable factored image representation for visual discovery". Thesis, Paris Sciences et Lettres (ComUE), 2019. http://www.theses.fr/2019PSLEE036.
Texto completo da fonteThis thesis proposes an approach for analyzing unpaired visual data annotated with time stamps by generating how images would have looked like if they were from different times. To isolate and transfer time dependent appearance variations, we introduce a new trainable bilinear factor separation module. We analyze its relation to classical factored representations and concatenation-based auto-encoders. We demonstrate this new module has clear advantages compared to standard concatenation when used in a bottleneck encoder-decoder convolutional neural network architecture. We also show that it can be inserted in a recent adversarial image translation architecture, enabling the image transformation to multiple different target time periods using a single network
Jonaityte, Inga <1981>. "Visual representation and financial decision making". Doctoral thesis, Università Ca' Foscari Venezia, 2014. http://hdl.handle.net/10579/4593.
Texto completo da fonteQuesta tesi affronta sperimentalmente gli effetti delle rappresentazioni visive sulle decisioni finanziarie. Ipotizziamo che le rappresentazioni visive dell'informazione finanziaria possano influenzare le decisioni. Per testare tali ipotesi, abbiamo condotto esperimenti online e mostrato che la scelta della rappresentazione visiva conduce a cambiamenti nell'attenzione, comprensione, e valutazione dell'informazione. Il secondo studio riguarda l'abilità dei consulenti finanziari di offrire giudizio esperto per aiutare consumatori inesperti nelle decisioni finanziarie. Abbiamo trovato che il contenuto della pubblicità influenza significativamente tanto l'esperto quanto l'inesperto, il che offre una nuova prospettiva sulle decisioni dei consulenti finanziari. Il terzo tema riguarda l'apprendimento da informazioni multidimensionali, l'adattamento al cambiamento e lo sviluppo di nuove strategie. Abbiamo investigato gli effetti dell'importanza delle "cues" e di cambiamenti dell'ambiente decisionale sull'apprendimento. Trasformazioni improvvise nell'ambiente decisionale sono più dannose di trasformazioni graduali.
Büchler, Uta [Verfasser], e Björn [Akademischer Betreuer] Ommer. "Visual Representation Learning with Minimal Supervision / Uta Büchler ; Betreuer: Björn Ommer". Heidelberg : Universitätsbibliothek Heidelberg, 2021. http://d-nb.info/1225868505/34.
Texto completo da fonteLivros sobre o assunto "Visual representation learning"
Virk, Satyugjit Singh. Learning STEM Through Integrative Visual Representation. [New York, N.Y.?]: [publisher not identified], 2013.
Encontre o texto completo da fonteZhang, Zheng. Binary Representation Learning on Visual Images. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2.
Texto completo da fonteCheng, Hong. Sparse Representation, Modeling and Learning in Visual Recognition. London: Springer London, 2015. http://dx.doi.org/10.1007/978-1-4471-6714-3.
Texto completo da fonte1966-, McBride Kecia Driver, ed. Visual media and the humanities: A pedagogy of representation. Knoxville: University of Tennessee Press, 2004.
Encontre o texto completo da fonteZareian, Alireza. Learning Structured Representations for Understanding Visual and Multimedia Data. [New York, N.Y.?]: [publisher not identified], 2021.
Encontre o texto completo da fonteI, Rumiati Raffaella, e Caramazza Alfonso, eds. The Multiple functions of sensory-motor representations. Hove: Psychology Press, 2005.
Encontre o texto completo da fonteSpiliotopoulou-Papantoniou, Vasiliki. The changing role of visual representations as a tool for research and learning. Hauppauge, N.Y: Nova Science Publishers, 2011.
Encontre o texto completo da fonteLearning-Based Local Visual Representation and Indexing. Elsevier, 2015. http://dx.doi.org/10.1016/c2014-0-01997-1.
Texto completo da fonteJi, Rongrong, Yue Gao, Ling-Yu Duan, Qionghai Dai e Yao Hongxun. Learning-Based Local Visual Representation and Indexing. Elsevier Science & Technology Books, 2015.
Encontre o texto completo da fonteLearning-Based Local Visual Representation and Indexing. Elsevier Science & Technology Books, 2015.
Encontre o texto completo da fonteCapítulos de livros sobre o assunto "Visual representation learning"
Wu, Qi, Peng Wang, Xin Wang, Xiaodong He e Wenwu Zhu. "Video Representation Learning". In Visual Question Answering, 111–17. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-0964-1_7.
Texto completo da fonteZhang, Zheng. "Correction to: Binary Representation Learning on Visual Images: Learning to Hash for Similarity Search". In Binary Representation Learning on Visual Images, C1—C2. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_8.
Texto completo da fonteZhang, Zheng. "Deep Collaborative Graph Hashing". In Binary Representation Learning on Visual Images, 143–67. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_6.
Texto completo da fonteZhang, Zheng. "Scalable Supervised Asymmetric Hashing". In Binary Representation Learning on Visual Images, 17–50. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_2.
Texto completo da fonteZhang, Zheng. "Probability Ordinal-Preserving Semantic Hashing". In Binary Representation Learning on Visual Images, 81–109. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_4.
Texto completo da fonteZhang, Zheng. "Introduction". In Binary Representation Learning on Visual Images, 1–16. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_1.
Texto completo da fonteZhang, Zheng. "Semantic-Aware Adversarial Training". In Binary Representation Learning on Visual Images, 169–97. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_7.
Texto completo da fonteZhang, Zheng. "Inductive Structure Consistent Hashing". In Binary Representation Learning on Visual Images, 51–80. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_3.
Texto completo da fonteZhang, Zheng. "Ordinal-Preserving Latent Graph Hashing". In Binary Representation Learning on Visual Images, 111–41. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-2112-2_5.
Texto completo da fonteGuo, Tan, Lei Zhang e Xiaoheng Tan. "Extreme Latent Representation Learning for Visual Classification". In Proceedings in Adaptation, Learning and Optimization, 65–75. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-23307-5_8.
Texto completo da fonteTrabalhos de conferências sobre o assunto "Visual representation learning"
Chen, Guikun, Xia Li, Yi Yang e Wenguan Wang. "Neural Clustering Based Visual Representation Learning". In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 5714–25. IEEE, 2024. http://dx.doi.org/10.1109/cvpr52733.2024.00546.
Texto completo da fonteBrack, Viktor, e Dominik Koßmann. "Local Representation Learning Using Visual Priors for Remote Sensing". In IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium, 8263–67. IEEE, 2024. http://dx.doi.org/10.1109/igarss53475.2024.10641131.
Texto completo da fonteXie, Ruobing, Zhiyuan Liu, Huanbo Luan e Maosong Sun. "Image-embodied Knowledge Representation Learning". In Twenty-Sixth International Joint Conference on Artificial Intelligence. California: International Joint Conferences on Artificial Intelligence Organization, 2017. http://dx.doi.org/10.24963/ijcai.2017/438.
Texto completo da fonteLi, Zechao. "Understanding-oriented visual representation learning". In the 7th International Conference. New York, New York, USA: ACM Press, 2015. http://dx.doi.org/10.1145/2808492.2808572.
Texto completo da fonteLee, Donghun, Seonghyun Kim, Samyeul Noh, Heechul Bae e Ingook Jang. "High-level Visual Representation via Perceptual Representation Learning". In 2023 14th International Conference on Information and Communication Technology Convergence (ICTC). IEEE, 2023. http://dx.doi.org/10.1109/ictc58733.2023.10393558.
Texto completo da fonteKolesnikov, Alexander, Xiaohua Zhai e Lucas Beyer. "Revisiting Self-Supervised Visual Representation Learning". In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2019. http://dx.doi.org/10.1109/cvpr.2019.00202.
Texto completo da fonteSariyildiz, Mert Bulent, Yannis Kalantidis, Diane Larlus e Karteek Alahari. "Concept Generalization in Visual Representation Learning". In 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, 2021. http://dx.doi.org/10.1109/iccv48922.2021.00949.
Texto completo da fonteÖzçelİk, Timoteos Onur, Berk Gökberk e Lale Akarun. "Self-Supervised Dense Visual Representation Learning". In 2024 32nd Signal Processing and Communications Applications Conference (SIU). IEEE, 2024. http://dx.doi.org/10.1109/siu61531.2024.10600771.
Texto completo da fonteBroscheit, Samuel. "Learning Distributional Token Representations from Visual Features". In Proceedings of The Third Workshop on Representation Learning for NLP. Stroudsburg, PA, USA: Association for Computational Linguistics, 2018. http://dx.doi.org/10.18653/v1/w18-3025.
Texto completo da fonteHong, Xudong, Vera Demberg, Asad Sayeed, Qiankun Zheng e Bernt Schiele. "Visual Coherence Loss for Coherent and Visually Grounded Story Generation". In Proceedings of the 8th Workshop on Representation Learning for NLP (RepL4NLP 2023). Stroudsburg, PA, USA: Association for Computational Linguistics, 2023. http://dx.doi.org/10.18653/v1/2023.repl4nlp-1.27.
Texto completo da fonteRelatórios de organizações sobre o assunto "Visual representation learning"
Tarasenko, Rostyslav O., Svitlana M. Amelina, Yuliya M. Kazhan e Olga V. Bondarenko. The use of AR elements in the study of foreign languages at the university. CEUR Workshop Proceedings, novembro de 2020. http://dx.doi.org/10.31812/123456789/4421.
Texto completo da fonteTarasenko, Rostyslav O., Svitlana M. Amelina, Yuliya M. Kazhan e Olga V. Bondarenko. The use of AR elements in the study of foreign languages at the university. CEUR Workshop Proceedings, novembro de 2020. http://dx.doi.org/10.31812/123456789/4421.
Texto completo da fonteShukla, Indu, Rajeev Agrawal, Kelly Ervin e Jonathan Boone. AI on digital twin of facility captured by reality scans. Engineer Research and Development Center (U.S.), novembro de 2023. http://dx.doi.org/10.21079/11681/47850.
Texto completo da fonteIatsyshyn, Anna V., Valeriia O. Kovach, Yevhen O. Romanenko, Iryna I. Deinega, Andrii V. Iatsyshyn, Oleksandr O. Popov, Yulii G. Kutsan, Volodymyr O. Artemchuk, Oleksandr Yu Burov e Svitlana H. Lytvynova. Application of augmented reality technologies for preparation of specialists of new technological era. [б. в.], fevereiro de 2020. http://dx.doi.org/10.31812/123456789/3749.
Texto completo da fonteZerla, Pauline. Trauma, Violence Prevention, and Reintegration: Learning from Youth Conflict Narratives in the Central African Republic. RESOLVE Network, fevereiro de 2024. http://dx.doi.org/10.37805/lpbi2024.1.
Texto completo da fonte