Literatura académica sobre el tema "Multimodal embedding space"
Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros
Consulte las listas temáticas de artículos, libros, tesis, actas de conferencias y otras fuentes académicas sobre el tema "Multimodal embedding space".
Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.
También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.
Artículos de revistas sobre el tema "Multimodal embedding space"
Tyshchuk, Kirill, Polina Karpikova, Andrew Spiridonov, Anastasiia Prutianova, Anton Razzhigaev y Alexander Panchenko. "On Isotropy of Multimodal Embeddings". Information 14, n.º 7 (10 de julio de 2023): 392. http://dx.doi.org/10.3390/info14070392.
Texto completoMai, Sijie, Haifeng Hu y Songlong Xing. "Modality to Modality Translation: An Adversarial Representation Learning and Graph Fusion Network for Multimodal Fusion". Proceedings of the AAAI Conference on Artificial Intelligence 34, n.º 01 (3 de abril de 2020): 164–72. http://dx.doi.org/10.1609/aaai.v34i01.5347.
Texto completoZhang, Linhai, Deyu Zhou, Yulan He y Zeng Yang. "MERL: Multimodal Event Representation Learning in Heterogeneous Embedding Spaces". Proceedings of the AAAI Conference on Artificial Intelligence 35, n.º 16 (18 de mayo de 2021): 14420–27. http://dx.doi.org/10.1609/aaai.v35i16.17695.
Texto completoGuo, Zhiqiang, Jianjun Li, Guohui Li, Chaoyang Wang, Si Shi y Bin Ruan. "LGMRec: Local and Global Graph Learning for Multimodal Recommendation". Proceedings of the AAAI Conference on Artificial Intelligence 38, n.º 8 (24 de marzo de 2024): 8454–62. http://dx.doi.org/10.1609/aaai.v38i8.28688.
Texto completoMoon, Jucheol, Nhat Anh Le, Nelson Hebert Minaya y Sang-Il Choi. "Multimodal Few-Shot Learning for Gait Recognition". Applied Sciences 10, n.º 21 (29 de octubre de 2020): 7619. http://dx.doi.org/10.3390/app10217619.
Texto completoZhang, Rongchao, Yiwei Lou, Dexuan Xu, Yongzhi Cao, Hanpin Wang y Yu Huang. "A Learnable Discrete-Prior Fusion Autoencoder with Contrastive Learning for Tabular Data Synthesis". Proceedings of the AAAI Conference on Artificial Intelligence 38, n.º 15 (24 de marzo de 2024): 16803–11. http://dx.doi.org/10.1609/aaai.v38i15.29621.
Texto completoMerkx, Danny y Stefan L. Frank. "Learning semantic sentence representations from visually grounded language without lexical knowledge". Natural Language Engineering 25, n.º 4 (julio de 2019): 451–66. http://dx.doi.org/10.1017/s1351324919000196.
Texto completoFan, Yunpeng, Wenyou Du, Yingwei Zhang y Xiaogang Wang. "Fault Detection for Multimodal Process Using Quality-Relevant Kernel Neighborhood Preserving Embedding". Mathematical Problems in Engineering 2015 (2015): 1–15. http://dx.doi.org/10.1155/2015/210125.
Texto completoOta, Kosuke, Keiichiro Shirai, Hidetoshi Miyao y Minoru Maruyama. "Multimodal Analogy-Based Image Retrieval by Improving Semantic Embeddings". Journal of Advanced Computational Intelligence and Intelligent Informatics 26, n.º 6 (20 de noviembre de 2022): 995–1003. http://dx.doi.org/10.20965/jaciii.2022.p0995.
Texto completoKim, Jongseok, Youngjae Yu, Hoeseong Kim y Gunhee Kim. "Dual Compositional Learning in Interactive Image Retrieval". Proceedings of the AAAI Conference on Artificial Intelligence 35, n.º 2 (18 de mayo de 2021): 1771–79. http://dx.doi.org/10.1609/aaai.v35i2.16271.
Texto completoTesis sobre el tema "Multimodal embedding space"
Couairon, Guillaume. "Text-Based Semantic Image Editing". Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS248.
Texto completoThe aim of this thesis is to propose algorithms for the task of Text-based Image Editing (TIE), which consists in editing digital images according to an instruction formulated in natural language. For instance, given an image of a dog, and the query "Change the dog into a cat", we want to produce a novel image where the dog has been replaced by a cat, keeping all other image aspects unchanged (animal color and pose, background). The north-star goal is to enable anyone to edit their images using only queries in natural language. One specificity of text-based image editing is that there is practically no training data to train a supervised algorithm. In this thesis, we propose different solutions for editing images, based on the adaptation of large multimodal models trained on huge datasets. We first study a simplified editing setup, named Retrieval-based image edit- ing, which does not require to directly modify the input image. Instead, given the image and modification query, we search in a large database an image that corresponds to the requested edit. We leverage multimodal image/text alignment models trained on web-scale datasets (like CLIP) to perform such transformations without any examples. We also propose the SIMAT framework for evaluating retrieval-based image editing. We then study how to directly modify the input image. We propose FlexIT, a method which iteratively changes the input image until it satisfies an abstract "editing objective" defined in a multimodal embedding space. We introduce a variety of regularization terms to enforce realistic transformations. Next, we focus on diffusion models, which are powerful generative models able to synthetize novel images conditioned on a wide variety of textual prompts. We demonstrate their versatility by proposing DiffEdit, an algorithm which adapts diffusion models for image editing without finetuning. We propose a zero-shot strategy for finding automatically where the initial image should be changed to satisfy the text transformation query. Finally, we study a specific challenge useful in the context of image editing: how to synthetize a novel image by giving as constraint a spatial layout of objects with textual descriptions, a task which is known as Semantic Image Synthesis. We adopt the same strategy, consisting in adapting diffusion models to solve the task without any example. We propose the ZestGuide algorithm, which leverages the spatio-semantic information encoded in the attention layers of diffusion models
Capítulos de libros sobre el tema "Multimodal embedding space"
Zhang, Chao y Jiawei Han. "Data Mining and Knowledge Discovery". En Urban Informatics, 797–814. Singapore: Springer Singapore, 2021. http://dx.doi.org/10.1007/978-981-15-8983-6_42.
Texto completoZhao, Xiang, Weixin Zeng y Jiuyang Tang. "Multimodal Entity Alignment". En Entity Alignment, 229–47. Singapore: Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-99-4250-3_9.
Texto completoValles-Perez, Ivan, Grzegorz Beringer, Piotr Bilinski, Gary Cook y Roberto Barra-Chicote. "SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces". En Frontiers in Artificial Intelligence and Applications. IOS Press, 2023. http://dx.doi.org/10.3233/faia230540.
Texto completoActas de conferencias sobre el tema "Multimodal embedding space"
Bhattacharya, Indrani, Arkabandhu Chowdhury y Vikas C. Raykar. "Multimodal Dialog for Browsing Large Visual Catalogs using Exploration-Exploitation Paradigm in a Joint Embedding Space". En ICMR '19: International Conference on Multimedia Retrieval. New York, NY, USA: ACM, 2019. http://dx.doi.org/10.1145/3323873.3325036.
Texto completoRostami, Mohammad y Aram Galstyan. "Cognitively Inspired Learning of Incremental Drifting Concepts". En Thirty-Second International Joint Conference on Artificial Intelligence {IJCAI-23}. California: International Joint Conferences on Artificial Intelligence Organization, 2023. http://dx.doi.org/10.24963/ijcai.2023/341.
Texto completoGopalakrishnan, Sabarish, Premkumar Udaiyar, Shagan Sah y Raymond Ptucha. "Multi Stage Common Vector Space for Multimodal Embeddings". En 2019 IEEE Applied Imagery Pattern Recognition Workshop (AIPR). IEEE, 2019. http://dx.doi.org/10.1109/aipr47015.2019.9174583.
Texto completoFeng, LiWei, Hao Ai y Yuan Li. "Multimode Process Monitoring Based on Density Space Clustering Locally Linear Embedding Technique". En 2023 2nd Conference on Fully Actuated System Theory and Applications (CFASTA). IEEE, 2023. http://dx.doi.org/10.1109/cfasta57821.2023.10243375.
Texto completoPasi, Piyush Singh, Karthikeya Battepati, Preethi Jyothi, Ganesh Ramakrishnan, Tanmay Mahapatra y Manoj Singh. "Temporally Aligning Long Audio Interviews with Questions: A Case Study in Multimodal Data Integration". En Thirty-Second International Joint Conference on Artificial Intelligence {IJCAI-23}. California: International Joint Conferences on Artificial Intelligence Organization, 2023. http://dx.doi.org/10.24963/ijcai.2023/683.
Texto completo