Literatura científica selecionada sobre o tema "Multimodal embedding space"
Crie uma referência precisa em APA, MLA, Chicago, Harvard, e outros estilos
Consulte a lista de atuais artigos, livros, teses, anais de congressos e outras fontes científicas relevantes para o tema "Multimodal embedding space".
Ao lado de cada fonte na lista de referências, há um botão "Adicionar à bibliografia". Clique e geraremos automaticamente a citação bibliográfica do trabalho escolhido no estilo de citação de que você precisa: APA, MLA, Harvard, Chicago, Vancouver, etc.
Você também pode baixar o texto completo da publicação científica em formato .pdf e ler o resumo do trabalho online se estiver presente nos metadados.
Artigos de revistas sobre o assunto "Multimodal embedding space"
Tyshchuk, Kirill, Polina Karpikova, Andrew Spiridonov, Anastasiia Prutianova, Anton Razzhigaev e Alexander Panchenko. "On Isotropy of Multimodal Embeddings". Information 14, n.º 7 (10 de julho de 2023): 392. http://dx.doi.org/10.3390/info14070392.
Texto completo da fonteMai, Sijie, Haifeng Hu e Songlong Xing. "Modality to Modality Translation: An Adversarial Representation Learning and Graph Fusion Network for Multimodal Fusion". Proceedings of the AAAI Conference on Artificial Intelligence 34, n.º 01 (3 de abril de 2020): 164–72. http://dx.doi.org/10.1609/aaai.v34i01.5347.
Texto completo da fonteZhang, Linhai, Deyu Zhou, Yulan He e Zeng Yang. "MERL: Multimodal Event Representation Learning in Heterogeneous Embedding Spaces". Proceedings of the AAAI Conference on Artificial Intelligence 35, n.º 16 (18 de maio de 2021): 14420–27. http://dx.doi.org/10.1609/aaai.v35i16.17695.
Texto completo da fonteGuo, Zhiqiang, Jianjun Li, Guohui Li, Chaoyang Wang, Si Shi e Bin Ruan. "LGMRec: Local and Global Graph Learning for Multimodal Recommendation". Proceedings of the AAAI Conference on Artificial Intelligence 38, n.º 8 (24 de março de 2024): 8454–62. http://dx.doi.org/10.1609/aaai.v38i8.28688.
Texto completo da fonteMoon, Jucheol, Nhat Anh Le, Nelson Hebert Minaya e Sang-Il Choi. "Multimodal Few-Shot Learning for Gait Recognition". Applied Sciences 10, n.º 21 (29 de outubro de 2020): 7619. http://dx.doi.org/10.3390/app10217619.
Texto completo da fonteZhang, Rongchao, Yiwei Lou, Dexuan Xu, Yongzhi Cao, Hanpin Wang e Yu Huang. "A Learnable Discrete-Prior Fusion Autoencoder with Contrastive Learning for Tabular Data Synthesis". Proceedings of the AAAI Conference on Artificial Intelligence 38, n.º 15 (24 de março de 2024): 16803–11. http://dx.doi.org/10.1609/aaai.v38i15.29621.
Texto completo da fonteMerkx, Danny, e Stefan L. Frank. "Learning semantic sentence representations from visually grounded language without lexical knowledge". Natural Language Engineering 25, n.º 4 (julho de 2019): 451–66. http://dx.doi.org/10.1017/s1351324919000196.
Texto completo da fonteFan, Yunpeng, Wenyou Du, Yingwei Zhang e Xiaogang Wang. "Fault Detection for Multimodal Process Using Quality-Relevant Kernel Neighborhood Preserving Embedding". Mathematical Problems in Engineering 2015 (2015): 1–15. http://dx.doi.org/10.1155/2015/210125.
Texto completo da fonteOta, Kosuke, Keiichiro Shirai, Hidetoshi Miyao e Minoru Maruyama. "Multimodal Analogy-Based Image Retrieval by Improving Semantic Embeddings". Journal of Advanced Computational Intelligence and Intelligent Informatics 26, n.º 6 (20 de novembro de 2022): 995–1003. http://dx.doi.org/10.20965/jaciii.2022.p0995.
Texto completo da fonteKim, Jongseok, Youngjae Yu, Hoeseong Kim e Gunhee Kim. "Dual Compositional Learning in Interactive Image Retrieval". Proceedings of the AAAI Conference on Artificial Intelligence 35, n.º 2 (18 de maio de 2021): 1771–79. http://dx.doi.org/10.1609/aaai.v35i2.16271.
Texto completo da fonteTeses / dissertações sobre o assunto "Multimodal embedding space"
Couairon, Guillaume. "Text-Based Semantic Image Editing". Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS248.
Texto completo da fonteThe aim of this thesis is to propose algorithms for the task of Text-based Image Editing (TIE), which consists in editing digital images according to an instruction formulated in natural language. For instance, given an image of a dog, and the query "Change the dog into a cat", we want to produce a novel image where the dog has been replaced by a cat, keeping all other image aspects unchanged (animal color and pose, background). The north-star goal is to enable anyone to edit their images using only queries in natural language. One specificity of text-based image editing is that there is practically no training data to train a supervised algorithm. In this thesis, we propose different solutions for editing images, based on the adaptation of large multimodal models trained on huge datasets. We first study a simplified editing setup, named Retrieval-based image edit- ing, which does not require to directly modify the input image. Instead, given the image and modification query, we search in a large database an image that corresponds to the requested edit. We leverage multimodal image/text alignment models trained on web-scale datasets (like CLIP) to perform such transformations without any examples. We also propose the SIMAT framework for evaluating retrieval-based image editing. We then study how to directly modify the input image. We propose FlexIT, a method which iteratively changes the input image until it satisfies an abstract "editing objective" defined in a multimodal embedding space. We introduce a variety of regularization terms to enforce realistic transformations. Next, we focus on diffusion models, which are powerful generative models able to synthetize novel images conditioned on a wide variety of textual prompts. We demonstrate their versatility by proposing DiffEdit, an algorithm which adapts diffusion models for image editing without finetuning. We propose a zero-shot strategy for finding automatically where the initial image should be changed to satisfy the text transformation query. Finally, we study a specific challenge useful in the context of image editing: how to synthetize a novel image by giving as constraint a spatial layout of objects with textual descriptions, a task which is known as Semantic Image Synthesis. We adopt the same strategy, consisting in adapting diffusion models to solve the task without any example. We propose the ZestGuide algorithm, which leverages the spatio-semantic information encoded in the attention layers of diffusion models
Capítulos de livros sobre o assunto "Multimodal embedding space"
Zhang, Chao, e Jiawei Han. "Data Mining and Knowledge Discovery". In Urban Informatics, 797–814. Singapore: Springer Singapore, 2021. http://dx.doi.org/10.1007/978-981-15-8983-6_42.
Texto completo da fonteZhao, Xiang, Weixin Zeng e Jiuyang Tang. "Multimodal Entity Alignment". In Entity Alignment, 229–47. Singapore: Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-99-4250-3_9.
Texto completo da fonteValles-Perez, Ivan, Grzegorz Beringer, Piotr Bilinski, Gary Cook e Roberto Barra-Chicote. "SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces". In Frontiers in Artificial Intelligence and Applications. IOS Press, 2023. http://dx.doi.org/10.3233/faia230540.
Texto completo da fonteTrabalhos de conferências sobre o assunto "Multimodal embedding space"
Bhattacharya, Indrani, Arkabandhu Chowdhury e Vikas C. Raykar. "Multimodal Dialog for Browsing Large Visual Catalogs using Exploration-Exploitation Paradigm in a Joint Embedding Space". In ICMR '19: International Conference on Multimedia Retrieval. New York, NY, USA: ACM, 2019. http://dx.doi.org/10.1145/3323873.3325036.
Texto completo da fonteRostami, Mohammad, e Aram Galstyan. "Cognitively Inspired Learning of Incremental Drifting Concepts". In Thirty-Second International Joint Conference on Artificial Intelligence {IJCAI-23}. California: International Joint Conferences on Artificial Intelligence Organization, 2023. http://dx.doi.org/10.24963/ijcai.2023/341.
Texto completo da fonteGopalakrishnan, Sabarish, Premkumar Udaiyar, Shagan Sah e Raymond Ptucha. "Multi Stage Common Vector Space for Multimodal Embeddings". In 2019 IEEE Applied Imagery Pattern Recognition Workshop (AIPR). IEEE, 2019. http://dx.doi.org/10.1109/aipr47015.2019.9174583.
Texto completo da fonteFeng, LiWei, Hao Ai e Yuan Li. "Multimode Process Monitoring Based on Density Space Clustering Locally Linear Embedding Technique". In 2023 2nd Conference on Fully Actuated System Theory and Applications (CFASTA). IEEE, 2023. http://dx.doi.org/10.1109/cfasta57821.2023.10243375.
Texto completo da fontePasi, Piyush Singh, Karthikeya Battepati, Preethi Jyothi, Ganesh Ramakrishnan, Tanmay Mahapatra e Manoj Singh. "Temporally Aligning Long Audio Interviews with Questions: A Case Study in Multimodal Data Integration". In Thirty-Second International Joint Conference on Artificial Intelligence {IJCAI-23}. California: International Joint Conferences on Artificial Intelligence Organization, 2023. http://dx.doi.org/10.24963/ijcai.2023/683.
Texto completo da fonte