Log in

Relevant bibliographies by topics / Zero-shot Retrieval

Contents

Journal articles

Academic literature on the topic 'Zero-shot Retrieval'

Author: Grafiati

Published: 6 September 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Zero-shot Retrieval.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Zero-shot Retrieval"

1

Dutta, Titir, and Soma Biswas. "Generalized Zero-Shot Cross-Modal Retrieval." IEEE Transactions on Image Processing 28, no. 12 (2019): 5953–62. http://dx.doi.org/10.1109/tip.2019.2923287.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Seo, Sanghyun, and Juntae Kim. "Hierarchical Semantic Loss and Confidence Estimator for Visual-Semantic Embedding-Based Zero-Shot Learning." Applied Sciences 9, no. 15 (2019): 3133. http://dx.doi.org/10.3390/app9153133.

Full text

Abstract:

Traditional supervised learning is dependent on the label of the training data, so there is a limitation that the class label which is not included in the training data cannot be recognized properly. Therefore, zero-shot learning, which can recognize unseen-classes that are not used in training, is gaining research interest. One approach to zero-shot learning is to embed visual data such as images and rich semantic data related to text labels of visual data into a common vector space to perform zero-shot cross-modal retrieval on newly input unseen-class data. This paper proposes a hierarchical semantic loss and confidence estimator to more efficiently perform zero-shot learning on visual data. Hierarchical semantic loss improves learning efficiency by using hierarchical knowledge in selecting a negative sample of triplet loss, and the confidence estimator estimates the confidence score to determine whether it is seen-class or unseen-class. These methodologies improve the performance of zero-shot learning by adjusting distances from a semantic vector to visual vector when performing zero-shot cross-modal retrieval. Experimental results show that the proposed method can improve the performance of zero-shot learning in terms of hit@k accuracy.

APA, Harvard, Vancouver, ISO, and other styles

3

Wang, Xiao, Craig Macdonald, and Iadh Ounis. "Improving zero-shot retrieval using dense external expansion." Information Processing & Management 59, no. 5 (2022): 103026. http://dx.doi.org/10.1016/j.ipm.2022.103026.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Kumar, Sanjeev. "Phase retrieval with physics informed zero-shot network." Optics Letters 46, no. 23 (2021): 5942. http://dx.doi.org/10.1364/ol.433625.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Lin, Kaiyi, Xing Xu, Lianli Gao, Zheng Wang, and Heng Tao Shen. "Learning Cross-Aligned Latent Embeddings for Zero-Shot Cross-Modal Retrieval." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (2020): 11515–22. http://dx.doi.org/10.1609/aaai.v34i07.6817.

Full text

Abstract:

Zero-Shot Cross-Modal Retrieval (ZS-CMR) is an emerging research hotspot that aims to retrieve data of new classes across different modality data. It is challenging for not only the heterogeneous distributions across different modalities, but also the inconsistent semantics across seen and unseen classes. A handful of recently proposed methods typically borrow the idea from zero-shot learning, i.e., exploiting word embeddings of class labels (i.e., class-embeddings) as common semantic space, and using generative adversarial network (GAN) to capture the underlying multimodal data structures, as well as strengthen relations between input data and semantic space to generalize across seen and unseen classes. In this paper, we propose a novel method termed Learning Cross-Aligned Latent Embeddings (LCALE) as an alternative to these GAN based methods for ZS-CMR. Unlike using the class-embeddings as the semantic space, our method seeks for a shared low-dimensional latent space of input multimodal features and class-embeddings by modality-specific variational autoencoders. Notably, we align the distributions learned from multimodal input features and from class-embeddings to construct latent embeddings that contain the essential cross-modal correlation associated with unseen classes. Effective cross-reconstruction and cross-alignment criterions are further developed to preserve class-discriminative information in latent space, which benefits the efficiency for retrieval and enable the knowledge transfer to unseen classes. We evaluate our model using four benchmark datasets on image-text retrieval tasks and one large-scale dataset on image-sketch retrieval tasks. The experimental results show that our method establishes the new state-of-the-art performance for both tasks on all datasets.

APA, Harvard, Vancouver, ISO, and other styles

6

Zhang, Haofeng, Yang Long, and Ling Shao. "Zero-shot Hashing with orthogonal projection for image retrieval." Pattern Recognition Letters 117 (January 2019): 201–9. http://dx.doi.org/10.1016/j.patrec.2018.04.011.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Zhang, Zhaolong, Yuejie Zhang, Rui Feng, Tao Zhang, and Weiguo Fan. "Zero-Shot Sketch-Based Image Retrieval via Graph Convolution Network." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (2020): 12943–50. http://dx.doi.org/10.1609/aaai.v34i07.6993.

Full text

Abstract:

Zero-Shot Sketch-based Image Retrieval (ZS-SBIR) has been proposed recently, putting the traditional Sketch-based Image Retrieval (SBIR) under the setting of zero-shot learning. Dealing with both the challenges in SBIR and zero-shot learning makes it become a more difficult task. Previous works mainly focus on utilizing one kind of information, i.e., the visual information or the semantic information. In this paper, we propose a SketchGCN model utilizing the graph convolution network, which simultaneously considers both the visual information and the semantic information. Thus, our model can effectively narrow the domain gap and transfer the knowledge. Furthermore, we generate the semantic information from the visual information using a Conditional Variational Autoencoder rather than only map them back from the visual space to the semantic space, which enhances the generalization ability of our model. Besides, feature loss, classification loss, and semantic loss are introduced to optimize our proposed SketchGCN model. Our model gets a good performance on the challenging Sketchy and TU-Berlin datasets.

APA, Harvard, Vancouver, ISO, and other styles

8

Yang, Fan, Zheng Wang, Jing Xiao, and Shin'ichi Satoh. "Mining on Heterogeneous Manifolds for Zero-Shot Cross-Modal Image Retrieval." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 07 (2020): 12589–96. http://dx.doi.org/10.1609/aaai.v34i07.6949.

Full text

Abstract:

Most recent approaches for the zero-shot cross-modal image retrieval map images from different modalities into a uniform feature space to exploit their relevance by using a pre-trained model. Based on the observation that manifolds of zero-shot images are usually deformed and incomplete, we argue that the manifolds of unseen classes are inevitably distorted during the training of a two-stream model that simply maps images from different modalities into a uniform space. This issue directly leads to poor cross-modal retrieval performance. We propose a bi-directional random walk scheme to mining more reliable relationships between images by traversing heterogeneous manifolds in the feature space of each modality. Our proposed method benefits from intra-modal distributions to alleviate the interference caused by noisy similarities in the cross-modal feature space. As a result, we achieved great improvement in the performance of the thermal v.s. visible image retrieval task. The code of this paper: https://github.com/fyang93/cross-modal-retrieval

APA, Harvard, Vancouver, ISO, and other styles

9

Xu, Rui, Zongyan Han, Le Hui, Jianjun Qian, and Jin Xie. "Domain Disentangled Generative Adversarial Network for Zero-Shot Sketch-Based 3D Shape Retrieval." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 3 (2022): 2902–10. http://dx.doi.org/10.1609/aaai.v36i3.20195.

Full text

Abstract:

Sketch-based 3D shape retrieval is a challenging task due to the large domain discrepancy between sketches and 3D shapes. Since existing methods are trained and evaluated on the same categories, they cannot effectively recognize the categories that have not been used during training. In this paper, we propose a novel domain disentangled generative adversarial network (DD-GAN) for zero-shot sketch-based 3D retrieval, which can retrieve the unseen categories that are not accessed during training. Specifically, we first generate domain-invariant features and domain-specific features by disentangling the learned features of sketches and 3D shapes, where the domain-invariant features are used to align with the corresponding word embeddings. Then, we develop a generative adversarial network that combines the domain-specific features of the seen categories with the aligned domain-invariant features to synthesize samples, where the synthesized samples of the unseen categories are generated by using the corresponding word embeddings. Finally, we use the synthesized samples of the unseen categories combined with the real samples of the seen categories to train the network for retrieval, so that the unseen categories can be recognized. In order to reduce the domain shift problem, we utilize unlabeled unseen samples to enhance the discrimination ability of the discriminator. With the discriminator distinguishing the generated samples from the unlabeled unseen samples, the generator can generate more realistic unseen samples. Extensive experiments on the SHREC'13 and SHREC'14 datasets show that our method significantly improves the retrieval performance of the unseen categories.

APA, Harvard, Vancouver, ISO, and other styles

10

Xu, Xing, Jialin Tian, Kaiyi Lin, Huimin Lu, Jie Shao, and Heng Tao Shen. "Zero-shot Cross-modal Retrieval by Assembling AutoEncoder and Generative Adversarial Network." ACM Transactions on Multimedia Computing, Communications, and Applications 17, no. 1s (2021): 1–17. http://dx.doi.org/10.1145/3424341.

Full text

Abstract:

Conventional cross-modal retrieval models mainly assume the same scope of the classes for both the training set and the testing set. This assumption limits their extensibility on zero-shot cross-modal retrieval (ZS-CMR), where the testing set consists of unseen classes that are disjoint with seen classes in the training set. The ZS-CMR task is more challenging due to the heterogeneous distributions of different modalities and the semantic inconsistency between seen and unseen classes. A few of recently proposed approaches are inspired by zero-shot learning to estimate the distribution underlying multimodal data by generative models and make the knowledge transfer from seen classes to unseen classes by leveraging class embeddings. However, directly borrowing the idea from zero-shot learning (ZSL) is not fully adaptive to the retrieval task, since the core of the retrieval task is learning the common space. To address the above issues, we propose a novel approach named Assembling AutoEncoder and Generative Adversarial Network (AAEGAN), which combines the strength of AutoEncoder (AE) and Generative Adversarial Network (GAN), to jointly incorporate common latent space learning, knowledge transfer, and feature synthesis for ZS-CMR. Besides, instead of utilizing class embeddings as common space, the AAEGAN approach maps all multimodal data into a learned latent space with the distribution alignment via three coupled AEs. We empirically show the remarkable improvement for ZS-CMR task and establish the state-of-the-art or competitive performance on four image-text retrieval datasets.

APA, Harvard, Vancouver, ISO, and other styles

More sources

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!