Auswahl der wissenschaftlichen Literatur zum Thema „Multimodal retrieval“

Geben Sie eine Quelle nach APA, MLA, Chicago, Harvard und anderen Zitierweisen an

Wählen Sie eine Art der Quelle aus:

Machen Sie sich mit den Listen der aktuellen Artikel, Bücher, Dissertationen, Berichten und anderer wissenschaftlichen Quellen zum Thema "Multimodal retrieval" bekannt.

Neben jedem Werk im Literaturverzeichnis ist die Option "Zur Bibliographie hinzufügen" verfügbar. Nutzen Sie sie, wird Ihre bibliographische Angabe des gewählten Werkes nach der nötigen Zitierweise (APA, MLA, Harvard, Chicago, Vancouver usw.) automatisch gestaltet.

Sie können auch den vollen Text der wissenschaftlichen Publikation im PDF-Format herunterladen und eine Online-Annotation der Arbeit lesen, wenn die relevanten Parameter in den Metadaten verfügbar sind.

Zeitschriftenartikel zum Thema "Multimodal retrieval"

1

Cui, Chenhao, und Zhoujun Li. „Prompt-Enhanced Generation for Multimodal Open Question Answering“. Electronics 13, Nr. 8 (10.04.2024): 1434. http://dx.doi.org/10.3390/electronics13081434.

Der volle Inhalt der Quelle
Annotation:
Multimodal open question answering involves retrieving relevant information from both images and their corresponding texts given a question and then generating the answer. The quality of the generated answer heavily depends on the quality of the retrieved image–text pairs. Existing methods encode and retrieve images and texts, inputting the retrieved results into a language model to generate answers. These methods overlook the semantic alignment of image–text pairs within the information source, which affects the encoding and retrieval performance. Furthermore, these methods are highly dependent on retrieval performance, and poor retrieval quality can lead to poor generation performance. To address these issues, we propose a prompt-enhanced generation model, PEG, which includes generating supplementary descriptions for images to provide ample material for image–text alignment while also utilizing vision–language joint encoding to improve encoding effects and thereby enhance retrieval performance. Contrastive learning is used to enhance the model’s ability to discriminate between relevant and irrelevant information sources. Moreover, we further explore the knowledge within pre-trained model parameters through prefix-tuning to generate background knowledge relevant to the questions, offering additional input for answer generation and reducing the model’s dependency on retrieval performance. Experiments conducted on the WebQA and MultimodalQA datasets demonstrate that our model outperforms other baseline models in retrieval and generation performance.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Xu, Hong. „Multimodal bird information retrieval system“. Applied and Computational Engineering 53, Nr. 1 (28.03.2024): 96–102. http://dx.doi.org/10.54254/2755-2721/53/20241282.

Der volle Inhalt der Quelle
Annotation:
Multimodal bird information retrieval system can help people popularize bird knowledge and help bird conservation. In this paper, we use the self-built bird dataset, the ViT-B/32 model in CLIP model as the training model, python as the development language, and PyQT5 to complete the interface development. The system mainly realizes the uploading and displaying of bird pictures, the multimodal retrieval function of bird information, and the introduction of related bird information. The results of the trial run show that the system can accomplish the multimodal retrieval of bird information, retrieve the species of birds and other related information through the pictures uploaded by the user, or retrieve the most similar bird information through the text content described by the user.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Romberg, Stefan, Rainer Lienhart und Eva Hörster. „Multimodal Image Retrieval“. International Journal of Multimedia Information Retrieval 1, Nr. 1 (07.03.2012): 31–44. http://dx.doi.org/10.1007/s13735-012-0006-4.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Kitanovski, Ivan, Gjorgji Strezoski, Ivica Dimitrovski, Gjorgji Madjarov und Suzana Loskovska. „Multimodal medical image retrieval system“. Multimedia Tools and Applications 76, Nr. 2 (25.01.2016): 2955–78. http://dx.doi.org/10.1007/s11042-016-3261-1.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Kulvinder Singh, Et al. „Enhancing Multimodal Information Retrieval Through Integrating Data Mining and Deep Learning Techniques“. International Journal on Recent and Innovation Trends in Computing and Communication 11, Nr. 9 (30.10.2023): 560–69. http://dx.doi.org/10.17762/ijritcc.v11i9.8844.

Der volle Inhalt der Quelle
Annotation:
Multimodal information retrieval, the task of re trieving relevant information from heterogeneous data sources such as text, images, and videos, has gained significant attention in recent years due to the proliferation of multimedia content on the internet. This paper proposes an approach to enhance multimodal information retrieval by integrating data mining and deep learning techniques. Traditional information retrieval systems often struggle to effectively handle multimodal data due to the inherent complexity and diversity of such data sources. In this study, we leverage data mining techniques to preprocess and structure multimodal data efficiently. Data mining methods enable us to extract valuable patterns, relationships, and features from different modalities, providing a solid foundation for sub- sequent retrieval tasks. To further enhance the performance of multimodal information retrieval, deep learning techniques are employed. Deep neural networks have demonstrated their effectiveness in various multimedia tasks, including image recognition, natural language processing, and video analysis. By integrating deep learning models into our retrieval framework, we aim to capture complex intermodal dependencies and semantically rich representations, enabling more accurate and context-aware retrieval.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Cao, Yu, Shawn Steffey, Jianbiao He, Degui Xiao, Cui Tao, Ping Chen und Henning Müller. „Medical Image Retrieval: A Multimodal Approach“. Cancer Informatics 13s3 (Januar 2014): CIN.S14053. http://dx.doi.org/10.4137/cin.s14053.

Der volle Inhalt der Quelle
Annotation:
Medical imaging is becoming a vital component of war on cancer. Tremendous amounts of medical image data are captured and recorded in a digital format during cancer care and cancer research. Facing such an unprecedented volume of image data with heterogeneous image modalities, it is necessary to develop effective and efficient content-based medical image retrieval systems for cancer clinical practice and research. While substantial progress has been made in different areas of content-based image retrieval (CBIR) research, direct applications of existing CBIR techniques to the medical images produced unsatisfactory results, because of the unique characteristics of medical images. In this paper, we develop a new multimodal medical image retrieval approach based on the recent advances in the statistical graphic model and deep learning. Specifically, we first investigate a new extended probabilistic Latent Semantic Analysis model to integrate the visual and textual information from medical images to bridge the semantic gap. We then develop a new deep Boltzmann machine-based multimodal learning model to learn the joint density model from multimodal information in order to derive the missing modality. Experimental results with large volume of real-world medical images have shown that our new approach is a promising solution for the next-generation medical imaging indexing and retrieval system.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Rafailidis, D., S. Manolopoulou und P. Daras. „A unified framework for multimodal retrieval“. Pattern Recognition 46, Nr. 12 (Dezember 2013): 3358–70. http://dx.doi.org/10.1016/j.patcog.2013.05.023.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Dong, Bin, Songlei Jian und Kai Lu. „Learning Multimodal Representations by Symmetrically Transferring Local Structures“. Symmetry 12, Nr. 9 (13.09.2020): 1504. http://dx.doi.org/10.3390/sym12091504.

Der volle Inhalt der Quelle
Annotation:
Multimodal representations play an important role in multimodal learning tasks, including cross-modal retrieval and intra-modal clustering. However, existing multimodal representation learning approaches focus on building one common space by aligning different modalities and ignore the complementary information across the modalities, such as the intra-modal local structures. In other words, they only focus on the object-level alignment and ignore structure-level alignment. To tackle the problem, we propose a novel symmetric multimodal representation learning framework by transferring local structures across different modalities, namely MTLS. A customized soft metric learning strategy and an iterative parameter learning process are designed to symmetrically transfer local structures and enhance the cluster structures in intra-modal representations. The bidirectional retrieval loss based on multi-layer neural networks is utilized to align two modalities. MTLS is instantiated with image and text data and shows its superior performance on image-text retrieval and image clustering. MTLS outperforms the state-of-the-art multimodal learning methods by up to 32% in terms of R@1 on text-image retrieval and 16.4% in terms of AMI onclustering.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Zhang, Guihao, und Jiangzhong Cao. „Feature Fusion Based on Transformer for Cross-modal Retrieval“. Journal of Physics: Conference Series 2558, Nr. 1 (01.08.2023): 012012. http://dx.doi.org/10.1088/1742-6596/2558/1/012012.

Der volle Inhalt der Quelle
Annotation:
Abstract With the popularity of the Internet and the rapid growth of multimodal data, multimodal retrieval has gradually become a hot area of research. As one of the important branches of multimodal retrieval, image-text retrieval aims to design a model to learn and align two modal data, image and text, in order to build a bridge of semantic association between the two heterogeneous data, so as to achieve unified alignment and retrieval. The current mainstream image-text cross-modal retrieval approaches have made good progress by designing a deep learning-based model to find potential associations between different modal data. In this paper, we design a transformer-based feature fusion network to fuse the information of two modalities in the feature extraction process, which can enrich the semantic connection between the modalities. Meanwhile, we conduct experiments on the benchmark dataset Flickr30k and get competitive results, where recall at 10 achieves 96.2% accuracy in image-to-text retrieval.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Kompus, Kristiina, Tom Eichele, Kenneth Hugdahl und Lars Nyberg. „Multimodal Imaging of Incidental Retrieval: The Low Route to Memory“. Journal of Cognitive Neuroscience 23, Nr. 4 (April 2011): 947–60. http://dx.doi.org/10.1162/jocn.2010.21494.

Der volle Inhalt der Quelle
Annotation:
Memories of past episodes frequently come to mind incidentally, without directed search. It has remained unclear how incidental retrieval processes are initiated in the brain. Here we used fMRI and ERP recordings to find brain activity that specifically correlates with incidental retrieval, as compared to intentional retrieval. Intentional retrieval was associated with increased activation in dorsolateral prefrontal cortex. By contrast, incidental retrieval was associated with a reduced fMRI signal in posterior brain regions, including extrastriate and parahippocampal cortex, and a modulation of a posterior ERP component 170 msec after the onset of visual retrieval cues. Successful retrieval under both intentional and incidental conditions was associated with increased activation in the hippocampus, precuneus, and ventrolateral prefrontal cortex, as well as increased amplitude of the P600 ERP component. These results demonstrate how early bottom–up signals from posterior cortex can lead to reactivation of episodic memories in the absence of strategic retrieval attempts.
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Dissertationen zum Thema "Multimodal retrieval"

1

Adebayo, Kolawole John <1986&gt. „Multimodal Legal Information Retrieval“. Doctoral thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amsdottorato.unibo.it/8634/1/ADEBAYO-JOHN-tesi.pdf.

Der volle Inhalt der Quelle
Annotation:
The goal of this thesis is to present a multifaceted way of inducing semantic representation from legal documents as well as accessing information in a precise and timely manner. The thesis explored approaches for semantic information retrieval (IR) in the Legal context with a technique that maps specific parts of a text to the relevant concept. This technique relies on text segments, using the Latent Dirichlet Allocation (LDA), a topic modeling algorithm for performing text segmentation, expanding the concept using some Natural Language Processing techniques, and then associating the text segments to the concepts using a semi-supervised text similarity technique. This solves two problems, i.e., that of user specificity in formulating query, and information overload, for querying a large document collection with a set of concepts is more fine-grained since specific information, rather than full documents is retrieved. The second part of the thesis describes our Neural Network Relevance Model for E-Discovery Information Retrieval. Our algorithm is essentially a feature-rich Ensemble system with different component Neural Networks extracting different relevance signal. This model has been trained and evaluated on the TREC Legal track 2010 data. The performance of our models across board proves that it capture the semantics and relatedness between query and document which is important to the Legal Information Retrieval domain.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Chen, Jianan. „Deep Learning Based Multimodal Retrieval“. Electronic Thesis or Diss., Rennes, INSA, 2023. http://www.theses.fr/2023ISAR0019.

Der volle Inhalt der Quelle
Annotation:
Les tâches multimodales jouent un rôle crucial dans la progression vers l'atteinte de l'intelligence artificielle (IA) générale. L'objectif principal de la recherche multimodale est d'exploiter des algorithmes d'apprentissage automatique pour extraire des informations sémantiques pertinentes, en comblant le fossé entre différentes modalités telles que les images visuelles, le texte linguistique et d'autres sources de données. Il convient de noter que l'entropie de l'information associée à des données hétérogènes pour des sémantiques de haut niveau identiques varie considérablement, ce qui pose un défi important pour les modèles multimodaux. Les modèles de réseau multimodal basés sur l'apprentissage profond offrent une solution efficace pour relever les difficultés découlant des différences substantielles d'entropie de l’information. Ces modèles présentent une précision et une stabilité impressionnantes dans les tâches d'appariement d'informations multimodales à grande échelle, comme la recherche d'images et de textes. De plus, ils démontrent de solides capacités d'apprentissage par transfert, permettant à un modèle bien entraîné sur une tâche multimodale d'être affiné et appliqué à une nouvelle tâche multimodale. Dans nos recherches, nous développons une nouvelle base de données multimodale et multi-vues générative spécifiquement conçue pour la tâche de segmentation référentielle multimodale. De plus, nous établissons une référence de pointe (SOTA) pour les modèles de segmentation d'expressions référentielles dans le domaine multimodal. Les résultats de nos expériences comparatives sont présentés de manière visuelle, offrant des informations claires et complètes
Multimodal tasks play a crucial role in the progression towards achieving general artificial intelligence (AI). The primary goal of multimodal retrieval is to employ machine learning algorithms to extract relevant semantic information, bridging the gap between different modalities such as visual images, linguistic text, and other data sources. It is worth noting that the information entropy associated with heterogeneous data for the same high-level semantics varies significantly, posing a significant challenge for multimodal models. Deep learning-based multimodal network models provide an effective solution to tackle the difficulties arising from substantial differences in information entropy. These models exhibit impressive accuracy and stability in large-scale cross-modal information matching tasks, such as image-text retrieval. Furthermore, they demonstrate strong transfer learning capabilities, enabling a well-trained model from one multimodal task to be fine-tuned and applied to a new multimodal task, even in scenarios involving few-shot or zero-shot learning. In our research, we develop a novel generative multimodal multi-view database specifically designed for the multimodal referential segmentation task. Additionally, we establish a state-of-the-art (SOTA) benchmark and multi-view metric for referring expression segmentation models in the multimodal domain. The results of our comparative experiments are presented visually, providing clear and comprehensive insights
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Böckmann, Christine, Jens Biele, Roland Neuber und Jenny Niebsch. „Retrieval of multimodal aerosol size distribution by inversion of multiwavelength data“. Universität Potsdam, 1997. http://opus.kobv.de/ubp/volltexte/2007/1436/.

Der volle Inhalt der Quelle
Annotation:
The ill-posed problem of aerosol size distribution determination from a small number of backscatter and extinction measurements was solved successfully with a mollifier method which is advantageous since the ill-posed part is performed on exactly given quantities, the points r where n(r) is evaluated may be freely selected. A new twodimensional model for the troposphere is proposed.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Zhu, Meng. „Cross-modal semantic-associative labelling, indexing and retrieval of multimodal data“. Thesis, University of Reading, 2010. http://centaur.reading.ac.uk/24828/.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Kahn, Itamar. „Remembering the past : multimodal imaging of cortical contributions to episodic retrieval“. Thesis, Massachusetts Institute of Technology, 2005. http://hdl.handle.net/1721.1/33171.

Der volle Inhalt der Quelle
Annotation:
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Brain and Cognitive Sciences, 2005.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Includes bibliographical references.
What is the nature of the neural processes that allow humans to remember past events? The theoretical framework adopted in this thesis builds upon cognitive models that suggest that episodic retrieval can be decomposed into two classes of computations: (1) recovery processes that serve to reactivate stored memories, making information from a past episode readily available, and (2) control processes that serve to guide the retrieval attempt and monitor/evaluate information arising from the recovery processes. A multimodal imaging approach that combined fMRI and MEG was adopted to gain insight into the spatial and temporal brain mechanisms supporting episodic retrieval. Chapter 1 reviews major findings and theories in the episodic retrieval literature grounding the open questions and controversies within the suggested framework. Chapter 2 describes an fMRI and MEG experiment that identified medial temporal cortical structures that signal item memory strength, thus supporting the perception of item familiarity. Chapter 3 describes an fMRI experiment that demonstrated that retrieval of contextual details involves reactivation of neural patterns engaged at encoding.
(cont.) Further, leveraging this pattern of reactivation, it was demonstrated that false recognition may be accompanied by recollection. The fMRI experiment reported in Chapter 3, when combined with an MEG experiment reported in Chapter 4, directly addressed questions regarding the control processes engaged during episodic retrieval. In particular, Chapter 3 showed that parietal and prefrontal cortices contribute to controlling the act of arriving at a retrieval decision. Chapter 4 then illuminates the temporal characteristics of parietal activation during episodic retrieval, providing novel evidence about the nature of parietal responses and thus constraints on theories of parietal involvement in episodic retrieval. The conducted research targeted distinct aspects of the multi-faceted act of remembering the past. The obtained data contribute to the building of an anatomical and temporal "blueprint" documenting the cascade of neural events that unfold during attempts to remember, as well as when such attempts are met with success or lead to memory errors. In the course of framing this research within the context of cognitive models of retrieval, the obtained neural data reflect back on and constrain these theories of remembering.
by Itamar Kahn.
Ph.D.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Nag, Chowdhury Sreyasi [Verfasser]. „Text-image synergy for multimodal retrieval and annotation / Sreyasi Nag Chowdhury“. Saarbrücken : Saarländische Universitäts- und Landesbibliothek, 2021. http://d-nb.info/1240674139/34.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Lolich, María, und Susana Azzollini. „Phenomenological retrieval style of autobiographical memories in a sample of major depressed individuals“. Pontificia Universidad Católica del Perú, 2016. http://repositorio.pucp.edu.pe/index/handle/123456789/99894.

Der volle Inhalt der Quelle
Annotation:
Autobiographical memory retrieval implies different phenomenological features. Given the lack of previous work in Hispanic-speaking populations, 34 in depth interview were carried out in individuals with and without Major Depressive Disorder in Buenos Aires, Argentina. Phenomenological components during the evocation of autobiographical memories were explored. Data was qualitatively analyzed using Grounded Theory. During the descriptive analyses, seven phenomenological categories were detected as emerging from the discourse. The axial and selective analyses revealed two main discursive axles areas; rhetoric-propo­ sitional and specificity- generalized. The impact on affective regulation processes, derived from the assumption of an amodal or multimodal style of processing autobiographical infor­ mation, merits further attention.
La evocación de recuerdos autobiográficos se caracteriza por presentar distintos compo­ nentes fenomenológicos. Dada la ausencia de trabajos previos realizados en poblaciones hispanoparlantes, se realizaron 34 entrevistas en profundidad a individuos con y sin tras­ torno depresivo mayor de la ciudad de Buenos Aires (Argentina). Fueron explorados los componentes fenomenológicos presentes en la evocación de recuerdos autobiográficos significativos. Los datos fueron analizados cualitativamente por medio de la Teoría Fun­ damentada en los Hechos. Durante el análisis descriptivo, se detectaron siete categorías fenomenológicas emergentes del discurso. Del análisis axial y selectivo fueron identificados dos ejes discursivos: retórico-proposicional y especificidad-generalidad. Las implicancias, en la regulación afectiva, derivadas de la asunción de un estilo amodal o multimodal de proce­ samiento de información autobiográfica merecen mayor atención.
A evocação de memórias autobiográficas é caracterizada por diferentes componentes feno­ menológicos. Dada a falta de trabalhos prévios sobre o tema em populações de língua espanhola, 34 entrevistas em profundidade foram conduzidas em indivíduos com e sem transtorno depressivo maior na cidade de Buenos Aires (Argentina). Foram explorados os componentes fenomenológicos presentes na evocação de memórias autobiográficas signi­ ficativas. Os dados foram analisados qualitativamente através da Teoria Fundamentada. Durante a análise descritiva, foram detectadas sete categorias fenomenológicas emer­ gentes no discurso. Dos analises axial e seletivo foram identificados dois eixos discursivos: retórico-proposicional e especificidade-generalidade. As implicações, na regulação afetiva, decorrentes da assunção de um estilo amodal ou um estilo multimodal no processamento de informações autobiográficas merecem mais atenção.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Valero-Mas, Jose J. „Towards Interactive Multimodal Music Transcription“. Doctoral thesis, Universidad de Alicante, 2017. http://hdl.handle.net/10045/71275.

Der volle Inhalt der Quelle
Annotation:
La transcripción de música por computador es de vital importancia en tareas del llamo campo de la Extracción y recuperación de información musical por su utilidad como proceso para la obtención de una abstracción simbólica que codifica el contenido musical de un fichero de audio. En esta disertación se estudia este problema desde una perspectiva diferente a la típicamente considerada para estos problemas, la perspectiva interactiva y multimodal. En este paradigma el usuario cobra especial importancia puesto que es parte activa en la resolución del problema (interactividad); por otro lado, la multimodalidad implica que diferentes fuentes de información extraídas de la misma señal se aúnan para ayudar a una mejor resolución de la tarea.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Quack, Till. „Large scale mining and retrieval of visual data in a multimodal context“. Konstanz Hartung-Gorre, 2009. http://d-nb.info/993614620/04.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Saragiotis, Panagiotis. „Cross-modal classification and retrieval of multimodal data using combinations of neural networks“. Thesis, University of Surrey, 2006. http://epubs.surrey.ac.uk/843338/.

Der volle Inhalt der Quelle
Annotation:
Current neurobiological thinking supported, in part, by experimentation stresses the importance of cross-modality. Uni-modal cognitive tasks, language and vision, for example, are performed with the help of many networks working simultaneously or sequentially; and for cross-modal tasks, like picture / object naming and word illustration, the output of these networks is combined to produce higher cognitive behaviour. The notion of multi-net processing is used typically in the pattern recognition literature, where ensemble networks of weak classifiers - typically supervised - appear to outperform strong classifiers. We have built a system, based on combinations of neural networks, that demonstrates how cross-modal classification can be used to retrieve multi-modal data using one of the available modalities of information. Two multi-net systems were used in this work: one comprising Kohonen SOMs that interact with each other via a Hebbian network and a fuzzy ARTMAP network where the interaction is through the embedded map field. The multi-nets were used for the cross-modal retrieval of images given keywords and for finding the appropriate keywords for an image. The systems were trained on two publicly available image databases that had collateral annotations on the images. The Hemera collection, comprising images of pre-segmented single objects, and the Corel collection with images of multiple objects were used for automatically generating various sets of input vectors. We have attempted to develop a method for evaluating the performance of multi-net systems using a monolithic network trained on modally-undifferentiated vectors as an intuitive bench-mark. To this extent single SOM and fuzzy ART networks were trained using a concatenated visual / linguistic vector to test the performance of multi-net systems with typical monolithic systems. Both multi-nets outperform the respective monolithic systems in terms of information retrieval measures of precision and recall on test images drawn from both datasets; the SOM multi-net outperforms the fuzzy ARTMAP both in terms of convergence and precision-recall. The performance of the SOM-based multi-net in retrieval, classification and auto-annotation is on a par with that of state of the art systems like "ALIP" and "Blobworld". Much of the neural network based simulations reported in the literature use supervised learning algorithms. Such algorithms are suited when classes of objects are predefined and objects in themselves are quite unique in terms of their attributes. We have compared the performance of our multi-net systems with that of a multi-layer perceptron (MLP). The MLP does show substantially greater precision and recall on a (fixed) class of objects when compared with our unsupervised systems. However when 'lesioned' -the network connectivity 'damaged' deliberately- the multi-net systems show a greater degree of robustness. Cross-modal systems appear to hold considerable intellectual and commercial potential and the multi-net approach facilitates the simulation of such systems.
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Bücher zum Thema "Multimodal retrieval"

1

Müller, Henning, Oscar Alfonso Jimenez del Toro, Allan Hanbury, Georg Langs und Antonio Foncubierta Rodriguez, Hrsg. Multimodal Retrieval in the Medical Domain. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-24471-6.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Peters, Carol, Valentin Jijkoun, Thomas Mandl, Henning Müller, Douglas W. Oard, Anselmo Peñas, Vivien Petras und Diana Santos, Hrsg. Advances in Multilingual and Multimodal Information Retrieval. Berlin, Heidelberg: Springer Berlin Heidelberg, 2008. http://dx.doi.org/10.1007/978-3-540-85760-0.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Jay, Kuo C. C., Hrsg. Video content analysis using multimodal information: For movie content extraction, indexing, and representation. Boston, Mass: Kluwer Academic Publishers, 2003.

Den vollen Inhalt der Quelle finden
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Li, Ying. Video Content Analysis Using Multimodal Information: For Movie Content Extraction, Indexing and Representation. Boston, MA: Springer US, 2003.

Den vollen Inhalt der Quelle finden
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

C, Peters, Hrsg. Advances in multilingual and multimodal information retrieval: 8th Workshop of the Cross-Language Evaluation Forum, CLEF 2007, Budapest, Hungary, September 19-21, 2007 : revised selected papers. Berlin: Springer, 2008.

Den vollen Inhalt der Quelle finden
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Forner, Pamela. Multilingual and Multimodal Information Access Evaluation: Second International Conference of the Cross-Language Evaluation Forum, CLEF 2011, Amsterdam, The Netherlands, September 19-22, 2011. Proceedings. Berlin, Heidelberg: Springer-Verlag GmbH Berlin Heidelberg, 2011.

Den vollen Inhalt der Quelle finden
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Li, Ying. Video content analysis using multimodal information: For movie content extraction, indexing, and representation. Boston, MA: Kluwer Academic Publishers, 2003.

Den vollen Inhalt der Quelle finden
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Gosse, Bouma, und SpringerLink (Online service), Hrsg. Interactive Multi-modal Question-Answering. Berlin, Heidelberg: Springer-Verlag Berlin Heidelberg, 2011.

Den vollen Inhalt der Quelle finden
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Esposito, Anna. Toward Autonomous, Adaptive, and Context-Aware Multimodal Interfaces. Theoretical and Practical Issues: Third COST 2102 International Training School, Caserta, Italy, March 15-19, 2010, Revised Selected Papers. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011.

Den vollen Inhalt der Quelle finden
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Andrzej, Drygajlo, Esposito Anna, Ortega-Garcia Javier, Faúndez Zanuy Marcos und SpringerLink (Online service), Hrsg. Biometric ID Management and Multimodal Communication: Joint COST 2101 and 2102 International Conference, BioID_MultiComm 2009, Madrid, Spain, September 16-18, 2009. Proceedings. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009.

Den vollen Inhalt der Quelle finden
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Buchteile zum Thema "Multimodal retrieval"

1

Mihajlović, Vojkan, Milan Petković, Willem Jonker und Henk Blanken. „Multimodal Content-based Video Retrieval“. In Multimedia Retrieval, 271–94. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007. http://dx.doi.org/10.1007/978-3-540-72895-5_10.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Kitanovski, Ivan, Katarina Trojacanec, Ivica Dimitrovski und Suzana Loskovska. „Multimodal Medical Image Retrieval“. In ICT Innovations 2012, 81–89. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-37169-1_8.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Pegia, Maria, Björn Þór Jónsson, Anastasia Moumtzidou, Sotiris Diplaris, Ilias Gialampoukidis, Stefanos Vrochidis und Ioannis Kompatsiaris. „Multimodal 3D Object Retrieval“. In MultiMedia Modeling, 188–201. Cham: Springer Nature Switzerland, 2024. http://dx.doi.org/10.1007/978-3-031-53302-0_14.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Schedl, Markus, und Peter Knees. „Personalization in Multimodal Music Retrieval“. In Adaptive Multimedia Retrieval. Large-Scale Multimedia Retrieval and Evaluation, 58–71. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-37425-8_5.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Toselli, Alejandro Héctor, Enrique Vidal und Francisco Casacuberta. „Interactive Image Retrieval“. In Multimodal Interactive Pattern Recognition and Applications, 209–26. London: Springer London, 2011. http://dx.doi.org/10.1007/978-0-85729-479-1_11.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Chang, Edward Y. „Multimodal Fusion“. In Foundations of Large-Scale Multimedia Information Management and Retrieval, 121–40. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-20429-6_6.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Zhou, Liting, und Cathal Gurrin. „Multimodal Embedding for Lifelog Retrieval“. In MultiMedia Modeling, 416–27. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-030-98358-1_33.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Hendriksen, Mariya. „Multimodal Retrieval in E-Commerce“. In Lecture Notes in Computer Science, 505–12. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-030-99739-7_62.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Baeza-Yates, Ricardo. „Retrieval Evaluation in Practice“. In Multilingual and Multimodal Information Access Evaluation, 2. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-15998-5_2.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Natsev, Apostol (Paul). „Multimodal Search for Effective Video Retrieval“. In Lecture Notes in Computer Science, 525–28. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006. http://dx.doi.org/10.1007/11788034_60.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Konferenzberichte zum Thema "Multimodal retrieval"

1

Kalpathy-Cramer, Jayashree, und William Hersh. „Multimodal medical image retrieval“. In the international conference. New York, New York, USA: ACM Press, 2010. http://dx.doi.org/10.1145/1743384.1743415.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Slaney, Malcolm. „Multimodal retrieval and ranking“. In the international conference. New York, New York, USA: ACM Press, 2010. http://dx.doi.org/10.1145/1743384.1743426.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Lisowska, Agnes. „Multimodal interface design for multimodal meeting content retrieval“. In the 6th international conference. New York, New York, USA: ACM Press, 2004. http://dx.doi.org/10.1145/1027933.1028006.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Gasser, Ralph, Luca Rossetto und Heiko Schuldt. „Multimodal Multimedia Retrieval with vitrivr“. In ICMR '19: International Conference on Multimedia Retrieval. New York, NY, USA: ACM, 2019. http://dx.doi.org/10.1145/3323873.3326921.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Agrawal, Rajeev, William Grosky und Farshad Fotouhi. „Image Retrieval Using Multimodal Keywords“. In 2006 8th IEEE International Symposium on Multimedia. IEEE, 2006. http://dx.doi.org/10.1109/ism.2006.91.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Alsan, Huseyin Fuat, Ekrem Yildiz, Ege Burak Safdil, Furkan Arslan und Taner Arsan. „Multimodal Retrieval with Contrastive Pretraining“. In 2021 International Conference on INnovations in Intelligent SysTems and Applications (INISTA). IEEE, 2021. http://dx.doi.org/10.1109/inista52262.2021.9548414.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Wehrmann, Jonatas, Mauricio A. Lopes, Martin D. More und Rodrigo C. Barros. „Fast Self-Attentive Multimodal Retrieval“. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2018. http://dx.doi.org/10.1109/wacv.2018.00207.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Kim, Taeyong, und Bowon Lee. „Multi-Attention Multimodal Sentiment Analysis“. In ICMR '20: International Conference on Multimedia Retrieval. New York, NY, USA: ACM, 2020. http://dx.doi.org/10.1145/3372278.3390698.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Singh, Vivek K., Siripen Pongpaichet und Ramesh Jain. „Situation Recognition from Multimodal Data“. In ICMR'16: International Conference on Multimedia Retrieval. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2911996.2930061.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Lin, Yen-Yu, und Chiou-Shann Fuh. „Multimodal kernel learning for image retrieval“. In 2010 International Conference on System Science and Engineering (ICSSE). IEEE, 2010. http://dx.doi.org/10.1109/icsse.2010.5551790.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Wir bieten Rabatte auf alle Premium-Pläne für Autoren, deren Werke in thematische Literatursammlungen aufgenommen wurden. Kontaktieren Sie uns, um einen einzigartigen Promo-Code zu erhalten!

Zur Bibliographie