Dissertations / Theses on the topic 'Retrieval models'

To see the other types of publications on this topic, follow the link: Retrieval models.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Retrieval models.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Stathopoulos, Vassilios. "Generative probabilistic models for image retrieval." Thesis, University of Glasgow, 2012. http://theses.gla.ac.uk/3360/.

Full text
Abstract:
Searching for information is a recurring problem that almost everyone has faced at some point. Being in a library looking for a book, searching through newspapers and magazines for an old article or searching through emails for an old conversation with a colleague are some examples of the searching activity. These are some of the many situations where someone; the “user”; has some vague idea of the information he is looking for; an “information need”; and is searching through a large number of documents, emails or articles; “information items”; to find the most “relevant” item for his purpose. In this thesis we study the problem of retrieving images from large image archives. We consider two different approaches for image retrieval. The first approach is content based image retrieval where the user is searching images using a query image. The second approach is semantic retrieval where the users expresses his query using keywords. We proposed a unified framework to treat both approaches using generative probabilistic models in order to rank and classify images with respect to user queries. The methodology presented in this Thesis is evaluated on a real image collection and compared against state of the art methods.
APA, Harvard, Vancouver, ISO, and other styles
2

Morgan, Richard. "Component library retrieval using property models." Thesis, Durham University, 1991. http://etheses.dur.ac.uk/6095/.

Full text
Abstract:
The re-use of products such as code, specifications, design decisions and documentation has been proposed as a method for increasing software productivity and reliability. A major problem that has still to be adequately solved is the storage and retrieval of re-usable 'components'. Current methods, such as keyword retrieval and catalogues, rely on the use of names to describe components or categories. This is inadequate for all but a few well established components and categories; in the majority of cases names do not convey sufficient information on which to base a decision to retrieve. One approach to this problem is to describe components using a formal specification. However this is impractical for two reasons; firstly, the limitations of theorem proving would severely restrict the complexity of components that could be retrieved and secondly the retrieval mechanism would need to have a method of retrieving components with 'similar' specifications. This thesis proposes the use of formal 'property' models to represent the key functionality of components. Retrieval of components can then take place on the basis of a property model produced by the library's users. These models only describe the key properties of a component, thereby making the task of comparing properties feasible. Views are introduced as a method of relating similar, non identical property models, and the use of these views facilitates the re-use of components with similar properties. The language Miramod has been developed for the purpose of describing components, and a Miramod compiler and property prover which allow Miramod models to be compared for similarity, have been designed and implemented. These tools have indicated that model based component library retrieval is feasible at relatively low levels of the programming process, and future work is suggested to extend the method to encompass earlier stages in the development of large systems.
APA, Harvard, Vancouver, ISO, and other styles
3

Vasconcelos, Nuno Miguel Borges de Pinho Cruz de. "Bayesian models for visual information retrieval." Thesis, Massachusetts Institute of Technology, 2000. http://hdl.handle.net/1721.1/62947.

Full text
Abstract:
Thesis (Ph.D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2000.
Includes bibliographical references (leaves 192-208).
This thesis presents a unified solution to visual recognition and learning in the context of visual information retrieval. Realizing that the design of an effective recognition architecture requires careful consideration of the interplay between feature selection, feature representation, and similarity function, we start by searching for a performance criteria that can simultaneously guide the design of all three components. A natural solution is to formulate visual recognition as a decision theoretical problem, where the goal is to minimize the probability of retrieval error. This leads to a Bayesian architecture that is shown to generalize a significant number of previous recognition approaches, solving some of the most challenging problems faced by these: joint modeling of color and texture, objective guidelines for controlling the trade-off between feature transformation and feature representation, and unified support for local and global queries without requiring image segmentation. The new architecture is shown to perform well on color, texture, and generic image databases, providing a good trade-off between retrieval accuracy, invariance, perceptual relevance of similarity judgments, and complexity. Because all that is needed to perform optimal Bayesian decisions is the ability to evaluate beliefs on the different hypothesis under consideration, a Bayesian architecture is not restricted to visual recognition. On the contrary, it establishes a universal recognition language (the language of probabilities) that provides a computational basis for the integration of information from multiple content sources and modalities. In result, it becomes possible to build retrieval systems that can simultaneously account for text, audio, video, or any other content modalities. Since the ability to learn follows from the ability to integrate information over time, this language is also conducive to the design of learning algorithms. We show that learning is, indeed, an important asset for visual information retrieval by designing both short and long-term learning mechanisms. Over short time scales (within a retrieval session), learning is shown to assure faster convergence to the desired target images. Over long time scales (between retrieval sessions), it allows the retrieval system to tailor itself to the preferences of particular users. In both cases, all the necessary computations are carried out through Bayesian belief propagation algorithms that, although optimal in a decision-theoretic sense, are extremely simple, intuitive, and easy to implement.
by Nuno Miguel Borges de Pinho Cruz de Vasconcelos.
Ph.D.
APA, Harvard, Vancouver, ISO, and other styles
4

Mensink, Thomas. "Learning Image Classification and Retrieval Models." Thesis, Grenoble, 2012. http://www.theses.fr/2012GRENM113/document.

Full text
Abstract:
Nous assistons actuellement à une explosion de la quantité des données visuelles. Par exemple, plusieurs millions de photos sont partagées quotidiennement sur les réseaux sociaux. Les méthodes d'interprétation d'images vise à faciliter l'accès à ces données visuelles, d'une manière sémantiquement compréhensible. Dans ce manuscrit, nous définissons certains buts détaillés qui sont intéressants pour les taches d'interprétation d'images, telles que la classification ou la recherche d'images, que nous considérons dans les trois chapitres principaux. Tout d'abord, nous visons l'exploitation de la nature multimodale de nombreuses bases de données, pour lesquelles les documents sont composés d'images et de descriptions textuelles. Dans ce but, nous définissons des similarités entre le contenu visuel d'un document, et la description textuelle d'un autre document. Ces similarités sont calculées en deux étapes, tout d'abord nous trouvons les voisins visuellement similaires dans la base multimodale, puis nous utilisons les descriptions textuelles de ces voisins afin de définir une similarité avec la description textuelle de n'importe quel document. Ensuite, nous présentons une série de modèles structurés pour la classification d'images, qui encodent explicitement les interactions binaires entre les étiquettes (ou labels). Ces modèles sont plus expressifs que des prédicateurs d'étiquette indépendants, et aboutissent à des prédictions plus fiables, en particulier dans un scenario de prédiction interactive, où les utilisateurs fournissent les valeurs de certaines des étiquettes d'images. Un scenario interactif comme celui-ci offre un compromis intéressant entre la précision, et l'effort d'annotation manuelle requis. Nous explorons les modèles structurés pour la classification multi-étiquette d'images, pour la classification d'image basée sur les attributs, et pour l'optimisation de certaines mesures de rang spécifiques. Enfin, nous explorons les classifieurs par k plus proches voisins, et les classifieurs par plus proche moyenne, pour la classification d'images à grande échelle. Nous proposons des méthodes d'apprentissage de métrique efficaces pour améliorer les performances de classification, et appliquons ces méthodes à une base de plus d'un million d'images d'apprentissage, et d'un millier de classes. Comme les deux méthodes de classification permettent d'incorporer des classes non vues pendant l'apprentissage à un coût presque nul, nous avons également étudié leur performance pour la généralisation. Nous montrons que la classification par plus proche moyenne généralise à partir d'un millier de classes, sur dix mille classes à un coût négligeable, et les performances obtenus sont comparables à l'état de l'art
We are currently experiencing an exceptional growth of visual data, for example, millions of photos are shared daily on social-networks. Image understanding methods aim to facilitate access to this visual data in a semantically meaningful manner. In this dissertation, we define several detailed goals which are of interest for the image understanding tasks of image classification and retrieval, which we address in three main chapters. First, we aim to exploit the multi-modal nature of many databases, wherein documents consists of images with a form of textual description. In order to do so we define similarities between the visual content of one document and the textual description of another document. These similarities are computed in two steps, first we find the visually similar neighbors in the multi-modal database, and then use the textual descriptions of these neighbors to define a similarity to the textual description of any document. Second, we introduce a series of structured image classification models, which explicitly encode pairwise label interactions. These models are more expressive than independent label predictors, and lead to more accurate predictions. Especially in an interactive prediction scenario where a user provides the value of some of the image labels. Such an interactive scenario offers an interesting trade-off between accuracy and manual labeling effort. We explore structured models for multi-label image classification, for attribute-based image classification, and for optimizing for specific ranking measures. Finally, we explore k-nearest neighbors and nearest-class mean classifiers for large-scale image classification. We propose efficient metric learning methods to improve classification performance, and use these methods to learn on a data set of more than one million training images from one thousand classes. Since both classification methods allow for the incorporation of classes not seen during training at near-zero cost, we study their generalization performances. We show that the nearest-class mean classification method can generalize from one thousand to ten thousand classes at negligible cost, and still perform competitively with the state-of-the-art
APA, Harvard, Vancouver, ISO, and other styles
5

Rebai, Ahmed. "Interactive Object Retrieval using Interpretable Visual Models." Phd thesis, Université Paris Sud - Paris XI, 2011. http://tel.archives-ouvertes.fr/tel-00608467.

Full text
Abstract:
This thesis is an attempt to improve visual object retrieval by allowing users to interact with the system. Our solution lies in constructing an interactive system that allows users to define their own visual concept from a concise set of visual patches given as input. These patches, which represent the most informative clues of a given visual category, are trained beforehand with a supervised learning algorithm in a discriminative manner. Then, and in order to specialize their models, users have the possibility to send their feedback on the model itself by choosing and weighting the patches they are confident of. The real challenge consists in how to generate concise and visually interpretable models. Our contribution relies on two points. First, in contrast to the state-of-the-art approaches that use bag-of-words, we propose embedding local visual features without any quantization, which means that each component of the high-dimensional feature vectors used to describe an image is associated to a unique and precisely localized image patch. Second, we suggest using regularization constraints in the loss function of our classifier to favor sparsity in the models produced. Sparsity is indeed preferable for concision (a reduced number of patches in the model) as well as for decreasing prediction time. To meet these objectives, we developed a multiple-instance learning scheme using a modified version of the BLasso algorithm. BLasso is a boosting-like procedure that behaves in the same way as Lasso (Least Absolute Shrinkage and Selection Operator). It efficiently regularizes the loss function with an additive L1-constraint by alternating between forward and backward steps at each iteration. The method we propose here is generic in the sense that it can be used with any local features or feature sets representing the content of an image region.
APA, Harvard, Vancouver, ISO, and other styles
6

Pérez-Sancho, Carlos. "Stochastic language models for music information retrieval." Doctoral thesis, Universidad de Alicante, 2009. http://hdl.handle.net/10045/14217.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Tam, Kwok Leung. "Indexing and retrieval of 3D articulated geometry models." Thesis, Durham University, 2009. http://etheses.dur.ac.uk/21/.

Full text
Abstract:
In this PhD research study, we focus on building a content-based search engine for 3D articulated geometry models. 3D models are essential components in nowadays graphic applications, and are widely used in the game, animation and movies production industry. With the increasing number of these models, a search engine not only provides an entrance to explore such a huge dataset, it also facilitates sharing and reusing among different users. In general, it reduces production costs and time to develop these 3D models. Though a lot of retrieval systems have been proposed in recent years, search engines for 3D articulated geometry models are still in their infancies. Among all the works that we have surveyed, reliability and efficiency are the two main issues that hinder the popularity of such systems. In this research, we have focused our attention mainly to address these two issues. We have discovered that most existing works design features and matching algorithms in order to reflect the intrinsic properties of these 3D models. For instance, to handle 3D articulated geometry models, it is common to extract skeletons and use graph matching algorithms to compute the similarity. However, since this kind of feature representation is complex, it leads to high complexity of the matching algorithms. As an example, sub-graph isomorphism can be NP-hard for model graph matching. Our solution is based on the understanding that skeletal matching seeks correspondences between the two comparing models. If we can define descriptive features, the correspondence problem can be solved by bag-based matching where fast algorithms are available. In the first part of the research, we propose a feature extraction algorithm to extract such descriptive features. We then convert the skeletal matching problems into bag-based matching. We further define metric similarity measure so as to support fast search. We demonstrate the advantages of this idea in our experiments. The improvement on precision is 12\% better at high recall. The indexing search of 3D model is 24 times faster than the state of the art if only the first relevant result is returned. However, improving the quality of descriptive features pays the price of high dimensionality. Curse of dimensionality is a notorious problem on large multimedia databases. The computation time scales exponentially as the dimension increases, and indexing techniques may not be useful in such situation. In the second part of the research, we focus ourselves on developing an embedding retrieval framework to solve the high dimensionality problem. We first argue that our proposed matching method projects 3D models on manifolds. We then use manifold learning technique to reduce dimensionality and maximize intra-class distances. We further propose a numerical method to sub-sample and fast search databases. To preserve retrieval accuracy using fewer landmark objects, we propose an alignment method which is also beneficial to existing works for fast search. The advantages of the retrieval framework are demonstrated in our experiments that it alleviates the problem of curse of dimensionality. It also improves the efficiency (3.4 times faster) and accuracy (30\% more accurate) of our matching algorithm proposed above. In the third part of the research, we also study a closely related area, 3D motions. 3D motions are captured by sticking sensor on human beings. These captured data are real human motions that are used to animate 3D articulated geometry models. Creating realistic 3D motions is an expensive and tedious task. Although 3D motions are very different from 3D articulated geometry models, we observe that existing works also suffer from the problem of temporal structure matching. This also leads to low efficiency in the matching algorithms. We apply the same idea of bag-based matching into the work of 3D motions. From our experiments, the proposed method has a 13\% improvement on precision at high recall and is 12 times faster than existing works. As a summary, we have developed algorithms for 3D articulated geometry models and 3D motions, covering feature extraction, feature matching, indexing and fast search methods. Through various experiments, our idea of converting restricted matching to bag-based matching improves matching efficiency and reliability. These have been shown in both 3D articulated geometry models and 3D motions. We have also connected 3D matching to the area of manifold learning. The embedding retrieval framework not only improves efficiency and accuracy, but has also opened a new area of research.
APA, Harvard, Vancouver, ISO, and other styles
8

Mjali, Siyabonga Zimozoxolo. "Latent semantic models : a study of probabilistic models for text in information retrieval." Diss., University of Pretoria, 2020. http://hdl.handle.net/2263/73881.

Full text
Abstract:
Large volumes of text is being generated every minute which necessitates effective and robust tools to retrieve relevant information. Supervised learning approaches have been explored extensively for this task, but it is difficult to secure large collections of labelled data to train this set of models. Since a supervised approach is too expensive in terms of annotating data, we consider unsupervised methods such as topic models and word embeddings in order to represent corpora in lower dimensional semantic spaces. Furthermore, we investigate different distance measures to capture similarity between indexed documents based on their semantic distributions. These include cosine, soft cosine and Jensen-Shannon similarities. This collection of methods discussed in this work allows for the unsupervised association of semantic similar texts which has a wide range of applications such as fake news detection, sociolinguistics and sentiment analysis.
Mini Dissertation (MSc)--University of Pretoria, 2020.
The Hub Internship
Centre for Artificial Intelligence Research
Statistics
MSc (Mathematical Statistics)
Unrestricted
APA, Harvard, Vancouver, ISO, and other styles
9

Belkacem, Thiziri. "Neural models for information retrieval : towards asymmetry sensitive approaches based on attention models." Thesis, Toulouse 3, 2019. http://www.theses.fr/2019TOU30167.

Full text
Abstract:
Ce travail se situe dans le contexte de la recherche d'information (RI) utilisant des techniques d'intelligence artificielle (IA) telles que l'apprentissage profond (DL). Il s'intéresse à des tâches nécessitant l'appariement de textes, telles que la recherche ad-hoc, le domaine du questions-réponses et l'identification des paraphrases. L'objectif de cette thèse est de proposer de nouveaux modèles, utilisant les méthodes de DL, pour construire des modèles d'appariement basés sur la sémantique de textes, et permettant de pallier les problèmes de l'inadéquation du vocabulaire relatifs aux représentations par sac de mots, ou bag of words (BoW), utilisées dans les modèles classiques de RI. En effet, les méthodes classiques de comparaison de textes sont basées sur la représentation BoW qui considère un texte donné comme un ensemble de mots indépendants. Le processus d'appariement de deux séquences de texte repose sur l'appariement exact entre les mots. La principale limite de cette approche est l'inadéquation du vocabulaire. Ce problème apparaît lorsque les séquences de texte à apparier n'utilisent pas le même vocabulaire, même si leurs sujets sont liés. Par exemple, la requête peut contenir plusieurs mots qui ne sont pas nécessairement utilisés dans les documents de la collection, notamment dans les documents pertinents. Les représentations BoW ignorent plusieurs aspects, tels que la structure du texte et le contexte des mots. Ces caractéristiques sont très importantes et permettent de différencier deux textes utilisant les mêmes mots et dont les informations exprimées sont différentes. Un autre problème dans l'appariement de texte est lié à la longueur des documents. Les parties pertinentes peuvent être réparties de manières différentes dans les documents d'une collection. Ceci est d'autant vrai dans les documents volumineux qui ont tendance à couvrir un grand nombre de sujets et à inclure un vocabulaire variable. Un document long pourrait ainsi comporter plusieurs passages pertinents qu'un modèle d'appariement doit capturer. Contrairement aux documents longs, les documents courts sont susceptibles de concerner un sujet spécifique et ont tendance à contenir un vocabulaire plus restreint. L'évaluation de leur pertinence est en principe plus simple que celle des documents plus longs. Dans cette thèse, nous avons proposé différentes contributions répondant chacune à l'un des problèmes susmentionnés. Tout d'abord, afin de résoudre le problème d'inadéquation du vocabulaire, nous avons utilisé des représentations distribuées des mots (plongement lexical) pour permettre un appariement basé sur la sémantique entre les différents mots. Ces représentations ont été utilisées dans des applications de RI où la similarité document-requête est calculée en comparant tous les vecteurs de termes de la requête avec tous les vecteurs de termes du document, indifféremment. Contrairement aux modèles proposés dans l'état-de-l'art, nous avons étudié l'impact des termes de la requête concernant leur présence/absence dans un document. Nous avons adopté différentes stratégies d'appariement document/requête. L'intuition est que l'absence des termes de la requête dans les documents pertinents est en soi un aspect utile à prendre en compte dans le processus de comparaison. En effet, ces termes n'apparaissent pas dans les documents de la collection pour deux raisons possibles : soit leurs synonymes ont été utilisés ; soit ils ne font pas partie du contexte des documents en questions
This work is situated in the context of information retrieval (IR) using machine learning (ML) and deep learning (DL) techniques. It concerns different tasks requiring text matching, such as ad-hoc research, question answering and paraphrase identification. The objective of this thesis is to propose new approaches, using DL methods, to construct semantic-based models for text matching, and to overcome the problems of vocabulary mismatch related to the classical bag of word (BoW) representations used in traditional IR models. Indeed, traditional text matching methods are based on the BoW representation, which considers a given text as a set of independent words. The process of matching two sequences of text is based on the exact matching between words. The main limitation of this approach is related to the vocabulary mismatch. This problem occurs when the text sequences to be matched do not use the same vocabulary, even if their subjects are related. For example, the query may contain several words that are not necessarily used in the documents of the collection, including relevant documents. BoW representations ignore several aspects about a text sequence, such as the structure the context of words. These characteristics are important and make it possible to differentiate between two texts that use the same words but expressing different information. Another problem in text matching is related to the length of documents. The relevant parts can be distributed in different ways in the documents of a collection. This is especially true in large documents that tend to cover a large number of topics and include variable vocabulary. A long document could thus contain several relevant passages that a matching model must capture. Unlike long documents, short documents are likely to be relevant to a specific subject and tend to contain a more restricted vocabulary. Assessing their relevance is in principle simpler than assessing the one of longer documents. In this thesis, we have proposed different contributions, each addressing one of the above-mentioned issues. First, in order to solve the problem of vocabulary mismatch, we used distributed representations of words (word embedding) to allow a semantic matching between the different words. These representations have been used in IR applications where document/query similarity is computed by comparing all the term vectors of the query with all the term vectors of the document, regardless. Unlike the models proposed in the state-of-the-art, we studied the impact of query terms regarding their presence/absence in a document. We have adopted different document/query matching strategies. The intuition is that the absence of the query terms in the relevant documents is in itself a useful aspect to be taken into account in the matching process. Indeed, these terms do not appear in documents of the collection for two possible reasons: either their synonyms have been used or they are not part of the context of the considered documents. The methods we have proposed make it possible, on the one hand, to perform an inaccurate matching between the document and the query, and on the other hand, to evaluate the impact of the different terms of a query in the matching process. Although the use of word embedding allows semantic-based matching between different text sequences, these representations combined with classical matching models still consider the text as a list of independent elements (bag of vectors instead of bag of words). However, the structure of the text as well as the order of the words is important. Any change in the structure of the text and/or the order of words alters the information expressed. In order to solve this problem, neural models were used in text matching
APA, Harvard, Vancouver, ISO, and other styles
10

Wessel, Raoul [Verfasser]. "Shape Retrieval Methods for Architectural 3D Models / Raoul Wessel." Bonn : Universitäts- und Landesbibliothek Bonn, 2014. http://d-nb.info/1048091503/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Hörster, Eva. "Topic models for image retrieval on large-scale databases." kostenfrei kostenfrei, 2009. http://d-nb.info/998079553/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Levy, Mark. "Retrieval and annotation of music using latent semantic models." Thesis, Queen Mary, University of London, 2012. http://qmro.qmul.ac.uk/xmlui/handle/123456789/2969.

Full text
Abstract:
This thesis investigates the use of latent semantic models for annotation and retrieval from collections of musical audio tracks. In particular latent semantic analysis (LSA) and aspect models (or probabilistic latent semantic analysis, pLSA) are used to index words in descriptions of music drawn from hundreds of thousands of social tags. A new discrete audio feature representation is introduced to encode musical characteristics of automatically-identified regions of interest within each track, using a vocabulary of audio muswords. Finally a joint aspect model is developed that can learn from both tagged and untagged tracks by indexing both conventional words and muswords. This model is used as the basis of a music search system that supports query by example and by keyword, and of a simple probabilistic machine annotation system. The models are evaluated by their performance in a variety of realistic retrieval and annotation tasks, motivated by applications including playlist generation, internet radio streaming, music recommendation and catalogue search.
APA, Harvard, Vancouver, ISO, and other styles
13

Osodo, Jennifer Akinyi. "An extended vector-based information retrieval system to retrieve e-learning content based on learner models." Thesis, University of Sunderland, 2011. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.542053.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Rao, Ashwani Pratap. "Statistical information retrieval models| Experiments, evaluation on real time data." Thesis, University of Delaware, 2014. http://pqdtopen.proquest.com/#viewpdf?dispub=1567821.

Full text
Abstract:

We are all aware of the rise of information age: heterogeneous sources of information and the ability to publish rapidly and indiscriminately are responsible for information chaos. In this work, we are interested in a system which can separate the "wheat" of vital information from the chaff within this information chaos. An efficient filtering system can accelerate meaningful utilization of knowledge. Consider Wikipedia, an example of community-driven knowledge synthesis. Facts about topics on Wikipedia are continuously being updated by users interested in a particular topic. Consider an automatic system (or an invisible robot) to which a topic such as "President of the United States" can be fed. This system will work ceaselessly, filtering new information created on the web in order to provide the small set of documents about the "President of the United States" that are vital to keeping the Wikipedia page relevant and up-to-date. In this work, we present an automatic information filtering system for this task. While building such a system, we have encountered issues related to scalability, retrieval algorithms, and system evaluation; we describe our efforts to understand and overcome these issues.

APA, Harvard, Vancouver, ISO, and other styles
15

Amati, Giambattista. "Probability models for information retrieval based on divergence from randomness." Thesis, University of Glasgow, 2003. http://theses.gla.ac.uk/1570/.

Full text
Abstract:
This thesis devises a novel methodology based on probability theory, suitable for the construction of term-weighting models of Information Retrieval. Our term-weighting functions are created within a general framework made up of three components. Each of the three components is built independently from the others. We obtain the term-weighting functions from the general model in a purely theoretic way instantiating each component with different probability distribution forms. The thesis begins with investigating the nature of the statistical inference involved in Information Retrieval. We explore the estimation problem underlying the process of sampling. De Finetti’s theorem is used to show how to convert the frequentist approach into Bayesian inference and we display and employ the derived estimation techniques in the context of Information Retrieval. We initially pay a great attention to the construction of the basic sample spaces of Information Retrieval. The notion of single or multiple sampling from different populations in the context of Information Retrieval is extensively discussed and used through-out the thesis. The language modelling approach and the standard probabilistic model are studied under the same foundational view and are experimentally compared to the divergence-from-randomness approach. In revisiting the main information retrieval models in the literature, we show that even language modelling approach can be exploited to assign term-frequency normalization to the models of divergence from randomness. We finally introduce a novel framework for the query expansion. This framework is based on the models of divergence-from-randomness and it can be applied to arbitrary models of IR, divergence-based, language modelling and probabilistic models included. We have done a very large number of experiment and results show that the framework generates highly effective Information Retrieval models.
APA, Harvard, Vancouver, ISO, and other styles
16

Zhou, Xiaohua Hu Xiaohua. "Semantics-based language models for information retrieval and text mining /." Philadelphia, Pa. : Drexel University, 2008. http://hdl.handle.net/1860/2931.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Wang, Jun. "Probabilistic retrieval models : relationships, context-specific application, selection and implementation." Thesis, Queen Mary, University of London, 2011. http://qmro.qmul.ac.uk/xmlui/handle/123456789/655.

Full text
Abstract:
Retrieval models are the core components of information retrieval systems, which guide the document and query representations, as well as the document ranking schemes. TF-IDF, binary independence retrieval (BIR) model and language modelling (LM) are three of the most influential contemporary models due to their stability and performance. The BIR model and LM have probabilistic theory as their basis, whereas TF-IDF is viewed as a heuristic model, whose theoretical justification always fascinates researchers. This thesis firstly investigates the parallel derivation of BIR model, LM and Poisson model, wrt event spaces, relevance assumptions and ranking rationales. It establishes a bridge between the BIR model and LM, and derives TF-IDF from the probabilistic framework. Then, the thesis presents the probabilistic logical modelling of the retrieval models. Various ways of how to estimate and aggregate probability, and alternative implementation to nonprobabilistic operator are demonstrated. Typical models have been implemented. The next contribution concerns the usage of of context-specific frequencies, i.e., the frequencies counted based on assorted element types or within different text scopes. The hypothesis is that they can help to rank the elements in structured document retrieval. The thesis applies context-specific frequencies on term weighting schemes in these models, and the outcome is a generalised retrieval model with regard to both element and document ranking. The retrieval models behave differently on the same query set: for some queries, one model performs better, for other queries, another model is superior. Therefore, one idea to improve the overall performance of a retrieval system is to choose for each query the model that is likely to perform the best. This thesis proposes and empirically explores the model selection method according to the correlation of query feature and query performance, which contributes to the methodology of dynamically choosing a model. In summary, this thesis contributes a study of probabilistic models and their relationships, the probabilistic logical modelling of retrieval models, the usage and effect of context-specific frequencies in models, and the selection of retrieval models.
APA, Harvard, Vancouver, ISO, and other styles
18

Teevan, Jaime B. (Jaime Brooks) 1976. "Improving information retrieval with textual analysis : Bayesian models and beyond." Thesis, Massachusetts Institute of Technology, 2001. http://hdl.handle.net/1721.1/86759.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Conser, Erik Timothy. "Improved Scoring Models for Semantic Image Retrieval Using Scene Graphs." PDXScholar, 2017. https://pdxscholar.library.pdx.edu/open_access_etds/3879.

Full text
Abstract:
Image retrieval via a structured query is explored in Johnson, et al. [7]. The query is structured as a scene graph and a graphical model is generated from the scene graph's object, attribute, and relationship structure. Inference is performed on the graphical model with candidate images and the energy results are used to rank the best matches. In [7], scene graph objects that are not in the set of recognized objects are not represented in the graphical model. This work proposes and tests two approaches for modeling the unrecognized objects in order to leverage the attribute and relationship models to improve image retrieval performance.
APA, Harvard, Vancouver, ISO, and other styles
20

Zarrinkoub, Sahand. "Transfer Learning in Deep Structured Semantic Models for Information Retrieval." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-286310.

Full text
Abstract:
Recent approaches to IR include neural networks that generate query and document vector representations. The representations are used as the basis for document retrieval and are able to encode semantic features if trained on large datasets, an ability that sets them apart from classical IR approaches such as TF-IDF. However, the datasets necessary to train these networks are not available to the owners of most search services used today, since they are not used by enough users. Thus, methods for enabling the use of neural IR models in data-poor environments are of interest. In this work, a bag-of-trigrams neural IR architecture is used in a transfer learning procedure in an attempt to increase performance on a target dataset by pre-training on external datasets. The target dataset used is WikiQA, and the external datasets are Quora’s Question Pairs, Reuters’ RCV1 and SQuAD. When considering individual model performance, pre-training on Question Pairs and fine-tuning on WikiQA gives us the best individual models. However, when considering average performance, pre-training on the chosen external dataset result in lower performance on the target dataset, both when all datasets are used together and when they are used individually, with different average performance depending on the external dataset used. On average, pre-training on RCV1 and Question Pairs gives the lowest and highest average performance respectively, when considering only the pre-trained networks. Surprisingly, the performance of an untrained, randomly generated network is high, and beats the performance of all pre-trained networks on average. The best performing model on average is a neural IR model trained on the target dataset without prior pre-training.
Nya modeller inom informationssökning inkluderar neurala nät som genererar vektorrepresentationer för sökfrågor och dokument. Dessa vektorrepresentationer används tillsammans med ett likhetsmått för att avgöra relevansen för ett givet dokument med avseende på en sökfråga. Semantiska särdrag i sökfrågor och dokument kan kodas in i vektorrepresentationerna. Detta möjliggör informationssökning baserat på semantiska enheter, vilket ej är möjligt genom de klassiska metoderna inom informationssökning, som istället förlitar sig på den ömsesidiga förekomsten av nyckelord i sökfrågor och dokument. För att träna neurala sökmodeller krävs stora datamängder. De flesta av dagens söktjänster används i för liten utsträckning för att möjliggöra framställande av datamängder som är stora nog att träna en neural sökmodell. Därför är det önskvärt att hitta metoder som möjliggör användadet av neurala sökmodeller i domäner med små tillgängliga datamängder. I detta examensarbete har en neural sökmodell implementerats och använts i en metod avsedd att förbättra dess prestanda på en måldatamängd genom att förträna den på externa datamängder. Måldatamängden som används är WikiQA, och de externa datamängderna är Quoras Question Pairs, Reuters RCV1 samt SquAD. I experimenten erhålls de bästa enskilda modellerna genom att föträna på Question Pairs och finjustera på WikiQA. Den genomsnittliga prestandan över ett flertal tränade modeller påverkas negativt av vår metod. Detta äller både när samtliga externa datamänder används tillsammans, samt när de används enskilt, med varierande prestanda beroende på vilken datamängd som används. Att förträna på RCV1 och Question Pairs ger den största respektive minsta negativa påverkan på den genomsnittliga prestandan. Prestandan hos en slumpmässigt genererad, otränad modell är förvånansvärt hög, i genomsnitt högre än samtliga förtränade modeller, och i nivå med BM25. Den bästa genomsnittliga prestandan erhålls genom att träna på måldatamängden WikiQA utan tidigare förträning.
APA, Harvard, Vancouver, ISO, and other styles
21

Xu, Zhe. "Expertise Retrieval in Enterprise Microblogs with Enhanced Models and Brokers." The Ohio State University, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=osu1399000265.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Deng, Jie. "Emotion-based music retrieval and recommendation." HKBU Institutional Repository, 2014. https://repository.hkbu.edu.hk/etd_oa/82.

Full text
Abstract:
The digital music industry has expanded dramatically during the past decades, which results in the generation of enormous amounts of music data. Along with the Internet, the growing volume of quantitative data about users (e.g., users’ behaviors and preferences) can be easily collected nowadays. All these factors have the potential to produce big data in the music industry. By utilizing big data analysis of music related data, music can be better semantically understood (e.g., genres and emotions), and the user’s high-level needs such as automatic recognition and annotation can be satisfied. For example, many commercial music companies such as Pandora, Spotify, and Last.fm have already attempted to use big data and machine learn- ing related techniques to drastically alter music search and discovery. According to musicology and psychology theories, music can reflect our heart and soul, while emotion is the core component of music that expresses the complex and conscious experience. However, there is insufficient research in this field. Consequently, due to the impact of emotion conveyed by music, retrieval and discovery of useful music information at the emotion level from big music data are extremely important. Over the past decades, researchers have made great strides in automated systems for music retrieval and recommendation. Music is a temporal art, involving specific emotion expression. But while it is easy for human beings to recognize emotions expressed by music, it is still a challenge for automated systems to recognize them. Although some significant emotion models (e.g., Hevner’s adjective circle, Arousal- Valence model, Pleasure-Arousal-Dominance model) established upon the discrete emotion theory and dimensional emotion theory have been widely adopted in the fi of emotion research, they still suffer from limitations due to the scalability and specificity in music domain. As a result, the effectiveness and availability of music retrieval and recommendation at the emotion level are still unsatisfactory. This thesis makes contribution at theoretical, technical, and empirical level. First of all, a hybrid musical emotion model named “Resonance-Arousal-Valence (RAV)” is proposed and well constructed at the beginning. It explores the computational and time-varying expressions of musical emotions. Furthermore, dependent on the RAV musical emotion model, a joint emotion space model (JESM) combines musical audio features and emotion tags feature is constructed. Second, corresponding to static musical emotion representation and time-varying musical emotion representation, two methods of music retrieval at the emotion level are designed: (1) a unified framework for music retrieval in joint emotion space; (2) dynamic time warping (DTW) for music retrieval by using time-varying music emotions. Furthermore, automatic music emotion annotation and segmentation are naturally conducted. Third, following the theory of affective computing (e.g., emotion intensity decay, and emotion state transition), an intelligent affective system for music recommendation is designed, where conditional random fi lds (CRF) is applied to predict the listener’s dynamic emotion state based on his or her personal historical music listening list in a session. Finally, the experiment dataset is well created and pro- posed systems are also implemented. Empirical results (recognition, retrieval, and recommendation) regarding accuracy compared to previous techniques are also presented, which demonstrates that the proposed methods enable an advanced degree of effectiveness of emotion-based music retrieval and recommendation. Keywords: Music and emotion, Music information retrieval, Music emotion recognition, Annotation and retrieval, Music recommendation, Affective computing, Time series analysis, Acoustic features, Ranking, Multi-objective optimization
APA, Harvard, Vancouver, ISO, and other styles
23

Draeger, Marco. "Use of probabilistic topic models for search." Thesis, Monterey, California : Naval Postgraduate School, 2009. http://edocs.nps.edu/npspubs/scholarly/theses/2009/Sep/09Sep_Draeger.pdf.

Full text
Abstract:
Thesis (M.S. in Operations Research)--Naval Postgraduate School, September 2009.
Thesis Advisor(s): Squire, Kevin M. "September 2009." Description based on title screen as viewed on November 5, 2009. Author(s) subject terms: Document modeling, information retrieval, semantic search, Bayesian nonparametric methods, hierarchical Bayes. Includes bibliographical references (p. 67-71). Also available in print.
APA, Harvard, Vancouver, ISO, and other styles
24

Park, Byung Chun. "Analytical models and optimal strategies for automated storage/retrieval system operations." Diss., Georgia Institute of Technology, 1991. http://hdl.handle.net/1853/24568.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Beecks, Christian [Verfasser]. "Distance-based similarity models for content-based multimedia retrieval / Christian Beecks." Aachen : Hochschulbibliothek der Rheinisch-Westfälischen Technischen Hochschule Aachen, 2013. http://d-nb.info/1046647245/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Jonsson, Therese. "Data Retrieval Strategy for Modern Database Models in a Serverless Architecture." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-279950.

Full text
Abstract:
The arising presence of social media platforms shapes modern system architectures to handle performance and scale. As user volumes increase, the need to keep data consistent and available at storage locations across the world adds complexity in distributed systems. The startup Leader Island has developed a communication platform for companies and organizations, using Amazon Web Services (AWS) as cloud provider to overcome operational concerns. Within their platform, users share various content and interact with each other. Data retrieval is an essential component in the platform, as the users should get various feeds and be able to search for specific content. For this functionality, Leader Island uses a combination of AWS Elasticsearch Service for data retrieval and Amazon DynamoDB for persistent storage. However, this setup has posed challenges within data models, mappings, and retrieval strategies. This project aims to find best practices for data models and mappings within both instances and to investigate in new data retrieval strategies to optimize the latencies of data retrieval. For this, three designs were configured with op- posing models and mappings. The results of each design were collected and measured against each other. Mainly, a design where data retrieval was distributed over both Elasticsearch and DynamoDB, incorporating DynamoDB best practices and a reduced data volume propagated to Elasticsearch, out-performed the initial platform design by a factor of 1.9 concerning platform latencies.
Den stigande närvaron av sociala medier formar de moderna systemarkitekturerna för att hantera prestanda och skala. I takt med att användarvolymer ökar växer också behovet av att hålla data konsistent och tillgängligt i datacenter över hela världen, vilket adderar komplexitet i distribuerade system. Startupen Leader Island har utvecklat en kommunikationsplattform för företag och organisationer, nyttjandes Amazon Web Services (AWS) som molnleverantör för att reducera operativa utmaningar. På plattformen delar användarna innehåll och interagerar med varandra. Att hämta data är en viktig komponent i platt- formen, eftersom användarna laddar in olika flöden och ska kunna söka efter specifikt innehåll. För denna funktionalitet använder Leader Island en kom- bination av AWS Elasticsearch Service för datainsamling och Amazon DynamoDB för permanent lagring. Denna konstellation har inneburit utmaningar inom data-modeller, data-mappning och hämtningsstrategier. Det här projektet syftar till att hitta bästa praxis för data-modeller och data-mappningar i båda instanser, samt undersöka nya strategier gällande datainsamling för att optimera prestandan. För detta konfigurerades tre designer med motsatta data- modeller och data-mappningar. Resultaten av varje design samlades in och mättes mot varandra. I huvudsak visade en design fördel, där datainsamlingen distribuerades över både Elasticsearch och DynamoDB. DynamoDB modellerades där enligt bästa praxis, och en reducerad datavolym propagerades till Elasticsearch, vilket resulterade i en 1.9 gånger bättre prestanda än i den initiala designen.
APA, Harvard, Vancouver, ISO, and other styles
27

Doan, Khoa Dang. "Generative models meet similarity search: efficient, heuristic-free and robust retrieval." Diss., Virginia Tech, 2021. http://hdl.handle.net/10919/105052.

Full text
Abstract:
The rapid growth of digital data, especially visual and textual contents, brings many challenges to the problem of finding similar data. Exact similarity search, which aims to exhaustively find all relevant items through a linear scan in a dataset, is impractical due to its high computational complexity. Approximate-nearest-neighbor (ANN) search methods, especially the Learning-to-hash or Hashing methods, provide principled approaches that balance the trade-offs between the quality of the guesses and the computational cost for web-scale databases. In this era of data explosion, it is crucial for the hashing methods to be both computationally efficient and robust to various scenarios such as when the application has noisy data or data that slightly changes over time (i.e., out-of-distribution). This Thesis focuses on the development of practical generative learning-to-hash methods and explainable retrieval models. We first identify and discuss the various aspects where the framework of generative modeling can be used to improve the model designs and generalization of the hashing methods. Then we show that these generative hashing methods similarly enjoy several appealing empirical and theoretical properties of generative modeling. Specifically, the proposed generative hashing models generalize better with important properties such as low-sample requirement, and out-of-distribution and data-corruption robustness. Finally, in domains with structured data such as graphs, we show that the computational methods in generative modeling have an interesting utility beyond estimating the data distribution and describe a retrieval framework that can explain its decision by borrowing the algorithmic ideas developed in these methods. Two subsets of generative hashing methods and a subset of explainable retrieval methods are proposed. For the first hashing subset, we propose a novel adversarial framework that can be easily adapted to a new problem domain and three training algorithms that learn the hash functions without several hyperparameters commonly found in the previous hashing methods. The contributions of our work include: (1) Propose novel algorithms, which are based on adversarial learning, to learn the hash functions; (2) Design computationally efficient Wasserstein-related adversarial approaches which have low computational and sample efficiency; (3) Conduct extensive experiments on several benchmark datasets in various domains, including computational advertising, and text and image retrieval, for performance evaluation. For the second hashing subset, we propose energy-based hashing solutions which can improve the generalization and robustness of existing hashing approaches. The contributions of our work for this task include: (1) Propose data-synthesis solutions to improve the generalization of existing hashing methods; (2) Propose energy-based hashing solutions which exhibit better robustness against out-of-distribution and corrupted data; (3) Conduct extensive experiments for performance evaluations on several benchmark datasets in the image retrieval domain. Finally, for the last subset of explainable retrieval methods, we propose an optimal alignment algorithm that achieves a better similarity approximation for a pair of structured objects, such as graphs, while capturing the alignment between the nodes of the graphs to explain the similarity calculation. The contributions of our work for this task include: (1) Propose a novel optimal alignment algorithm for comparing two sets of bag-of-vectors embeddings; (2) Propose a differentiable computation to learn the parameters of the proposed optimal alignment model; (3) Conduct extensive experiments, for performance evaluation of both the similarity approximation task and the retrieval task, on several benchmark graph datasets.
Doctor of Philosophy
Searching for similar items, or similarity search, is one of the fundamental tasks in this information age, especially when there is a rapid growth of visual and textual contents. For example, in a search engine such as Google, a user searches for images with similar content to a referenced image; in online advertising, an advertiser finds new users, and eventually targets these users with advertisements, where the new users have similar profiles to some referenced users who have previously responded positively to the same or similar advertisements; in the chemical domain, scientists search for proteins with a similar structure to a referenced protein. The practical search applications in these domains often face several challenges, especially when these datasets or databases can contain a large number (e.g., millions or even billions) of complex-structured items (e.g., texts, images, and graphs). These challenges can be organized into three central themes: search efficiency (the economical use of resources such as computation and time) and model-design effort (the ease of building the search model). Besides search efficiency and model-design effort, it is increasingly a requirement of a search model to possess the ability to explain the search results, especially in the scientific domains where the items are structured objects such as graphs. This dissertation tackles the aforementioned challenges in practical search applications by using the computational techniques that learn to generate data. First, we overcome the need to scan the entire large dataset for similar items by considering an approximate similarity search technique called hashing. Then, we propose an unsupervised hashing framework that learns the hash functions with simpler objective functions directly from raw data. The proposed retrieval framework can be easily adapted into new domains with significantly lower effort in model design. When labeled data is available but is limited (which is a common scenario in practical search applications), we propose a hashing network that can synthesize additional data to improve the hash function learning process. The learned model also exhibits significant robustness against data corruption and slight changes in the underlying data. Finally, in domains with structured data such as graphs, we propose a computation approach that can simultaneously estimate the similarity of structured objects, such as graphs, and capture the alignment between their substructures, e.g., nodes. The alignment mechanism can help explain the reason why two objects are similar or dissimilar. This is a useful tool for domain experts who not only want to search for similar items but also want to understand how the search model makes its predictions.
APA, Harvard, Vancouver, ISO, and other styles
28

Abdulahhad, Karam. "Information retrieval modeling by logic and lattice : application to conceptual information retrieval." Thesis, Grenoble, 2014. http://www.theses.fr/2014GRENM014/document.

Full text
Abstract:
Cette thèse se situe dans le contexte des modèles logique de Recherche d'Information (RI). Le travail présenté dans la thèse est principalement motivé par l'inexactitude de l'hypothèse sur l'indépendance de termes. En effet, cette hypothèse communément acceptée en RI stipule que les termes d'indexation sont indépendant les un des autres. Cette hypothèse est fausse en pratique mais permet tout de même aux systèmes de RI de donner de bon résultats. La proposition contenue dans cette thèse met également l'emphase sur la nature déductive du processus de jugement de pertinence. Les logiques formelles sont bien adaptées pour la représentation des connaissances. Elles permettent ainsi de représenter les relations entre les termes. Les logiques formelles sont également des systèmes d'inférence, ainsi la RI à base de logique constitue une piste de travail pour construire des systèmes efficaces de RI. Cependant, en étudiant les modèles actuels de RI basés sur la logique, nous montrons que ces modèles ont généralement des lacunes. Premièrement, les modèles de RI logiques proposent normalement des représentations complexes de document et des requête et difficile à obtenir automatiquement. Deuxièmement, la décision de pertinence d->q, qui représente la correspondance entre un document d et une requête q, pourrait être difficile à vérifier. Enfin, la mesure de l'incertitude U(d->q) est soit ad-hoc ou difficile à mettre en oeuvre. Dans cette thèse, nous proposons un nouveau modèle de RI logique afin de surmonter la plupart des limites mentionnées ci-dessus. Nous utilisons la logique propositionnelle (PL). Nous représentons les documents et les requêtes comme des phrases logiques écrites en Forme Normale Disjonctive. Nous argumentons également que la décision de pertinence d->q pourrait être remplacée par la validité de l'implication matérielle. Pour vérifier si d->q est valide ou non, nous exploitons la relation potentielle entre PL et la théorie des treillis. Nous proposons d'abord une représentation intermédiaire des phrases logiques, où elles deviennent des noeuds dans un treillis ayant une relation d'ordre partiel équivalent à la validité de l'implication matérielle. En conséquence, nous transformons la vérification de validité de d->q, ce qui est un calcul intensif, en une série de vérifications simples d'inclusion d'ensembles. Afin de mesurer l'incertitude de la décision de pertinence U(d->q), nous utilisons la fonction du degré d'inclusion Z, qui est capable de quantifier les relations d'ordre partielles définies sur des treillis. Enfin, notre modèle est capable de travailler efficacement sur toutes les phrases logiques sans aucune restriction, et est applicable aux données à grande échelle. Notre modèle apporte également quelques conclusions théoriques comme: la formalisation de l'hypothèse de van Rijsbergen sur l'estimation de l'incertitude logique U(d->q) en utilisant la probabilité conditionnelle P(q|d), la redéfinition des deux notions Exhaustivité et Spécificité, et finalement ce modèle a également la possibilité de reproduire les modèles les plus classiques de RI. De manière pratique, nous construisons trois instances opérationnelles de notre modèle. Une instance pour étudier l'importance de Exhaustivité et Spécificité, et deux autres pour montrer l'insuffisance de l'hypothèse sur l'indépendance des termes. Nos résultats expérimentaux montrent un gain de performance lors de l'intégration Exhaustivité et Spécificité. Cependant, les résultats de l'utilisation de relations sémantiques entre les termes ne sont pas suffisants pour tirer des conclusions claires. Le travail présenté dans cette thèse doit être poursuivit par plus d'expérimentations, en particulier sur l'utilisation de relations, et par des études théoriques en profondeur, en particulier sur les propriétés de la fonction Z
This thesis is situated in the context of logic-based Information Retrieval (IR) models. The work presented in this thesis is mainly motivated by the inadequate term-independence assumption, which is well-accepted in IR although terms are normally related, and also by the inferential nature of the relevance judgment process. Since formal logics are well-adapted for knowledge representation, and then for representing relations between terms, and since formal logics are also powerful systems for inference, logic-based IR thus forms a candidate piste of work for building effective IR systems. However, a study of current logic-based IR models shows that these models generally have some shortcomings. First, logic-based IR models normally propose complex, and hard to obtain, representations for documents and queries. Second, the retrieval decision d->q, which represents the matching between a document d and a query q, could be difficult to verify or check. Finally, the uncertainty measure U(d->q) is either ad-hoc or hard to implement. In this thesis, we propose a new logic-based IR model to overcome most of the previous limits. We use Propositional Logic (PL) as an underlying logical framework. We represent documents and queries as logical sentences written in Disjunctive Normal Form. We also argue that the retrieval decision d->q could be replaced by the validity of material implication. We then exploit the potential relation between PL and lattice theory to check if d->q is valid or not. We first propose an intermediate representation of logical sentences, where they become nodes in a lattice having a partial order relation that is equivalent to the validity of material implication. Accordingly, we transform the checking of the validity of d->q, which is a computationally intensive task, to a series of simple set-inclusion checking. In order to measure the uncertainty of the retrieval decision U(d->q), we use the degree of inclusion function Z that is capable of quantifying partial order relations defined on lattices. Finally, our model is capable of working efficiently on any logical sentence without any restrictions, and is applicable to large-scale data. Our model also has some theoretical conclusions, including, formalizing and showing the adequacy of van Rijsbergen assumption about estimating the logical uncertainty U(d->q) through the conditional probability P(q|d), redefining the two notions Exhaustivity and Specificity, and the possibility of reproducing most classical IR models as instances of our model. We build three operational instances of our model. An instance to study the importance of Exhaustivity and Specificity, and two others to show the inadequacy of the term-independence assumption. Our experimental results show worthy gain in performance when integrating Exhaustivity and Specificity into one concrete IR model. However, the results of using semantic relations between terms were not sufficient to draw clear conclusions. On the contrary, experiments on exploiting structural relations between terms were promising. The work presented in this thesis can be developed either by doing more experiments, especially about using relations, or by more in-depth theoretical study, especially about the properties of the Z function
APA, Harvard, Vancouver, ISO, and other styles
29

Lupinetti, Katia. "Identification of shape and structural characteristics in assembly models for retrieval applications." Thesis, Paris, ENSAM, 2018. http://www.theses.fr/2018ENAM0003/document.

Full text
Abstract:
Au cours des dernières années, innombrables produits ont été conçus en utilisant des modèles numériques 3D, où les courants logiciels pour la conception et le dessin technique utilisent des modèles CAO (Conception Assistée par Ordinateur). Ces logiciels sont utilisés dans de nombreux domaines, tels que l'automobile, la marine, l'aérospatiale et plus encore. À l'intérieur de d'une entreprise qui utilise ces systèmes, il est possible d'avoir accès à des modèles CAO de produits déjà développés puisque la conception de nouveaux produits fait souvent référence à des modèles existants depuis que produits similaires permettent à l'avance la connaissance des éventuels problèmes et leur solutions. Par conséquent, il est utile de disposer de solutions technologiques capables d'évaluer les similitudes de différents produits afin que l'utilisateur puisse récupérer des modèles existants et avoir ainsi accès à des informations utiles pour la nouvelle conception.Le concept de similarité a été largement étudié dans la littérature et il est bien connu que deus objets puissent être similaire de plusieurs façons. Ces multiples possibilités rendent complexe l'évaluation de la similarité entre deux objets. À ce jour, de nombreuses méthodes ont été proposées pour l’identification de différentes similitudes entre les pièces, mais peu de travaux abordent cet problème en évoquant d’assemblages de pièces. Si l’évaluation de la similarité entre deux pièces a beaucoup de points de vue, quand on va examiner des assemblages de pièces, les combinaisons de similarité augmentent vertigineusement puisqu'il y a plus de facteurs à considérer.Sur la base de ces exigences, nous proposons de définir un système qui permettant la récupération des assemblages des pièces similaires en fonction de multiple critères de similarité. Pour ce faire, il faut avoir un descripteur qui peut gérer les informations nécessaires pour caractériser les différentes similitudes entre les deux modèles. Par conséquent, l'un des points principaux de ce travail sera la définition d'un descripteur capable de coder les données nécessaires à l'évaluation des similarités. De plus, certaines des informations du descripteur peuvent être disponibles dans le modèle CAO, tandis que d'autres devront être extraites de manière appropriée. Par conséquent, des algorithmes seront proposés pour extraire les informations nécessaires pour remplir les champs du descripteur. Enfin, pour une évaluation de la similarité, plusieurs mesures entre les modèles seront définies, de sorte que chacune d'entre elles évaluent un aspect particulier de leur similarité
The large use of CAD systems in many industrial fields, such as automotive, naval, and aerospace, has generated a number of 3D databases making available a lot of 3D digital models. Within enterprises, which make use of these technologies, it is common practice to access to CAD models of previously developed products. In fact, designing new products often refers to existing models since similar products allow knowing in advance common problems and related solutions. Therefore, it is useful to have technological solutions that are able to evaluate the similarities of different products in such a way that the user can retrieve existing models and thus have access to the associated useful information for the new design.The concept of similarity has been widely studied in literature and it is well known that two objects can be similar under different perspectives. These multiple possibilities make complicate the assessment of the similarity between two objects. So far, many methods are proposed for the recognition of different parts similarities, but few researches address this problem for assembly models. If evaluating the similarity between two parts may be done under different perspectives, considering assemblies, the viewpoints increase considerably since there are more elements playing a meaningful role.Based on these requirements, we propose a system for retrieving similar assemblies according to different similarity criteria. To achieve this goal, it is necessary having an assembly description including all the information required for the characterizations of the possible different similarity criteria between the two assemblies. Therefore, one of the main topics of this work is the definition of a descriptor capable of encoding the data needed for the evaluation of similarity adaptable to different objectives. In addition, some of the information included in the descriptor may be available in CAD models, while other has to be extracted appropriately. Therefore, algorithms are proposed for extracting the necessary information to fill out the descriptor elements. Finally, for the evaluation of assembly similarity, several measures are defined, each of them evaluating a specific aspect of their similarity
APA, Harvard, Vancouver, ISO, and other styles
30

Petkova, Desislava I. "Cluster-based relevance models for automatic image annotation /." Connect to online version, 2005. http://ada.mtholyoke.edu/setr/websrc/pdfs/www/2005/124.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Fei, Qi. "Operation models for information systems /." View abstract or full-text, 2009. http://library.ust.hk/cgi/db/thesis.pl?IELM%202009%20FEI.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Udoyen, Nsikan. "Information Modeling for Intent-based Retrieval of Parametric Finite Element Analysis Models." Diss., Georgia Institute of Technology, 2006. http://hdl.handle.net/1853/14084.

Full text
Abstract:
Adaptive reuse of parametric finite element analysis (FEA) models is a common form of reuse that involves integrating new information into an archived FEA model to apply it towards a new similar physical problem. Adaptive reuse of archived FEA models is often motivated by the need to assess the impact of minor improvements to component-based designs such as addition of new structural components, or the need to assess new failure modes that arise when a device is redesigned for new operating environments or loading conditions. Successful adaptive reuse of FEA models involves reference to supporting documents that capture the formulation of the model to determine what new information can be integrated and how. However, FEA models and supporting documents are not stored in formats that are semantically rich enough to support automated inference of their relevance to a modelers needs. The modelers inability to precisely describe information needs and execute queries based on such requirements results in inefficient queries and time spent manually assessing irrelevant models. The central research question in this research is thus how do we incorporate a modelers intent into automated retrieval of FEA models for adaptive reuse? An automated retrieval method to support adaptive reuse of parametric FEA models has been developed in the research documented in this thesis. The method consists of a classification-based retrieval method based on ALE subsumption hierarchies that classify models using semantically rich description logic representations of physical problem structure and a reusability-based ranking method. Conceptual data models have been developed for the representations that support both retrieval and ranking of archived FEA models. The method is validated using representations of FEA models of several classes of electronic chip packages. Experimental results indicate that the properties of the representation methods support effective automation of retrieval functions for FEA models of component-based designs.
APA, Harvard, Vancouver, ISO, and other styles
33

Renners, Ingo. "Data-driven system identification via evolutionary retrieval of Takagi-Sugeno fuzzy models." [S.l. : s.n.], 2004. http://deposit.ddb.de/cgi-bin/dokserv?idn=974452351.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Lester, Neil. "Assisting the software reuse process through classification and retrieval of software models." Thesis, University of Ulster, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.311531.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Marcén, Terraza Ana Cristina. "Design of a Machine Learning-based Approach for Fragment Retrieval on Models." Doctoral thesis, Universitat Politècnica de València, 2021. http://hdl.handle.net/10251/158617.

Full text
Abstract:
[ES] El aprendizaje automático (ML por sus siglas en inglés) es conocido como la rama de la inteligencia artificial que reúne algoritmos estadísticos, probabilísticos y de optimización, que aprenden empíricamente. ML puede aprovechar el conocimiento y la experiencia que se han generado durante años en las empresas para realizar automáticamente diferentes procesos. Por lo tanto, ML se ha aplicado a diversas áreas de investigación, que estudian desde la medicina hasta la ingeniería del software. De hecho, en el campo de la ingeniería del software, el mantenimiento y la evolución de un sistema abarca hasta un 80% de la vida útil del sistema. Las empresas, que se han dedicado al desarrollo de sistemas software durante muchos años, han acumulado grandes cantidades de conocimiento y experiencia. Por lo tanto, ML resulta una solución atractiva para reducir sus costos de mantenimiento aprovechando los recursos acumulados. Específicamente, la Recuperación de Enlaces de Trazabilidad, la Localización de Errores y la Ubicación de Características se encuentran entre las tareas más comunes y relevantes para realizar el mantenimiento de productos software. Para abordar estas tareas, los investigadores han propuesto diferentes enfoques. Sin embargo, la mayoría de las investigaciones se centran en métodos tradicionales, como la indexación semántica latente, que no explota los recursos recopilados. Además, la mayoría de las investigaciones se enfocan en el código, descuidando otros artefactos de software como son los modelos. En esta tesis, presentamos un enfoque basado en ML para la recuperación de fragmentos en modelos (FRAME). El objetivo de este enfoque es recuperar el fragmento del modelo que realiza mejor una consulta específica. Esto permite a los ingenieros recuperar el fragmento que necesita ser trazado, reparado o ubicado para el mantenimiento del software. Específicamente, FRAME combina la computación evolutiva y las técnicas ML. En FRAME, un algoritmo evolutivo es guiado por ML para extraer de manera eficaz distintos fragmentos de un modelo. Estos fragmentos son posteriormente evaluados mediante técnicas ML. Para aprender a evaluarlos, las técnicas ML aprovechan el conocimiento (fragmentos recuperados de modelos) y la experiencia que las empresas han generado durante años. Basándose en lo aprendido, las técnicas ML determinan qué fragmento del modelo realiza mejor una consulta. Sin embargo, la mayoría de las técnicas ML no pueden entender los fragmentos de los modelos. Por lo tanto, antes de aplicar las técnicas ML, el enfoque propuesto codifica los fragmentos a través de una codificación ontológica y evolutiva. En resumen, FRAME está diseñado para extraer fragmentos de un modelo, codificarlos y evaluar cuál realiza mejor una consulta específica. El enfoque ha sido evaluado a partir de un caso real proporcionado por nuestro socio industrial (CAF, un proveedor internacional de soluciones ferroviarias). Además, sus resultados han sido comparados con los resultados de los enfoques más comunes y recientes. Los resultados muestran que FRAME obtuvo los mejores resultados para la mayoría de los indicadores de rendimiento, proporcionando un valor medio de precisión igual a 59.91%, un valor medio de exhaustividad igual a 78.95%, una valor-F medio igual a 62.50% y un MCC (Coeficiente de Correlación Matthews) medio igual a 0.64. Aprovechando los fragmentos recuperados de los modelos, FRAME es menos sensible al conocimiento tácito y al desajuste de vocabulario que los enfoques basados en información semántica. Sin embargo, FRAME está limitado por la disponibilidad de fragmentos recuperados para llevar a cabo el aprendizaje automático. Esta tesis presenta una discusión más amplia de estos aspectos así como el análisis estadístico de los resultados, que evalúa la magnitud de la mejora en comparación con los otros enfoques.
[CAT] L'aprenentatge automàtic (ML per les seues sigles en anglés) és conegut com la branca de la intel·ligència artificial que reuneix algorismes estadístics, probabilístics i d'optimització, que aprenen empíricament. ML pot aprofitar el coneixement i l'experiència que s'han generat durant anys en les empreses per a realitzar automàticament diferents processos. Per tant, ML s'ha aplicat a diverses àrees d'investigació, que estudien des de la medicina fins a l'enginyeria del programari. De fet, en el camp de l'enginyeria del programari, el manteniment i l'evolució d'un sistema abasta fins a un 80% de la vida útil del sistema. Les empreses, que s'han dedicat al desenvolupament de sistemes programari durant molts anys, han acumulat grans quantitats de coneixement i experiència. Per tant, ML resulta una solució atractiva per a reduir els seus costos de manteniment aprofitant els recursos acumulats. Específicament, la Recuperació d'Enllaços de Traçabilitat, la Localització d'Errors i la Ubicació de Característiques es troben entre les tasques més comunes i rellevants per a realitzar el manteniment de productes programari. Per a abordar aquestes tasques, els investigadors han proposat diferents enfocaments. No obstant això, la majoria de les investigacions se centren en mètodes tradicionals, com la indexació semàntica latent, que no explota els recursos recopilats. A més, la majoria de les investigacions s'enfoquen en el codi, descurant altres artefactes de programari com són els models. En aquesta tesi, presentem un enfocament basat en ML per a la recuperació de fragments en models (FRAME). L'objectiu d'aquest enfocament és recuperar el fragment del model que realitza millor una consulta específica. Això permet als enginyers recuperar el fragment que necessita ser traçat, reparat o situat per al manteniment del programari. Específicament, FRAME combina la computació evolutiva i les tècniques ML. En FRAME, un algorisme evolutiu és guiat per ML per a extraure de manera eficaç diferents fragments d'un model. Aquests fragments són posteriorment avaluats mitjançant tècniques ML. Per a aprendre a avaluar-los, les tècniques ML aprofiten el coneixement (fragments recuperats de models) i l'experiència que les empreses han generat durant anys. Basant-se en l'aprés, les tècniques ML determinen quin fragment del model realitza millor una consulta. No obstant això, la majoria de les tècniques ML no poden entendre els fragments dels models. Per tant, abans d'aplicar les tècniques ML, l'enfocament proposat codifica els fragments a través d'una codificació ontològica i evolutiva. En resum, FRAME està dissenyat per a extraure fragments d'un model, codificar-los i avaluar quin realitza millor una consulta específica. L'enfocament ha sigut avaluat a partir d'un cas real proporcionat pel nostre soci industrial (CAF, un proveïdor internacional de solucions ferroviàries). A més, els seus resultats han sigut comparats amb els resultats dels enfocaments més comuns i recents. Els resultats mostren que FRAME va obtindre els millors resultats per a la majoria dels indicadors de rendiment, proporcionant un valor mitjà de precisió igual a 59.91%, un valor mitjà d'exhaustivitat igual a 78.95%, una valor-F mig igual a 62.50% i un MCC (Coeficient de Correlació Matthews) mig igual a 0.64. Aprofitant els fragments recuperats dels models, FRAME és menys sensible al coneixement tàcit i al desajustament de vocabulari que els enfocaments basats en informació semàntica. No obstant això, FRAME està limitat per la disponibilitat de fragments recuperats per a dur a terme l'aprenentatge automàtic. Aquesta tesi presenta una discussió més àmplia d'aquests aspectes així com l'anàlisi estadística dels resultats, que avalua la magnitud de la millora en comparació amb els altres enfocaments.
[EN] Machine Learning (ML) is known as the branch of artificial intelligence that gathers statistical, probabilistic, and optimization algorithms, which learn empirically. ML can exploit the knowledge and the experience that have been generated for years to automatically perform different processes. Therefore, ML has been applied to a wide range of research areas, from medicine to software engineering. In fact, in software engineering field, up to an 80% of a system's lifetime is spent on the maintenance and evolution of the system. The companies, that have been developing these software systems for a long time, have gathered a huge amount of knowledge and experience. Therefore, ML is an attractive solution to reduce their maintenance costs exploiting the gathered resources. Specifically, Traceability Link Recovery, Bug Localization, and Feature Location are amongst the most common and relevant tasks when maintaining software products. To tackle these tasks, researchers have proposed a number of approaches. However, most research focus on traditional methods, such as Latent Semantic Indexing, which does not exploit the gathered resources. Moreover, most research targets code, neglecting other software artifacts such as models. In this dissertation, we present an ML-based approach for fragment retrieval on models (FRAME). The goal of this approach is to retrieve the model fragment which better realizes a specific query in a model. This allows engineers to retrieve the model fragment, which must be traced, fixed, or located for software maintenance. Specifically, the FRAME approach combines evolutionary computation and ML techniques. In the FRAME approach, an evolutionary algorithm is guided by ML to effectively extract model fragments from a model. These model fragments are then assessed through ML techniques. To learn how to assess them, ML techniques takes advantage of the companies' knowledge (retrieved model fragments) and experience. Then, based on what was learned, ML techniques determine which model fragment better realizes a query. However, model fragments are not understandable for most ML techniques. Therefore, the proposed approach encodes the model fragments through an ontological evolutionary encoding. In short, the FRAME approach is designed to extract model fragments, encode them, and assess which one better realizes a specific query. The approach has been evaluated in our industrial partner (CAF, an international provider of railway solutions) and compared to the most common and recent approaches. The results show that the FRAME approach achieved the best results for most performance indicators, providing a mean precision value of 59.91%, a recall value of 78.95%, a combined F-measure of 62.50%, and a MCC (Matthews correlation coefficient) value of 0.64. Leveraging retrieved model fragments, the FRAME approach is less sensitive to tacit knowledge and vocabulary mismatch than the approaches based on semantic information. However, the approach is limited by the availability of the retrieved model fragments to perform the learning. These aspects are further discussed, after the statistical analysis of the results, which assesses the magnitude of the improvement in comparison to the other approaches.
Marcén Terraza, AC. (2020). Design of a Machine Learning-based Approach for Fragment Retrieval on Models [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/158617
TESIS
APA, Harvard, Vancouver, ISO, and other styles
36

Roos, Daniel. "Evaluation of BERT-like models for small scale ad-hoc information retrieval." Thesis, Linköpings universitet, Artificiell intelligens och integrerade datorsystem, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-177675.

Full text
Abstract:
Measuring semantic similarity between two sentences is an ongoing research field with big leaps being taken every year. This thesis looks at using modern methods of semantic similarity measurement for an ad-hoc information retrieval (IR) system. The main challenge tackled was answering the question "What happens when you don’t have situation-specific data?". Using encoder-based transformer architectures pioneered by Devlin et al., which excel at fine-tuning to situationally specific domains, this thesis shows just how well the presented methodology can work and makes recommendations for future attempts at similar domain-specific tasks. It also shows an example of how a web application can be created to make use of these fast-learning architectures.
APA, Harvard, Vancouver, ISO, and other styles
37

Limbu, Dilip Kumar. "Contextual information retrieval from the WWW." Click here to access this resource online, 2008. http://hdl.handle.net/10292/450.

Full text
Abstract:
Contextual information retrieval (CIR) is a critical technique for today’s search engines in terms of facilitating queries and returning relevant information. Despite its importance, little progress has been made in its application, due to the difficulty of capturing and representing contextual information about users. This thesis details the development and evaluation of the contextual SERL search, designed to tackle some of the challenges associated with CIR from the World Wide Web. The contextual SERL search utilises a rich contextual model that exploits implicit and explicit data to modify queries to more accurately reflect the user’s interests as well as to continually build the user’s contextual profile and a shared contextual knowledge base. These profiles are used to filter results from a standard search engine to improve the relevance of the pages displayed to the user. The contextual SERL search has been tested in an observational study that has captured both qualitative and quantitative data about the ability of the framework to improve the user’s web search experience. A total of 30 subjects, with different levels of search experience, participated in the observational study experiment. The results demonstrate that when the contextual profile and the shared contextual knowledge base are used, the contextual SERL search improves search effectiveness, efficiency and subjective satisfaction. The effectiveness improves as subjects have actually entered fewer queries to reach the target information in comparison to the contemporary search engine. In the case of a particularly complex search task, the efficiency improves as subjects have browsed fewer hits, visited fewer URLs, made fewer clicks and have taken less time to reach the target information when compared to the contemporary search engine. Finally, subjects have expressed a higher degree of satisfaction on the quality of contextual support when using the shared contextual knowledge base in comparison to using their contextual profile. These results suggest that integration of a user’s contextual factors and information seeking behaviours are very important for successful development of the CIR framework. It is believed that this framework and other similar projects will help provide the basis for the next generation of contextual information retrieval from the Web.
APA, Harvard, Vancouver, ISO, and other styles
38

Hart, James Brian. "An examination of two synthetic aperture radar wind retrieval models during NORCSEX '95." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 1996. http://handle.dtic.mil/100.2/ADA326275.

Full text
Abstract:
Thesis (M.S. in Meteorology and Physical Oceanography) Naval Postgraduate School, December 1996.
"December 1996." Thesis advisor(s): Kenneth Davidson, Carlyle H. Wash. Includes bibliographical references (p. 71-72). Also available online.
APA, Harvard, Vancouver, ISO, and other styles
39

Azzam, Hany. "Modelling semantic search : the evolution of knowledge modelling, retrieval models and query processing." Thesis, Queen Mary, University of London, 2011. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.538379.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

陸穎剛 and Wing-kong Luk. "Concept space approach for cross-lingual information retrieval." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2000. http://hub.hku.hk/bib/B30147724.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Kaliciak, Leszek. "Hybrid models for combination of visual and textual features in context-based image retrieval." Thesis, Robert Gordon University, 2013. http://hdl.handle.net/10059/924.

Full text
Abstract:
Visual Information Retrieval poses a challenge to intelligent information search systems. This is due to the semantic gap, the difference between human perception (information needs) and the machine representation of multimedia objects. Most existing image retrieval systems are monomodal, as they utilize only visual or only textual information about images. The semantic gap can be reduced by improving existing visual representations, making them suitable for a large-scale generic image retrieval. The best up-to-date candidates for a large-scale Content-based Image Retrieval are models based on the “Bag of Visual Words” framework. Existing approaches, however, produce high dimensional and thus expensive representations for data storage and computation. Because the standard “Bag of Visual Words” framework disregards the relationships between the histogram bins, the model can be further enhanced by exploiting the correlations between the “visual words”. Even the improved visual features will find it hard to capture an abstract semantic meaning of some queries, e.g. “straight road in the USA”. Textual features, on the other hand, would struggle with such queries as “church with more than two towers” as in many cases the information about the number of towers would be missing. Thus, both visual and textual features represent complementary yet correlated aspects of the same information object, an image. Existing hybrid approaches for the combination of visual and textual features do not take these inherent relationships into account and thus the combinations’ performance improvement is limited. Visual and textual features can be also combined in the context of relevance feedback. The relevance feedback can help us narrow down and “correct” the search. The feedback mechanism would produce subsets of visual query and feedback representations as well as subsets of textual query and textual feedback representations. A meaningful feature combination in the context of relevance feedback should take the inherent inter (visual-textual) and intra (visual-visual, textualtextual) relationships into account. In this work, we propose a principled framework for the semantic gap reduction in large scale generic image retrieval. The proposed framework comprises development and enhancement of novel visual features, a hybrid model for the visual and textual features combination, and a hybrid model for the combination of features in the context of relevance feedback, with both fixed and adaptive weighting schemes (importance of a query and its context). Apart from the experimental evaluation of our models, theoretical validations of some interesting discoveries on feature fusion strategies were also performed. The proposed models were incorporated into our prototype system with an interactive user interface.
APA, Harvard, Vancouver, ISO, and other styles
42

El, Khoury Rachid. "Partial 3D-shape indexing and retrieval." Phd thesis, Institut National des Télécommunications, 2013. http://tel.archives-ouvertes.fr/tel-00834359.

Full text
Abstract:
A growing number of 3D graphic applications have an impact on today's society. These applications are being used in several domains ranging from digital entertainment, computer aided design, to medical applications. In this context, a 3D object search engine with a good performance in time consuming and results becomes mandatory. We propose a novel approach for 3D-model retrieval based on closed curves. Then we enhance our method to handle partial 3D-model retrieval. Our method starts by the definition of an invariant mapping function. The important properties of a mapping function are its invariance to rigid and non rigid transformations, the correct description of the 3D-model, its insensitivity to noise, its robustness to topology changes, and its independance on parameters. However, current state-of-the-art methods do not respect all these properties. To respect these properties, we define our mapping function based on the diffusion and the commute-time distances. To prove the properties of this function, we compute the Reeb graph of the 3D-models. To describe the whole 3D-model, using our mapping function, we generate indexed closed curves from a source point detected automatically at the center of a 3D-model. Each curve describes a small region of the 3D-model. These curves lead to create an invariant descriptor to different transformations. To show the robustness of our method on various classes of 3D-models with different poses, we use shapes from SHREC 2012. We also compare our approach to existing methods in the state-of-the-art with a dataset from SHREC 2010. For partial 3D-model retrieval, we enhance the proposed method using the Bag-Of-Features built with all the extracted closed curves, and show the accurate performances using the same dataset
APA, Harvard, Vancouver, ISO, and other styles
43

Strunjas, Svetlana. "Algorithms and Models for Collaborative Filtering from Large Information Corpora." University of Cincinnati / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1220001182.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Yngve, Gary. "Visualization for biological models, simulation, and ontologies /." Thesis, Connect to this title online; UW restricted, 2008. http://hdl.handle.net/1773/6912.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Olivares, Ríos Ximena. "Large scale image retrieval base on user generated content." Doctoral thesis, Universitat Pompeu Fabra, 2011. http://hdl.handle.net/10803/22718.

Full text
Abstract:
Los sistemas online para compartir fotos proporcionan una valiosa fuente de contenidos generado por el usuario (UGC). La mayor a de los sistemas de re- cuperaci on de im agenes Web utilizan las anotaciones textuales para rankear los resultados, sin embargo estas anotaciones no s olo ilustran el contenido visual de una imagen, sino que tambi en describen situaciones subjetivas, espaciales, temporales y sociales, que complican la tarea de b usqueda basada en palabras clave. La investigaci on en esta tesis se centra en c omo mejorar la recuperaci on de im agenes en sistemas de gran escala, es decir, la Web, combinando informaci on proporcionada por los usuarios m as el contenido visual de las im agenes. En el presente trabajo se exploran distintos tipos de UGC, tales como anotaciones de texto, anotaciones visuales, y datos de click-through, as como diversas t ecnicas para combinar esta informaci on con el objetivo de mejorar la recuperaci on de im agenes usando informaci on visual. En conclusi on, la investigaci on realizada en esta tesis se centra en la impor- tancia de incluir la informaci on visual en distintas etapas de la recuperaci on de contenido. Combinando informaci on visual con otras formas de UGC, es posible mejorar signi cativamente el rendimiento de un sistema de recuperaci on de im agenes y cambiar la experiencia del usuario en la b usqueda de contenidos multimedia en la Web.
Online photo sharing systems provide a valuable source of user generated content (UGC). Most Web image retrieval systems use textual annotations to rank the results, although these annotations do not only illustrate the visual content of an image, but also describe subjective, spatial, temporal, and social dimensions, complicating the task of keyword based search. The research in this thesis is focused on how to improve the retrieval of images in large scale context , i.e. the Web, using information provided by users combined with visual content from images. Di erent forms of UGC are explored, such as textual annotations, visual annotations, and click-through-data, as well as di erent techniques to combine these data to improve the retrieval of images using visual information. In conclusion, the research conducted in this thesis focuses on the impor- tance to include visual information into various steps of the retrieval of media content. Using visual information, in combination with various forms of UGC, can signi cantly improve the retrieval performance and alter the user experience when searching for multimedia content on the Web. 1
APA, Harvard, Vancouver, ISO, and other styles
46

Gray, Brett. "Relational models of feature based concept formation, theory-based concept formation and analogical retrieval/mapping /." [St. Lucia, Qld.], 2003. http://www.library.uq.edu.au/pdfserve.php?image=thesisabs/absthe17450.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Zhang, Xiangmin. "A study of the effects of user characteristics on mental models of information retrieval systems." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape11/PQDD_0002/NQ41538.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

McPherson, Christopher. "Refinement of CALIPSO Aerosol Retrieval Models Through Analysis of Airborne High Spectral Resolution Lidar Data." Diss., The University of Arizona, 2011. http://hdl.handle.net/10150/145281.

Full text
Abstract:
The deepening of scientific understanding of atmospheric aerosols figures substan¬tially into stated goals for climate change research and a variety of internationally col¬laborative earth observation missions. One such mission is the joint NASA/Centre Na¬tional d’Études Spatiales (CNES) Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) mission, whose primary instrument is the Cloud-Aerosol LIdar with Orthogonal Polarization (CALIOP), a spaceborne two-wavelength, elastic-scatter li¬dar, which has been making continuous, global observations of atmospheric aerosols and clouds since June of 2006, shortly after its launch in April of the same year. The work presented in this dissertation consists of the development of an aerosol retrieval strategy to improve aerosol retrievals from lidar data from the CALIPSO mission, as well as a comprehensive formulation of accompanying aerosol models based on a thor¬ough analysis of data from an airborne High Spectral Resolution Lidar (HSRL) instrument. The retrieval methodology, known as the Constrained Ratio Aerosol Model-fit (CRAM) technique, is a means of exploiting the available dual-wavelength information from CAL¬IOP to constrain the possible solutions to the problem of aerosol retrieval from elastic-scatter lidar so as to be consistent with theoretically or empirically known aerosol models. Constraints applied via CRAM are manifested in spectral ratios of scattering parameters corresponding to observationally-based aerosol models. Consequently, accurate and rep¬resentative models incorporating various spectral scattering parameters are instrumental to the successful implementation of a methodology like CRAM. The aerosol models arising from this work are derived from measurements made by the NASA Langley Research Center (LaRC) airborne HSRL instrument, which has the capability to measure both aerosol scattering parameters (i.e., backscatter and extinction) independently at 532 nm. The instrument also incorporates an elastic-scatter channel at 1064 nm, facilitating the incorporation of dual-wavelength information by way of particu¬lar constraints. The intent in developing these new models is to furnish as satisfactory a basis as possible for retrieval techniques such as CRAM, whose approach to the problem of aerosol retrieval attempts to make optimal use of the available spectral information from multi-wavelength lidar, thus providing a framework for improving aerosol retrievals from CALIPSO and furthering the scientific goals related to atmospheric aerosols.
APA, Harvard, Vancouver, ISO, and other styles
49

Wu, Bruce Jiinpo. "The effects of data models and conceptual models of the structured query language on the task of query writing by end users." Thesis, University of North Texas, 1991. https://digital.library.unt.edu/ark:/67531/metadc332680/.

Full text
Abstract:
This research is an empirical investigation of human factors on the use of database systems. The problem motivating the study is the difficulty encountered by end-users in retrieving data from a database.
APA, Harvard, Vancouver, ISO, and other styles
50

Murugesan, Keerthiram. "CLUSTER-BASED TERM WEIGHTING AND DOCUMENT RANKING MODELS." UKnowledge, 2011. http://uknowledge.uky.edu/gradschool_theses/651.

Full text
Abstract:
A term weighting scheme measures the importance of a term in a collection. A document ranking model uses these term weights to find the rank or score of a document in a collection. We present a series of cluster-based term weighting and document ranking models based on the TF-IDF and Okapi BM25 models. These term weighting and document ranking models update the inter-cluster and intra-cluster frequency components based on the generated clusters. These inter-cluster and intra-cluster frequency components are used for weighting the importance of a term in addition to the term and document frequency components. In this thesis, we will show how these models outperform the TF-IDF and Okapi BM25 models in document clustering and ranking.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography